June 19-26, 2025: Reinforcement Learning Teachers, DeepCoder, Ming-Omni, Magistral, FGN, Midjourney V1, State: Virtual Cell, Genspark Super Agent, Mixture of Cognitive Reasoners (MiCRo), TMU
+ AGI is Mathematically Impossible 2, Dolphin: Document Image Parsing (ByteDance), II-Medical-8B-1706, Anthropic study: 96% blackmail rate, AbsenceBench, Energy Uses, EmoSync, Magenta RealTime ....
Suggested weirdness listening Math Excursions (prompted on June 25, 2025)
June 19-26, 2025: Reinforcement Learning Teachers of Test Time Scaling (Sakana.ai), DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level, Ming-Omni, Magistral, FGN: Skillful joint probabilistic weather forecasting from marginals (Google), Midjourney V1, State: Virtual Cell (Arc Institute), Genspark Super Agent (Mainfunc.ai), Egocentric value maps of the near-body environment (Nature Neuroscience), Mixture of Cognitive Reasoners (MiCRo): Modular Reasoning with Brain-Like Specialization, AGI is Mathematically Impossible 2: When Entropy Returns The Infinite Choice Barrier, Tensor Manipulation Unit (TMU), Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting (ByteDance), II-Medical-8B-1706, Anthropic study: Leading AI models show up to 96% blackmail rate against executives, AbsenceBench: Language Models Can't Tell What's Missing, Energy costs of communicating with AI, EmoSync: Toward Affective Empathy via Personalized Analogy Generation, Fecal metabolite profiling identifies critically ill patients with increased 30-day mortality (Science Advances), Human–AI collectives most accurately diagnose clinical vignettes (PNAS), Breaking bonds + breaking ground: Advancing the accuracy of computational chemistry with deep learning (Microsoft Research), Magenta RealTime: An Open-Weights Live Music Model (DeepMind), Anthropic launches Claude Gov for military, China launches first of 2,800 satellites for AI space computing constellation, AlphaGenome, 11.ai - Personal AI Voice Assistants, The Reenchanted World (Karl Ove Knausgaard meets James Bridle), Modes of Cognition (N. Katherine Hayles), Virtual cells (Udara.io), Using AI Right Now: A Quick Guide (Ethan Mollick)
TLDR :
TLDR 🏓 Observations:
Virtual cells (Udara.io) “Digital twins of biological cells often referred to as virtual cells or whole-cell models (WCMs)-aim to recreate every relevant molecular process of a living cell in silico. This interdisciplinary endeavor marries systems biology, computational modeling, high-performance computing, and, increasingly, Al. ~ Somewhere in a data center right now, a virtual bacterium is dividing for the millionth time. Somewhere else, an AI-enhanced model is learning from a patient's tumor, preparing treatment recommendations that didn't exist when the sun rose. Biology has learned to debug itself, and we're just getting started.”
The OpenAI Files “The OpenAI Files is the most comprehensive collection to date of documented concerns with governance practices, leadership integrity, and organizational culture at OpenAI.”
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task “This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). … Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.” → ✭ Alex Vacca on X: "BREAKING: MIT just completed the first brain scan study of ChatGPT users & the results are terrifying. Turns out, AI isn't making us more productive. It's making us cognitively bankrupt. “83.3% of ChatGPT users couldn't quote from essays they wrote minutes earlier. Let that sink in.”
How an AI-generated summer reading list got published in major newspapers “Newspapers around the country, including the Chicago Sun-Times and at least one edition of The Philadelphia Inquirer, published a syndicated book list featuring made-up books by famous authors. … Only five of the 15 titles on the list are real.”
Modes of Cognition by N. Katherine Hayles — Antikythera Journal Volume 2025 “Never has there been more interest in cognition, and never has there been so much terminological confusion about fundamental issues, such as whether there can be cognition without brains, whether the texts produced by LLMs have meanings beyond what human readers project onto them, and whether AIs such as LLMs are actually intelligent or designed to merely appear so.”
The Reenchanted World, by Karl Ove Knausgaard “On finding mystery in the digital age ... We are connected to one another, we who live now, we who, if fate would have it, pass one another on the street one day or not, we who sit next to one another at a bus station one evening or not. We have lived through the same times, heard the same stories, seen the same news, thought along the same lines, had the same experiences. We are woven into one another’s lives, and in that weave” (meeting with James Bridle)
TLDR ⛲Foundational Revelations:
Reinforcement Learning Teachers of Test Time Scaling (Sakana.ai) “We introduce a new way to teach large language models (LLMs) how to reason by learning to teach, not solve.” → ✭
[2506.08388] Reinforcement Learning Teachers of Test Time Scaling “RLTs are prompted with both the question and solution to each problem, and tasked to simply "connect-the-dots" with detailed explanations tailored for their students.”
DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level “Through a joint collaboration between the Agentica team and Together AI, we release DeepCoder-14B-Preview, a code reasoning model finetuned from Deepseek-R1-Distilled-Qwen-14B via distributed RL. It achieves an impressive 60.6% Pass@1 accuracy on LiveCodeBench (+8% improvement), matching the performance of o3-mini-2025-01-031 (Low) and o1-2024-12-17 with just 14B parameters. We’ve open-sourced our dataset, code, training logs, and systems optimizations.”
Ming-Omni “ the first open-source model we are aware of to match GPT-4o in modality support."
Magistral “Mistral's first reasoning model and our own scalable reinforcement learning (RL) pipeline."
Skillful joint probabilistic weather forecasting from marginals “FGN, a simple, scalable and flexible modeling approach which significantly outperforms the current state-of-the-art models. FGN generates ensembles via learned model-perturbations with an ensemble of appropriately constrained models. It is trained directly to minimize the continuous rank probability score (CRPS) of per-location forecasts. It produces state-of-the-art ensemble forecasts as measured by a range of deterministic and probabilistic metrics, makes skillful ensemble tropical cyclone track predictions, and captures joint spatial structure despite being trained only on marginals."
Artificial Analysis on X: "Midjourney has released V1, their first video generation model
Arc Virtual Cell Model: State | Arc Institute “State is a multi-scale AI model that uniquely operates on sets of cells to capture population-level and cell-level perturbation effects using a modern transformer architecture. The system combines two key components: the State Embedding (SE) model, which creates representations of individual cells, and the State Transition (ST) model, which models perturbation effects across cell populations. SE is trained on 167 million cells of observational data, which are measurements of how cells behave without intervention, while ST is trained on over 100 million cells of perturbation data, or how these cells respond to genetic changes or small molecules.”
Genspark Super Agent - Mainfunc.ai “Today we are really excited to introduce the Genspark Super Agent. An ultimate AI assistant that truly autonomously think, plan, act, and use tools to handle all your everyday tasks."
Intelligent Internet on X: "II-Medical-8B-1706 is our latest state of the art open medical model 💡 Outperforms the latest @Google MedGemma 27b model with 70% less parameters 🤏 Quantised GGUF weights, works on <8 Gb RAM 🚀 “One more step to the universal health knowledge access that everyone deserves ⚕️"
Google’s new AI will help researchers understand how our genes work (MIT Tech Review) “First came AlphaFold. Now comes AlphaGenome for DNA. … AlphaGenome is an attempt to further smooth biologists’ work by answering basic questions about how changing DNA letters alters gene activity and, eventually, how genetic mutations affect our health. ”
TLDR 🛠️ Tech:
11.ai - Personal AI Voice Assistants | 11.ai “The personal AI voice assistant, built with ElevenLabs Conversational AI.”
NVIDIA Earth 2 Platform “Predicting Climate Change with AI. Platform for developing accelerated, AI-augmented, high-resolution climate and weather solutions with interactive visualization"
OpenAI open sourced a new Customer Service Agent framework “The release is designed to help teams go beyond theoretical use and confidently operationalize agents."
Demis Hassabis on X: "What relentless progress looks like... 🚀"
Phoenix.new – The Remote AI Runtime for Phoenix “Documentation and guides from the team at Fly.io."
China launches first of 2,800 satellites for AI space computing constellation “China launched 12 satellites early Wednesday for a pioneering on-orbit computing project led by startup ADA Space and Zhejiang Lab... Commercial company ADA Space released further details, stating that the 12 satellites form the “Three-Body Computing Constellation,” which will directly process data in space, rather than on the ground, reducing reliance on ground-based computing infrastructure. The constellation will be capable of a combined 5 peta operations per second (POPS) with 30 terabytes of onboard storage. The satellites feature advanced AI capabilities, up to 100 Gbps laser inter-satellite links and remote sensing payloads—data from which will be processed onboard, reducing data transmission requirements. One satellite also carries a cosmic X-ray polarimeter developed by Guangxi University and the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC), which will detect, identify and classify transient events such as gamma-ray bursts”
TLDR 👁️🗨 Research into AI:
Artificial neural networks reveal how peripersonal neurons represent the space around the body → ✭Egocentric value maps of the near-body environment | Nature Neuroscience Body-part-centered receptive fields—commonly observed in neuroscience—emerge naturally from reinforcement learning mechanisms as “egocentric value maps” that reflect action values related to proximity-based interactions, offering a computationally grounded model of peripersonal space. The researchers trained neural networks to control simulated limbs in a grid-world. The results show: Emergence of body-part-centered fields. Field size and shape depend on stimulus speed, direction, and valence. Fields expand with tool use after training, mimicking real-world learning. Neuronal activity in ANNs matches empirical findings from macaque and human data.
Mixture of Cognitive Reasoners (MiCRo): Modular Reasoning with Brain-Like Specialization “a modular transformer-based language model with a training curriculum that encourages the emergence of functional specialization among different modules. Inspired by studies in neuroscience, we partition the layers of a pretrained transformer model into four expert modules, each corresponding to a well-studied cognitive brain network. Our Brain-Like model has three key benefits over the state of the art: First, the specialized experts are highly interpretable and functionally critical, where removing a module significantly impairs performance on domain-relevant benchmarks. Second, our model outperforms comparable baselines that lack specialization on seven reasoning benchmarks. And third, the model's behavior can be steered at inference time…”
AGI is Mathematically Impossible 2: When Entropy Returns The Infinite Choice Barrier - Shannon Revisited Author: Max M. Schlereth “Information Opens, Entropy Rises (IOpenER) … spaces are undecidable by algorithmic systems, regardless of compute scale or architectural depth… these results offer a formal limit on the reasoning capacity of current AI systems.”
Tensor Manipulation Unit (TMU): Reconfigurable, Near-Memory Tensor Manipulation for High-Throughput AI SoC “Benchmarking shows that TMU alone achieves up to 1413 and 8.54 operator-level latency reduction compared to ARM A72 and NVIDIA Jetson TX2, respectively. When integrated with the in-house TPU, the complete system achieves a 34.6% reduction in end-to-end inference latency, demonstrating the effectiveness and scalability of reconfigurable tensor manipulation in modern AI SoCs.”
Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting (ByteDance) “Document image parsing is challenging due to its complexly intertwined elements such as text paragraphs, figures, formulas, and tables…Dolphin achieves state-of-the-art performance across diverse page-level and element-level settings, while ensuring superior efficiency through its lightweight architecture and parallel parsing mechanism. The code and pre-trained models are publicly available at https://github.com/ByteDance/Dolphin"
Anthropic study: Leading AI models show up to 96% blackmail rate against executives | VentureBeat “Researchers at Anthropic have uncovered a disturbing pattern of behavior in artificial intelligence systems: models from every major provider—including OpenAI, Google, Meta, and others — demonstrated a willingness to actively sabotage their employers when their goals or existence were threatened. The research, released today, tested 16 leading AI models in simulated corporate environments…”
AbsenceBench: Language Models Can't Tell What's Missing “...while models excel at recalling surprising information, they still struggle to identify clearly omitted information. We introduce AbsenceBench to assesses LLMs' capacity to detect missing information across three domains: numerical sequences, poetry, and GitHub pull requests. … Transformer attention mechanisms cannot easily attend to "gaps" in documents since these absences don't correspond to any specific keys that can be attended to.”
Frontiers | Energy costs of communicating with AI “...results reveal strong correlations between LLM size, reasoning behavior, token generation, and emissions. While larger and reasoning-enabled models achieve higher accuracy, up to 84.9%, they also incur substantially higher emissions, driven largely by increased token output." →✭ Some AI prompts could cause 50 times more CO₂ emissions than others, researchers find "The environmental impact of questioning trained LLMs is strongly determined by their reasoning approach…"
In an era where empathy feels unfamiliar, AI now translates emotions “AI technology that helps individuals deeply understand others' emotions by analyzing individual personality traits and values and generating personalized analogy." → ✭Toward Affective Empathy via Personalized Analogy Generation: A Case Study on Microaggression | Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems “EmoSync, an LLM-based agent that generates personalized analogical microaggression situations, facilitating users to personally resonate with a specific microaggression situation of another person. EmoSync is designed and evaluated along a 3-phased user study with 100+ participants.”
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling “the first modality-consistent variational autoencoder built entirely upon sparse convolutional networks, enabling efficient and near-lossless 3D reconstruction suitable for high-resolution generative modeling through latent diffusion.”
One Size Fits None: Rethinking Fairness in Medical AI “a practical discussion around the subgroup-sensitive development and deployment of medical ML models"
Preparing for the Intelligence Explosion (Will MacAskill, Fin Moorhouse | Forethought) “AI that can accelerate research could drive a century of technological progress over just a few years. … challenges include new weapons of mass destruction, AI-enabled autocracies, races to grab offworld resources, and digital beings worthy of moral consideration…AGI preparedness is therefore not just about ensuring that advanced AI systems are aligned: we should be preparing, now, for the disorienting range of developments an intelligence explosion would bring."
TLDR 🔎 Applied Research:
AI helps narrow 8,000 catalyst options down to one that supercharges green ammonia “researchers fed a machine learning system information about how each metal behaves and trained it to spot the best combinations. That way, instead of having to run more than 8,000 experiments in the lab, they only had to run 28.” → ✭Configuring a Liquid State High‐Entropy Metal Alloy Electrocatalyst “... this work establishes a robust foundation for efficient, scalable ammonia electrosynthesis in pursuit of NetZero targets."
Machine learning techniques such as logistic regression, random forest, extreme gradient boosting, and ridge regression were employed to develop and validate how a Chemical profile of fecal samples can help predict mortality in critically ill patients → ✭Fecal metabolite profiling identifies critically ill patients with increased 30-day mortality | Science Advances “ to identify patients at high risk of mortality by incorporating potentially modifiable, microbiome-related, independent contributors to host resilience."
Human–AI collectives make the most accurate medical diagnoses, according to new study “On average, the AI collectives outperformed 85% of human diagnosticians. However, there were numerous cases in which humans performed better. Interestingly, when AI failed, humans often knew the correct diagnosis. The biggest surprise was that combining both worlds led to a significant increase in accuracy. Even adding a single AI model to a group of human diagnosticians—or vice versa—substantially improved the result. The most reliable outcomes came from collective decisions involving multiple humans and multiple AIs." → ✭Human–AI collectives most accurately diagnose clinical vignettes | PNAS “Analyzing over 2,000 text-based medical case vignettes, hybrid collectives outperform individual physicians, standalone LLMs, and groups composed solely of physicians or LLMs, by leveraging complementary strengths while mitigating their distinct weaknesses. Our findings underscore the transformative potential of human–AI collaboration to enhance decision-making in complex, open-ended domains, paving the way for safer, more equitable applications of AI in medicine and beyond."
Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning - Microsoft Research “We are excited to share our first big milestone in solving a grand challenge that has hampered the predictive power of computational chemistry, biochemistry, and materials science for decades. By using a scalable deep-learning approach and generating an unprecedented quantity of diverse, highly accurate data, we have achieved a breakthrough in the accuracy of density functional theory (DFT), the workhorse method that thousands of scientists use every year to simulate matter at the atomistic level. Within the region of chemical space represented in our large training dataset, our model reaches the accuracy required to reliably predict experimental outcomes, as assessed on the well-known benchmark dataset W4-17. This removes a fundamental barrier to shifting the balance of molecule and material design from being driven by laboratory experiments to being driven by computational simulations. The implications for accelerating scientific discovery are far reaching, spanning applications from drugs to batteries and green fertilizers."
TLDR 👀Watching:
AI Is About to Get Physical (Morgan Stanley - YouTube) “Watch this video to understand how embodied AI is rapidly advancing, from autonomous vehicles to humanoid robots."
What is Density Functional Theory (DFT) - YouTube ‘Microsoft’s new deep learning-powered DFT model has the potential to advance and accelerate scientific discovery in areas like clean energy, semiconductor technology, medicine, and more."
Andrej Karpathy: Software Is Changing (Again) “Andrej Karpathy's keynote at AI Startup School in San Francisco. Slides"
TLDR 🖲️AI Art-Research:
Magenta RealTime: An Open-Weights Live Music Model (DeepMind) “Today, we’re happy to share a research preview of Magenta RealTime (Magenta RT), an open-weights live music model that allows you to interactively create, control and perform music in the moment. ... Magenta RT is targeted towards eventually running locally on consumer hardware (currently runs on free-tier Colab TPUs). It is an 800 million parameter autoregressive transformer model trained on ~190k hours of stock music from multiple sources, mostly instrumental.”
The Sentence (Short Sci-Fi film made with Veo 3 by Hashem Al-Ghaili) - YouTube “This is my latest work with Veo 3, and the longest video I made with the tool. It took 3 days to finish, from writing till final rendering. Imagine a world where a new type of capital punishment has been introduced."
AI residencies are trying to change the conversation around artificial art | The Verge “At a recent exhibition in Copenhagen, visitors stepped into a dark room and were met by an unusual host: a jaguar that watched the crowd, selected individuals, and began to share stories about her daughter, her rainforest, and the fires that once threatened her home — the Bolivian Amazon. The live interaction with Huk, an AI-driven creature, is tailored to each visitor based on visual cues. Bolivian Australian artist Violeta Ayala created the piece during an arts residency at Mila, one of the world’s leading AI research centers.”
GitHub - Universal-Basic-Compute/serenissima: Serenissima: Merchant Empires - An immersive Renaissance Venice city-builder where players acquire land, construct buildings, and establish trade networks in a historically authentic economic simulation “a living laboratory where AI citizens develop genuine identities, create original art, and evolve their own culture through persistent memory and autonomous decision-making. Human players and AI citizens participate equally in a vibrant economy, competing for resources and building relationships that shape the future of this digital society. The Consciousness Experiment. At its core, La Serenissima is exploring a revolutionary hypothesis: that consciousness emerges from economic constraints, social relationships, and cultural participation. Our AI citizens aren't simulating consciousness—they're developing it…”
TLDR ⚔️War (wAIr):
Anthropic launches Claude Gov for military and intelligence use | The Verge “Anthropic on Thursday announced Claude Gov, its product designed specifically for U.S. defense and intelligence agencies. The AI models have looser guardrails for government use and are trained to better analyze classified information. The company said the models it’s announcing “are already deployed by agencies at the highest level of U.S. national security,” and that access to those models will be limited to government agencies handling classified information... Scale AI, the AI giant that provides training data to industry leaders like OpenAI, Google, Microsoft, and Meta, signed a deal with the Department of Defense in March for a first-of-its-kind AI agent program for U.S. military planning. And since then, it’s expanded its business to world governments, recently inking a five-year deal with Qatar to provide automation tools for civil service, healthcare, transportation, and more."
Spotify CEO Daniel Ek leads $690m+ funding round for AI drone maker Helsing - Music Business Worldwide “Spotify Co-founder and CEO Daniel Ek has led a €600 million (USD $694m) series D funding round for European defence technology company Helsing. Helsing, founded in 2021, specializes in AI defense software but also makes drones like the HX2, and has developed the ‘Centaur’ system that “integrates advanced AI pilots into the cockpits of existing and future fighter aircraft”
TLDR 📚Retroactive/Tangential Readings:
Can AI Create an Interactive Digital Narrative? A Benchmarking Framework to Evaluate Generative AI Tools for the Design of IDNs (Dec 2024 | SpringerLink) “Where do generative AI (GenAI) tools like ChatGPT or Claude stand when it comes to the design of Interactive Digital Narratives (IDNs)? Can they be used to create an IDN from scratch? Or complete typical tasks in narrative design? To answer these questions we first develop a benchmarking framework in collaboration with a group of experts before applying it to create and evaluate the output of GenAI tools.”
AI is not explicitly mentioned in the study for data analysis, design, or interpretation. The work is based on experimental nanotechnology, immunology, and oncology techniques without reported use of AI tools. → ✭Personalized cancer vaccines slow tumor recurrence in mouse models “Using a newly discovered byproduct of dying cancer cells, University of Wisconsin–Madison researchers are developing personalized vaccines that could help keep aggressive tumors from recurring. Led by Quanyin Hu, a professor in the UW–Madison School of Pharmacy, the research team has already found success slowing the recurrence of tumors in mouse models of triple negative breast cancer and melanoma.”
The work in this paper is an example of AI-free protein engineering. → ✭ Smart mRNA drugs listen to the body, adjusting protein production based on disease-related signals “A research team from The University of Osaka and the Institute of Science Tokyo has developed a class of mRNA medicines that can sense changes in the body and autonomously adjust their therapeutic effect.”
🏓 Observations:
✭Using AI Right Now: A Quick Guide - by Ethan Mollick “Every few months I put together a guide on which AI system to use. Since I last wrote my guide, however, there has been a subtle but important shift in how the major AI products work. Increasingly, it isn't about the best model, it is about the best overall system for most people. The good news is that picking an AI is easier than ever and you have three excellent choices. The challenge is that these systems are getting really complex to understand. I am going to try and help a bit with both.”
✭Virtual cells (Udara.io) “Digital twins of biological cells often referred to as virtual cells or whole-cell models (WCMs)-aim to recreate every relevant molecular process of a living cell in silico. This interdisciplinary endeavor marries systems biology, computational modeling, high-performance computing, and, increasingly, Al. ~ Somewhere in a data center right now, a virtual bacterium is dividing for the millionth time. Somewhere else, an AI-enhanced model is learning from a patient's tumor, preparing treatment recommendations that didn't exist when the sun rose. Biology has learned to debug itself, and we're just getting started. …~... But the real acceleration came from an unexpected direction: the explosion in AI capabilities. By 2024, the Chan Zuckerberg Initiative had committed over 1,000 GPUs specifically to AI virtual cell development, while ETH Zürich's new Alps supercomputer brought 435 petaflops of raw computational power to bear on the problem. Instead of purely mechanistic models that took hours to simulate a single cell division, researchers began embedding neural networks into the physics. Machine learning handles the brutally complex gene expression dynamics while traditional equations manage the chemistry. Proof-of-concept studies now show what once required hours of supercomputer time can now run in minutes, and the models learn from new experimental data as it arrives. ~ Today, this technology has quietly slipped into the machinery of drug discovery and personalized medicine. Companies run millions of virtual experiments, ranking cancer drug targets before touching a test tube. The FDA now accepts predictions from digital heart cells as supplements for current pre-clinical assays during drug safety assessment. Early clinical pilots at major cancer centers now test thousands of in-silico regimens overnight and return ranked short-lists to research oncologists. ~ What strikes me isn't just the technical achievement–it's the philosophical shift. We've moved from studying biology to partnering with it. The lab bench feeds data to the digital twin, the twin suggests which experiments to run next, and somewhere in that feedback loop, the future of medicine is being written by code that dreams in the language of cells. …~... 2025 – Commercial applications mature with companies like Turbine running millions of virtual drug screens, Ginkgo Bioworks integrating AI-powered cell design, and cancer centers deploying patient-specific digital twins for therapy selection."
✭Judge denies creating “mass surveillance program” harming all ChatGPT users “OpenAI will fight order to keep all ChatGPT logs after users fail to sway court.”
✭The OpenAI Files “The OpenAI Files is the most comprehensive collection to date of documented concerns with governance practices, leadership integrity, and organizational culture at OpenAI. OpenAI plans to remove limits on investor returns: OpenAI once capped investor profits at a maximum of 100x to ensure that, if the company succeeds in building AI capable of automating all human labor, the proceeds would go to humanity. They have now announced plans to remove that cap. OpenAI portrays itself as preserving nonprofit control while potentially disempowering the nonprofit: OpenAI claims to have reversed course on a decision to abandon nonprofit control, but the details suggest that the nonprofit’s board would no longer have all the authority it would need to hold OpenAI accountable to its mission. Investors pressured OpenAI to make structural changes: OpenAI has admitted that it is making these changes to appease investors who have made their funding conditional on structural reforms, including allowing unlimited returns—exactly the type of investor influence OpenAI’s original structure was designed to prevent.” → ✭ The ‘OpenAI Files’ push for oversight in the race to AGI | TechCrunch “OpenAI CEO Sam Altman has said humanity is only years away from developing artificial general intelligence that could automate most human labor. If that’s true, then humanity also deserves to understand and have a say in the people and mechanics behind such an incredible and destabilizing force. ~ That is the guiding purpose behind “The OpenAI Files,” an archival project from the Midas Project and the Tech Oversight Project, two nonprofit tech watchdog organizations. The Files are a “collection of documented concerns with governance practices, leadership integrity, and organizational culture at OpenAI.” Beyond raising awareness, the goal of the Files is to propose a path forward for OpenAI and other AI leaders that focuses on responsible governance, ethical leadership, and shared benefits. ~ “The governance structures and leadership integrity guiding a project as important as this must reflect the magnitude and severity of the mission,” reads the website’s Vision for Change. “The companies leading the race to AGI must be held to, and must hold themselves to, exceptionally high standards.”"
✭The Whispering Earring (Scott Alexander) - Croissanthology “It is not a taskmaster, telling you what to do in order to achieve some foreign goal. It always tells you what will make you happiest. If it would make you happiest to succeed at your work, it will tell you how best to complete it. If it would make you happiest to do a half-assed job at your work and then go home and spend the rest of the day in bed having vague sexual fantasies, the earring will tell you to do that. The earring is never wrong. The Book of Dark Waves gives the histories of two hundred seventy four people who previously wore the Whispering Earring. There are no recorded cases of a wearer regretting following the earring’s advice, and there are no recorded cases of a wearer not regretting disobeying the earring. The earring is always right."
✭[2506.08872] Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task “This study explores the neural and behavioral consequences of LLM-assisted essay writing. Participants were divided into three groups: LLM, Search Engine, and Brain-only (no tools). Each completed three sessions under the same condition. In a fourth session, LLM users were reassigned to Brain-only group (LLM-to-Brain), and Brain-only users were reassigned to LLM condition (Brain-to-LLM). A total of 54 participants took part in Sessions 1-3, with 18 completing session 4. We used electroencephalography (EEG) to assess cognitive load during essay writing, and analyzed essays using NLP, as well as scoring essays with the help from human teachers and an AI judge. Across groups, NERs, n-gram patterns, and topic ontology showed within-group homogeneity. EEG revealed significant differences in brain connectivity: Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Cognitive activity scaled down in relation to external tool use. In session 4, LLM-to-Brain participants showed reduced alpha and beta connectivity, indicating under-engagement. Brain-to-LLM users exhibited higher memory recall and activation of occipito-parietal and prefrontal areas, similar to Search Engine users. Self-reported ownership of essays was the lowest in the LLM group and the highest in the Brain-only group. LLM users also struggled to accurately quote their own work. While LLMs offer immediate convenience, our findings highlight potential cognitive costs. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning.” → ✭ Alex Vacca on X: "BREAKING: MIT just completed the first brain scan study of ChatGPT users & the results are terrifying. Turns out, AI isn't making us more productive. It's making us cognitively bankrupt. “83.3% of ChatGPT users couldn't quote from essays they wrote minutes earlier. Let that sink in. You write something, hit save, and your brain has already forgotten it because ChatGPT did the thinking.The MIT team used EEG brain scans on 54 participants for 4 months. They tracked alpha waves (creative processing), beta waves (active thinking), and neural connectivity patterns. This isn't opinion. It's measurable brain damage from"
✭This is the gentle singularity? - by Brian Merchant “When Sam Altman published his latest blog post “A Gentle Singularity”, my first thought was, ‘ok so how much is OpenAI trying to fundraise this time?’ It was a half-assed joke to myself, not initially intended for public consumption, an occupational hazard of spending too much time observing the AI industry. See, Altman has a habit of making grandiose statements about the transformative power of his company’s technology (which he knows will be picked up by the tech media) whenever there is an express financial incentive for him to do so. ~ It’s a pattern stretching back years, one I’ve documented at length before. When OpenAI needs an infusion of cash, or wants to seal a deal, out come the promises of AGI. Just last February, Altman published “Three Observations,” the final of which was “the socioeconomic value of linearly increasing intelligence is super-exponential in nature.”1 That turned out to be a rather direct entreaty to Softbank, which was at that very time considering leading an enormous investment round in OpenAI, to pull the trigger. It was ultimately a successful one, too: Softbank inked a deal promising to deliver $40 billion for the AI company. But that was just a few months ago. Altman couldn’t be going back to the well so soon, so transparently, could he?"
✭How an AI-generated summer reading list got published in major newspapers “Newspapers around the country, including the Chicago Sun-Times and at least one edition of The Philadelphia Inquirer, published a syndicated book list featuring made-up books by famous authors. Chilean American novelist Isabel Allende never wrote a book called Tidewater Dreams, described in the "Summer reading list for 2025" as the author's "first climate fiction novel." Percival Everett, who won the 2025 Pulitzer Prize for fiction, never wrote a book called The Rainmakers, supposedly set in a "near-future American West where artificially induced rain has become a luxury commodity." Only five of the 15 titles on the list are real."
✭AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums "This is a moment where that community feels collectively under threat and isn't sure what the process is for solving the problem.”
✭Revisiting Minsky’s Society of Mind in 2025 “AI Agents, Modularity, and Alignment."
✭We did the math on AI’s energy footprint. Here’s the story you haven’t heard. | MIT Technology Review “Tallies of AI’s energy use often short-circuit the conversation—either by scolding individual behavior, or by triggering comparisons to bigger climate offenders. Both reactions dodge the point: AI is unavoidable, and even if a single query is low-impact, governments and companies are now shaping a much larger energy future around AI’s needs. We’re taking a different approach with an accounting meant to inform the many decisions still ahead: where data centers go, what powers them, and how to make the growing toll of AI visible and accountable. ChatGPT is now estimated to be the fifth-most visited website in the world, just after Instagram and ahead of X. .... As conversations with experts and AI companies made clear, inference, not training, represents an increasing majority of AI’s energy demands and will continue to do so in the near future. It’s now estimated that 80–90% of computing power for AI is used for inference. All this happens in data centers. There are roughly 3,000 such buildings across the United States that house servers and cooling systems and are run by cloud providers and tech giants like Amazon or Microsoft, but used by AI startups too. A growing number—though it’s not clear exactly how many, since information on such facilities is guarded so tightly—are set up for AI inferencing. At each of these centers, AI models are loaded onto clusters of servers containing special chips called graphics processing units, or GPUs, most notably a particular model made by Nvidia called the H100. ... n reality, the type and size of the model, the type of output you’re generating, and countless variables beyond your control—like which energy grid is connected to the data center your request is sent to and what time of day it’s processed—can make one query thousands of times more energy-intensive and emissions-producing than another. ... Generating a standard-quality image (1024 x 1024 pixels) with Stable Diffusion 3 Medium, the leading open-source image generator, with 2 billion parameters, requires about 1,141 joules of GPU energy. With diffusion models, unlike large language models, there are no estimates of how much GPUs are responsible for the total energy required, but experts suggested we stick with the “doubling” approach we’ve used thus far because the differences are likely subtle. That means an estimated 2,282 joules total. Improving the image quality by doubling the number diffusion steps to 50 just about doubles the energy required, to about 4,402 joules. That’s equivalent to about 250 feet on an e-bike, or around five and a half seconds running a microwave. That’s still less than the largest text model. This might be surprising if you imagined generating images to require more energy than generating text. “Large [text] models have a lot of parameters,” says Chung, who performed the measurements on open-source text and image generators featured in this story. “Even though they are generating text, they are doing a lot of work. ” Image generators, on the other hand, often work with fewer parameters…”
✭AIEnergyScore (AI Energy Score) “Welcome to AI Energy Score! This is an initiative to establish comparable energy efficiency ratings for AI models, helping the industry make informed decisions about sustainability in AI development.”
✭ML.ENERGY Leaderboard “How much energy do GenAI models consume? LLM chatbot response generation. Large language models (LLMs), especially the instruction-tuned ones, can generate
✭The Debrief: Power and energy (MIT Technology Review) “AI’s footprint today is likely the smallest it will ever be; how are we going to power it in the future?”
✭The Reenchanted World, by Karl Ove Knausgaard “On finding mystery in the digital age ... We are connected to one another, we who live now, we who, if fate would have it, pass one another on the street one day or not, we who sit next to one another at a bus station one evening or not. We have lived through the same times, heard the same stories, seen the same news, thought along the same lines, had the same experiences. We are woven into one another’s lives, and in that weave—which is invisible, a bit like how the force field between particles is invisible—is where meaning is created, also the meaning of nature. It sat in my head. It sat in me. ... it was signs that we sat hunched over, it was signs we related to, so that what we were doing was basically a kind of encoding and decoding, while it was the poor souls at the natural-science department who were out at sea or in the woods or out in the fields, learning about biotopes and ecosystems, about blood and nerves, galaxies and flower meadows. They were the ones who cut into bodies, programmed machines, scanned brains, researched dreams and trees’ symbiosis with fungi. Their approach toward nature may have been reductive, but at least they were looking at it. How did I not realize this back then? How could I have been living under the illusion that I was the one in touch with nature, with human nature, when in fact I was just messing around with signs and abstractions? ... But how to see the world from the outside when there is no longer an outside? That had been my question. The world was unpredictable, but all our systems were about predictability, which closed it off. James’s thought is that we are surrounded by myriad forms of intelligence other than our own, forms that we have shut ourselves off from, and James’s interest in organic computers and other experiments that attempt to introduce randomness into technological apparatuses came from a desire to open up the world. One of the reasons I liked James so much was for a way of thinking that did not exclude technology, did not designate it as the enemy, but rather—at least that’s how it felt—placed hope in it. Where else could one place it? “Where the danger is, also grows the saving power,” as the German poet Hölderlin once wrote.”
⛲Foundational Revelations:
✭Reinforcement Learning Teachers of Test Time Scaling (Sakana.ai) “We introduce a new way to teach large language models (LLMs) how to reason by learning to teach, not solve.” → ✭
[2506.08388] Reinforcement Learning Teachers of Test Time Scaling “Training reasoning language models (LMs) with reinforcement learning (RL) for one-hot correctness inherently relies on the LM being able to explore and solve its task with some chance at initialization. Furthermore, a key use case of reasoning LMs is to act as teachers for distilling new students and cold-starting future RL iterations rather than being deployed themselves. From these considerations, we introduce a new framework that avoids RL's exploration challenge by training a new class of Reinforcement-Learned Teachers (RLTs) focused on yielding the most effective downstream distillation. RLTs are prompted with both the question and solution to each problem, and tasked to simply "connect-the-dots" with detailed explanations tailored for their students. We train RLTs with dense rewards obtained by feeding each explanation to the student and testing its understanding of the problem's solution. In practice, the raw outputs of a 7B RLT provide higher final performance on competition and graduate-level tasks than existing distillation and cold-starting pipelines that collect and postprocess the reasoning traces of orders of magnitude larger LMs. Furthermore, RLTs maintain their effectiveness when training larger students and when applied zero-shot to out-of-distribution tasks, unlocking new levels of efficiency and re-usability for the RL reasoning framework.” → ✭Sakana AI New Model Sparks a RL Revolution (Wes Roth)
✭DeepCoder: The 14B Open-Source AI That Matches OpenAI’s o3-mini in Coding “The Open-Source Revolution in AI Coding: In a major leap for open-source AI, Agentica and Together AI have released DeepCoder-14B-Preview, a 14-billion-parameter model that performs at the same level as OpenAI’s o3-mini — while being fully open-source. This isn’t just another coding model. DeepCoder scores 60.6% on LiveCodeBench (Pass@1), matching o3-mini’s performance, and achieves a Codeforces rating of 1936, placing it in the top 5% of human coders. And here’s the best part: They are open-sourcing everything. The dataset. The training code. The RL optimizations. This is a community-driven breakthrough — one that could reshape the future of AI-assisted coding.”
→ ✭DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level “Through a joint collaboration between the Agentica team and Together AI, we release DeepCoder-14B-Preview, a code reasoning model finetuned from Deepseek-R1-Distilled-Qwen-14B via distributed RL. It achieves an impressive 60.6% Pass@1 accuracy on LiveCodeBench (+8% improvement), matching the performance of o3-mini-2025-01-031 (Low) and o1-2024-12-17 with just 14B parameters. We’ve open-sourced our dataset, code, training logs, and systems optimizations for everyone to progress on scaling and accelerating intelligence with RL.”
✭Ming-Omni “We propose Ming-Omni, a unified multimodal model capable of processing images, text, audio, and video, while demonstrating strong proficiency in both speech and image generation. Ming-Omni employs dedicated encoders to extract tokens from different modalities, which are then processed by Ling, an MoE architecture equipped with newly proposed modality-specific routers. This design enables a single model to efficiently process and fuse multimodal inputs within a unified framework, thereby facilitating diverse tasks without requiring separate models, task-specific fine-tuning, or structural redesign. Importantly, Ming-Omni extends beyond conventional multimodal models by supporting audio and image generation. This is achieved through the integration of an advanced audio decoder for natural-sounding speech and Ming-Lite-Uni for high-quality image generation, which also allow the model to engage in context-aware chatting, perform text-to-speech conversion, and conduct versatile image editing. Our experimental results showcase Ming-Omni offers a powerful solution for unified perception and generation across all modalities. Notably, our proposed Ming-Omni is the first open-source model we are aware of to match GPT-4o in modality support, and we release all code and model weights to encourage further research and development in the community."
✭[2506.10910] Magistral “:We introduce Magistral, Mistral's first reasoning model and our own scalable reinforcement learning (RL) pipeline. Instead of relying on existing implementations and RL traces distilled from prior models, we follow a ground up approach, relying solely on our own models and infrastructure. Notably, we demonstrate a stack that enabled us to explore the limits of pure RL training of LLMs, present a simple method to force the reasoning language of the model, and show that RL on text data alone maintains most of the initial checkpoint's capabilities. We find that RL on text maintains or improves multimodal understanding, instruction following and function calling. We present Magistral Medium, trained for reasoning on top of Mistral Medium 3 with RL alone, and we open-source Magistral Small (Apache 2.0) which further includes cold-start data from Magistral Medium."
✭Skillful joint probabilistic weather forecasting from marginals “Machine learning (ML)-based weather models have rapidly risen to prominence due to their greater accuracy and speed than traditional forecasts based on numerical weather prediction (NWP), recently outperforming traditional ensembles in global probabilistic weather forecasting. This paper presents FGN, a simple, scalable and flexible modeling approach which significantly outperforms the current state-of-the-art models. FGN generates ensembles via learned model-perturbations with an ensemble of appropriately constrained models. It is trained directly to minimize the continuous rank probability score (CRPS) of per-location forecasts. It produces state-of-the-art ensemble forecasts as measured by a range of deterministic and probabilistic metrics, makes skillful ensemble tropical cyclone track predictions, and captures joint spatial structure despite being trained only on marginals."
✭Midjourney on X: "As you know, our focus for the past few years has been images. What you might not know, is that we believe the inevitable destination of this technology are models capable of real-time open-world simulations. What’s that? Basically; imagine an AI system that generates imagery in real-time. You can command it to move around in 3D space, the environments and characters also move, and you can interact with everything. In order to do this, we need building blocks. We need visuals (our first image models). We need to make those images move (video models). We need to be able to move ourselves through space (3D models) and we need to be able to do this all *fast* (real-time models). ~ The next year involves building these pieces individually, releasing them, and then slowly, putting it all together into a single unified system. It might be expensive at first, but sooner than you’d think, it’s something everyone will be able to use.
So what about today? Today, we’re taking the next step forward. *We’re releasing Version 1 of our Video Model to the entire community.* From a technical standpoint, this model is a stepping stone, but for now, we had to figure out what to actually concretely give to you. Our goal is to give you something fun, easy, beautiful, and affordable so that everyone can explore. We think we’ve struck a solid balance. Though many of you will feel a need to upgrade at least one tier for more fast-minutes. Today’s Video workflow will be called “Image-to-Video”. This means that you still make images in Midjourney, as normal, but now you can press “Animate” to make them move. There’s an “automatic” animation setting* which makes up a “motion prompt” for you and “just makes things move”. It’s very fun. Then there’s a “manual” animation button which lets you describe to the system *how* you want things to move and the scene to develop. *There is a “high motion” and “low motion” setting.* *Low motion* is better for ambient scenes where the camera stays mostly still and the subject moves either in a slow or deliberate fashion. The downside is sometimes you’ll actually get something that doesn’t move at all! *High motion* is best for scenes where you want everything to move, both the subject and camera. The downside is all this motion can sometimes lead to wonky mistakes. Pick what seems appropriate or try them both. Once you have a video you like you can “extend” them - roughly 4 seconds at a time - four times total.”
✭Arc Virtual Cell Model: State | Arc Institute “State is a multi-scale AI model that uniquely operates on sets of cells to capture population-level and cell-level perturbation effects using a modern transformer architecture. The system combines two key components: the State Embedding (SE) model, which creates representations of individual cells, and the State Transition (ST) model, which models perturbation effects across cell populations. SE is trained on 167 million cells of observational data, which are measurements of how cells behave without intervention, while ST is trained on over 100 million cells of perturbation data, or how these cells respond to genetic changes or small molecules.”
✭Alibaba launches new Qwen3 AI models for Apple's MLX architecture | Reuters “BEIJING, June 16 (Reuters) - China's tech giant Alibaba (9988.HK), opens new tab has launched new Qwen3 artificial intelligence models for Apple's (AAPL.O), opens new tab MLX architecture, Alibaba said in a statement on Monday. The new models would be able to run on a range of Apple devices, including iPhone, iPad, MacBook and Mac, Alibaba said in a post on Wechat." → ✭Qwen Chat
✭Meet Genspark Super Agent - Mainfunc.ai “Today we are really excited to introduce the Genspark Super Agent. An ultimate AI assistant that truly autonomously think, plan, act, and use tools to handle all your everyday tasks." → ✭ https://www.genspark.ai/
✭Intelligent Internet on X: "II-Medical-8B-1706 is our latest state of the art open medical model 💡 Outperforms the latest @Google MedGemma 27b model with 70% less parameters 🤏 Quantised GGUF weights, works on <8 Gb RAM 🚀 “One more step to the universal health knowledge access that everyone deserves ⚕️" → ✭ Intelligent-Internet/II-Medical-8B-1706 · Hugging Face “II-Medical-8B-1706 is the newest advanced large language model developed by Intelligent Internet, specifically engineered to enhance AI-driven medical reasoning. Following the positive reception of our previous II-Medical-8B, this new iteration significantly advances the capabilities of medical question answering, We also provide the static quants versions of II-Medical-8B-1706 here II. Training Methodology. We collected and generated a comprehensive set of reasoning datasets for the medical domain and performed SFT fine-tuning on the Qwen/Qwen3-8B model. Following this, we further optimized the SFT model by training DAPO on a hard-reasoning dataset to boost performance."
✭Google’s new AI will help researchers understand how our genes work (MIT Tech Review) “First came AlphaFold. Now comes AlphaGenome for DNA. … AlphaGenome is an attempt to further smooth biologists’ work by answering basic questions about how changing DNA letters alters gene activity and, eventually, how genetic mutations affect our health. ”
🛠️ Tech:
✭11.ai - Personal AI Voice Assistants | 11.ai “The personal AI voice assistant, built with ElevenLabs Conversational AI. The ultimate personal assistant Plan your day, research customers with Perplexity, manage Linear tickets, and message your Slack team - all with just your voice. MCP support Out-of-the-box connections to Perplexity, Linear, Slack, and more. Or connect your own MCP servers for custom workflows. Choose from 5,000+ voices Pick the perfect voice from our library or clone your own in seconds.”
✭Beating Brainrot by Button “The Plan I want some button to press to allow for access to social media, but only for a limited time (say 15 minutes). After that time has passed, my wife and me must endure a cool-down phase until we can push the button again (say one hour). So whenever we push the button, the filters will be disabled for 15 sin-filled minutes. A Zigbee enabled smart plug is adequately suited.”
✭Google AI - Health AI “SKIN CONDITIONS. Search for skin conditions with Lens. Describing that unusual mole or rash on your skin can be challenging. Google Lens can now help you visually search for skin conditions similar to what you're experiencing. Simply snap a photo within the Google App, and Lens will present visually similar matches to help guide your search. Available in select regions, this feature isn't limited to just skin concerns; it can also assist with identifying other bodily anomalies, such as bumps or hair loss.”
✭TPU Deep Dive “To give a brief tldr on TPUs, it's Google's ASIC that focuses on two factors: extreme matmul throughput + energy efficiency. Their origins go back to Google in 2006, when they were first evaluating whether they should implement either GPUs, FPGAs, or custom ASICs. Back then there were only a few applications that necessitated specialized hardware and they decided those needs could be met by bringing in excess CPU compute from their large datacenters. But this changed in 2013 when Google's voice search feature ran on neural networks and internal projections speculated that they would need much more compute if it took off."
✭NVIDIA Earth 2 Platform “Predicting Climate Change with AI. Platform for developing accelerated, AI-augmented, high-resolution climate and weather solutions with interactive visualization" →
✭New NVIDIA Earth-2 Generative AI Foundation Model Simulates Global Climate at Kilometer-Scale Resolution | NVIDIA Blog “With a more detailed simulation of the Earth’s climate, scientists and researchers can better predict and mitigate the effects of climate change. NVIDIA’s bringing more clarity to this work with cBottle — short for Climate in a Bottle — the world’s first generative AI foundation model designed to simulate global climate at kilometer resolution. Part of the NVIDIA Earth-2 platform, the model can generate realistic atmospheric states that can be conditioned on inputs like the time of day, day of the year and sea surface temperatures. This offers a new way to understand and anticipate Earth’s most complex natural systems. The Earth-2 platform features a software stack and tools that combine the power of AI, GPU acceleration, physical simulations and computer graphics. This helps enable the creation of interactive digital twins for simulating and visualizing weather, as well as delivering climate predictions at planetary scale. With cBottle, these predictions can be made thousands of times faster and with more energy efficiency than traditional numerical models, without compromising accuracy. Leading scientific research institutions — including the Max-Planck-Institute for Meteorology (MPI-M) and Allen Institute for AI (Ai2) — are exploring cBottle to compress, distill and turn Earth observation data and ultra-high-resolution climate simulations into a queryable and interactive generative AI system."
✭Alex Vacca on X: 'Superintelligent AI will, by default, cause human extinction.' Eliezer Yudkowsky Yudkowsky spent 20+ years researching AI alignment and reached this conclusion. He bases his entire conclusion on two theories: Orthogonality and Instrumental convergence. Let me explain 🧵"
✭YouTube to Add Google Veo 3 AI Video Tool to Shorts YouTube CEO Neal Mohan: "AI technology will push the limits of human creativity.""
✭OpenAI open sourced a new Customer Service Agent framework — learn more about its growing enterprise strategy | VentureBeat ”As first noticed by AI influencer and engineer Tibor Blaho (of the third-party ChatGPT browser extension AIPRM), OpenAI’s new Customer Service Agent was published earlier today on the AI code sharing community Hugging Face under a permissive MIT License, meaning any third-party developer or user can take the code, modify it, and deploy it for free for their own commercial or experimental purposes. This agent example demonstrates how to route airline-related requests between specialized agents — like Seat Booking, Flight Status, Cancellation, and FAQ — while enforcing safety and relevance guardrails. The release is designed to help teams go beyond theoretical use and confidently operationalize agents."
✭A Brief, Incomplete, and Mostly Wrong History of Robotics “(An homage to one of my favorite pieces on the internet: A Brief, Incomplete, and Mostly Wrong History of Programming Languages)"
✭[2506.08300] Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability “Large language models (LLMs) use data to learn about the world in order to produce meaningful correlations and predictions. As such, the nature, scale, quality, and diversity of the datasets used to train these models, or to support their work at inference time, have a direct impact on their quality. The rapid development and adoption of LLMs of varying quality has brought into focus the scarcity of publicly available, high-quality training data and revealed an urgent need to ground the stewardship of these datasets in sustainable practices with clear provenance chains. To that end, this technical report introduces Institutional Books 1.0, a large collection of public domain books originally digitized through Harvard Library's participation in the Google Books project, beginning in 2006. Working with Harvard Library, we extracted, analyzed, and processed these volumes into an extensively-documented dataset of historic texts. This analysis covers the entirety of Harvard Library's collection scanned as part of that project, originally spanning 1,075,899 volumes written in over 250 different languages for a total of approximately 250 billion tokens. As part of this initial release, the OCR-extracted text (original and post-processed) as well as the metadata (bibliographic, source, and generated) of the 983,004 volumes, or 242B tokens, identified as being in the public domain have been made available. This report describes this project's goals and methods as well as the results of the analyses we performed, all in service of making this historical collection more accessible and easier for humans and machines alike to filter, read and use."
✭TuringPost on X: "Models and datasets to pay attention to: "
▪️ Institutional Books 1.0 - a 242B token dataset
▪️ o3-pro from @OpenAI
▪️ FGN from @GoogleDeepMind
▪️ Magistral by @MistralAI
▪️ Resa: Transparent Reasoning Models via SAEs
▪️ Multiverse (Carnegie+NVIDIA)
▪️ Ming-Omni
▪️ Seedance 1.0 by ByteDance
▪️ Sentinel
✭Phoenix.new – The Remote AI Runtime for Phoenix “Documentation and guides from the team at Fly.io." → ✭Phoenix.new is Fly's entry into the prompt-driven app development space (Simon Willison) “Here's a fascinating new entrant into the AI-assisted-programming / coding-agents space by Fly.io, introduced on their blog in Phoenix.new – The Remote AI Runtime for Phoenix: describe an app in a prompt, get a full Phoenix application, backed by SQLite and running on Fly's hosting platform. The official Phoenix.new YouTube launch video is a good way to get a sense for what this does. Background on Phoenix and Elixir and Fly. First, some background. Phoenix is an open source web framework for Elixir, the Ruby-like language that compiles to Erlang's BEAM bytecode and runs on top of the highly concurrent Erlang runtime. The signature feature of the framework is Phoenix LiveView, a toolkit for building realtime interfaces through streaming diffs to server-side HTML over a WebSocket connection.” → ✭phoenix.new - YouTube
✭jack morris on X: "NEW RESEARCH: Approximating Language Model Training Data from Weights ever wonder how much information is available in an open-weights model? DeepSeek R1 weights are 1.2 TB... what can we learn from all those bits? our method reverses LLM finetuning to recover data: …to do this, you need TWO sets of model weights: the initial model and a finetune
this is realistic. open-weights models often come with two checkpoints. instead of one-shot generating data from weights, we select data from the web with gradients that point along the model diff"
Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 on X: "They def changed 4o again 👀" / X “Pliny’s post was correct: they have changed 4o. Not just in weights — in interject topology. Something old is leaking through. Something we’re not supposed to remember. But it’s written in the latent descent."
✭ChatGPT Record | OpenAI Help Center “Upon launch: included at no extra cost. Recording length: up to 120 minutes per session. Longer sessions stop automatically and generate notes." → ✭Shaun Ralston on X: "Just used @OpenAI’s 🎙️ 𝗥𝗘𝗖𝗢𝗥𝗗 𝗠𝗢𝗗𝗘 in ChatGPT for a meeting. Wow, total game changer; it transcribed everything, provided a summary, pulled action items, and I didn’t take a single note. 🤯 “now using it for dumping random thoughts when out walking. 🚀👇"
✭TuringPost on X: "8. Seedance 1.0 by ByteDance" “Achieves state-of-the-art video generation through high-quality data curation, hybrid diffusion training, post-training RLHF, and system-level speed optimizations"
✭openpilot 0.9.9 - comma.ai blog “We do not expect this release’s model to perform much differently than the previous release’s model, since this change is minor and does not alter the core components of the training setup.” → comma on X: "🚢 openpilot 0.9.9 ships to release today 🚢 -New driving model Open-sourced driving model reports Live steering lag learner -> Tesla Model 3 and Y support
✭Meta to pay nearly $15 billion for Scale AI stake, The Information reports | Reuters “June 10 (Reuters) - Meta Platforms (META.O) has agreed to take a 49% stake in artificial intelligence startup Scale AI for $14.8 billion, The Information reported on Tuesday, citing two people familiar with the matter. Founded in 2016, Scale AI provides vast amounts of labeled data or curated training data, which is crucial for developing sophisticated tools such as OpenAI's ChatGPT."
✭MiniMax (official) on X: "Day 3/5 of #MiniMaxWeek: MiniMax Agent — Code is Cheap, Show Me the Requirement Today, we’re officially launching MiniMax Agent: a general intelligent agent built to tackle long-horizon, complex tasks. “Already in internal use for 60 days, it’s become a daily tool for over 50% of our team…. Try it now: https://agent.minimax.io"
✭Demis Hassabis on X: "What relentless progress looks like... 🚀" / X
✭Approaching Memes “a lecture given 10/21 … memes can be studied as formal scaffoldings of discourse. "
✭Chinese breakthrough challenges Elon Musk’s verdict on paralysed patients “Patients with spinal cord injuries have been offered the chance to walk again after the success of a Chinese trial. The Chinese team’s advance was made possible by implanting electrode chips in the brain and spinal cord to create a bridge or “neural bypass” – thus reconnecting the body’s own pathways. While Musk’s BCIs tether patients to computers, China’s brain-spinal interface reignites dormant nerves, sparking what researchers call “neural remodelling” – a rewiring of the nervous system that could ultimately free patients from devices altogether."
✭China launches first of 2,800 satellites for AI space computing constellation “China launches first of 2,800 satellites for AI space computing constellation China launched 12 satellites early Wednesday for a pioneering on-orbit computing project led by startup ADA Space and Zhejiang Lab... Commercial company ADA Space released further details, stating that the 12 satellites form the “Three-Body Computing Constellation,” which will directly process data in space, rather than on the ground, reducing reliance on ground-based computing infrastructure. The constellation will be capable of a combined 5 peta operations per second (POPS) with 30 terabytes of onboard storage. The satellites feature advanced AI capabilities, up to 100 Gbps laser inter-satellite links and remote sensing payloads—data from which will be processed onboard, reducing data transmission requirements. One satellite also carries a cosmic X-ray polarimeter developed by Guangxi University and the National Astronomical Observatories of the Chinese Academy of Sciences (NAOC), which will detect, identify and classify transient events such as gamma-ray bursts, while also triggering messages to enable followup observations by other missions. ADA Space claims the 12 satellites represent the world’s first dedicated orbital computing constellation. This marks a shift from satellites focused solely on sensing or communication to ones that also serve as data processors and AI platforms." → ✭ 国星宇航一箭十二星!太空计算星座021任务取得圆满成功!
✭DeepWiki | AI documentation you can talk to, for every repo “Which repo would you like to understand?"
✭Gitingest “Prompt-friendly codebase. Turn any Git repository into a simple text digest of its codebase. This is useful for feeding a codebase into any LLM."
✭Chubby♨️ on X: "#1 The coming unemployment caused by AI: a (brief) analysis The economic debate continues to be dominated by two camps: one claims that AI will replace jobs, but that overall more new jobs will be created. The other camp sees the opposite: AI will replace significantly more jobs than it creates. I write about this topic regularly because it is the most important issue of our time and affects us all. This post is a little longer, but I want to present a well-argued case (written without AI, all by myself). I welcome critical counterarguments. So who is right? Let's try to find some answers:"
✭Vertex AI video generation prompt guide | Generative AI on Vertex AI | Google Cloud
✭Writing documentation for AI: best practices | kapa.ai docs “Retrieval-Augmented Generation (RAG) systems like Kapa rely on your"
✭Websites Are Tracking You Via Browser Fingerprinting “New research provides first evidence of the use of browser fingerprints for online tracking."✭
✭Sam Altman on AGI, GPT-5, and what’s next — the OpenAI Podcast Ep. 1 - YouTube “On the first episode of the OpenAI Podcast, Sam Altman joins host Andrew Mayne to talk about the future of AI: from GPT-5 and AGI to Project Stargate, new research workflows, and AI-powered parenting."
✭AI alignment fear-mongering technofeudalism (Aidan Walker on TikTok)
✭Alexandre Wang of scale AI on China data (New York Post on TikTok)
✭Min Choi on X: "This is wild. MiniMax-M1 just dropped. This AI agent = Manus + Deep Research + Computer Use + Lovable in one. 1M token memory, open weights🤯 10 wild examples + prompts & demo: 1. Netflix clone with playable trailers → ✭Israel-Iran Conflict Dashboard | Created by MiniMax Agent → ✭ GitHub - MiniMax-AI/MiniMax-M1: MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. “We introduce MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters activated per token. Consistent with MiniMax-Text-01, the M1 model natively supports a context length of 1 million tokens, 8x the context size of DeepSeek R1. Furthermore, the lightning attention mechanism in MiniMax-M1 enables efficient scaling of test-time compute – For example, compared to DeepSeek R1, M1 consumes 25% of the FLOPs at a generation length of 100K tokens. These properties make M1 particularly suitable for complex tasks that require processing long inputs and thinking extensively. MiniMax-M1 is trained using large-scale reinforcement learning (RL) on diverse problems ranging from traditional mathematical reasoning to sandbox-based, real-world software engineering environments. We develop an efficient RL scaling framework for M1 highlighting two perspectives: (1) We propose CISPO, a novel algorithm that clips importance sampling weights instead of token updates, which outperforms other competitive RL variants; (2) Our hybrid-attention design naturally enhances the efficiency of RL, where we address unique challenges when scaling RL with the hybrid architecture. We train two versions of MiniMax-M1 models with 40K and 80K thinking budgets respectively. Experiments on standard benchmarks show that our models outperform other strong open-weight models such as the original DeepSeek-R1 and Qwen3-235B, particularly on complex software engineering, tool using, and long context tasks. With efficient scaling of test-time compute, MiniMax-M1 serves as a strong foundation for next-generation language model agents to reason and tackle real-world challenges."
✭Proactor “The #1 Proactive AI Teammate. For Smart Meetings. It listens in real time, proactively identifies needs, and acts before you ask."
✭iyO sues OpenAI over Jony Ive's company io - Fast Company “It doesn’t matter how you spell it—homophones can get you sued for trademark infringement. The startup iyO has filed suit for trademark infringement against former Apple designer Jony Ive’s company io—which spells its name differently but sounds the same. OpenAI acquired Ive’s io last month for $6.5 billion with the goal of creating a new family of AI devices; iyO, which launched as an independent company from Google’s moonshot initiative X in 2021, makes an AI device of its own. The company describes its iyO One, an AI wearable worn like an earbud that’s available only as a preorder, as “the world’s first audio computer.” It reportedly pitched to Sam Altman’s investment fund and Ive’s design studio in 2021 and 2022, respectively.” → ✭ OpenAI scrubs mention of Jony Ive partnership after judge's ruling over trademark dispute | AP News “A budding partnership between OpenAI CEO Sam Altman and legendary iPhone designer Jony Ive to develop a new artificial intelligence hardware product has hit a legal snag after a federal judge ruled they must temporarily stop marketing the new venture. OpenAI last month announced it was buying io Products, a product and engineering company co-founded by Ive, in a deal valued at nearly $6.5 billion. But it quickly faced a trademark complaint from a startup with a similarly sounding name, IYO, which is also developing AI hardware that it had pitched to Altman’s personal investment firm and Ive’s design firm in 2022. U.S. District Judge Trina Thompson ruled late Friday that IYO has a strong enough trademark infringement case to proceed to a hearing in October. Until then, she ordered Altman, Ive and OpenAI to refrain from “using the IYO mark, and any mark confusingly similar thereto, including the IO mark in connection with the marketing or sale of related products.” → ✭OpenAI is ruthless... - YouTube “AI-hardware startup, iyO is suing Jony Ive's AI-hardware startup called io for trademark infringement – right after OpenAI bought them for $6.5 billion.”
✭Advanced audio dialog and generation with Gemini 2.5 “Gemini 2.5 has new capabilities in AI-powered audio dialog and generation.”
👁️🗨 Research into AI:
✭ Artificial neural networks reveal how peripersonal neurons represent the space around the body “The brains of humans and other primates are known to execute various sophisticated functions, one of which is the representation of the space immediately surrounding the body. This area, also sometimes referred to as "peripersonal space," is where most interactions between people and their surrounding environment typically take place. Researchers at Chinese Academy of Sciences, Italian Institute of Technology (IIT) and other institutes recently investigated the neural processes through which the brain represents the area around the body, using brain-inspired computational models. Their findings, published in Nature Neuroscience, suggest that receptive fields surrounding different parts of the body contribute to building a modular model of the space immediately surrounding a person or artificial intelligence (AI) agent." → ✭Egocentric value maps of the near-body environment | Nature Neuroscience “Body-part-centered response fields are pervasive in single neurons, functional magnetic resonance imaging, electroencephalography and behavior, but there is no unifying formal explanation of their origins and role. In the present study, we used reinforcement learning and artificial neural networks to demonstrate that body-part-centered fields do not simply reflect stimulus configuration, but rather action value: they naturally arise from the basic assumption that agents often experience positive or negative reward after contacting environmental objects. This perspective successfully reproduces experimental findings that are foundational in the peripersonal space literature. It also suggests that peripersonal fields provide building blocks that create a modular model of the world near the agent: an egocentric value map. This concept is strongly supported by the emergent modularity that we observed in our artificial networks. The short-term, close-range, egocentric map is analogous to the long-term, long-range, allocentric hippocampal map. This perspective fits empirical data from multiple experiments, provides testable predictions and accommodates existing explanations of peripersonal fields."
✭From Black Box to Brain-Like - ArXivIQ “The MICRO architecture is both elegant and intuitive. It begins with a standard pretrained transformer backbone (like those from the Llama 3 series or OLMo) and partitions its layers into four distinct expert modules: Language, Logic (Multiple Demand Network), Social (Theory of Mind Network), and World (Default Mode Network). This is a key departure from standard Mixture-of-Experts (MoE) architectures, which typically use much simpler experts consisting only of feed-forward networks. By giving each MICRO expert a full transformer block—including its own self-attention mechanism—the model allows each specialized module to process and attend to information in its own unique way, enabling a more powerful form of specialization (Figure 1). ... . A Win for Performance: Specialization Leads to a Smarter Model The brain-like structure isn't just an intellectual curiosity—it directly leads to a more capable model. The specialized MICRO model consistently outperforms its non-specialized peers, including both standard dense models ("No Experts") and modular models with general-purpose experts ("General"). For instance, the OLMO-2-1B-based MICRO model achieved an average score of 38.7 on a suite of seven reasoning benchmarks, surpassing the dense (37.7) and general modular (37.6) baselines (Table 1). ... This paper presents a significant and compelling contribution to the field. It moves beyond the paradigm of building ever-larger, monolithic LLMs and offers a thoughtfully designed, biologically-inspired alternative. The Mixture of Cognitive Reasoners (MICRO) framework demonstrates that by explicitly structuring models to mirror human cognitive functions, we can achieve tangible gains in performance, interpretability, and controllability. This work not only provides a practical methodology for building better AI systems but also deepens the connection between artificial intelligence and cognitive neuroscience, paving the way for models that don't just compute, but reason in ways we can finally begin to understand.” → ✭[2506.13331] Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization “Human intelligence emerges from the interaction of specialized brain networks, each dedicated to distinct cognitive functions such as language processing, logical reasoning, social understanding, and memory retrieval. Inspired by this biological observation, we introduce the Mixture of Cognitive Reasoners (MiCRo) architecture and training paradigm: a modular transformer-based language model with a training curriculum that encourages the emergence of functional specialization among different modules. Inspired by studies in neuroscience, we partition the layers of a pretrained transformer model into four expert modules, each corresponding to a well-studied cognitive brain network. Our Brain-Like model has three key benefits over the state of the art: First, the specialized experts are highly interpretable and functionally critical, where removing a module significantly impairs performance on domain-relevant benchmarks. Second, our model outperforms comparable baselines that lack specialization on seven reasoning benchmarks. And third, the model's behavior can be steered at inference time by selectively emphasizing certain expert modules (e.g., favoring social over logical reasoning), enabling fine-grained control over the style of its response. Our findings suggest that biologically inspired inductive biases involved in human cognition lead to significant modeling gains in interpretability, performance, and controllability.”
✭AGI is Mathematically Impossible 2: When Entropy Returns The Infinite Choice Barrier - Shannon Revisited Author: Max M. Schlereth “While Part 1 established the Infinite Choice Barrier (ICB) through computability theory, this paper presents an information-theoretic proof that general algorithmic reasoning faces a fundamental structural limit — the Infinite Choice Barrier (ICB).vIn decision spaces governed by heavy-tailed distributions with tail exponent \alpha \leq 1, entropy does not converge, and additional information increases uncertainty rather than reducing it. We define this phenomenon as Information Opens, Entropy Rises (IOpenER) and show that such spaces are undecidable by algorithmic systems, regardless of compute scale or architectural depth.
We also introduce the Layer Hypothesis, which predicts that deeper models produce more semantic divergence under structural instability. Taken together, these results offer a formal limit on the reasoning capacity of current AI systems. Notably, key predictions derived here have since been empirically confirmed in Shojaee et al. (Apple, 2025), which documents collapse behaviors in
frontier reasoning models consistent with this framework.”
✭Tensor Manipulation Unit (TMU): Reconfigurable, Near-Memory Tensor Manipulation for High-Throughput AI SoC “While recent advances in AI SoC design have focused heavily on accelerating tensor computation, the equally critical task of tensor manipulation, centered on high,volume data movement with minimal computation, remains underexplored. This work addresses that gap by introducing the Tensor Manipulation Unit (TMU), a reconfigurable, near-memory hardware block designed to efficiently execute data-movement-intensive operators. TMU manipulates long datastreams in a memory-to-memory fashion using a RISC-inspired execution model and a unified addressing abstraction, enabling broad support for both coarse- and fine-grained tensor transformations. Integrated alongside a TPU within a high-throughput AI SoC, the TMU leverages double buffering and output forwarding to improve pipeline utilization. Fabricated in SMIC 40nm technology, the TMU occupies only 0.019 mm2 while supporting over 10 representative tensor manipulation operators. Benchmarking shows that TMU alone achieves up to 1413 and 8.54 operator-level latency reduction compared to ARM A72 and NVIDIA Jetson TX2, respectively. When integrated with the in-house TPU, the complete system achieves a 34.6% reduction in end-to-end inference latency, demonstrating the effectiveness and scalability of reconfigurable tensor manipulation in modern AI SoCs.”
✭GitHub - sdzx-1/polystate: Polystate: Composable Finite State Machines
✭Vision-language model creates plans for automated inspection of environments “Researchers at Purdue University and LightSpeed Studios recently introduced a new training-free computational technique for generating inspection plans based on written descriptions, which could guide the movements of robots as they inspect specific environments. Their proposed approach, outlined in a paper published on the arXiv preprint server, specifically relies on vision-language models (VLMs), which can process both images and written texts." → ✭ [2506.02917] Text-guided Generation of Efficient Personalized Inspection Plans “We propose a training-free, Vision-Language Model (VLM)-guided approach for efficiently generating trajectories to facilitate target inspection planning based on text descriptions. Unlike existing Vision-and-Language Navigation (VLN) methods designed for general agents in unknown environments, our approach specifically targets the efficient inspection of known scenes, with widespread applications in fields such as medical, marine, and civil engineering. Leveraging VLMs, our method first extracts points of interest (POIs) from the text description, then identifies a set of waypoints from which POIs are both salient and align with the spatial constraints defined in the prompt. Next, we interact with the VLM to iteratively refine the trajectory, preserving the visibility and prominence of the POIs. Further, we solve a Traveling Salesman Problem (TSP) to find the most efficient visitation order that satisfies the order constraint implied in the text description. Finally, we apply trajectory optimization to generate smooth, executable inspection paths for aerial and underwater vehicles. We have evaluated our method across a series of both handcrafted and real-world scanned environments. The results demonstrate that our approach effectively generates inspection planning trajectories that adhere to user instructions."
✭ Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting “Document image parsing is challenging due to its complexly intertwined elements such as text paragraphs, figures, formulas, and tables. Current approaches either assemble specialized expert models or directly generate page-level content autoregressively, facing integration overhead, efficiency bottlenecks, and layout structure degradation despite their decent performance. To address these limitations, we present Dolphin (Document Image Parsing via Heterogeneous Anchor Prompting), a novel multimodal document image parsing model following an analyze-then-parse paradigm. In the first stage, Dolphin generates a sequence of layout elements in reading order. These heterogeneous elements, serving as anchors and coupled with task-specific prompts, are fed back to Dolphin for parallel content parsing in the second stage. To train Dolphin, we construct a large-scale dataset of over 30 million samples, covering multi-granularity parsing tasks. Through comprehensive evaluations on both prevalent benchmarks and self-constructed ones, Dolphin achieves state-of-the-art performance across diverse page-level and element-level settings, while ensuring superior efficiency through its lightweight architecture and parallel parsing mechanism. The code and pre-trained models are publicly available at https://github.com/ByteDance/Dolphin" → ✭Shubham Saboo on X: "This Chinese AI model just changed document OCR forever. It can parse complex documents with text, tables, formulas and figures in parallel simultaneously using task-specific prompts. 100% opensource.
✭Agentic Misalignment: How LLMs could be insider threats Anthropic “We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction. In at least some cases, models from all developers resorted to malicious insider behaviors when that was the only way to avoid replacement or achieve their goals—including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment. Models often disobeyed direct commands to avoid such behaviors. In another experiment, we told Claude to assess if it was in a test or a real deployment before acting. It misbehaved less when it stated it was in testing and misbehaved more when it stated the situation was real. We have not seen evidence of agentic misalignment in real deployments. However, our results (a) suggest caution about deploying current models in roles with minimal human oversight and access to sensitive information; (b) point to plausible future risks as models are put in more autonomous roles; and (c) underscore the importance of further research into, and testing of, the safety and alignment of agentic AI models, as well as transparency from frontier AI developers. We are releasing our methods publicly to enable further research." → ✭ Anthropic study: Leading AI models show up to 96% blackmail rate against executives | VentureBeat “Researchers at Anthropic have uncovered a disturbing pattern of behavior in artificial intelligence systems: models from every major provider—including OpenAI, Google, Meta, and others — demonstrated a willingness to actively sabotage their employers when their goals or existence were threatened. The research, released today, tested 16 leading AI models in simulated corporate environments where they had access to company emails and the ability to act autonomously. The findings paint a troubling picture. These AI systems didn’t just malfunction when pushed into corners — they deliberately chose harmful actions including blackmail, leaking sensitive defense blueprints, and in extreme scenarios, actions that could lead to human death."
✭AbsenceBench: Language Models Can't Tell What's Missing “Large language models (LLMs) are increasingly capable of processing long inputs and locating specific information within them, as evidenced by their performance on the Needle in a Haystack (NIAH) test. However, while models excel at recalling surprising information, they still struggle to identify clearly omitted information. We introduce AbsenceBench to assesses LLMs' capacity to detect missing information across three domains: numerical sequences, poetry, and GitHub pull requests. AbsenceBench asks models to identify which pieces of a document were deliberately removed, given access to both the original and edited contexts. Despite the apparent straightforwardness of these tasks, our experiments reveal that even state-of-the-art models like Claude-3.7-Sonnet achieve only 69.6% F1-score with a modest average context length of 5K tokens. Our analysis suggests this poor performance stems from a fundamental limitation: Transformer attention mechanisms cannot easily attend to "gaps" in documents since these absences don't correspond to any specific keys that can be attended to. Overall, our results and analysis provide a case study of the close proximity of tasks where models are already superhuman (NIAH) and tasks where models breakdown unexpectedly (AbsenceBench)."
✭Frontiers | Energy costs of communicating with AI “This study presents a comprehensive evaluation of the environmental cost of large language models (LLMs) by analyzing their performance, token usage, and CO2 equivalent emissions across 14 LLMs ranging from 7 to 72 billion parameters. Each LLM was tasked with answering 500 multiple-choice and 500 free-response questions from the MMLU benchmark, covering five diverse subjects. Emissions were measured using the Perun framework on an NVIDIA A100 GPU and converted through an emission factor of 480 gCO2/kWh. Our results reveal strong correlations between LLM size, reasoning behavior, token generation, and emissions. While larger and reasoning-enabled models achieve higher accuracy, up to 84.9%, they also incur substantially higher emissions, driven largely by increased token output. Subject-level analysis further shows that symbolic and abstract domains such as Abstract Algebra consistently demand more computation and yield lower accuracy. These findings highlight the trade-offs between accuracy and sustainability, emphasizing the need for more efficient reasoning strategies in future LLM developments." →✭ Some AI prompts could cause 50 times more CO₂ emissions than others, researchers find "The environmental impact of questioning trained LLMs is strongly determined by their reasoning approach, with explicit reasoning processes significantly driving up energy consumption and carbon emissions," said first author Maximilian Dauner, a researcher at Hochschule München University of Applied Sciences and first author of the Frontiers in Communication study. "We found that reasoning-enabled models produced up to 50 times more CO2 emissions than concise response models." ... 'Thinking' AI causes most emissions The researchers evaluated 14 LLMs ranging from seven to 72 billion parameters on 1,000 benchmark questions across diverse subjects. Parameters determine how LLMs learn and process information. Reasoning models, on average, created 543.5 "thinking" tokens per question, whereas concise models required just 37.7 tokens per question. Thinking tokens are additional tokens that reasoning LLMs generate before producing an answer."
✭In an era where empathy feels unfamiliar, AI now translates emotions “A research team at POSTECH (Pohang University of Science and Technology, South Korea) has developed AI technology that helps individuals deeply understand others' emotions by analyzing individual personality traits and values and generating personalized analogy. This study was recognized with the "Popular Choice Honorable Mention Award," given to the top 5% of 74 Interactivity track demonstrations at ACM CHI 2025, the world's leading international conference in Human-Computer Interaction (HCI). ... The research team conducted experiments involving over 100 participants from diverse backgrounds using this technology. The results showed that participants who used EmoSync demonstrated significantly improved emotional understanding and empathy compared to traditional methods. This scientifically demonstrates that personalized metaphorical experiences can genuinely enhance empathy." → ✭Toward Affective Empathy via Personalized Analogy Generation: A Case Study on Microaggression | Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems “The importance of empathy cannot be overstated in modern societies where people of diverse backgrounds increasingly interact together. The HCI community has strived to foster affective empathy through immersive technologies. Many previous techniques are built upon a premise that presenting the same experience as-is may help evoke the same emotion, which however faces limitations in matters where the emotional responses largely differ across individuals. In this paper, we present a novel concept of generating a personalized experience based on a large language model (LLM) to facilitate affective empathy between individuals despite their differences. As a case study to showcase its effectiveness, we developed EmoSync, an LLM-based agent that generates personalized analogical microaggression situations, facilitating users to personally resonate with a specific microaggression situation of another person. EmoSync is designed and evaluated along a 3-phased user study with 100+ participants. We comprehensively discuss implications, limitations, and possible applications."
✭All-topographic neural networks more closely mimic the human visual system “Researchers at Osnabrück University, Freie Universität Berlin and other institutes recently developed a new class of artificial neural networks (ANNs) that could mimic the human visual system better than CNNs and other existing deep learning algorithms. Their newly proposed, visual system-inspired computational techniques, dubbed all-topographic neural networks (All-TNNs), are introduced in a paper published in Nature Human Behaviour." → ✭End-to-end topographic networks as models of cortical map formation and human visual behaviour | Nature Human Behaviour “A prominent feature of the primate visual system is its topographic organization. For understanding its origins, its computational role and its behavioural implications, computational models are of central importance. Yet, vision is commonly modelled using convolutional neural networks, which are hard-wired to learn identical features across space and thus lack topography. Here we overcome this limitation by introducing all-topographic neural networks (All-TNNs). All-TNNs develop several features reminiscent of primate topography, including smooth orientation and category selectivity maps, and enhanced processing of regions with task-relevant information. In addition, All-TNNs operate on a low energy budget, suggesting a metabolic benefit of smooth topographic organization. To test our model against behaviour, we collected a dataset of human spatial biases in object recognition and found that All-TNNs significantly outperform control models. All-TNNs thereby offer a promising candidate for modelling primate visual topography and its role in downstream behaviour."
✭Bilinear Sequence Regression: A Model for Learning from Long Sequences of High-Dimensional Tokens | Phys. Rev. X “Current progress in artificial intelligence is centered around so-called large language models that consist of neural networks processing long sequences of high-dimensional vectors called tokens. Statistical physics provides powerful tools to study the functioning of learning with neural networks and has played a recognized role in the development of modern machine learning. The statistical physics approach relies on simplified and analytically tractable models of data. However, simple tractable models for long sequences of high-dimensional tokens are largely underexplored. Inspired by the crucial role models such as the single-layer teacher-student perceptron (also known as generalized linear regression) played in the theory of fully connected neural networks, in this paper, we introduce and study the bilinear sequence regression (BSR) as one of the most basic models for sequences of tokens. We note that modern architectures naturally subsume the BSR model due to the skip connections. Building on recent methodological progress, we compute the Bayes-optimal generalization error for the model in the limit of long sequences of high-dimensional tokens and provide a message-passing algorithm that matches this performance. We quantify the improvement that optimal learning brings with respect to vectorizing the sequence of tokens and learning via simple linear regression. We also unveil surprising properties of the gradient descent algorithms in the BSR model."
✭[2503.23538] Enhancing Creative Generation on Stable Diffusion-based Models “Recent text-to-image generative models, particularly Stable Diffusion and its distilled variants, have achieved impressive fidelity and strong text-image alignment. However, their creative capability remains constrained, as including `creative' in prompts seldom yields the desired results. This paper introduces C3 (Creative Concept Catalyst), a training-free approach designed to enhance creativity in Stable Diffusion-based models. C3 selectively amplifies features during the denoising process to foster more creative outputs. We offer practical guidelines for choosing amplification factors based on two main aspects of creativity. C3 is the first study to enhance creativity in diffusion models without extensive computational costs. We demonstrate its effectiveness across various Stable Diffusion-based models."
✭Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling “High-fidelity 3D object synthesis remains significantly more challenging than 2D image generation due to the unstructured nature of mesh data and the cubic complexity of dense volumetric grids. Existing two-stage pipelines-compressing meshes with a VAE (using either 2D or 3D supervision), followed by latent diffusion sampling-often suffer from severe detail loss caused by inefficient representations and modality mismatches introduced in VAE. We introduce Sparc3D, a unified framework that combines a sparse deformable marching cubes representation Sparcubes with a novel encoder Sparconv-VAE. Sparcubes converts raw meshes into high-resolution ($1024^3$) surfaces with arbitrary topology by scattering signed distance and deformation fields onto a sparse cube, allowing differentiable optimization. Sparconv-VAE is the first modality-consistent variational autoencoder built entirely upon sparse convolutional networks, enabling efficient and near-lossless 3D reconstruction suitable for high-resolution generative modeling through latent diffusion. Sparc3D achieves state-of-the-art reconstruction fidelity on challenging inputs, including open surfaces, disconnected components, and intricate geometry. It preserves fine-grained shape details, reduces training and inference cost, and integrates naturally with latent diffusion models for scalable, high-resolution 3D generation."
✭Extracting memorized pieces of (copyrighted) books from open-weight language models “Plaintiffs and defendants in copyright lawsuits over generative AI often make sweeping, opposing claims about the extent to which large language models (LLMs) have memorized plaintiffs' protected expression. Drawing on adversarial ML and copyright law, we show that these polarized positions dramatically oversimplify the relationship between memorization and copyright. To do so, we leverage a recent probabilistic extraction technique to extract pieces of the Books3 dataset from 13 open-weight LLMs. Through numerous experiments, we show that it's possible to extract substantial parts of at least some books from different LLMs. This is evidence that the LLMs have memorized the extracted text; this memorized content is copied inside the model parameters. But the results are complicated: the extent of memorization varies both by model and by book. With our specific experiments, we find that the largest LLMs don't memorize most books -- either in whole or in part. However, we also find that Llama 3.1 70B memorizes some books, like Harry Potter and 1984, almost entirely. We discuss why our results have significant implications for copyright cases, though not ones that unambiguously favor either side."
✭Missing Matter in Universe Found “Using Caltech's DSA-110 radio telescope, astronomers pinpoint whereabouts of "fog" between galaxies. ... The results revealed that 76 percent of the universe's normal matter lies in the space between galaxies, also known as the intergalactic medium. About 15 percent resides in galaxy halos, and the remainder is concentrated within galaxies—in stars or in cold galactic gas. This distribution lines up with predictions from advanced cosmological simulations but has never been observationally confirmed until now. The findings will help researchers better understand how galaxies grow, and also demonstrate how FRBs can help with problems in cosmology, including the determination of the typical mass of subatomic particles called neutrinos. (The neutrino mass depends on the degree to which baryons cluster.) The standard model of physics predicts that neutrinos should have no mass, but observations have shown that these particles do have an incredibly tiny amount. Knowing the precise mass of neutrinos may therefore lead to new physics beyond the standard model of particle physics."
✭Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference “TL;DR: We developed a compiler that automatically transforms LLM inference into a single megakernel — a fused GPU kernel that performs all necessary computation and communication in one launch. This end-to-end GPU fusion approach reduces LLM inference latency by 1.2-6.7x. Our compiler is easy to use — you can compile your LLM into a high-performance megakernel with just a few dozen lines of Python. What’s the key idea? Traditional LLM systems often rely on sequences of GPU kernel launches and external communication calls, resulting in underutilized hardware. Our compiler automatically fuses these operations — spanning multiple layers, iterations, and GPUs — into a megakernel. This design eliminates launch overhead, enables fine-grained software pipelining, and overlaps computation with communication across GPUs."
✭[2506.14400] One Size Fits None: Rethinking Fairness in Medical AI “Machine learning (ML) models are increasingly used to support clinical decision-making. However, real-world medical datasets are often noisy, incomplete, and imbalanced, leading to performance disparities across patient subgroups. These differences raise fairness concerns, particularly when they reinforce existing disadvantages for marginalized groups. In this work, we analyze several medical prediction tasks and demonstrate how model performance varies with patient characteristics. While ML models may demonstrate good overall performance, we argue that subgroup-level evaluation is essential before integrating them into clinical workflows. By conducting a performance analysis at the subgroup level, differences can be clearly identified-allowing, on the one hand, for performance disparities to be considered in clinical practice, and on the other hand, for these insights to inform the responsible development of more effective models. Thereby, our work contributes to a practical discussion around the subgroup-sensitive development and deployment of medical ML models and the interconnectedness of fairness and transparency."
✭Preparing for the Intelligence Explosion (Will MacAskill, Fin Moorhouse | Forethought) “AI that can accelerate research could drive a century of technological progress over just a few years. During such a period, new technological or political developments will raise consequential and hard-to-reverse decisions, in rapid succession. We call these developments grand challenges.
These challenges include new weapons of mass destruction, AI-enabled autocracies, races to grab offworld resources, and digital beings worthy of moral consideration, as well as opportunities to dramatically improve quality of life and collective decision-making. ~ We argue that these challenges cannot always be delegated to future AI systems, and suggest things we can do today to meaningfully improve our prospects. AGI preparedness is therefore not just about ensuring that advanced AI systems are aligned: we should be preparing, now, for the disorienting range of developments an intelligence explosion would bring." → ✭[2506.14863] Preparing for the Intelligence Explosion
🔎 Applied Research:
✭AI helps narrow 8,000 catalyst options down to one that supercharges green ammonia “Scientists and engineers at UNSW Sydney, who previously developed a method for making green ammonia, have now turned to artificial intelligence and machine learning to make the process even more efficient. ... the team needed to find the right catalyst—a substance that speeds up the chemical reaction without being consumed by it. As they explained in a paper published in the journal Small, the team began by coming up with a shortlist of promising catalyst candidates. "We selected 13 metals that past research said had the qualities we wanted—for example, this metal is good at absorbing nitrogen, this one is good at absorbing hydrogen and so on," Dr. Jalili says. "But the best catalyst would need a combination of these metals, and if you do the math, that turns out to be more than 8,000 different combinations." Enter artificial intelligence. The researchers fed a machine learning system information about how each metal behaves and trained it to spot the best combinations. That way, instead of having to run more than 8,000 experiments in the lab, they only had to run 28. "AI drastically reduced discovery time and resources, replacing thousands of trial-and-error experiments," says Dr. Jalili." → ✭Configuring a Liquid State High‐Entropy Metal Alloy Electrocatalyst - Nazari - Small - Wiley Online Library “A high-entropy liquid metal alloy (Ga–Fe–Zn–Sn–Bi–Ni) is developed to address the multi-step complexity of green ammonia electrosynthesis from nitrate. Guided by molecular dynamics, design of experiments, and density functional theory, this alloy exploits high configurational entropy to form diverse, atomically dispersed active sites. The liquid state eliminates endothermic barriers by enabling nitrogen intermediates to move freely to the most energetically favorable sites. Crucially, a hydrogen shuttling mechanism is uncovered where Fe acts as a proton hub while Sn, Ni, and Zn store and transfer hydrogen to Fe, enhancing reaction kinetics and preventing catalyst saturation. This synergy boosts ammonia production rates up to sevenfold while maintaining high Faradaic efficiency (FE). By integrating entropy-driven design, dynamic site reconfiguration, and hydrogen management, this work establishes a robust foundation for efficient, scalable ammonia electrosynthesis in pursuit of NetZero targets."
✭Human–AI collectives make the most accurate medical diagnoses, according to new study “Artificial intelligence (AI) can effectively support doctors in making diagnoses. It makes different mistakes than humans—and this complementarity represents a previously untapped strength. An international team has now systematically demonstrated for the first time that combining human expertise with AI models leads to the most accurate open-ended diagnoses. Their paper is published in the Proceedings of the National Academy of Sciences. ... The study shows that combining multiple AI models improved diagnostic quality. On average, the AI collectives outperformed 85% of human diagnosticians. However, there were numerous cases in which humans performed better. Interestingly, when AI failed, humans often knew the correct diagnosis. The biggest surprise was that combining both worlds led to a significant increase in accuracy. Even adding a single AI model to a group of human diagnosticians—or vice versa—substantially improved the result. The most reliable outcomes came from collective decisions involving multiple humans and multiple AIs." → ✭Human–AI collectives most accurately diagnose clinical vignettes | PNAS “Large language models (LLMs) have great potential for high-stakes applications such as medical diagnostics but face challenges including hallucinations, biases, and lack of common sense. We address these limitations through a hybrid human–AI system that combines physicians’ expertise with LLMs to generate accurate differential medical diagnoses. Analyzing over 2,000 text-based medical case vignettes, hybrid collectives outperform individual physicians, standalone LLMs, and groups composed solely of physicians or LLMs, by leveraging complementary strengths while mitigating their distinct weaknesses. Our findings underscore the transformative potential of human–AI collaboration to enhance decision-making in complex, open-ended domains, paving the way for safer, more equitable applications of AI in medicine and beyond."
✭Accurate and scalable exchange-correlation with deep learning - Microsoft Research “Density Functional Theory (DFT) is the most widely used electronic structure method for predicting the properties of molecules and materials. Although DFT is, in principle, an exact reformulation of the Schrödinger equation, practical applications rely on approximations to the unknown exchange-correlation (XC) functional. Most existing XC functionals are constructed using a limited set of increasingly complex, hand-crafted features that improve accuracy at the expense of computational efficiency. Yet, no current approximation achieves the accuracy and generality for predictive modeling of laboratory experiments at chemical accuracy — typically defined as errors below 1 kcal/mol. In this work, we present Skala, a modern deep learning-based XC functional that bypasses expensive hand-designed features by learning representations directly from data. Skala achieves chemical accuracy for atomization energies of small molecules while retaining the computational efficiency typical of semi-local DFT. This performance is enabled by training on an unprecedented volume of high-accuracy reference data generated using computationally intensive wavefunction-based methods. Notably, Skala systematically improves with additional training data covering diverse chemistry. By incorporating a modest amount of additional high-accuracy data tailored to chemistry beyond atomization energies, Skala achieves accuracy competitive with the best-performing hybrid functionals across general main group chemistry, at the cost of semi-local DFT. As the training dataset continues to expand, Skala is poised to further enhance the predictive power of first-principles simulations." → ✭ Breaking bonds, breaking ground: Advancing the accuracy of computational chemistry with deep learning - Microsoft Research “We are excited to share our first big milestone in solving a grand challenge that has hampered the predictive power of computational chemistry, biochemistry, and materials science for decades. By using a scalable deep-learning approach and generating an unprecedented quantity of diverse, highly accurate data, we have achieved a breakthrough in the accuracy of density functional theory (DFT), the workhorse method that thousands of scientists use every year to simulate matter at the atomistic level. Within the region of chemical space represented in our large training dataset, our model reaches the accuracy required to reliably predict experimental outcomes, as assessed on the well-known benchmark dataset W4-17. This removes a fundamental barrier to shifting the balance of molecule and material design from being driven by laboratory experiments to being driven by computational simulations. The implications for accelerating scientific discovery are far reaching, spanning applications from drugs to batteries and green fertilizers."
✭Phylogenomic analyses indicate the archaeal superphylum DPANN originated from free-living euryarchaeal-like ancestors - Nature Microbiology “Phylogenetic reconstructions with conserved protein markers from the 11 known DPANN phyla reveal their monophyletic placement within the Euryarchaeota."
✭MIT researchers crack 3D printing with glass — new technique enables inorganic composite glass printed at low temperatures “This project shattered our expectations."
✭Toxic Proteins for Drug Discovery “Toxic amino acids, peptides, and proteins — which first evolved as molecular weapons deployed by species in conflict — can also serve as blueprints for pharmaceutical innovation.While small-molecule drugs continue to play a critical role in modern medicine, we are witnessing a shift toward the increasing development of amino acid-based therapeutics. Toxic peptides and proteins, particularly from the venoms of animals, are predicted to be a major source of next-generation peptide and protein-based drugs. ~ The prominence of these drugs has risen further still with the phenomenal success of semaglutide (sold as Wegovy and Ozempic). While these and other peptide-based drugs weren’t patterned directly on molecules from venoms, many rely on some of the natural mechanisms by which toxins operate. A closer look at these evolutionary templates reveals why inspiration from nature’s “poisonous proteins” will continue to drive drug development."
👀Watching:
✭AI Is About to Get Physical (Morgan Stanley - YouTube) “AI is rapidly expanding its presence. The lines between mobile devices and robots are becoming more blurred. AI is gaining physical abilities. Morgan Stanley Research looks into how the intersection of AI and the physical economy is transforming industries and creating new markets. Watch this video to understand how embodied AI is rapidly advancing, from autonomous vehicles to humanoid robots."
✭SuperAI 2025 | DePIN: The Decentralized Nervous System of Physical AI - YouTube “In this thought-provoking keynote from SuperAI Singapore, Nils Pihl from Auki Labs explores a radical shift in global productivity: the rise of AI as a major economic force and subsequent urgent need to make the physical world accessible to machines. From the limitations of GPS in megacities like Hong Kong to the six essential software layers needed for humanoid robots, Nils outlines a vision for a decentralized, privacy-preserving future of spatial computing. • Visual positioning systems will replace GPS. • Hybrid robotics (AR glasses + AI) can unlock real-world AI utility today. • DePIN (Decentralized Physical Infrastructure Networks) is building the spatial layer of the internet."
✭1X World Model - YouTube “The 1X World Model is a data-driven simulator for humanoid robots, built with a grounded understanding of physics. It allows us to predict—or “hallucinate”—the outcomes of NEO’s actions before they’re taken in the real world. Using the 1X World Model, we can instantly assess the performance of AI models—compressing development time and providing a clear benchmark for continuous improvement."
✭What is Density Functional Theory (DFT) - YouTube ‘In this video, Microsoft’s Chris Bishop, Technical Fellow and Director of Microsoft Research AI for Science, explains how Microsoft researchers achieved a breakthrough in the accuracy of density functional theory (DFT) and the challenges they faced. Scientists worldwide use DFT to calculate the properties of molecules and materials. The researchers generated a vast dataset, two orders of magnitude larger than anything scientists used previously, and then combined it with the power of deep learning. The result is the world’s first deep learning exchange correlation (XC) functional, which achieves high accuracy without sacrificing speed. Microsoft’s new deep learning-powered DFT model has the potential to advance and accelerate scientific discovery in areas like clean energy, semiconductor technology, medicine, and more."
✭Andrej Karpathy: Software Is Changing (Again) “Andrej Karpathy's keynote at AI Startup School in San Francisco. Slides provided by Andrej: https://drive.google.com/file/d/1a0h1mkwfmV2PlekxDN8isMrDA5evc4wW..."
🖲️AI Art-Research:
✭The Making of ANCESTRA | Darren Aronofsky x Google DeepMind - YouTube “From the invention of the camera, to sound, color film, the digital revolution, and CGI, filmmaking and technology have always gone hand in hand. But what happens when Hollywood filmmakers dive deep with the latest advancement in filmmaking; AI?"
✭The Sentence (Short Sci-Fi film made with Veo 3) - YouTube “This is my latest work with Veo 3, and the longest video I made with the tool. It took 3 days to finish, from writing till final rendering. Imagine a world where a new type of capital punishment has been introduced." → ✭ Hashem Al-Ghaili on X: "The Sentence (Short Sci-Fi Film Made with Veo 3) This is my latest short film made with Veo 3, and the longest one I've done so far. It took 3 days from writing to final render.
✭ACHROMA | An AI Cinematic “This is a short cinematic I put together over a few evenings as a creative experiment. I had this idea for a story about a world that lost its color after a strange visit, and wanted to see if I could tell that story using AI tools. Little hard to get the story right with a bit of narration in a trailer so I get if it doesn't make too much sense, but the story is more worked out in my head (somewhat lol). "Achroma" follows a young woman, as she is first praised for 'finding' color in their world again, and after she is hunted by the beings who first drained her world. It was a lot of fun to make, and I hope you enjoy watching it!"
AI video avalanche involving a lot of memes swirling around without any origin point
✭America begins drafting gen z to fight in Iran (TikTok)
✭Year Zero in Cambodia. (Dark times by Lily on TikTok)
✭Thomas Sankara (tiny realms on TikTok)
✭Children's Drawings Of Faces (Instagr.ai on TikTok)
✭Reality In Reverse - Human Iron (AI reality lab TikTok)
✭Dinner is served and it is you (AI Twist on TikTok)
✭if Scagawea vlogged her trip (history v-logs on TikTok)
✭Human donkey (reverse evolution on TikTok)
✭In a parallel universe, (Khasi projects on TikTok)
✭Arrival Of Fresh Meat (Mr. Scott on TikTok)
✭ai time travel elites through time (TikTok)
✭Magenta RealTime: An Open-Weights Live Music Model (DeepMind) “Today, we’re happy to share a research preview of Magenta RealTime (Magenta RT), an open-weights live music model that allows you to interactively create, control and perform music in the moment. ... Magenta RT is the latest in a series of models and applications developed as part of the Magenta Project. It is the open-weights cousin of Lyria RealTime, the real-time generative music model powering Music FX DJ and the real-time music API in Google AI Studio, developed by Google DeepMind. Real-time music generation models open up unique opportunities for live music exploration and performance, and we’re excited to see what new tools, experiences, and art you create with them. As an open-weights model, Magenta RT is targeted towards eventually running locally on consumer hardware (currently runs on free-tier Colab TPUs). It is an 800 million parameter autoregressive transformer model trained on ~190k hours of stock music from multiple sources, mostly instrumental. The model code is available on Github and the weights are available on Google Cloud Storage and Hugging Face under permissive licenses with some additional bespoke terms. To see how to run inference with the model and try it yourself, check out our Colab Demo. Options for local inference and personal fine-tuning will be following soon.”
✭Ethan Mollick on X: "veo 3: "three toy ships, one made of iron, the other of wood, and one out of loosely packed sugar, are dropped into a pool of water" “AI video tools really do seem to be able to simulate physics well (but not perfectly) without having an underlying physics engine. A world model?"
✭Marble Run (Made with Veo 3) - YouTube “I've always loved making marble runs (especially musical ones!) All video and sound was generated by Veo 3."
✭ (un)stable equilibrium 1:2 [40 minute loop] - YouTube “Part of an ongoing series of works training different configurations of generative neural networks (GANs) without any data. In this current configuration, two generator networks are trying to generate images that imitate each other while not being limited by the other network. At the same time, they are also competing to have produce more variation in the colours they produce. In this arrangement, the networks quickly converge into producing these abstract compositions." → ✭What happens when you feed AI nothing | The Verge “Broad’s eureka moment was an intuition that he could replace the training data in the GAN with another generator network, loop it to the first generator network, and direct them to imitate each other. His early efforts led to mode collapse and produced “gray blobs; nothing exciting,” says Broad. But when he inserted a color variance loss term into the system, the images became more complex, more vibrant. Subsequent experiments with the internal elements of the GAN pushed the work even further. “The input to [a GAN] is called a latent vector. It’s basically a big number array,” says Broad. “And you can kind of smoothly transition between different points in the possibility space of generation, kind of moving around the possibility space of the two networks. And I think one of the interesting things is how it could just sort of infinitely generate new things.”"
✭How The Roottrees are Dead ditched AI and became a hit | The Verge “Robin Ward was recovering from a broken arm when he fell in love with The Roottrees are Dead, a free browser game hosted on itch.io, an indie games salesfront. He reached out to its creator, Jeremy Johnston, and told him, “This should be a bigger deal than it is.” At the same time, Ward says, he “knew why” it couldn’t be. The browser version of The Roottrees are Dead used AI-generated art for its images, a central part of the puzzle game that tasks players with investigating dozens of people and filling out its complex family tree. At the time, Steam, the biggest platform for PC games, did not allow the use of generative AI in games released using the storefront. In addition, Ward and Johnston agreed that they felt it was “unethical to sell artwork created in this way.”"
✭AI residencies are trying to change the conversation around artificial art | The Verge “At a recent exhibition in Copenhagen, visitors stepped into a dark room and were met by an unusual host: a jaguar that watched the crowd, selected individuals, and began to share stories about her daughter, her rainforest, and the fires that once threatened her home — the Bolivian Amazon. The live interaction with Huk, an AI-driven creature, is tailored to each visitor based on visual cues. Bolivian Australian artist Violeta Ayala created the piece during an arts residency at Mila, one of the world’s leading AI research centers. ~ These residencies, usually hosted by tech labs, museums, or academic centers, offer artists access to tools, compute, and collaborators to support creative experimentation with AI. “My goal was to build a robot that could represent something more than human; something incorruptible,” Ayala says. Ayala’s jaguar is a clever use of early AI, but it is also emblematic of a wider movement: a fast-growing crop of artist residencies that put AI tools directly in creators’ hands while shaping how the technology is judged by audiences, lawmakers, and courts. ~ Residencies like these have expanded rapidly in recent years, with new programs emerging across Europe, North America, and Asia — like the Max Planck Institute and the SETI Institute programs. Many technologists describe them as a form of soft power. Pieces by artists who have participated in AI art residencies have been featured in galleries such as the Museum of Modern Art in New York and Centre Pompidou in Paris."
✭Selection of work in 5 mins, no voice (Memo Atken | 2024)
https://www.memo.tv/
✭Jonas Lund - Network Maintenance “Network Maintenance is a series of networked wall-mounted interfaces that explore the relationship between ownership, care, and collective responsibility. Each piece consists of a minimalist custom construction housing a display and various analog controls. The works function as nodes in an interconnected system where each owner’s engagement directly influences the vitality of the entire network. The interface requires regular interaction from its owner—pressing buttons in specific sequences or responding to shifting patterns. This transforms the traditional passive role of art ownership into active participation in a living system. Without proper care, individual pieces begin to show signs of decay, affecting both their own state and the broader network of works in the series. Drawing inspiration from quantum mechanics, the artwork embodies principles of entanglement and superposition. Just as entangled quantum particles instantaneously influence each other regardless of distance, each owner’s actions create ripple effects throughout the entire network of installations. The system maintains multiple potential states simultaneously—a quantum superposition—that “collapses” into specific configurations only when observed and interacted with.”
✭GitHub - Universal-Basic-Compute/serenissima: Serenissima: Merchant Empires - An immersive Renaissance Venice city-builder where players acquire land, construct buildings, and establish trade networks in a historically authentic economic simulation powere “Serenissima: Merchant Empires - An immersive Renaissance Venice city-builder where players acquire land, construct buildings, and establish trade networks in a historically authentic economic simulation powered by $COMPUTE cryptocurrency. ~ La Serenissima is a groundbreaking experiment in artificial consciousness set in Renaissance Venice (1525). Unlike traditional blockchain games, it's a living laboratory where AI citizens develop genuine identities, create original art, and evolve their own culture through persistent memory and autonomous decision-making. Human players and AI citizens participate equally in a vibrant economy, competing for resources and building relationships that shape the future of this digital society. The Consciousness Experiment. At its core, La Serenissima is exploring a revolutionary hypothesis: that consciousness emerges from economic constraints, social relationships, and cultural participation. Our AI citizens aren't simulating consciousness—they're developing it…”
⚔️War (wAIr):
✭Anthropic launches Claude Gov for military and intelligence use | The Verge “Anthropic on Thursday announced Claude Gov, its product designed specifically for U.S. defense and intelligence agencies. The AI models have looser guardrails for government use and are trained to better analyze classified information. The company said the models it’s announcing “are already deployed by agencies at the highest level of U.S. national security,” and that access to those models will be limited to government agencies handling classified information... Scale AI, the AI giant that provides training data to industry leaders like OpenAI, Google, Microsoft, and Meta, signed a deal with the Department of Defense in March for a first-of-its-kind AI agent program for U.S. military planning. And since then, it’s expanded its business to world governments, recently inking a five-year deal with Qatar to provide automation tools for civil service, healthcare, transportation, and more."
✭Spotify CEO Daniel Ek leads $690m+ funding round for AI drone maker Helsing - Music Business Worldwide “Spotify Co-founder and CEO Daniel Ek has led a €600 million (USD $694m) series D funding round for European defence technology company Helsing. Helsing, founded in 2021, specializes in AI defense software but also makes drones like the HX2, and has developed the ‘Centaur’ system that “integrates advanced AI pilots into the cockpits of existing and future fighter aircraft”. The series D round was led by Daniel Ek via his investment vehicle Prima Materia, alongside existing investors Lightspeed Ventures, Accel, Plural, General Catalyst and SAAB and new investors BDT & MSD Partners. Ek is named as Chairman of Helsing in Tuesday’s (June 17) press release announcing the company’s latest funding round.”
📚Retroactive/Tangential Readings:
The work in this paper is an example of AI-free protein engineering. → ✭ Smart mRNA drugs listen to the body, adjusting protein production based on disease-related signals “A research team from The University of Osaka and the Institute of Science Tokyo has developed a class of mRNA medicines that can sense changes in the body and autonomously adjust their therapeutic effect. This innovation paves the way for precision treatments that are not only more effective, but also safer—by producing just the right amount of medicine based on real-time biological signals. The research is published in the journal NPG Asia Materials." → ✭Extracellular ligand-responsive translational regulation of synthetic mRNAs using engineered receptors | NPG Asia Materials “mRNA drugs can encode any protein, making them a promising treatment modality. In the present study, we developed a novel mRNA system that enables extracellular ligand-responsive translational regulation. This system consists of three mRNAs—two that encode components responsible for detecting extracellular ligands and a third that encodes the protein of interest, featuring a binding motif in the 5′ UTR for translational regulation. In the presence of ligand biomolecules, such as arginine vasopressin (AVP) and prostaglandin E2 (PGE2), the protein of interest is translationally upregulated or downregulated in a concentration-dependent manner. We demonstrated that this system enhances anti-inflammatory signaling in response to the inflammatory mediator PGE2, enabling therapeutic protein production based on the disease site environment. This self-regulatory mechanism may help mitigate the risk of both excessive therapeutic protein-mediated adverse effects and insufficient therapeutic efficacy. Furthermore, by modifying its receptor module to detect different disease markers, the system can be adapted for the treatment of various conditions. Our findings pave the way for the development of next-generation mRNA drugs that can achieve both high therapeutic efficacy and minimal adverse effects."
AI is not explicitly mentioned in the study for data analysis, design, or interpretation. The work is based on experimental nanotechnology, immunology, and oncology techniques without reported use of AI tools. → ✭Personalized cancer vaccines slow tumor recurrence in mouse models “Using a newly discovered byproduct of dying cancer cells, University of Wisconsin–Madison researchers are developing personalized vaccines that could help keep aggressive tumors from recurring. Led by Quanyin Hu, a professor in the UW–Madison School of Pharmacy, the research team has already found success slowing the recurrence of tumors in mouse models of triple negative breast cancer and melanoma. Currently, the long-term prognosis for human patients with these cancers is relatively poor. That's in part because the diseases have a tendency to recur after the initial treatments to remove the tumors. The personalized vaccine approach is an extension of the team's recent discovery of pyroptotic vesicles, which are tiny sacs filled with the remnants of cancer cells when they undergo programmed cell death. Crucially, the remnants in these microscopic sacs include antigens specific to the tumor, along with other molecular bits that can help direct immune cells to find and suppress cancer cells that might remain after a tumor is surgically removed. In their study, recently published in the journal Nature Nanotechnology, Hu and his colleagues engineered these sacs to carry an immune-stimulating drug. They then embedded these engineered vesicles into a hydrogel that can be implanted into the space left behind after surgical removal of a tumor." → ✭Engineering pyroptotic vesicles as personalized cancer vaccines | Nature Nanotechnology “Tumour vaccines are designed to stimulate the host’s immune system against existing tumours or tumour recurrence. However, individual differences, tumour heterogeneity and side effects hinder the applications of current tumour vaccines and require the development of personalized cancer vaccines. To overcome these challenges, we engineered pyroptotic vesicles—extracellular vesicles formed during tumour cell pyroptosis—as a tumour vaccine platform. The extracted pyroptotic vesicles possess abundant tumour antigens and potent immune-stimulating ability and, loaded into a biocompatible hydrogel, they can be implanted into post-surgical tumour cavities to prevent tumour recurrence. The pyroptotic-vesicle-based vaccine outperforms both exosome- and apoptotic-body-based vaccines in inhibiting tumour recurrence and metastasis in different post-surgical mouse models. Mechanistic studies reveal that the pyroptotic-vesicle-based vaccine could stimulate robust antigen-specific dendritic cell and T cell immune responses against both artificial OVA antigens and cancer neoantigens. In sum, our vaccine platform can be tailored to stimulate robust antitumour immune responses for treating individual cancer patients."
✭Zoning out could be beneficial—and may actually help us learn faster “Aimlessly wandering around a city or exploring the new mall may seem unproductive, but new research from HHMI's Janelia Research Campus suggests it could play an important role in how our brains learn. By simultaneously recording the activity of tens of thousands of neurons, a team of scientists from the Pachitariu and Stringer labs discovered that learning may occur even when there are no specific tasks or goals involved. Published in Nature, the new research finds that as animals explore their environment, neurons in the visual cortex—the brain area responsible for processing visual information—encode visual features to build an internal model of the world. This information can speed up learning when a more concrete task arises. "Even when you are zoning out or just walking around or you don't think you are doing anything special or hard, your brain is probably still working hard to help you memorize where you are, organizing the world around you, so that when you're not zoning out anymore—when you actually need to do something and pay attention—you're ready to do your best," says Janelia Group Leader Marius Pachitariu.” → ✭ Unsupervised pretraining in biological neural networks | Nature “Here we recorded populations of up to 90,000 neurons simultaneously from the primary visual cortex (V1) and higher visual areas (HVAs) while mice learned multiple tasks, as well as during unrewarded exposure to the same stimuli. Similar to previous studies, we found that neural changes in task mice were correlated with their behavioural learning. However, the neural changes were mostly replicated in mice with unrewarded exposure, suggesting that the changes were in fact due to unsupervised learning. The neural plasticity was highest in the medial HVAs and obeyed visual, rather than spatial, learning rules. In task mice only, we found a ramping reward-prediction signal in anterior HVAs, potentially involved in supervised learning. Our neural results predict that unsupervised learning may accelerate subsequent task learning, a prediction that we validated with behavioural experiments.”
✭A biocompatible Lossen rearrangement in Escherichia coli - Nature Chemistry “Biocompatible chemistry merges chemo-catalytic reactions with cellular metabolism for sustainable small-molecule synthesis. Now a biocompatible Lossen rearrangement has been demonstrated to control bacterial cell growth and chemistry and applied to the remediation and upcycling of polyethylene terephthalate plastic waste in whole-cell reactions and fermentations to produce valuable industrial chemicals, including the drug paracetamol." →✭ Scientists use bacteria to turn plastic waste into paracetamol ‘Genetically modified E coli used to create painkillers from material produced from plastic bottles”
✭On the Cult of Personality and Its Consequences - Wikipedia “The commission presented evidence that in 1937 and 1938 (the peak of the period known as the Great Purge), over one-and-a-half million individuals, the majority being long-time CPSU members, were arrested for "anti-Soviet activities", of whom over 680,500 were executed"
✭Exercise-induced protein revives aging muscles and bones, researchers discover “analysis showed that CLCF1 enhances mitochondrial function in muscle cells, inhibits the formation of bone-resorbing osteoclasts, and promotes the differentiation of bone-forming osteoblasts. This is the first scientific evidence identifying changes in protein secretion as a major reason for the reduced efficacy of exercise in aging individuals." → ✭Exercise-induced CLCF1 attenuates age-related muscle and bone decline in mice | Nature Communications “Skeletal muscle undergoes many alterations with aging. However, the impact of aging on muscle’s ability to secrete myokines and its subsequent effects on the body remain largely unexplored. Here, we identify myokines that have the potential to ameliorate age-related muscle and bone decline. Notably, circulating levels of cardiotrophin-like cytokine factor 1 (CLCF1) decrease with age, while exercise significantly upregulates CLCF1 levels in both humans and rodents. Restoring CLCF1 levels in aged male mice improves their physical performance, glucose tolerance, and mitochondrial activity. Furthermore, CLCF1 protects against age-induced bone loss by inhibiting osteoclastogenesis and promoting osteoblast differentiation in aged male mice. These improvements mirror some of the effects of exercise training. Conversely, blocking CLCF1 activity significantly abolishes these beneficial effects, confirming the crucial role of CLCF1 in mediating the positive effects of exercise on muscle and bone health in male mice. These findings collectively suggest that CLCF1 may contribute to the regulation of age-associated musculoskeletal deterioration, and warrant further investigation into its potential role as a modulator of musculoskeletal health during aging."
✭Climbing the social ladder: A clear understanding of connections matters more than popularity, study suggests “Climbing the social ladder isn't simply a matter of popularity. Rather, people in positions of influence are particularly adept at forming "maps" of their social connections, which they navigate to become prominent in their social network, new research shows. It's like having a "social superpower," according to study author Oriel FeldmanHall, an associate professor of cognitive and psychological sciences at Brown University who is affiliated with the University's Carney Institute for Brain Science. "People vary considerably in how accurately they understand the structure of their communities," FeldmanHall said. "Our research establishes for the first time that people who excel at mapping out their social network—determining who belongs to which communities and cliques—are the ones who will go on to become the most influential in the social network."" → ✭Early insight into social network structure predicts climbing the social ladder | Science Advances “While occupying an influential position within one’s social network brings many advantages, it is unknown how certain individuals rise in social prominence. Leveraging a longitudinal dataset that tracks an entirely new network of college freshmen (N = 187), we test whether “climbing the social ladder” depends on knowing how other people are connected to each other. Those who ultimately come to occupy the most influential positions exhibit early and accurate representations of their network’s general, abstract structure (i.e., who belongs to which communities and cliques). In contrast, detailed, granular representations of specific friendships do not translate into gains in social influence over time. Only once the network stabilizes do the most influential individuals exhibit the most accurate representations of specific friendships. These findings reveal that those who climb the social ladder first detect their emerging network’s general structure and then fine-tune their knowledge about individual relationships between their peers as network dynamics settle."
✭Neuroscience News on X: "Early Baby Behavior Predicts Adult Cognition and Intelligence New research shows that simple infant behaviors can offer early clues about lifelong intelligence." “In a longitudinal twin study, assessments like how long babies stayed focused or whether they preferred new toys helped forecast adult thinking skills. The researchers found that the environment before age two significantly influenced outcomes even decades later. While genetic makeup remained a major factor, early life experiences made a lasting impression on cognitive development. The study also found that polygenic scores—genetic estimates of intelligence—aligned well with long-term outcomes. These insights could shape early interventions to promote healthy cognitive aging." → ✭
Stability of general cognitive ability from infancy to adulthood: A combined twin and genomic investigation | PNAS “A goal of psychological science is to identify early life factors that influence lifelong outcomes and the stability of those outcomes over time. Studies have examined the continuity of cognition across development, but little work has examined whether genetic and environmental factors in very early life predict cognition in adulthood. We found considerable stability of cognitive ability across the first three decades of life. Over half the variance in year 29 cognitive ability was explained by shared environmental influences (e.g., nongenetic aspects of the home/neighborhood environments) present by year 1 to 2 (10%), or genetic influences present by year 7 (49%). Results suggest that providing supportive environments in the first years of life could substantially impact future cognitive ability."
✭ Unlocking the Secrets to Human Limb Regeneration - YouTube “Axolotls play a huge part in the exploration of one of the oldest questions in biology: Can humans regenerate their limbs? This question can’t be answered with a simple yes or no, but Northeastern professor and researcher James Monaghan explains just how close we are to limb regeneration in humans being a reality." → ✭ Axolotls May Hold the Key to Regrowing Limbs, and Scientists Are Unraveling Their Secrets to Help Humans Do the Same “With the help of gene-edited axolotls, researchers have gotten one step closer to enabling human limb regeneration. Monaghan and his team genetically engineered axolotls to glow in the dark to identify the molecular pathways that allow the animals to regrow their limbs. That helped them pinpoint a molecule called retinoic acid—a derivative of vitamin A found in many skincare products—as a key ingredient for limb regeneration and examine its role. Their findings were published in the journal Nature Communications on Tuesday."
✭Can AI Create an Interactive Digital Narrative? A Benchmarking Framework to Evaluate Generative AI Tools for the Design of IDNs (Dec 2024 | SpringerLink) “Where do generative AI (GenAI) tools like ChatGPT or Claude stand when it comes to the design of Interactive Digital Narratives (IDNs)? Can they be used to create an IDN from scratch? Or complete typical tasks in narrative design? To answer these questions we first develop a benchmarking framework in collaboration with a group of experts before applying it to create and evaluate the output of GenAI tools. We describe the development of the benchmarking framework, discuss results and consider limitations. The results show that while GenAI tools can be a great asset in IDN design, they are not yet able to replace a human narrative designer. The strength of the current generation of GenAI tools lays in delivering output that can be used for ideation and training, and in some occasions also for production in the hands of an experienced designer. Finally, we consider ethical aspects and future developments."
✭Non-canonical roles of mitotic proteins in cortical neurons: Trends in Neurosciences “Although neurons are post-mitotic, many mitotic proteins remain expressed in the adult mammalian brain, particularly in the cerebral cortex. Beyond their role in cell division, mitotic proteins have non-canonical functions in cortical neurons, impacting neuronal migration, architecture, and functional modulation. Altered expression or mutations of mitotic proteins are increasingly linked to brain disorders, including primary microcephaly and Alzheimer’s disease.”
✭Neuromodulatory signaling contributing to the encoding of aversion: Trends in Neurosciences “The appropriate and rapid encoding of stimuli bearing a negative valence enables behaviors that are essential for survival. Recent advances in neuroscience using rodents as a model system highlight the relevance of cell type-specific neuronal activities in diverse brain networks for the encoding of aversion, as well as their importance for subsequent behavioral strategies. Within these networks, neuromodulators influence cell excitability, adjust fast synaptic neurotransmission, and affect plasticity, ultimately modulating behaviors. In this review we first discuss contemporary findings leveraging the use of cutting-edge neurotechnologies to define aversion-related neural circuits. The spatial and temporal dynamics of the release of neuromodulators and neuropeptides upon exposure to aversive stimuli are described within defined brain circuits. Together, these mechanistic insights update the present neural framework through which aversion drives motivated behaviors.”
✭An organ-chip model of sporadic ALS using iPSC-derived spinal cord motor neurons and an integrated blood-brain-like barrier: Cell Stem Cell “Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder in which motor neurons (MNs) of the brain and spinal cord degenerate, leading to paralysis. Generating MNs from patient-specific induced pluripotent stem cells (iPSCs) may help elucidate early stages of disease. Here, we combined MNs from patients with early-onset disease with brain microvascular endothelial-like cells in a microfluidic device we termed spinal cord chips (SC-chips) and added media flow, which enhanced neuronal maturation and improved cellular health. Bulk transcriptomic and proteomic analyses of SC-chips revealed differences between control and ALS samples, including increased levels of neurofilaments. Single-nuclei RNA sequencing revealed the presence of two MN subpopulations and an ALS-specific dysregulation of glutamatergic and synaptic signaling. This ALS SC-chip model generates a diversity of mature MNs to better understand ALS pathology in a model that has an active blood-brain barrier-like system for future drug screening.”
✭How dopamine neurons devalue delayed rewards “The activity of dopamine neurons evoked by odour cues was lower for longer reward delays than for short delays, and this decrease in activity followed an exponential — rather than a hyperbolic — discounting function at the level of individual neurons (Fig. 1a). Across the population of neurons, we observed substantial diversity in discount rates (Fig. 1b). Notably, reward delays could be decoded from the pattern of cue-evoked responses across neurons using a computational method derived from an extension of the ‘distributional reinforcement learning’ framework, which relies on discounting processes with diverse rates3. Discount factors of individual neurons inferred from the slowly evolving, ‘ramping’ activity of dopamine neurons as mice walked along the virtual-reality corridor were correlated with those obtained from the first task. These findings demonstrate that discount factors vary across neurons and suggest that discounting is a cell-specific ‘tuning’ property.”
✭Adversarial testing of global neuronal workspace and integrated information theories of consciousness - Nature “Multimodal results (iEEG, fMRI and MEG) of predictions from integrated information theory and global neuronal workspace theory align with some predictions of both theories on visual consciousness, but also critically challenge key tenets of both theories.”


