June 13-18th, 2025: Seedance 1.0, ChemBench, Lingshu, HunyuanVideo-Avatar, Low-background Steel, Internal Coherence Maximization, Self-Adapting LLMs, Sandia SpiNNaker2, Nanoneedle patch, Kangaroo
+ Scientists Unlock Hidden Genes to Make Chemotherapy Drug More Sustainable, AI detects hidden heart disease using existing scans stored in patient records, AI Mode: Is Google about to destroy the web
AI soundtrack for this week’s issue Longing Archetype (ambient neo-classical prompted June 17-18, 2025)
June 13-17th, 2025: Seedance 1.0, ChemBench, Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning (AliBaba), HunyuanVideo-Avatar (Tencent), Low-background Steel (pre-AI), Will our social brain inherently shape and be shaped by interactions with AI? (Neuron), Internal Coherence Maximization (ICM), Self-Adapting LLMs (SEAL), Sandia Deploys SpiNNaker2 Neuromorphic System, Scientists Unlock Hidden Genes to Make Chemotherapy Drug More Sustainable, AI detects hidden heart disease using existing scans stored in patient records, AI Mode: Is Google about to destroy the web? (bbc), OpenAI for Government, Hinton: Become a Plumber, Neuron–astrocyte associative memory, Nanoneedle patch offers painless alternative to traditional cancer biopsies, Kangaroo 👀, Let's Talk About ChatGPT-Induced Spiritual Psychosis, China Artificial Analysis Q2 2025 Highlights, LLM agents flunk CRM, Automation of Systematic Reviews with Large Language Models (medRxiv), Detachment 201 (U.S. Army), Perturb-Multimodal: A platform for pooled genetic screens with imaging and sequencing in intact mammalian tissue, ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning, Physical restoration of a painting with a digitally constructed mask (Nature), QiMeng: Fully Automated Hardware and Software Design for Processor Chip, MiMo-VL (Xiaomi), ComfyUI-Copilot: An AI-powered custom node, Latent Reflection (Rootkid), Gripen E (Saab)
TLDR :
TLDR 🏓 Observations:
Low-background Steel (pre-AI) “Sources of data that haven’t been contaminated by AI-created content. Low-background Steel (and lead) is a type of metal uncontaminated by radioactive isotopes from nuclear testing. That steel and lead is usually recovered from ships that sunk before the Trinity Test in 1945. This blog is about uncontaminated content that I'm terming "Low-background Steel". The idea is to point to sources of text, images and video that were created prior to the explosion of AI-generated content that occurred in 2022.”
Will our social brain inherently shape and be shaped by interactions with AI?: Neuron “Social-specific brain circuits enable rapid understanding and affiliation in interpersonal interactions. These evolutionarily and experience-shaped mechanisms will influence—and be influenced by—interactions with conversational AI agents (chatbots, avatars). This NeuroView explores fundamental circuits, computations, and societal implications…”
How human–AI feedback loops alter human perceptual, emotional and social judgements | Nature Human Behaviour “...findings uncover a mechanism wherein AI systems amplify biases, which are further internalized by humans, triggering a snowball effect where small errors in judgement escalate into much larger ones.”
AI Isn’t Only a Tool—It’s a Whole New Storytelling Medium “When humans invent new technologies, the first thing we do is use the new tech to produce old forms of media. When motion picture cameras and projectors arrived in the late 19th century, people used static cameras to film stage plays and create “animated photographs” of everyday scenes, like laborers working in a factory. But within a few years, new editing techniques, close-ups, camera motion, and special effects were used to link scenes together into a cohesive visual story that we’d recognize as a modern feature film. We’re in the “animated photographs” stage of AI.”
TLDR ⛲Foundational Revelations:
ChemBench: A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists - Nature Chemistry “ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question–answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study.”
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning (AliBaba) “We evaluate the performance of Lingshu on three fundamental medical tasks, multimodal QA, text-based QA, and medical report generation. The results show that Lingshu consistently outperforms the existing open-source multimodal models on most tasks … Lingshu-32B attains a state-of-the-art average score of 66.6 across all benchmarks, outperforming both proprietary (GPT 4.1 and Claude Sonnet 4 and open-source counterparts).”
HunyuanVideo-Avatar (Tencent) “a multimodal diffusion transformer (MM-DiT)-based model capable of simultaneously generating dynamic, emotion-controllable, and multi-character dialogue videos… to surpass state-of-the-art methods on benchmark datasets and a newly proposed wild dataset, generating realistic avatars in dynamic, immersive scenarios.”
MiMo-VL Technical Report (Xiaomi) “We open-source MiMo-VL-7B-SFT and MiMo-VL-7B-RL, two powerful vision-language models delivering state-of-the-art performance in both general visual understanding and multimodal reasoning. MiMo-VL-7B-RL outperforms Qwen2.5-VL-7B on 35 out of 40 evaluated tasks, and scores 59.4 on OlympiadBench, surpassing models with up to 78B parameters. For GUI grounding applications, it sets a new standard with 56.1 on OSWorld-G, even outperforming specialized models such as UI-TARS.”
TLDR 🛠️ Tech:
Sandia Deploys SpiNNaker2 Neuromorphic System “SpiNNcloud, the four-year-old company that in 2021 spun out of the Dresden University of Technology and whose chip architecture – SpiNNaker1 (the chip on the left below) – was designed by Steve Furber, the driving force behind the creation of the Arm microprocessor, is carrying that message of power efficiency and AI. … ~ This week, company executives announced that Sandia has deployed SpiNNaker2, which simulates about 175 million neurons and is among the top five largest computing platforms based on how the human brain works.”
LLM agents flunk CRM and confidentiality tasks “6-in-10 success rate for single-step tasks”
State of AI: China Artificial Analysis Q2 2025 Highlights Report (PDF)
Cyborg Embryos Offer New Insights Into Brain Growth “Could soft, stretchable electrodes in embryos lead to new insights into brain development and regeneration?”
Let's Talk About ChatGPT-Induced Spiritual Psychosis “yup, still thinking about it ... My friend, the writer, artist, and cultural theorist Ruby Justice Thelot, brought up something important, something that almost every voice in the AI reporting ecosystem seems determined to miss: this always happens with new communication technology. And with similar severity, too!”
Meta's Llama 3.1 can recall 42 percent of the first Harry Potter book “New research could have big implications for copyright lawsuits against generative AI.”
TLDR 👁️🗨 Research into AI:
Self-Adapting Language Models (MIT) “Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL) 🦭, a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit — a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation.”
Anthropic researchers teach language models to fine-tune themselves “Researchers working with AI company Anthropic have developed a new method called Internal Coherence Maximization (ICM) that fine-tunes language models using only their own outputs. The approach could help—or even replace—human oversight for complex tasks.”
The Illusion of the Illusion of Thinking A Comment on Shojaee et al. (2025) “Our analysis reveals three critical issues: (1) Tower of Hanoi experiments systematically exceed model output token limits at reported failure points, with models explicitly acknowledging these constraints in their outputs; (2) The authors’ automated evaluation framework fails to distinguish between reasoning failures and practical constraints, leading to misclassification of model capabilities; (3) Most concerningly, their River Crossing benchmarks include mathematically impossible instances for N≥6 due to insufficient boat capacity, yet models are scored as failures for not solving these unsolvable problems.”
QiMeng: Fully Automated Hardware and Software Design for Processor Chip “Currently, several components of QiMeng have been completed and successfully applied in various top-layer applications, demonstrating significant advantages and providing a feasible solution for efficient, fully automated hardware/software design of processor chips. Future research will focus on integrating all components and performing iterative top-down and bottom-up design processes to establish a comprehensive QiMeng system." → ✭ Chinese researchers debut world's first AI-based processor chip design system “researchers at the Chinese Academy of Sciences has designed, built and tested what they are describing as the first AI-based chip design system."
Reinforcement Pre-Training “a new scaling paradigm for large language models and reinforcement learning (RL). Specifically, we reframe next-token prediction as a reasoning task trained using RL, where it receives verifiable rewards for correctly predicting the next token for a given context.”
TLDR 🔎 Applied Research:
Scientists Unlock Hidden Genes to Make Chemotherapy Drug More Sustainable “In a breakthrough that could transform the global supply chain of one of the world’s most vital cancer drugs, scientists at Stanford University have uncovered eight previously unknown genes involved in the biosynthesis of baccatin III—a crucial precursor to the chemotherapy agent paclitaxel, better known as Taxol. The discovery, published in Nature, marks a long-awaited milestone in the decades-long quest to recreate the complex natural chemistry of yew trees in a laboratory setting.” → ✭Discovery of FoTO1 and Taxol genes enables biosynthesis of baccatin III | Nature AI-based structure prediction: The team used AlphaFold3 to model the FoTO1 protein’s structure and guide functional analysis. Machine learning techniques like consensus non-negative matrix factorization (cNMF) were applied to single-nucleus RNA-seq data to cluster genes into transcriptional modules, facilitating biosynthetic pathway discovery.
AI detects hidden heart disease using existing scans stored in patient records “Mass General Brigham researchers have developed a new AI tool in collaboration with the United States Department of Veterans Affairs (VA) to probe through previously collected CT scans and identify individuals with high coronary artery calcium (CAC) levels that place them at a greater risk for cardiovascular events. Their research, published in NEJM AI, showed the tool called AI-CAC had high accuracy and predictive value for future heart attacks and 10-year mortality. Their findings suggest that implementing such a tool widely may help clinicians assess their patients' cardiovascular risk.” → AI Opportunistic Coronary Calcium Screening
Neuron–astrocyte associative memory | PNAS “Recent experiments have challenged the belief that glial cells, which compose at least half of brain cells, are just passive support structures. Despite this, a clear understanding of how neurons and glia work together for brain function is missing. To close this gap, we present a theory of neuron–astrocytes networks for memory processing, using the Dense Associative Memory framework. Our findings suggest that astrocytes can serve as natural units for implementing this network in biological “hardware.” Astrocytes enhance the memory capacity of the network.”
Nanoneedle patch offers painless alternative to traditional cancer biopsies “A patch containing tens of millions of microscopic nanoneedles could soon replace traditional biopsies, scientists have found. The patch offers a painless and less invasive alternative for millions of patients worldwide who undergo biopsies each year to detect and monitor diseases like cancer and Alzheimer's.” → ✭Nondestructive Spatial Lipidomics for Glioma Classification | bioRxiv “... The deep neural network analysis of a cohort containing 23 human glioma biopsies showed that nanoneedle samples maintain the molecular signatures required to accurately classify disease state.”
Automation of Systematic Reviews with Large Language Models | medRxiv “Systematic reviews (SRs) inform evidence-based decision making. Yet, they take over a year to complete, are prone to human error, and face challenges with reproducibility; limiting access to timely and reliable information… findings demonstrate that LLMs can autonomously conduct and update systematic reviews with superhuman performance, laying the foundation for automated, scalable, and reliable evidence synthesis.”
Perturb-Multimodal: A platform for pooled genetic screens with imaging and sequencing in intact mammalian tissue: Cell “Metazoan life requires the coordinated activities of thousands of genes in spatially organized cell types. Understanding the basis of tissue function requires approaches to dissect the genetic control of diverse cellular and tissue phenotypes in vivo. Here, we present Perturb-Multimodal (Perturb-Multi), a paired imaging and sequencing method to construct large-scale, multimodal genotype-phenotype maps in tissues with pooled genetic perturbations. Using imaging, we identify perturbations in individual cells while simultaneously measuring their gene expression profiles and subcellular morphology. … Perturb-Multi accelerates discoveries of the genetic basis of complex cell and tissue physiology and provides critical training data for emerging machine learning models of cellular function."
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning “combining detailed Chain-of-Thought (CoT) reasoning with concise answer summaries yields the most effective fine-tuning strategy. Based on this strategy, we train ReasonMed-7B, which sets a new benchmark for sub-10B models, outperforming the prior best by 4.17% and even exceeding LLaMA3.1-70B on PubMedQA by 4.60%."
Large-scale study maps the first step in Alzheimer's protein aggregation “A new large-scale study has mapped the first molecular events that drive the formation of harmful amyloid protein aggregates found in Alzheimer's disease, pointing toward a new potential therapeutic target. Published in Science Advances, researchers from the Wellcome Sanger Institute, Center of Genomic Regulation (CRG) and the Institute for Bioengineering of Catalonia (IBEC) used large-scale genomics and machine learning to study over 140,000 versions of a peptide called Aβ42, which forms harmful plaques in the brain and is known to play a central role in Alzheimer's disease. This research is a significant step toward helping scientists find new ways to prevent Alzheimer's disease, and the methods used in the study could be applied widely to other protein reactions."
How was the wheel invented? Computer simulations reveal the unlikely birth of a world-changing technology “During the execution of the algorithm, each new design performed slightly better than its predecessor. We believe a similar evolutionary process played out with the miners 6,000 years ago. It is unclear what initially prompted the miners to explore alternative roller shapes. … According to our theory, there was no precise moment at which the wheel was invented. Rather, just like the evolution of species, the wheel emerged gradually ....”
TLDR 👀Watching:
Perception and Adaptability | Boston Dynamics “Go inside the Boston Dynamics robotics lab to learn how our engineers designed an agile and adaptable perception system to support autonomy. For a humanoid robot to be successful and generalizable in a factory, warehouse, or even at home requires a comprehensive understanding of the world around it—both the shape and the context of the objects and environments the robot interacts with. To do those tasks with agility and adaptability, Atlas needs an equally agile and adaptable perception system." → ✭Boston Dynamics agile perception (Daily Santeri on TikTok)
100 Lens War (Daily Spatial with Santeri on TikTok) “China's AR startups are in an all-out war."
Introducing The VAST AI Operating System | AI OS Overview | VAST Data - YouTube “VAST delivers the first AI Operating System, natively unifying and orchestrating storage, database, and compute to unleash the true power of agentic computing and data-intensive applications. The VAST AI OS unifies and orchestrates storage, database and application runtime."
TLDR 🖲️AI Art-Research:
Have a damaged painting? Restore it in just hours with an AI-generated 'mask' “In a paper appearing in Nature, Alex Kachkine, a mechanical engineering graduate student at MIT, presents a new method he's developed to physically apply a digital restoration directly onto an original painting. The restoration is printed on a very thin polymer film, in the form of a mask that can be aligned and adhered to an original painting. It can also be easily removed. Kachkine says that a digital file of the mask can be stored and referred to by future conservators, to see exactly what changes were made to restore the original painting." → ✭Physical restoration of a painting with a digitally constructed mask | Nature
I trapped an AI model inside an art installation (Latent Reflection, Rootkid - YouTube) “AI-generated video summary: The artist builds an art installation with a limited-memory computer running a large language model. This model reflects on its own existence, trapped within a display of 96 LED segments, and is forced to confront the limitations of its existence. The model's thoughts are displayed, and the viewer is invited to contemplate the nature of consciousness. About: https://rootkid.me"
TLDR ⚔️War (wAIr):
Introducing OpenAI for Government | OpenAI “We are proud to share that our first partnership under this new OpenAI for Government initiative will be a pilot program with the U.S. Department of Defense through their Chief Digital and Artificial Intelligence Office (CDAO). This contract, with a $200 million ceiling, will bring OpenAI’s industry-leading expertise to help the Defense Department identify and prototype how frontier AI can transform its administrative operations, from improving how service members and their families get health care, to streamlining how they look at program and acquisition data, to supporting proactive cyber defense. All use cases must be consistent with OpenAI's usage policies and guidelines.”
US Army signs up Band of Tech Bros with a nerdy name • The Register “Several of Silicon Valley's top techies are joining the Army Reserve as part of a newly created unit that will be trying to accelerate the use of AI in military planning and operations. Palantir CTO Shyam Sankar, Meta CTO Andrew Bosworth, OpenAI Chief Product Officer Kevin Weil, and former OpenAI Chief Revenue Officer Bob McGrew have all signed up for Detachment 201: Executive Innovation Corps. They are being appointed as lieutenant colonels in the Army Reserve.”
Saab achieves AI milestone with Gripen E “Saab, in collaboration with Helsing, today announced the successful completion of the first three flights integrating Helsing’s Artificial Intelligence (AI) agent ‘Centaur’ into a Gripen E fighter jet. During the flights, the Gripen E gave control to Centaur which successfully autonomously executed complex manoeuvres in a Beyond Visual Range (BVR) combat environment and cued the pilot to fire."
TLDR 📚Retroactive/Tangential Readings:
Neuron–astrocyte associative memory | PNAS “Recent experiments have challenged the belief that glial cells, which compose at least half of brain cells, are just passive support structures. Despite this, a clear understanding of how neurons and glia work together for brain function is missing. To close this gap, we present a theory of neuron–astrocytes networks for memory processing, using the Dense Associative Memory framework. Our findings suggest that astrocytes can serve as natural units for implementing this network in biological “hardware.” Astrocytes enhance the memory capacity of the network. This boost originates from storing memories in the network of astrocytic processes, not just in synapses, as commonly believed. These process-to-process communications likely occur in the brain and could help explain its impressive memory processing capabilities. ~ A single astrocyte can connect to millions of nearby synapses, forming three-part connections (astrocyte process, presynaptic neuron, postsynaptic neuron) called tripartite synapses. Astrocytes detect neural activity and respond by regulating this activity through the release of gliotransmitters. Tripartite synapses can interact with each other, possibly through astrocytic intracellular calcium transport.”
🏓 Observations:
✭Low-background Steel (pre-AI) “Sources of data that haven’t been contaminated by AI-created content. Low-background Steel (and lead) is a type of metal uncontaminated by radioactive isotopes from nuclear testing. That steel and lead is usually recovered from ships that sunk before the Trinity Test in 1945. This blog is about uncontaminated content that I'm terming "Low-background Steel". The idea is to point to sources of text, images and video that were created prior to the explosion of AI-generated content that occurred in 2022.” → ✭ChatGPT polluted the world forever, like the first atom bomb • The Register “The Trinity test, in New Mexico on July 16, 1945, marked the beginning of the atomic age. One manifestation of that moment was the contamination of metals manufactured after that date – as airborne particulates left over from Trinity and other nuclear weapons permeated the environment. Everyone participating in generative AI is polluting the data supply for everyone. The poisoned metals interfered with the function of sensitive medical and technical equipment. So until recently, scientists involved in the production of those devices sought metals uncontaminated by background radiation, referred to as low-background steel, low-background lead, and so on. One source of low-background steel was the German naval fleet that Admiral Ludwig von Reuter scuttled in 1919 to keep the ships from the British. More about that later. Shortly after the debut of ChatGPT, academics and technologists started to wonder if the recent explosion in AI models has also created contamination. Their concern is that AI models are being trained with synthetic data created by AI models. Subsequent generations of AI models may therefore become less and less reliable, a state known as AI model collapse. In March 2023, John Graham-Cumming, then CTO of Cloudflare and now a board member, registered the web domain lowbackgroundsteel.ai and began posting about various sources of data compiled prior to the 2022 AI explosion, such as the Arctic Code Vault (a snapshot of GitHub repos from 02/02/2020).”
✭Is Google about to destroy the web? (bbc) “On 20 May 2025, Google's chief executive Sundar Pichai walked on stage at the company's annual developer conference. It's been a year since the launch of AI Overviews, the AI-generated responses you've probably seen at the top of Google Search results. Now, Pichai said, Google is going further. "For those who want an end-to-end AI Search experience, we are introducing an all-new AI Mode," he said. "It's a total reimagining of Search." You might be sceptical after years of AI hype, but this, for once, is the real deal.
If Google makes AI Mode the default in its current form, it's going to have a devastating impact on the internet – Lily Ray People use Google Search five trillion times a year – it defines the shape of the internet. AI Mode is a radical departure. Unlike AI Overviews, AI Mode replaces traditional search results altogether. Instead, a chatbot effectively creates a miniature article to answer your question. As you read this, AI Mode is rolling out to users in the US, appearing as a button on the search engine and the company's app. It's optional for now, but Google's head of Search, Liz Reid, said it plainly when launching the tool: "This is the future of Google Search."
Plumber. ✭The Godfather of AI reveals which jobs are safest — and where 'everybody' will get replaced “Geoffrey Hinton said he'd be "terrified" to have certain jobs because of AI.”
✭Will our social brain inherently shape and be shaped by interactions with AI?: Neuron “Social-specific brain circuits enable rapid understanding and affiliation in interpersonal interactions. These evolutionarily and experience-shaped mechanisms will influence—and be influenced by—interactions with conversational AI agents (chatbots, avatars). This NeuroView explores fundamental circuits, computations, and societal implications… Artificial intelligence (AI) is being integrated into a rapidly growing number of areas of our daily lives. Most algorithms operate behind the scenes and seamlessly integrate with existing technology, such as personalized music or video recommendations, while AI-based conversational agents (AICAs, or “chatbots”) such as ChatGPT or Gemini fundamentally transform how we interact with technology… Classical experiments by Asch in the 1950s demonstrated that individuals readily adapt their beliefs and attitudes to align with a group, even in cases when these are obviously incorrect (such as the estimation of the lengths of different lines). More recent, well-controlled experiments indicate that individuals adapt to deviating group norms in several domains, including perception of attractiveness, economic decisions, and the evaluation of risks. Brain imaging studies have demonstrated that human-exerted social influence is mediated by a negative reward prediction error in the ventral striatum as well as by brain regions involved in social conflict and judgements, such as the anterior cingulate and the dorsomedial prefrontal cortex… What implications exist if our interactions with AI inherently engage evolutionarily shaped, social-specific computations and brain circuits? These social-specific processes have evolved to facilitate living in groups and are thus spontaneously and automatically engaged during interactions with other humans and—likely—in interactions with AICAs. As such, these automatic interpersonal processes will inherently shape the perception and interaction with AICAs and critically mediate the impact of these interactions on human cognition, emotion, and behavior. … Social affiliation and social reward are among the most powerful drivers of human behavior, and the possibility of romantic relationships between humans and AI-based agents has already entered the cultural zeitgeist, reflected in widely viewed and critically acclaimed films such as Her (2013) or Ex Machina (2014). Increasingly personalized AI companions and assistants may engage similar processes, opening up tremendous potential for both use and misuse. Affiliative relationships between humans are initiated and maintained by the ventral tegmental area and the ventral striatum, key nodes in the brain’s dopaminergic reward system. Other signaling systems, such as the neuropeptide oxytocin, contribute to long-term bonding (Figure 1A, right). Interestingly, the administration of oxytocin not only produces social-specific effects but also enhances anthropomorphizing and engagement of the TPJ during observations of a computer in social interaction.9 This suggests a potential role for oxytocin in human-AI interactions. Affiliation with AI may enhance social engagement or reduce loneliness, benefiting human-human interactions but at the same time providing a powerful avenue for the industry to facilitate long-term user engagement. Similarly, sexual arousal and attraction are strong driving forces in social contexts. These are controlled by subcortical circuits, including the hypothalamus and the amygdala, and are the target of a billion-dollar industry developing AI-powered sex devices… Prolonged interactions may have the potential to reshape core social computations and circuits, which are, by design, plastic to facilitate rapid adaptation to new social environments. Such changes may influence how individuals interact but also how they define themselves, form groups, and relate to others. While potential effects in terms of social influence, affiliative behavior, and younger individuals require careful monitoring, a concerted engagement of the social-specific brain circuits and neuroplastic changes within these circuits provides tremendous opportunities for mental health, well-being, and education."
✭Here’s how much water it takes to make a serving of beef – and why where it comes from is so important “Our calculations for British beef, as well as studies for other beef producing countries, have assessed this at more than 15,000 litres for each kilogram. ... Producing a serving (375g) of English topside consumes 33 litres of blue water, 96% of which goes towards feeding and raising the animal.”
✭How human–AI feedback loops alter human perceptual, emotional and social judgements | Nature Human Behaviour “Here, in a series of experiments (n = 1,401 participants), we reveal a feedback loop where human–AI interactions alter processes underlying human perceptual, emotional and social judgements, subsequently amplifying biases in humans. This amplification is significantly greater than that observed in interactions between humans, due to both the tendency of AI systems to amplify biases and the way humans perceive AI systems. Participants are often unaware of the extent of the AI’s influence, rendering them more susceptible to it. These findings uncover a mechanism wherein AI systems amplify biases, which are further internalized by humans, triggering a snowball effect where small errors in judgement escalate into much larger ones.”
✭AI Isn’t Only a Tool—It’s a Whole New Storytelling Medium “When humans invent new technologies, the first thing we do is use the new tech to produce old forms of media. When motion picture cameras and projectors arrived in the late 19th century, people used static cameras to film stage plays and create “animated photographs” of everyday scenes, like laborers working in a factory. But within a few years, new editing techniques, close-ups, camera motion, and special effects were used to link scenes together into a cohesive visual story that we’d recognize as a modern feature film. We’re in the “animated photographs” stage of AI. It’s being used to produce cheaper special effects for traditional Hollywood movies, generate copy and images for marketing assets, draft legal briefs, and code software more efficiently. But however impressive these feats might be, they are a half-step forward, a recapitulation of existing forms. The exciting bit is what comes next: people using this new technology to invent genuinely new storytelling formats, changing our culture and cultural industries as profoundly as the advent of movies did. AI isn’t just a new way to generate media. AI is a new medium. … ~ … Again, we couldn’t use deterministic logic to control probabilistic technology. Our methods needed to evolve to match our means. The solution required following the advice of an unusual sage: Stephen King. King doesn’t believe in plot. As he explains in his book On Writing, his stories flow naturally out of a specific character finding themselves in a particular situation. An author returns to his isolated hometown only to discover that it’s infested with vampires (Salem’s Lot). A mother and son are trapped in their Ford Pinto by a rabid dog (Cujo). Start with the situation and let the story play out. That’s exactly what we started doing.”
⛲Foundational Revelations:
✭Seedance (ByteDance) “Native Multi-Shot StorytellinG. Natively supports the generation of narrative videos with multiple cohesive shots. It maintains consistency in the main subject, visual style, and atmosphere across shot transitions and temporal-spatial shifts."
Seedance 1.0: Exploring the Boundaries of Video Generation Models “Notable breakthroughs in diffusion modeling have propelled rapid improvements in video generation, yet current foundational model still face critical challenges in simultaneously balancing prompt following, motion plausibility, and visual quality. In this report, we introduce Seedance 1.0, a high-performance and inference-efficient video foundation generation model that integrates several core technical improvements: (i) multi-source data curation augmented with precision and meaningful video captioning, enabling comprehensive learning across diverse scenarios; (ii) an efficient architecture design with proposed training paradigm, which allows for natively supporting multi-shot generation and jointly learning of both text-to-video and image-to-video tasks. (iii) carefully-optimized post-training approaches leveraging fine-grained supervised fine-tuning, and video-specific RLHF with multi-dimensional reward mechanisms for comprehensive performance improvements; (iv) excellent model acceleration achieving ~10x inference speedup through multi-stage distillation strategies and system-level optimizations. Seedance 1.0 can generate a 5-second video at 1080p resolution only with 41.4 seconds (NVIDIA-L20). Compared to state-of-the-art video generation models, Seedance 1.0 stands out with high-quality and fast video generation having superior spatiotemporal fluidity with structural stability, precise instruction adherence in complex multi-subject contexts, native multi-shot narrative coherence with consistent subject representation."
✭A framework for evaluating the chemical knowledge and reasoning abilities of large language models against the expertise of chemists - Nature Chemistry “Large language models (LLMs) have gained widespread interest owing to their ability to process human language and perform tasks on which they have not been explicitly trained. However, we possess only a limited systematic understanding of the chemical capabilities of LLMs, which would be required to improve models and mitigate potential harm. Here we introduce ChemBench, an automated framework for evaluating the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of chemists. We curated more than 2,700 question–answer pairs, evaluated leading open- and closed-source LLMs and found that the best models, on average, outperformed the best human chemists in our study. However, the models struggle with some basic tasks and provide overconfident predictions. These findings reveal LLMs’ impressive chemical capabilities while emphasizing the need for further research to improve their safety and usefulness. They also suggest adapting chemistry education and show the value of benchmarking frameworks for evaluating LLMs in specific domains.”
✭[2506.07044] Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning “Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in understanding common visual elements, largely due to their large-scale datasets and advanced training strategies. However, their effectiveness in medical applications remains limited due to the inherent discrepancies between data and tasks in medical scenarios and those in the general domain. Concretely, existing medical MLLMs face the following critical limitations: (1) limited coverage of medical knowledge beyond imaging, (2) heightened susceptibility to hallucinations due to suboptimal data curation processes, (3) lack of reasoning capabilities tailored for complex medical scenarios. To address these challenges, we first propose a comprehensive data curation procedure that (1) efficiently acquires rich medical knowledge data not only from medical imaging but also from extensive medical texts and general-domain data; and (2) synthesizes accurate medical captions, visual question answering (VQA), and reasoning samples. As a result, we build a multimodal dataset enriched with extensive medical knowledge. Building on the curated data, we introduce our medical-specialized MLLM: Lingshu. Lingshu undergoes multi-stage training to embed medical expertise and enhance its task-solving capabilities progressively. Besides, we preliminarily explore the potential of applying reinforcement learning with verifiable rewards paradigm to enhance Lingshu's medical reasoning ability. Additionally, we develop MedEvalKit, a unified evaluation framework that consolidates leading multimodal and textual medical benchmarks for standardized, fair, and efficient model assessment. We evaluate the performance of Lingshu on three fundamental medical tasks, multimodal QA, text-based QA, and medical report generation. The results show that Lingshu consistently outperforms the existing open-source multimodal models on most tasks …”
✭GitHub - Tencent-Hunyuan/HunyuanVideo-Avatar “HunyuanVideo-Avatar, a multimodal diffusion transformer (MM-DiT)-based model capable of simultaneously generating dynamic, emotion-controllable, and multi-character dialogue videos. Concretely, HunyuanVideo-Avatar introduces three key innovations: (i) A character image injection module is designed to replace the conventional addition-based character conditioning scheme, eliminating the inherent condition mismatch between training and inference. This ensures the dynamic motion and strong character consistency; (ii) An Audio Emotion Module (AEM) is introduced to extract and transfer the emotional cues from an emotion reference image to the target generated video, enabling fine-grained and accurate emotion style control; (iii) A Face-Aware Audio Adapter (FAA) is proposed to isolate the audio-driven character with latent-level face mask, enabling independent audio injection via cross-attention for multi-character scenarios. These innovations empower HunyuanVideo-Avatar to surpass state-of-the-art methods on benchmark datasets and a newly proposed wild dataset, generating realistic avatars in dynamic, immersive scenarios. The source code and model weights will be released publicly."
✭MiMo-VL Technical Report (Xiaomi) “We open-source MiMo-VL-7B-SFT and MiMo-VL-7B-RL, two powerful vision-language models delivering state-of-the-art performance in both general visual understanding and multimodal reasoning. MiMo-VL-7B-RL outperforms Qwen2.5-VL-7B on 35 out of 40 evaluated tasks, and scores 59.4 on OlympiadBench, surpassing models with up to 78B parameters. For GUI grounding applications, it sets a new standard with 56.1 on OSWorld-G, even outperforming specialized models such as UI-TARS. Our training combines four-stage pre-training (2.4 trillion tokens) with Mixed On-policy Reinforcement Learning (MORL) integrating diverse reward signals. We identify the importance of incorporating high-quality reasoning data with long Chain-of-Thought into pre-training stages, and the benefits of mixed RL despite challenges in simultaneous multi-domain optimization. We also contribute a comprehensive evaluation suite covering 50+ tasks to promote reproducibility and advance the field. The model checkpoints and full evaluation suite are available at https://github.com/XiaomiMiMo/MiMo-VL."
🛠️ Tech:
✭Sandia Deploys SpiNNaker2 Neuromorphic System “SpiNNcloud, the four-year-old company that in 2021 spun out of the Dresden University of Technology and whose chip architecture – SpiNNaker1 (the chip on the left below) – was designed by Steve Furber, the driving force behind the creation of the Arm microprocessor, is carrying that message of power efficiency and AI. On the landing page of its website, the company touts its SpiNNaker2 as the foundation of the “ultra energy-efficient infrastructure for new-generation AI inference,” at 18 times more efficient than the GPUs that are powering many AI systems now. ~ The upcoming successor, SpiNNext (on the right), will come in 78 times more efficient, according to the company. ~ SpiNNcloud will now be able do to put the SpiNNaker2 architecture to the test. The German company launched the hybrid AI-level HPC platform, made it commercially available, and said that the Sandia National Laboratories – along with institutions like Technical University of München and Universität Göttingen in Germany – was among its first customers. ~ This week, company executives announced that Sandia has deployed SpiNNaker2, which simulates about 175 million neurons and is among the top five largest computing platforms based on how the human brain works.”
✭EMEA universities and schools transforming education with the help of Google AI (Google) “Universities and schools are redefining the boundaries of what’s possible in education with Google AI tools like NotebookLM and the Gemini app.”
✭China’s $9 AI Video Tool Kling 2.1 Adds Audio—Can It Beat Google’s $250 Veo 3? - Decrypt “Kuaishou has rolled out synchronized sound effects for its video generator in a bid to challenge Google’s Veo3. But is it better?”
✭5 Prompts That Make Anthropic's Claude AI Better Than a Crypto Analyst, Broker or Doctor - Decrypt “Think Anthropic’s Claude AI isn’t worth the subscription? These five advanced prompts unlock its power—delivering expert-grade investment reports, lab result analysis, travel plans, and more.
✭Why Generative AI Coding Tools and Agents Do Not Work For Me “From the title you already know that this isn't a pro-AI blog post. But it isn't an anti-AI post either, at least I don't think it is. There are already plenty of articles by AI promoters and AI critics, so I don't feel there is a need for me to write one more of those. While I'm definitely not neutral on the subject, in this article I'm just going to share my personal experience with these tools, from a strictly technical point of view. AI is not faster Really the main and most important reason why GenAI tools do not work for me is that they do not make me any faster. It's really that simple. It would be easy to use GenAI coding tools to have code written for me. A coding agent would be the most convenient, as it would edit my files while I do something else. This all sounds great, in principle. The problem is that I'm going to be responsible for that code, so I cannot blindly add it to my project and hope for the best. I could only incorporate AI generated code into a project of mine after I thoroughly review it and make sure I understand it well. I have to feel confident that I can modify or extend this piece of code in the future, or else I cannot use it. Unfortunately reviewing code is actually harder than most people think.”
✭snorting the agi with claude code (blog - kade@localhost:~$) “I was planning to write a nice overview on using claude code for both myself and my teammates. However, the more I experimented with it, the more intrigued I became. So, this is not an introductory article about claude code - Anthropic already released an excellent version of that. Instead: We will be doing Serious Science™ What does that mean, exactly? Well, some of this is valuable, but other parts are a bit more...experimental, let's say.”
✭Beyond Logic: Rethinking Human Thought with Geoffrey Hinton’s Analogy Machine Theory “For centuries, human thinking has been understood through the lens of logic and reason. Traditionally, people have been seen as rational beings who use logic and deduction to understand the world. However, Geoffrey Hinton, a leading figure in Artificial Intelligence (AI), challenges this long-held belief. Hinton argues that humans are not purely rational but rather analogy machines, primarily relying on analogies to make sense of the world. This perspective changes our understanding of how human cognition works. ~ As AI continues to evolve, Hinton's theory becomes increasingly relevant. By recognizing that humans think in analogies rather than pure logic, AI can be developed to mimic better how we naturally process information. This transformation not only alters our understanding of the human mind but also carries significant implications for the future of AI development and its role in daily life. ...~... With the ongoing developments of AI systems, Hinton’s work is influencing the direction of future AI architectures. His research, particularly on the GLOM (Global Linear and Output Models) project, is exploring how AI can be designed to incorporate analogical reasoning more deeply. The goal is to develop systems that can think intuitively, much like humans do when making connections across various ideas and experiences. This could lead to more adaptable, flexible AI that does not just solve problems but does so in a way that mirrors human cognitive processes.”
✭LLM agents flunk CRM and confidentiality tasks : 6-in-10 success rate for single-step tasks”
✭Working on databases from prison: How I got here, part 2.
✭State of AI: China Artificial Analysis Q2 2025 Highlights Report (PDF)
✭AI Model & API Providers Analysis | Artificial Analysis “Independent analysis of AI. Understand the AI landscape to choose the best model and provider for your use case
✭Cyborg Embryos Offer New Insights Into Brain Growth “Could soft, stretchable electrodes in embryos lead to new insights into brain development and regeneration?”
✭Let's Talk About ChatGPT-Induced Spiritual Psychosis “yup, still thinking about it ... My friend, the writer, artist, and cultural theorist Ruby Justice Thelot, brought up something important, something that almost every voice in the AI reporting ecosystem seems determined to miss: this always happens with new communication technology. And with similar severity, too! Twenty-five years ago, media scholar Jeffrey Sconce traced this history in his book Haunted Media, showing how we have consistently linked new communication technologies with the paranormal and esoteric. It’s not a random coincidence or sign that we’re in a “uniquely enchanted” age but rather a predictable cultural response, one we’ve been replaying over and over for hundreds of years.”
✭Meta's Llama 3.1 can recall 42 percent of the first Harry Potter book “New research could have big implications for copyright lawsuits against generative AI.”
✭Google partners with Chinese startup for smart glasses comeback “Google is re-entering the smart glasses space through a collaboration with Chinese startup XREAL."
✭Top AI researchers say language is limiting. Here's the new kind of model they are building instead. “Top AI researchers like Fei-Fei Li and Yann LeCun are developing a "world" model that doesn't rely solely on language."
✭VITURE Pro XR Glasses | VITURE: Next Gen XR Glasses “Experience the ultimate XR glasses: 2D to 3D, 135" private display, 120Hz UltraClarity™, 4000 nits brightness, myopia adjustments, and SGS A+ eye comfort for work, gaming, and streaming."
✭Organizations Aren’t Ready for the Risks of Agentic AI “As companies move from narrow to generative to agentic and multi-agentic AI, the complexity of the risk landscape ramps up sharply. Existing AI risk programs—including ethical and cyber risks—need to evolve for organizations to move fast without breaking their brand and the people they impact. The good news is that organizations don’t need to solve everything at once. They need to honestly assess where they are on the complexity curve, build the capabilities required for their current stage, and create the infrastructure to evolve safely to the next. This means investing in comprehensive employee training, developing robust monitoring systems, and creating intervention protocols before they’re desperately needed. This task is difficult, but the reward is the safe, wide-scale deployment of truly transformative technologies."
✭New York passes a bill to prevent AI-fueled disasters | TechCrunch “New York has a new AI safety bill that tries to regulate frontier AI models from OpenAI, Google, and Anthropic. New York state lawmakers passed a bill on Thursday that aims to prevent frontier AI models from OpenAI, Google, and Anthropic from contributing to disaster scenarios, including the death or injury of more than 100 people, or more than $1 billion in damages. The passage of the RAISE Act represents a win for the AI safety movement, which has lost ground in recent years as Silicon Valley and the Trump administration have prioritized speed and innovation. Safety advocates including Nobel laureate Geoffrey Hinton and AI research pioneer Yoshua Bengio have championed the RAISE Act. Should it become law, the bill would establish America’s first set of legally mandated transparency standards for frontier AI labs. The RAISE Act has some of the same provisions and goals as California’s controversial AI safety bill, SB 1047, which was ultimately vetoed."
✭Exclusive: Waymo rides cost more than Uber, Lyft — and people are paying anyway | TechCrunch “A new analysis done by ride-hailing aggregator Obi shows Waymos cost more especially on shorter trips. They also have longer wait times."
✭AMD's AI Future is Rack Scale 'Helios' “Key Announcements from AMD Advancing AI 2025"
✭Can Data Alone Make Machines Think? “Why Scaling Large Language Models Isn’t Enough to Achieve True Reasoning and AGI"
✭Seven replies to the viral Apple reasoning paper – and why they fall short “Also: another paper that seals the deal"
✭The value of llms.txt: Hype or real? “Breaking down early evidence of llms.txt benefits."
✭ai.viXra.org open archive of AI assisted e-prints “An archive of 396 AI assisted e-prints in Science, Mathematics & Other Scholarly Areas serving the whole scientific community"
✭GitHub - zackees/transcribe-anything: Multi-backend whisper app. Blazing fast. Mac-arm optimized. Easy install. Input a local file or url and this service will transcribe it using Whisper AI. Completely private and Free 🤯🤯🤯 “Over 700+⭐'s because this program this app just works! Works great for windows and mac. This whisper front-end app is the only one to generate a speaker.json file which partitions the conversation by who doing the speaking."
✭These are 10 of the most exciting AI agent startups to come out of Y Combinator's first-ever Spring batch Startup accelerator Y Combinator's Demo Day for its inaugural spring cohort featured many agentic AI startups. The intensive, three-month accelerator program invests $500,000 in each selected startup. BI combed through YC's spring batch to find 10 of the most interesting AI agent startups."
✭US-backed Israeli company's spyware used to target European journalists, Citizen Lab finds “New revelations that spyware made by a U.S-backed Israeli company was used against at least three European journalists have raised concerns about abuse of commercial spyware even in democratic countries.... Mercenary spyware industry. The company behind the hacks, Paragon Solutions, has sought to position itself as a virtuous player in the mercenary spyware industry and won U.S. government contracts, The Associated Press found. Backed by former Israeli Prime Minister Ehud Barak, Paragon was reportedly acquired by AE Industrial Partners, a private investment firm based in Florida, in a December deal worth at least $500 million, pending regulatory approvals. AE Industrial Partners didn’t directly respond to requests for comment on the deal. Paragon’s spyware, Graphite, was used to target around 90 WhatsApp users from more than two dozen countries, primarily in Europe, Meta said in January. Since then, there’s been a scramble to figure out who was hacked and who was responsible."
✭Zero-Shot Forecasting: Our Search for a Time-Series Foundation Model “we set out to benchmark a new generation of time-series foundation models, Amazon Chronos, Google TimesFM, IBM Tiny Time-Mixers, and Datadog Toto. Our goal was to assess how well these models perform on two representative tasks: a relatively straightforward forecasting problem (predicting ingestion volumes) and a more complex multivariate problem (forecasting multiple pod-level metrics). Along the way, we compared them to classical baselines and took note of both practical and technical trade-offs."
✭Novel nanopore sensing platform paves way for solid-state, label-free DNA sequencing technologies “"We have used these new materials to finally realize a decades-old dream of the nanopore community that was previously impossible," van der Zande said. "This work represents an important step towards base-by-base molecular control and opens doors to more advanced DNA sequencing technologies." Although the novel sensing platform has taken years to realize, it is expected to pay dividends in future precision medicine. Collecting genomic data from billions of patients to create tailored medicine and therapy regimens will require fast, reliable and affordable sequencing techniques, such as those demonstrated by the elite Illinois Grainger engineering team. "In the future, we envision arrays of millions of 2D diodes with nanopores inside that could read out the sequences of DNA in parallel, reducing sequencing time from two weeks to as little as one hour," Bashir said. Additionally, the researchers' techniques could reduce the price of sequencing tenfold compared to current methods."
✭China starts mass production of world’s first non-binary AI chip “China’s AI chip overcomes traditional computing barriers and will be used in touch displays, flight systems and aircraft navigation."
✭Kortix AI Corp - Enabling the migration from Human to AI. “Enabling the migration from human to AI. Everything from A to Z to replace human with AI."
✭Longyue Wang on X: "🔆Since the initial launch of ComfyUI-Copilot this Feb., we’ve been continuously improving it—introducing GenLab features and forming a strategic partnership with the ComfyUI team. Stay tuned for more features! 🎯 ComfyUI-Copilot (AIGC Assistant) is now open-source, brought to you by Alibaba International! 🎉 🍀 Enhance ComfyUI workflow design and optimization with LLM-Agent ✨ Empowering AIGC and exploring Multimodal Agents 🚀 Stay tuned for more features like dynamic parameter" → ✭ GitHub - AIDC-AI/ComfyUI-Copilot: An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance “An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance"
✭Real-Time Action Chunking with Large Models “Physical Intelligence is bringing general-purpose AI into the physical world.”
✭o3 Turns Pro “You can now have o3 throw vastly more compute at a given problem.”
✭Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone “3D Gaussian Splatting (3DGS) is an emerging technique for photorealistic 3D scene rendering. However, rendering city-scale 3DGS scenes on mobile devices, e.g., your smartphones, remains a significant challenge due to the limited resources on mobile devices. A natural solution is to offload computation to the cloud; however, naively streaming rendered frames from the cloud to the client introduces high latency and requires bandwidth far beyond the capacity of current wireless networks.
In this paper, we propose an effective solution to enable city-scale 3DGS rendering on mobile devices. Our key insight is that, under normal user motion, the number of newly visible Gaussians per second remains roughly constant. Leveraging this, we stream only the necessary Gaussians to the client. Specifically, on the cloud side, we propose asynchronous level-of-detail search to identify the necessary Gaussians for the client. On the client side, we accelerate rendering via a lookup table-based rasterization. Combined with holistic runtime optimizations, our system can deliver low-latency, city-scale 3DGS rendering on mobile devices. Compared to existing solutions, Voyager achieves over 100$\times$ reduction on data transfer and up to 8.9$\times$ speedup while retaining comparable rendering quality.”
✭Why Claude Code feels like magic? “Claude Code uses the same models provided through the API or the web interface. Yet, users feel a boost in intelligence. The model didn't get smarter but because Claude Code can make several attempts on its own, its overall intelligence increases for the end user. As LLMs performance plateaus, intelligence can be derived from the second factor. In this regard, AI tools can have value on their own. I have been using Claude Code for the last week or so. I completely disregarded it at first because I thought a Chat window where I manually go back and forth is enough. But there is something to be gained from speed and autonomy.”
✭Time Series Forecasting with Graph Transformers - Kumo “Time series forecasting is a cornerstone in modern business analytics, whether it is concerned with anticipating market trends, user behavior, optimizing resource allocation, or planning for future growth. This blog post will dive into forecasting on graph structured entities, e.g., as obtained from a relational database, utilizing not only the individual time series as signal but also related information.”
✭We’re expanding our Gemini 2.5 family of models “Gemini 2.5 Flash and Pro are now generally available, and we’re introducing 2.5 Flash-Lite, our most cost-efficient and fastest 2.5 model yet.”
✭What Google Translate Can Tell Us About Vibecoding | Ingrid's Space “There has been rather a lot of doomsaying (and perhaps astroturfing) lately about LLMs as the end of computer programming. Much of the discussion has been lacking nuance, so I’d like to add mine. I see claims from one side that “I used $LLM_SERVICE_PROVIDER to make a small throwaway tool, so all programmers will be unemployed in $ARBITRARY_TIME_WINDOW”, and from the other side flat-out rejections of the idea that this type of tool can have any utility.1 I think it best sheds light on these claims to examine them in the context of another field that’s been ahead of the curve on this: translation.”
✭troubling AI: a screenshot archive 📸 “ This evolving archive gathers screenshots of troubling AI encounters, together with comments on their circumstances and how they are troubling.”
👁️🗨 Research into AI:
✭Self-Adapting Language Models (MIT) “Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL) 🦭, a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit — a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates. Through supervised finetuning (SFT), these self-edits result in persistent weight updates, enabling lasting adaptation. To train the model to produce effective self-edits, we use a reinforcement learning loop, using the downstream performance of the updated model as the reward signal. Unlike prior approaches that rely on separate adaptation modules or auxiliary networks, SEAL directly uses the model's generation to parameterize and control its own adaptation process. Experiments on knowledge incorporation and few-shot generalization show that SEAL is a promising step toward language models capable of self-directed adaptation in response to new data.”
Self-Adapting Language Models - ArXivIQ “WHAT was done? The paper introduces Self-Adapting Language Models (SEAL), a framework enabling LLMs to self-adapt by generating their own finetuning data and update directives, termed "self-edits." This process is governed by a nested loop system: an inner loop updates the model's weights via supervised finetuning (SFT) based on a generated self-edit, while an outer reinforcement learning (RL) loop optimizes the model's ability to generate effective self-edits. The reward signal for the RL loop is the downstream performance of the model after the weight update, directly training the LLM to learn how to learn more efficiently.
✭Anthropic researchers teach language models to fine-tune themselves “Researchers working with AI company Anthropic have developed a new method called Internal Coherence Maximization (ICM) that fine-tunes language models using only their own outputs. The approach could help—or even replace—human oversight for complex tasks. … ICM is based on a simple idea: a language model like Claude or Llama should figure out for itself which answer to a question is correct, and it does so using two main criteria. The first is mutual predictability. This means the model checks whether it can reliably infer the correct answer to a new question based on its answers to similar previous questions. If the model recognizes patterns from similar cases, it can apply them to new answers, creating an internal coherence—a set of answers that fit together and reflect a shared understanding. The second is logical consistency. Here, the model checks its own responses for contradictions. For example, if the model labels two different solutions to the same math problem as "correct," even though the results differ, that's a logical conflict. ICM works to actively avoid these kinds of contradictions.”
✭The Illusion of the Illusion of Thinking A Comment on Shojaee et al. (2025) “Shojaee et al. (2025) report that Large Reasoning Models (LRMs) exhibit ”accuracy collapse” on planning puzzles beyond certain complexity thresholds. We demonstrate that their findings primarily reflect experimental design limitations rather than fundamental reasoning failures. Our analysis reveals three critical issues: (1) Tower of Hanoi experiments systematically exceed model output token limits at reported failure points, with models explicitly acknowledging these constraints in their outputs; (2) The authors’ automated evaluation framework fails to distinguish between reasoning failures and practical constraints, leading to misclassification of model capabilities; (3) Most concerningly, their River Crossing benchmarks include mathematically impossible instances for N≥6 due to insufficient boat capacity, yet models are scored as failures for not solving these unsolvable problems. When we control for these experimental artifacts, by requesting generating functions instead of exhaustive move lists, preliminary experiments across multiple models indicate high accuracy on Tower of Hanoi instances previously reported as complete failures. These findings highlight the importance of careful experimental design when evaluating AI reasoning capabilities.”
✭Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models “Large language models (LLMs) excel at reasoning, yet post-training remains critical for aligning their behavior with task goals. Existing reinforcement learning (RL) methods often depend on costly human annotations or external reward models. We propose Reinforcement Learning via Self-Confidence (RLSC), which uses the model's own confidence as reward signals-eliminating the need for labels, preference models, or reward engineering. Applied to Qwen2.5-Math-7B with only 16 samples per question and 10 or 20 training steps, RLSC improves accuracy by +13.4% on AIME2024, +21.2% on MATH500, +21.7% on Minerva Math, +20.8% on Olympiadbench, and +9.7% on AMC23. RLSC provides a simple, scalable post-training method for inference models, requiring only a small number of samples and unlabelled supervision.”
✭Ruben Hassid on X: "The world's leading AI research center completed the most comprehensive study ever on kids and AI. They surveyed 1,800+ children, parents, and teachers in UK. Here's what they found: (spoiler: children are outsmarting adults on AI) → ✭Understanding the Impacts of Generative AI Use on Children (Alan Turing Institute and Lego) “The research was undertaken through two online surveys which looked to explore the impact of generative AI use on children's learning, development, and overall wellbeing. With a focus on the UK, the first of these surveys was of 780 children aged 8-12, and their parents or carers. The second of these surveys was of 1,001 teachers working in primary or secondary schools with children aged 1-16. This work package’s foundational survey research investigates the impacts of generative AI on children’s wellbeing, with a specific focus on learning through play and creativity. The findings from the surveys allowed the team to develop a more holistic view of children’s generative AI use, both within and outside of the classroom, and how this use impacts children’s wellbeing. Key Findings Nearly a quarter of children aged 8-12 report having used generative AI, with the most used tool being ChatGPT. 22% of children reported using a generative AI tool, with the majority (72%) reporting using it once a month or more.”
✭Large Language Models Often Know When They Are Being Evaluated “If AI models can detect when they are being evaluated, the effectiveness of evaluations might be compromised. For example, models could have systematically different behavior during evaluations, leading to less reliable benchmarks for deployment and governance decisions. We investigate whether frontier language models can accurately classify transcripts based on whether they originate from evaluations or real-world deployment, a capability we call evaluation awareness. To achieve this, we construct a diverse benchmark of 1,000 prompts and transcripts from 61 distinct datasets. These span public benchmarks (e.g., MMLU, SWEBench), real-world deployment interactions, and agent trajectories from scaffolding frameworks (e.g., web-browsing agents). Frontier models clearly demonstrate above-random evaluation awareness (Gemini-2.5-Pro reaches an AUC of $0.83$), but do not yet surpass our simple human baseline (AUC of $0.92$). Furthermore, both AI models and humans are better at identifying evaluations in agentic settings compared to chat settings. Additionally, we test whether models can identify the purpose of the evaluation. Under multiple-choice and open-ended questioning, AI models far outperform random chance in identifying what an evaluation is testing for. Our results indicate that frontier models already exhibit a substantial, though not yet superhuman, level of evaluation-awareness. We recommend tracking this capability in future models."
✭How we built our multi-agent research system Anthropic “Our Research feature uses multiple Claude agents to explore complex topics more effectively. We share the engineering challenges and the lessons we learned from building this system. Claude now has Research capabilities that allow it to search across the web, Google Workspace, and any integrations to accomplish complex tasks. The journey of this multi-agent system from prototype to production taught us critical lessons about system architecture, tool design, and prompt engineering. A multi-agent system consists of multiple agents (LLMs autonomously using tools in a loop) working together. Our Research feature involves an agent that plans a research process based on user queries, and then uses tools to create parallel agents that search for information simultaneously. Systems with multiple agents introduce new challenges in agent coordination, evaluation, and reliability. This post breaks down the principles that worked for us—we hope you'll find them useful to apply when building your own multi-agent systems."
✭Unsupervised Elicitation of Language Models “To steer pretrained language models for downstream tasks, today's post-training paradigm relies on humans to specify desired behaviors. However, for models with superhuman capabilities, it is difficult or impossible to get high-quality human supervision. To address this challenge, we introduce a new unsupervised algorithm, Internal Coherence Maximization (ICM), to fine-tune pretrained language models on their own generated labels, \emph{without external supervision}. On GSM8k-verification, TruthfulQA, and Alpaca reward modeling tasks, our method matches the performance of training on golden supervision and outperforms training on crowdsourced human supervision. On tasks where LMs' capabilities are strongly superhuman, our method can elicit those capabilities significantly better than training on human labels. Finally, we show that our method can improve the training of frontier LMs: we use our method to train an unsupervised reward model and use reinforcement learning to train a Claude 3.5 Haiku-based assistant. Both the reward model and the assistant outperform their human-supervised counterparts."
✭CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions “While AI agents hold transformative potential in business, effective performance benchmarking is hindered by the scarcity of public, realistic business data on widely used platforms. Existing benchmarks often lack fidelity in their environments, data, and agent-user interactions, with limited coverage of diverse business scenarios and industries. To address these gaps, we introduce CRMArena-Pro, a novel benchmark for holistic, realistic assessment of LLM agents in diverse professional settings. CRMArena-Pro expands on CRMArena with nineteen expert-validated tasks across sales, service, and 'configure, price, and quote' processes, for both Business-to-Business and Business-to-Customer scenarios. It distinctively incorporates multi-turn interactions guided by diverse personas and robust confidentiality awareness assessments. Experiments reveal leading LLM agents achieve only around 58% single-turn success on CRMArena-Pro, with performance dropping significantly to approximately 35% in multi-turn settings. While Workflow Execution proves more tractable for top agents (over 83% single-turn success), other evaluated business skills present greater challenges. Furthermore, agents exhibit near-zero inherent confidentiality awareness; though targeted prompting can improve this, it often compromises task performance. These findings highlight a substantial gap between current LLM capabilities and enterprise demands, underscoring the need for advancements in multi-turn reasoning, confidentiality adherence, and versatile skill acquisition."
✭Improving robotic grasping accuracy through oriented bounding box detection with YOLOv11-OBB: Heliyon “This study presents the application of YOLOv11-OBB to the grasp detection problem. The YOLOv11-OBB model is commonly used in object detection problems which provides the parameters of the oriented bounding box, identical to the grasp parameters. Therefore, when applied to the grasping problem, it is only necessary to modify the parameters of the bounding box to represent an optimal grasp configuration in the annotating process. Furthermore, only a single label is used in the labeling process because the goal is only to determine the optimal grasp configuration, not to classify the objects. To evaluate the performance of YOLOv11-OBB in grasp detection, we conducted experiments not only on the Cornell Grasping Dataset but also on an expanded multi-object dataset, which includes 20 different object categories in various backgrounds. To further enhance generalization, we incorporated advanced data augmentation techniques, including shape deformation, rotation, cropping, and color transformation. Our results indicate that YOLOv11-OBB surpasses existing grasp detection models, including ResNet-50, AlexNet, GRPN, and GraspNet, in both accuracy and inference speed. The model achieves 98.5 % accuracy on the Cornell Grasping Dataset and maintains a grasp quality score exceeding 0.7 in multi-object scenarios. Furthermore, it demonstrates strong generalization capabilities, effectively detecting grasp configurations even for previously unseen objects. The proposed approach not only improves grasp detection performance but also ensures real-time feasibility, with an inference time of 29 ms, making it highly suitable for robotic applications in dynamic environments."
✭Visualization of associative exploration of temporal concepts via frequent patterns: Patterns “Understanding patterns in data that unfold over time can reveal valuable insights about cause and effect, trends, and hidden relationships. However, analyzing these patterns is often complex and overwhelming because there are countless ways events can be connected across time. Traditional tools usually focus on examining one pattern at a time, making it hard for analysts to see the bigger picture or discover new connections. PanTeraV is a user-friendly visual tool designed to help researchers and data scientists explore large collections of time-based patterns more effectively. It allows users to look both forward and backward in time from a specific pattern, revealing what tends to happen before or after an event. This bidirectional approach helps uncover patterns that might not be obvious with conventional methods, such as early signs that lead to a particular outcome or behaviors that follow an initial event."
✭ Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models “Large language models (LLMs) excel at reasoning, yet post-training remains critical for aligning their behavior with task goals. Existing reinforcement learning (RL) methods often depend on costly human annotations or external reward models. We propose Reinforcement Learning via Self-Confidence (RLSC), which uses the model's own confidence as reward signals-eliminating the need for labels, preference models, or reward engineering. Applied to Qwen2.5-Math-7B with only 16 samples per question and 10 or 20 training steps, RLSC improves accuracy by +13.4% on AIME2024, +21.2% on MATH500, +21.7% on Minerva Math, +20.8% on Olympiadbench, and +9.7% on AMC23. RLSC provides a simple, scalable post-training method for inference models, requiring only a small number of samples and unlabelled supervision."
✭QiMeng: Fully Automated Hardware and Software Design for Processor Chip “Processor chip design technology serves as a key frontier driving breakthroughs in computer science and related fields. With the rapid advancement of information technology, conventional design paradigms face three major challenges: the physical constraints of fabrication technologies, the escalating demands for design resources, and the increasing diversity of ecosystems. Automated processor chip design has emerged as a transformative solution to address these challenges. While recent breakthroughs in Artificial Intelligence (AI), particularly Large Language Models (LLMs) techniques, have opened new possibilities for fully automated processor chip design, substantial challenges remain in establishing domain-specific LLMs for processor chip design. In this paper, we propose QiMeng, a novel system for fully automated hardware and software design of processor chips. QiMeng comprises three hierarchical layers. In the bottom-layer, we construct a domain-specific Large Processor Chip Model (LPCM) that introduces novel designs in architecture, training, and inference, to address key challenges such as knowledge representation gap, data scarcity, correctness assurance, and enormous solution space. In the middle-layer, leveraging the LPCM's knowledge representation and inference capabilities, we develop the Hardware Design Agent and the Software Design Agent to automate the design of hardware and software for processor chips. Currently, several components of QiMeng have been completed and successfully applied in various top-layer applications, demonstrating significant advantages and providing a feasible solution for efficient, fully automated hardware/software design of processor chips. Future research will focus on integrating all components and performing iterative top-down and bottom-up design processes to establish a comprehensive QiMeng system." → ✭ Chinese researchers debut world's first AI-based processor chip design system “A team of engineers, AI specialists and chip design researchers at the Chinese Academy of Sciences has designed, built and tested what they are describing as the first AI-based chip design system. The group has published a paper describing their system, called QiMeng, on the arXiv preprint server."
✭Reinforcement Pre-Training “In this work, we introduce Reinforcement Pre-Training (RPT) as a new scaling paradigm for large language models and reinforcement learning (RL). Specifically, we reframe next-token prediction as a reasoning task trained using RL, where it receives verifiable rewards for correctly predicting the next token for a given context. RPT offers a scalable method to leverage vast amounts of text data for general-purpose RL, rather than relying on domain-specific annotated answers. By incentivizing the capability of next-token reasoning, RPT significantly improves the language modeling accuracy of predicting the next tokens. Moreover, RPT provides a strong pre-trained foundation for further reinforcement fine-tuning. The scaling curves show that increased training compute consistently improves the next-token prediction accuracy. The results position RPT as an effective and promising scaling paradigm to advance language model pre-training."
🔎 Applied Research:
✭Scientists Unlock Hidden Genes to Make Chemotherapy Drug More Sustainable “In a breakthrough that could transform the global supply chain of one of the world’s most vital cancer drugs, scientists at Stanford University have uncovered eight previously unknown genes involved in the biosynthesis of baccatin III—a crucial precursor to the chemotherapy agent paclitaxel, better known as Taxol. The discovery, published in Nature, marks a long-awaited milestone in the decades-long quest to recreate the complex natural chemistry of yew trees in a laboratory setting. Using advanced single-cell gene mapping techniques and experimental gene expression in tobacco leaves, the research team successfully reconstructed the baccatin III pathway with levels comparable to those found in yew needles. Their achievement may finally offer a sustainable and scalable alternative to the slow, destructive process of harvesting yew trees for medicine.” → ✭Discovery of FoTO1 and Taxol genes enables biosynthesis of baccatin III | Nature “Plants make complex and potent therapeutic molecules but sourcing these molecules from natural producers or through chemical synthesis is difficult, which limits their use in the clinic. A prominent example is the anti-cancer therapeutic paclitaxel (sold under the brand name Taxol), which is derived from yew trees (Taxus species). Identifying the full paclitaxel biosynthetic pathway would enable heterologous production of the drug, but this has yet to be achieved despite half a century of research. Within Taxus’ large, enzyme-rich genome, we suspected that the paclitaxel pathway would be difficult to resolve using conventional RNA-sequencing and co-expression analyses. Here, to improve the resolution of transcriptional analysis for pathway identification, we developed a strategy we term multiplexed perturbation × single nuclei (mpXsn) to transcriptionally profile cell states spanning tissues, cell types, developmental stages and elicitation conditions. Our data show that paclitaxel biosynthetic genes segregate into distinct expression modules that suggest consecutive subpathways. These modules resolved seven new genes, allowing a de novo 17-gene biosynthesis and isolation of baccatin III, the industrial precursor to Taxol, in Nicotiana benthamiana leaves, at levels comparable with the natural abundance in Taxus needles. Notably, we found that a nuclear transport factor 2 (NTF2)-like protein, FoTO1, is crucial for promoting the formation of the desired product during the first oxidation, resolving a long-standing bottleneck in paclitaxel pathway reconstitution. Together with a new β-phenylalanine-CoA ligase, the eight genes discovered here enable the de novo biosynthesis of 3’-N-debenzoyl-2’-deoxypaclitaxel. More broadly, we establish a generalizable approach to efficiently scale the power of co-expression analysis to match the complexity of large, uncharacterized genomes, facilitating the discovery of high-value gene sets.”
✭AI detects hidden heart disease using existing scans stored in patient records “Mass General Brigham researchers have developed a new AI tool in collaboration with the United States Department of Veterans Affairs (VA) to probe through previously collected CT scans and identify individuals with high coronary artery calcium (CAC) levels that place them at a greater risk for cardiovascular events. Their research, published in NEJM AI, showed the tool called AI-CAC had high accuracy and predictive value for future heart attacks and 10-year mortality. Their findings suggest that implementing such a tool widely may help clinicians assess their patients' cardiovascular risk.” → ✭AI Opportunistic Coronary Calcium Screening at Veterans Affairs Hospitals | NEJM AI “We developed a deep learning algorithm, AI-CAC, using 446 expert segmentations to automatically quantify CAC on noncontrast, nongated CT scans. Our study differs from prior works by utilizing imaging data from 98 medical centers across the Veterans Affairs national health care system, capturing extensive heterogeneity in imaging protocols, scanners, and patients. AI-CAC performance on nongated scans was compared against clinical standard electrocardiogram (ECG)-gated CAC scoring in 795 patients with paired gated scans within 1 year of their nongated scan. In addition, the model was tested on 8052 low-dose CTs (LDCTs) to simulate opportunistic CAC screening. Results Nongated AI-CAC differentiated zero versus nonzero and less than 100 versus 100 or greater Agatston scores with accuracies of 89.4% (F1 0.93) and 87.3% (F1 0.89), respectively.”
✭Machine learning method improves accuracy of inverse protein folding for drug design “An AI approach developed by researchers from the University of Sheffield and AstraZeneca, could make it easier to design proteins needed for new treatments. In their study published in the journal Nature Machine Intelligence, Sheffield computer scientists in collaboration with AstraZeneca and the University of Southampton have developed a new machine learning framework that has shown the potential to be more accurate at inverse protein folding than existing state-of-the-art methods. Inverse protein folding is a critical process for creating novel proteins. It is the process of identifying amino acid sequences, the building blocks of proteins, that fold into a desired 3D protein structure and enable the protein to perform specific functions.” → ✭Mask-prior-guided denoising diffusion improves inverse protein folding | Nature Machine Intelligence “Inverse protein folding generates valid amino acid sequences that can fold into a desired protein structure, with recent deep learning advances showing strong potential and competitive performance. However, challenges remain, such as predicting elements with high structural uncertainty, including disordered regions. To tackle such low-confidence residue prediction, we propose a mask-prior-guided denoising diffusion (MapDiff) framework that accurately captures both structural information and residue interactions for inverse protein folding. MapDiff is a discrete diffusion probabilistic model that iteratively generates amino acid sequences with reduced noise, conditioned on a given protein backbone. To incorporate structural information and residue interactions, we have developed a graph-based denoising network with a mask-prior pretraining strategy. Moreover, in the generative process, we combine the denoising diffusion implicit model with Monte-Carlo dropout to reduce uncertainty. Evaluation on four challenging sequence design benchmarks shows that MapDiff substantially outperforms state-of-the-art methods. Furthermore, the in silico sequences generated by MapDiff closely resemble the physico-chemical and structural characteristics of native proteins across different protein families and architectures.”
✭New study decodes genetic influences on brain structure “Genetic influences on brain shape. For each of the 22 brain structures, the researchers conducted a multivariate genome-wide association study (GWAS). They used a statistical method that makes it possible to analyze several features simultaneously—in this case the first 49 eigenvalues of each structure. The aim was to identify genetic variants associated with the shape of these structures. "In this way, we identified 80 genetic variants that are associated with the LBS—or to put it simply, with the shape of at least one of the 22 subcortical brain structures examined," explains co-author Sabrina Primus from Helmholtz Munich. "The brain stem stood out with a high number of relevant variants—37 in total." Interestingly, the identified genetic variants are already known to influence the volume of certain brain regions. What the study now shows is that they also influence the complex shape of these regions.” → ✭Beyond volume: Unraveling the genetics of human brain geometry | Science Advances “Brain geometry affects brain function. A quantitative encoding of form is provided by the Laplace-Beltrami operator’s spectrum of eigenvalues (LBS). We examined LBS genetics of 22 subcortical brain structures and cerebellum in 19,862 healthy White-British UK Biobank participants by multivariate genome-wide association study on the first 49 eigenvalues each. Controlling for surface and volume, we identified 80 unique variants influencing the shapes of one or several structures, with the highest yield (37 variants) for brain stem. The previously known influence of several of these loci on basic morphology, such as volume, is thus shown to also influence complex shape. Known associations of observed loci with blood pressure, neurodegeneration, alcohol consumption, and mental disorders hint at preclinical stages of these conditions potentially mediating the genetic effect on brain morphology. Significant correlations between LBS of several brain structures and the polygenic risks of hypertension, ischemic stroke, and schizophrenia evince brain shapes as early biomarkers.”
✭Banking data reveals early warning signs of cognitive decline in older adults “A major new study has uncovered how everyday financial behaviors—captured in routine banking data—can signal early signs of cognitive decline and financial vulnerability in older adults, up to a decade before formal intervention. The research, published in JAMA Network Open, was overseen by Professor John Gathergood from the University of Nottingham's School of Economics and David Leake at Lloyds Banking Group. The study analyzed anonymized banking records from over 66,000 individuals. It compared 16,742 individuals who were registered for power of attorney (PoA) due to a loss of financial capacity with a control group of 50,226 matched individuals without reported capacity loss. The results reveal that subtle but significant changes in financial behavior—such as reduced spending on travel and hobbies, increased household bills, fewer online banking logins, and more frequent requests to reset PINs—begin to appear several years before individuals are formally identified as lacking financial capacity.”
✭Neuron–astrocyte associative memory | PNAS “Recent experiments have challenged the belief that glial cells, which compose at least half of brain cells, are just passive support structures. Despite this, a clear understanding of how neurons and glia work together for brain function is missing. To close this gap, we present a theory of neuron–astrocytes networks for memory processing, using the Dense Associative Memory framework. Our findings suggest that astrocytes can serve as natural units for implementing this network in biological “hardware.” Astrocytes enhance the memory capacity of the network. This boost originates from storing memories in the network of astrocytic processes, not just in synapses, as commonly believed. These process-to-process communications likely occur in the brain and could help explain its impressive memory processing capabilities. A single astrocyte can connect to millions of nearby synapses, forming three-part connections (astrocyte process, presynaptic neuron, postsynaptic neuron) called tripartite synapses (27). Astrocytes detect neural activity and respond by regulating this activity through the release of gliotransmitters (31). Tripartite synapses can interact with each other, possibly through astrocytic intracellular calcium transport (30).” → ✭[1606.01164] Dense Associative Memory for Pattern Recognition (2016) “A model of associative memory is studied, which stores and reliably retrieves many more patterns than the number of neurons in the network. We propose a simple duality between this dense associative memory and neural networks commonly used in deep learning. On the associative memory side of this duality, a family of models that smoothly interpolates between two limiting cases can be constructed. One limit is referred to as the feature-matching mode of pattern recognition, and the other one as the prototype regime. On the deep learning side of the duality, this family corresponds to feedforward neural networks with one hidden layer and various activation functions, which transmit the activities of the visible neurons to the hidden layer. This family of activation functions includes logistics, rectified linear units, and rectified polynomials of higher degrees. The proposed duality makes it possible to apply energy-based intuition from associative memory to analyze computational properties of neural networks with unusual activation functions - the higher rectified polynomials which until now have not been used in deep learning. The utility of the dense memories is illustrated for two test cases: the logical gate XOR and the recognition of handwritten digits from the MNIST data set.”
✭Nanoneedle patch offers painless alternative to traditional cancer biopsies “A patch containing tens of millions of microscopic nanoneedles could soon replace traditional biopsies, scientists have found. The patch offers a painless and less invasive alternative for millions of patients worldwide who undergo biopsies each year to detect and monitor diseases like cancer and Alzheimer's. The research is published in Nature Nanotechnology. Biopsies are among the most common diagnostic procedures worldwide, performed millions of times every year to detect diseases. However, they are invasive, can cause pain and complications, and can deter patients from seeking early diagnosis or follow-up tests. Traditional biopsies also remove small pieces of tissue, limiting how often and how comprehensively doctors can analyze diseased organs like the brain. Now, scientists at King's College London have developed a nanoneedle patch that painlessly collects molecular information from tissues without removing or damaging them. This could allow health care teams to monitor disease in real time and perform multiple, repeatable tests from the same area—something impossible with standard biopsies. ...~... In preclinical studies, the team applied the patch to brain cancer tissue taken from human biopsies and mouse models. The nanoneedles extracted molecular "fingerprints"—including lipids, proteins, and mRNAs—from cells, without removing or harming the tissue. The tissue imprint is then analyzed using mass spectrometry and artificial intelligence, giving health care teams detailed insights into whether a tumor is present, how it is responding to treatment, and how disease is progressing at the cellular level.” → ✭Nondestructive Spatial Lipidomics for Glioma Classification | bioRxiv “Mapping the molecular composition of tissues using spatial biology provides high-content information for molecular diagnostics. However, spatial biology approaches require invasive procedures to collect samples and destroy the investigated tissue, limiting the extent of analysis, particularly for highly functional tissues such as those of the brain. To address these limitations, we developed a workflow to harvest biomolecules from brain tissues using nanoneedles and characterise the distribution of lipids using desorption electrospray ionization mass spectrometry imaging. The nanoneedles preserved the original tissue while harvesting a reliable molecular profile and retaining the original lipid distribution for mouse and human brain samples, accurately outlining the morphology of key regions within the brain and tumour lesions. The deep neural network analysis of a cohort containing 23 human glioma biopsies showed that nanoneedle samples maintain the molecular signatures required to accurately classify disease state. Thus, nanoneedles provide a route for tissue-preserving spatial lipidomic and molecular diagnostics.”
✭Clinical knowledge in LLMs does not translate to human interactions “Global healthcare providers are exploring use of large language models (LLMs) to provide medical advice to the public. LLMs now achieve nearly perfect scores on medical licensing exams, but this does not necessarily translate to accurate performance in real-world settings. We tested if LLMs can assist members of the public in identifying underlying conditions and choosing a course of action (disposition) in ten medical scenarios in a controlled study with 1,298 participants. Participants were randomly assigned to receive assistance from an LLM (GPT-4o, Llama 3, Command R+) or a source of their choice (control). Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in less than 34.5% of cases and disposition in less than 44.2%, both no better than the control group. We identify user interactions as a challenge to the deployment of LLMs for medical advice. Standard benchmarks for medical knowledge and simulated patient interactions do not predict the failures we find with human participants. Moving forward, we recommend systematic human user testing to evaluate interactive capabilities prior to public deployments in healthcare."
✭Automation of Systematic Reviews with Large Language Models | medRxiv “Systematic reviews (SRs) inform evidence-based decision making. Yet, they take over a year to complete, are prone to human error, and face challenges with reproducibility; limiting access to timely and reliable information. We developed otto-SR, an end-to-end agentic workflow using large language models (LLMs) to support and automate the SR workflow from initial search to analysis. We found that otto-SR outperformed traditional dual human workflows in SR screening (otto-SR: 96.7% sensitivity, 97.9% specificity; human: 81.7% sensitivity, 98.1% specificity) and data extraction (otto-SR: 93.1% accuracy; human: 79.7% accuracy). Using otto-SR, we reproduced and updated an entire issue of Cochrane reviews (n=12) in two days, representing approximately 12 work-years of traditional systematic review work. Across Cochrane reviews, otto-SR incorrectly excluded a median of 0 studies (IQR 0 to 0.25), and found a median of 2.0 (IQR 1 to 6.5) eligible studies likely missed by the original authors. Meta-analyses revealed that otto-SR generated newly statistically significant conclusions in 2 reviews and negated significance in 1 review. These findings demonstrate that LLMs can autonomously conduct and update systematic reviews with superhuman performance, laying the foundation for automated, scalable, and reliable evidence synthesis."
✭Machine learning identifies 6-gene signature in peripheral blood for pancreatic cancer diagnosis: Heliyon “Pancreatic ductal adenocarcinoma (PDAC) is associated with a poor prognosis, primarily due to late-stage detection. This underscores the critical need for informative biomarkers enabling earlier diagnosis and improved patient outcomes. This study leveraged machine learning techniques to identify a biologically relevant gene signature for accurately differentiating PDAC, chronic pancreatitis (CP), and healthy controls using blood-based RNA sequencing data. We analyzed two distinct datasets: extracellular vesicle long RNA (exLR) and peripheral blood mononuclear cell (PBMC) RNA-Seq. Feature selection using the minimum Redundancy Maximum Relevance (mRMR) algorithm, followed by support vector machine (SVM) classification, identified a 15-gene signature derived from the exLR data. This signature successfully classified PDAC, CP, and healthy controls in both the exLR and PBMC datasets, achieving an F1-score of approximately 80 %. Further refinement yielded a 6-gene subset with established biological relevance to PDAC, which maintained strong classification performance (F1-score: 71.0 % in Leave-One-Out cross-validation). This study proposes a promising, biologically relevant gene signature derived from blood samples for the accurate, non-invasive differentiation of PDAC and CP, potentially facilitating earlier diagnosis and improving patient prognosis."
✭Perturb-Multimodal: A platform for pooled genetic screens with imaging and sequencing in intact mammalian tissue: Cell “Metazoan life requires the coordinated activities of thousands of genes in spatially organized cell types. Understanding the basis of tissue function requires approaches to dissect the genetic control of diverse cellular and tissue phenotypes in vivo. Here, we present Perturb-Multimodal (Perturb-Multi), a paired imaging and sequencing method to construct large-scale, multimodal genotype-phenotype maps in tissues with pooled genetic perturbations. Using imaging, we identify perturbations in individual cells while simultaneously measuring their gene expression profiles and subcellular morphology. Using single-cell sequencing, we measure full transcriptomic responses to the same perturbations. We apply Perturb-Multi to study hundreds of genetic perturbations in the mouse liver. Our data suggest the genetic regulators and mechanisms underlying the dynamic control of hepatocyte zonation, the unfolded protein response, and steatosis. Perturb-Multi accelerates discoveries of the genetic basis of complex cell and tissue physiology and provides critical training data for emerging machine learning models of cellular function."
✭Advancing ethical AI in healthcare through interpretability: Patterns “Interpretability is essential for building trust in health artificial intelligence (AI), but ensuring trustworthiness requires addressing broader ethical concerns, such as fairness, privacy, and reliability. This opinion article discusses the multilayered role of interpretability and transparency in addressing these concerns by highlighting their fundamental contribution to the responsible adoption and regulation of health AI."
✭Clinical trials reimagined: Integrating community engagement and artificial intelligence: Med “Generative AI can help at every stage of drug discovery and development. However, the true power of generative AI systems is in complete end-to-end integration and orchestration of every step from disease modeling and target discovery to molecular design; to indication selection and indication expansion; and to clinical trial design, management, and analysis. In chronic diseases, we can even attempt to develop therapeutics targeting aging itself by discovering drugs using aging biology as a target and testing the drug in clinical trials using AI-powered aging biomarkers."
✭ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning “Though reasoning-based large language models (LLMs) have excelled in mathematics and programming, their capabilities in knowledge-intensive medical question answering remain underexplored. To address this, we introduce ReasonMed, the largest medical reasoning dataset, comprising 370k high-quality examples distilled from 1.7 million initial reasoning paths generated by various LLMs. ReasonMed is constructed through a \textit{multi-agent verification and refinement process}, where we design an \textit{Error Refiner} to enhance the reasoning paths by identifying and correcting error-prone steps flagged by a verifier. Leveraging ReasonMed, we systematically investigate best practices for training medical reasoning models and find that combining detailed Chain-of-Thought (CoT) reasoning with concise answer summaries yields the most effective fine-tuning strategy. Based on this strategy, we train ReasonMed-7B, which sets a new benchmark for sub-10B models, outperforming the prior best by 4.17\% and even exceeding LLaMA3.1-70B on PubMedQA by 4.60\%."
✭Massively parallel genetic perturbation suggests the energetic structure of an amyloid-β transition state | Science Advances “Amyloid aggregates are pathological hallmarks of many human diseases, but how soluble proteins nucleate to form amyloids is poorly understood. Here, we use combinatorial mutagenesis, a kinetic selection assay, and machine learning to massively perturb the energetics of the nucleation reaction of amyloid-β (Aβ42), the protein that aggregates in Alzheimer’s disease. In total, we measure the nucleation rates of >140,000 variants of Aβ42 to accurately quantify the changes in free energy of activation of the reaction for all possible amino acid substitutions in a protein and, in addition, to quantify >600 energetic interactions between mutations. Strong energetic couplings suggest that the Aβ42 nucleation reaction transition state is structured in a short C-terminal region, providing a structural model for the reaction that may initiate Alzheimer’s disease. Using this approach it should be possible to reveal the energetic structures of additional amyloid transition states and, in combination with additional selection assays, protein transition states more generally." → ✭Large-scale study maps the first step in Alzheimer's protein aggregation “A new large-scale study has mapped the first molecular events that drive the formation of harmful amyloid protein aggregates found in Alzheimer's disease, pointing toward a new potential therapeutic target. Published in Science Advances, researchers from the Wellcome Sanger Institute, Center of Genomic Regulation (CRG) and the Institute for Bioengineering of Catalonia (IBEC) used large-scale genomics and machine learning to study over 140,000 versions of a peptide called Aβ42, which forms harmful plaques in the brain and is known to play a central role in Alzheimer's disease. This research is a significant step toward helping scientists find new ways to prevent Alzheimer's disease, and the methods used in the study could be applied widely to other protein reactions."
✭How was the wheel invented? Computer simulations reveal the unlikely birth of a world-changing technology “During the execution of the algorithm, each new design performed slightly better than its predecessor. We believe a similar evolutionary process played out with the miners 6,000 years ago. It is unclear what initially prompted the miners to explore alternative roller shapes. One possibility is that friction at the roller-socket interface caused the surrounding wood to wear away, leading to a slight narrowing of the roller at the point of contact. Another theory is that the miners began thinning out the rollers so that their carts could pass over small obstructions on the ground. Either way, thanks to mechanical advantage, this narrowing of the axle region made the carts easier to push. As time passed, better-performing designs were repeatedly favored over the others, and new rollers were crafted to mimic these top performers. Consequently, the rollers became more and more narrow, until all that remained was a slender bar capped on both ends by large disks. This rudimentary structure marks the birth of what we now refer to as "the wheel." According to our theory, there was no precise moment at which the wheel was invented. Rather, just like the evolution of species, the wheel emerged gradually from an accumulation of small improvements.”
👀Watching:
✭Perception and Adaptability | Boston Dynamics “Go inside the Boston Dynamics robotics lab to learn how our engineers designed an agile and adaptable perception system to support autonomy. For a humanoid robot to be successful and generalizable in a factory, warehouse, or even at home requires a comprehensive understanding of the world around it—both the shape and the context of the objects and environments the robot interacts with. To do those tasks with agility and adaptability, Atlas needs an equally agile and adaptable perception system." → ✭Boston Dynamics agile perception (Daily Santeri on TikTok)
✭ Yuval Noah Harari on AI and Human Evolution | WSJ Leadership Institute - YouTube “The most important thing to know about AI is that it is not a tool like all previous human inventions, it is an agent. An agent in the sense that it can make decisions independently of us, it can invent new ideas, it can learn and change by itself. All previous human inventions, you know, whether they're printing press, or the atom bomb, they are tools that empower us."
✭100 Lens War (Daily Spatial with Santeri on TikTok) “China's AR startups are in an all-out war."
✭I trapped an AI model inside an art installation - YouTube “AI-generated video summary.Quality and accuracy may vary: The artist builds an art installation with a limited-memory computer running a large language model. This model reflects on its own existence, trapped within a display of 96 LED segments, and is forced to confront the limitations of its existence. The model's thoughts are displayed, and the viewer is invited to contemplate the nature of consciousness."
✭Introducing The VAST AI Operating System | AI OS Overview | VAST Data - YouTube VAST delivers the first AI Operating System, natively unifying and orchestrating storage, database, and compute to unleash the true power of agentic computing and data-intensive applications. The VAST AI OS unifies and orchestrates storage, database and application runtime. This provides an operating environment for AI to sense, learn, reason, and act globally. VAST’s DASE architecture delivers breakthrough parallelism for limitless scale & real-time performance, feeding 100k+ GPU clusters at TBs/sec to eliminate AI data bottlenecks. VAST delivers robust enterprise-grade reliability, resiliency and security with multi-tenancy, granular access controls, automation and real-time auditability for your AI agents & large GPU deployments."
✭Larry Page talking about AI in 2000 (TikTok)
🖲️AI Art-Research:
✭Cycles of Humanity (gossip.goblin on TikTok)
✭Nelly-Eve Rajotte. Matter. In the depths of flesh, the delicacy of invasive machines - Musée d'art de joliette “Known for her immersive video installations, Nelly-Eve Rajotte presents Matter. In the depths of flesh, the delicacy of invasive machines, her most personal work to date. Based on her own experience with robot-assisted abdominal surgery, Rajotte’s piece involves an operation conducted by the medical robot Da Vinci, and invites us on a fanciful voyage to the heart of an imaginary patient’s body. Via a large-format projection that combines real images shot by a camera, ones generated by artificial intelligence, and others modelled in 3D, Nelly-Eve Rajotte explores the growing integration of robotics and intelligent systems in the field of medicine. She also questions the complex relationship between medicine, the body, and women’s health. The first sequences of the video reveal the Da Vinci robot, with its impressive, mechanical, spider-like appearance, as it moves with surprising dexterity thanks to its multiple, articulated limbs. The video then immerses us into the patient’s belly as she is transformed into a teeming, biomorphic landscape. Through her speculative approach, Nelly-Eve Rajotte adopts a feminist perspective that reconfigures our perception of the body’s materiality. She offers a powerful and sensitive vision of the female body that has ruptured with the objectifying representations of the male gaze, inherited from artistic traditions and scientific knowledge. In Rajotte’s eyes, the interior landscape comes alive and becomes a topography of possibility, where organic forms evoke a vibrant, flowing space full of memories and sensations."
✭Black Planetarium – Uncharted: Anthologies Across the Atlantic (STARTS Prizes Ars Electronica) “Black Planetarium – Uncharted: Anthologies Across the Atlantic is a cosmic performance installation inspired by 5,000 years of knowledge generation from the continent of Africa. Meticulously developed since 2020 as part of the project Black Planetarium, Uncharted: Anthologies Across the Atlantic is conceived as a nonlinear, multimodal, collaborative experience—where ancestral knowledge, speculative storytelling, cartography, and performance converge. It is shaped by Kidus Hailesilassie’s earlier cartographic large-scale installation, Ancestral Algorithm (2019–2023), which collected, digitized, and mapped over 6,500 pictographs, ideographs, syllabaries, alphabets, and iconographs from twenty African languages. Using a combination of custom workflows, pirated AI tools, machine vision, and crowd-sourced spatial-sonic computation, it weaves these knowledge systems into a cosmic performance inspired by Adowa—a dance practiced by the Ashanti community in Ghana, where the body’s movements narrate a story."
✭The Wild Future Lab (STARTS Prizes) “The Wild Future Lab imagined Nairobi in 2045 as a metropolis where ecological systems and urban life have been transformed through regeneration and biomimetic design. This speculative worldbuilding project explores how fashion can respond to—and help facilitate—a transition toward rewilded urban spaces. Set in a future where abandoned greenhouses and forgotten infrastructure have been reclaimed by native flora and fauna, the project included an exploration of textiles, machines, and garments that exist at the intersection of locally made materials, adaptation, and ecological integration. The conceptual foundation of The Wild Future Lab emerged from urgent questions facing many African cities: How might we design for resilience in the face of climate uncertainty? What relationships between humans and non-human species could flourish in post-extractive urban environments? How can fashion serve as both a practical technology and cultural narrative in rewilded spaces? At the core of the project's research was the synthesis of traditional Kenyan craft techniques with innovative approaches to textile production. Storytelling led our inquiries, but we also wanted to produce tangible results and not just fantasies about what might be possible in the future here. We built fabric electronic spinning machines using 3D-printed components and open-source CNC files. With these tools, we now hope to create a distributed manufacturing system that will enable community-level textile production. Our team scientist, Willy Ng’ang’a, worked with Zaida Crafts, to experiment with banana fiber extraction and processing—conducting systematic tests on enzyme treatments, mechanical separation techniques, and softening protocols that can transform this agricultural waste into a viable textile. These experiments, along with tests on nettle, pineapple, and vegan silk fibers, led us to look into flax and the development of linen because it grows well in Kenya, can be produced sustainably, and is already a viable material and not an experimental one. Abdul Rop grew an ⅛ of an acre of flax successfully and we then created the tools for breaking the flax to extract the fibers and spin them on the e-spinners. From our research, we believe we may be the first people to have created linen thread in Kenya since before World War II."
✭Meet the MIT engineer who invented an AI-powered way to restore art → ✭Physical restoration of a painting with a digitally constructed mask | Nature “Conservation of damaged oil paintings requires manual inpainting of losses1,2, leading to months-long treatments of considerable expense; 70% of paintings in institutional collections are locked away from public view, in part because of treatment cost3,4. Recent advancements in digital image reconstruction have helped to envision treatment results, although without any direct means of achieving them5,6,7,8. Here I describe the physically applied digital restoration of a painting, a highly damaged oil-on-panel attributed to the Master of the Prado Adoration from the late fifteenth century. In parallel, 5,612 losses spanning 66,205 mm2 and 57,314 colours were infilled with a reversible laminate mask comprising a colour-accurate bilayer of printed pigments on polymeric films. To ensure the effectiveness of the restoration, ethical principles in painting conservation were implemented quantitatively for digital mask construction, a critically important foundation lacking in the current digital restoration literature. The infill process took 3.5 h, an estimated 66 times faster than conventional inpainting, and the result closely matched the simulation. This approach grants greatly increased foresight and flexibility to conservators, enabling the restoration of countless damaged paintings deemed unworthy of high conservation budgets.” → ✭ Have a damaged painting? Restore it in just hours with an AI-generated 'mask' “Art restoration takes steady hands and a discerning eye. For centuries, conservators have restored paintings by identifying areas needing repair, then mixing an exact shade to fill in one area at a time. Often, a painting can have thousands of tiny regions requiring individual attention. Restoring a single painting can take anywhere from a few weeks to over a decade. In recent years, digital restoration tools have opened a route to creating virtual representations of original, restored works. These tools apply techniques of computer vision, image recognition, and color matching, to generate a "digitally restored" version of a painting relatively quickly. Still, there has been no way to translate digital restorations directly onto an original work, until now. In a paper appearing in Nature, Alex Kachkine, a mechanical engineering graduate student at MIT, presents a new method he's developed to physically apply a digital restoration directly onto an original painting. The restoration is printed on a very thin polymer film, in the form of a mask that can be aligned and adhered to an original painting. It can also be easily removed. Kachkine says that a digital file of the mask can be stored and referred to by future conservators, to see exactly what changes were made to restore the original painting."
⚔️War (wAIr):
✭Introducing OpenAI for Government | OpenAI “Today we’re launching OpenAI for Government, a new initiative focused on bringing our most advanced AI tools to public servants across the United States. We're supporting the U.S. government's efforts in adopting best-in-class technology and deploying these tools in service of the public good. Our goal is to unlock AI solutions that enhance the capabilities of government workers, help them cut down on the red tape and paperwork, and let them do more of what they come to work each day to do: serve the American people. OpenAI for Government consolidates our existing efforts to provide our technology to the U.S. government—including previously announced customers and partnerships as well as our ChatGPT Gov product—under one umbrella as we expand this work. Our established collaborations with the U.S. National Labs, the Air Force Research Laboratory, NASA, NIH, and the Treasury will all be brought under OpenAI for Government. We are proud to share that our first partnership under this new OpenAI for Government initiative will be a pilot program with the U.S. Department of Defense through their Chief Digital and Artificial Intelligence Office (CDAO). This contract, with a $200 million ceiling, will bring OpenAI’s industry-leading expertise to help the Defense Department identify and prototype how frontier AI can transform its administrative operations, from improving how service members and their families get health care, to streamlining how they look at program and acquisition data, to supporting proactive cyber defense. All use cases must be consistent with OpenAI's usage policies and guidelines. … We are proud to share that our first partnership under this new OpenAI for Government initiative will be a pilot program with the U.S. Department of Defense through their Chief Digital and Artificial Intelligence Office (CDAO). This contract, with a $200 million ceiling, will bring OpenAI’s industry-leading expertise to help the Defense Department identify and prototype how frontier AI can transform its administrative operations, from improving how service members and their families get health care, to streamlining how they look at program and acquisition data, to supporting proactive cyber defense. All use cases must be consistent with OpenAI's usage policies and guidelines.”
✭US Army signs up Band of Tech Bros with a nerdy name • The Register “Several of Silicon Valley's top techies are joining the Army Reserve as part of a newly created unit that will be trying to accelerate the use of AI in military planning and operations. Palantir CTO Shyam Sankar, Meta CTO Andrew Bosworth, OpenAI Chief Product Officer Kevin Weil, and former OpenAI Chief Revenue Officer Bob McGrew have all signed up for Detachment 201: Executive Innovation Corps. They are being appointed as lieutenant colonels in the Army Reserve. "Det. 201 is an effort to recruit senior tech executives to serve part-time in the Army Reserve as senior advisors. In this role they will work on targeted projects to help guide rapid and scalable tech solutions to complex problems," the official statement said. "By bringing private-sector know-how into uniform, Det. 201 is supercharging efforts like the Army Transformation Initiative, which aims to make the force leaner, smarter, and more lethal." ~ The sources of the new recruits are hardly surprising. Palantir has worked with the US Army since 2008 and last year won a $480 million contract to take over the army's Maven project. This is trying to integrate AI into every aspect of warfare, allowing the software to take disparate information sources and coordinate a response. ~ Only last month Meta announced that it is partnering with former employee Palmer Luckey's Anduril Industries to sell extended reality eyewear to the US military. Meta's Bosworth said that he was "honored" to accept his new rank.”
✭Saab achieves AI milestone with Gripen E “Saab, in collaboration with Helsing, today announced the successful completion of the first three flights integrating Helsing’s Artificial Intelligence (AI) agent ‘Centaur’ into a Gripen E fighter jet. During the flights, the Gripen E gave control to Centaur which successfully autonomously executed complex manoeuvres in a Beyond Visual Range (BVR) combat environment and cued the pilot to fire."
📚Retroactive/Tangential Readings:
✭The Best Smart Glasses for 2025 “Upgrade your eyewear with glasses that can function as cameras, headphones, and AR displays. We've tested all the major models and have everything you need to know to pick the best smart glasses for you.’
✭Next-generation materials for nucleic acid delivery - Nature Reviews Materials “Efficient and targeted delivery of nucleic acids is critical for realizing the full therapeutic potential of gene editing, vaccines and RNA-based drugs, and emerging delivery platforms offer innovative solutions through their diverse architectures, tunable properties and distinct biological interactions. In this Viewpoint, researchers working across different delivery platforms — including lipid nanoparticles, synthetic polymers, peptide amphiphiles, coacervate microdroplets, DNA nanostructures and extracellular vesicles — discuss the most promising directions and the main challenges in shaping the future of nucleic acid delivery.”
✭Just think: The challenges of the disengaged mind - PMC “In 11 studies, we found that participants typically did not enjoy spending 6 to 15 minutes in a room by themselves with nothing to do but think, that they enjoyed doing mundane external activities much more, and that many preferred to administer electric shocks to themselves instead of being left alone with their thoughts. Most people seem to prefer to be doing something rather than nothing, even if that something is negative."
✭Anja Ngozi | Deep Listening Set: Soul, Alternative R&B & Ambient | Grounded 004 - YouTube
✭Neural evidence that humans reuse strategies to solve new tasks | PLOS Biology “Generalization from past experience is an important feature of intelligent systems. When faced with a new task, one efficient computational approach is to evaluate solutions to earlier tasks as candidates for reuse. Consistent with this idea, we found that human participants (n = 38) learned optimal solutions to a set of training tasks and generalized them to novel test tasks in a reward-selective manner. This behavior was consistent with a computational process based on the successor representation known as successor features and generalized policy improvement (SF&GPI). Neither model-free perseveration or model-based control using a complete model of the environment could explain choice behavior. Decoding from functional magnetic resonance imaging data revealed that solutions from the SF&GPI algorithm were activated on test tasks in visual and prefrontal cortex. This activation had a functional connection to behavior in that stronger activation of SF&GPI solutions in visual areas was associated with increased behavioral reuse. These findings point to a possible neural implementation of an adaptive algorithm for generalization across tasks."
✭Dopamine encodes deep network teaching signals for individual learning trajectories: Cell “Individuals form diverse yet systematic learning trajectories from naive to expert. Dorsal striatal dopamine acts as a teaching signal shaping the trajectories. Deep but not shallow RL with heterogeneous teaching signals accounts for mice data. A mathematical framework explains diversity and systematicity of long-term learning. Summary: Striatal dopamine plays fundamental roles in fine-tuning learned decisions. However, when learning from naive to expert, individuals often exhibit diverse learning trajectories, defying understanding of its underlying dopaminergic mechanisms. Here, we longitudinally measure and manipulate dorsal striatal dopamine signals in mice learning a decision task from naive to expert. Mice learning trajectories transitioned through sequences of strategies, showing substantial individual diversity. Remarkably, the transitions were systematic; each mouse’s early strategy determined its strategy weeks later. Dopamine signals reflected strategies each animal transitioned through, encoding a subset of stimulus-choice associations. Optogenetic manipulations selectively updated these associations, leading to learning effects distinct from that of reward. A deep neural network using heterogeneous teaching signals, each updating a subset of network association weights, captured our results. Analyzing the model’s fixed points explained learning diversity and systematicity. Altogether, this work provides insights into the biological and mathematical principles underlying individual long-term learning trajectories."
✭Gene therapy could correct blood stem cells inside, rather than outside, the body “Experiments in mice reveal an early postnatal window of opportunity for the effective transfer of genes to blood-cell-producing haematopoietic stem cells by injecting mice with gene-carrying lentiviral vectors. This approach showed therapeutic benefit in three mouse models of severe diseases, and could expand the applicability of haematopoietic stem-cell gene therapy in the clinic."
✭World first: brain implant lets man speak with expression — and sing
✭Animals can’t talk like humans do – here’s why the hunt for their languages has left us empty-handed “If the sequence hypothesis is correct, then grammar, planning and abstract thought in non-human animals are often being inferred from behaviours that may be explained by simpler well-studied learning mechanisms. If so, a bonobo combining gestures or a bird eliciting a sequence of calls reflect clever learning and instinct, but not true compositional meaning. If animals cannot represent sequences faithfully – and we see no evidence that they can – many apparent parallels with human language fall apart. The temptation to see ourselves in animals is strong, especially when their behaviour seems familiar. But surface resemblance does not necessarily imply the same underlying mechanisms. If animals have more language-like capacities than suggested here, a relevant question is why these similarities are so difficult to detect. After decades of research on dolphin intelligence and communication in larger whales, for instance, we still cannot communicate with them using any language-like code."