🫓 Sept 4-10th: AlphaProteo, Replit Agent, Colossus, SSI raises $1 billion, Lumen Orbit, Neurode, Pong prodigy Hydrogel, Project Sid, Loopy Avatar, DepthCrafter, ReMamba, Transfusion, NiceAunties Sora
+ AI Hype Fraud: Reflection-Llama-70B, Apple introduces clinical-grade, over-the-counter Hearing Aid, Hacker News Book Map, DeepSeek-V2.5, Emotion Recognition in VR, Where Minds Come From (M. Levin)
Image from proof-of-process semi-autonomous Ai-Artist (Aug 13-27th,2024)
🫓 Sept 4-10th: Why A.I. Isn’t Going to Make Art (Ted Chiang | The New Yorker), Apple's Hidden AI Prompts Discovered in macOS Beta - MacRumors, The biology of smell is a mystery — AI is helping to solve it (Nature Feature), Council of Europe opens AI convention for signature | Digital Watch Observatory, Study shows ‘alarming’ level of trust in AI for life and death decisions - The Engineer, NaNoWriMo Says Condemning AI Is ‘Classist and Ableist’ (404Media), AlphaProteo generates novel proteins for biology and health research (DeepMind), Replit Agent in early access—available today for subscribers" - Amjad Masad on X, Neurode, Scrape anything with AI - FetchFox, Zed.dev, Aider.chat, Multi-Datacenter Training: OpenAI's Ambitious Plan To Beat Google's Infrastructure (SemiAnalysis), Fine-tuning Best Practices Chapter 2: Models - OpenPipe, 🚀 𝐋𝐥𝐚𝐦𝐚-𝟑.𝟏-𝐒𝐭𝐨𝐫𝐦-𝟖𝐁, Two AI Developers Are Plotting $125 Billion Supercomputers — The Information, @xAI team brought Colossus 100k H100 training cluster online, Lumen Orbit 🚀 Data Centers in Space | Y Combinator, OpenAI co-founder Sutskever's new safety-focused AI startup SSI raises $1 billion | Reuters, anthropic-quickstarts, The AI Hype Fraud Reflection-Llama-70B, Apple introduces clinical-grade, over-the-counter Hearing Aid feature, Hacker News Book Map, DeepSeek-V2.5, A day in the life of the world’s fastest supercomputer (Nature), Pong prodigy: Hydrogel material shows unexpected learning abilities, Project Sid - Altera’s Substack, Loopy loopyavatar.github.io/ (ByteDance), DepthCrafter (Tencent), ReMamba: Equip Mamba with Effective Long-Sequence Modeling, Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model, Sweating the Details: Emotion Recognition and the Influence of Physical Exertion in Virtual Reality Exergaming, Artificial intelligence guided screening for cardiomyopathies in an obstetric population: a pragmatic randomized clinical trial | Nature Medicine, Gut Microbiome Wellness Index 2 enhances health status prediction from gut microbiome taxonomic profiles | Nature, Multi-Relational Graph Representation Learning for Financial Statement Fraud Detection, Build SaaS with AI (Sabrina Ramonov 🍄 on YouTube), $125B for Superintelligence? 3 Models Coming, Sutskever's Secret SSI, & Data Centers (in space)... (AI Explained) , Where Minds Come From: The Scaling of Collective Intelligence, AI, and You | Michael Levin Lecture, RoyalCities/Vocal_Textures_Main · Hugging Face, Singaporean artist Niceaunties uses Sora (OpenAI | TikTok), Generative artificial intelligence, human creativity, and art | PNAS Nexus | Oxford Academic (March 2024)
TLDR: DeepMind released AlphaProteo which generates novel protein binders for biology and health research, it “achieves higher experimental success rates and 3 to 300 times better binding affinities than the best existing methods on seven target proteins”. ByteDance released a lifelike lip-synch module Loopy (loopyavatar.github.io): “It can generate vivid motion details from audio alone, such as non-speech movements like sighing, emotion-driven eyebrow and eye movements, and natural head movements”. Japanese researchers taught a hydrogel —a type of soft, flexible material— to play the 1970s computer game "Pong." This follows last week’s biohybrid robot controlled by fungal mycelia. Tencent AI Lab released DepthCrafter which “transforms monocular videos into depth videos with consistency and precision”. Lumen Orbit 🚀 proposes Data Centers in Space “to make use of abundant solar energy, cooling, and the ability to freely scale up.” -- renewable perhaps, but what about maintenance costs? Project Sid by Altera.ai launched a multi-agent sim in minecraft and “saw agents form a merchant hub, vote in a democracy, spread religions, & collect 5x more distinct items than ever before.” Sutskever announced $1 billion in funding for Safe Super Intelligence (Reuters confirmed it). Ted Chiang is certain A.I. Isn’t Going to Make Art. Early access is open to the Neurode ADHD personalized brain stimulant “headband that helps you improve focus, impulse control & memory in 20 minutes a day”. Replit released ‘Replit Agent’ in early access — it sets up dev environments, installs packages, configures DB, and deploys: the release spawned headlines like Replit Agents are Here to Replace All Software Engineers (AIA) & “Andrej Karpathy, who has been actively building using Cursor, said Replit Agents can be placed under the “feel the AGI” category”. Michael Levin lectures convincingly on Where Minds Come From: The Scaling of Collective Intelligence, AI, and You. “Pooling existing 8069 stool shotgun metagenomes” the Gut Microbiome Wellness Index 2 offers a “resource for evaluating health using an individual’s unique gut microbial composition.” SemiAnalysis details OpenAI's Ambitious Plan To Beat Google's Infrastructure with multi-datacenter multi-gigawatt training compute buildouts. After “debates about AI on our social media channels became vitriolic,” NaNoWriMo retracted the statement: “We also want to be clear in our belief that the categorical condemnation of Artificial Intelligence has classist and ableist undertones …”
Notes toward AI & Narrative #7 : Personalized Narratives as Cultural Medicine
In the realm of medical research, personalized medicine technologies are proliferating, will similar robust individualization resources such as personalized biometric narratives emerge in the realm of art, literature and cultural experiences? Yes, but only if there is sufficient data. In the realm of personalized medicine, the Gut Microbiome Wellness Index 2, offers a “resource for evaluating health using an individual’s unique gut microbial composition.” EMBL's European Bioinformatics Institute “maintains the world’s most comprehensive range of freely available and up-to-date molecular data resources.” Similar personalized open-narrative datasets do not exist. In September 2024, the Internet Archive lost it’s appeal after being sued by publishers, including Penguin Random House, HarperCollins, and Wiley. Big pharma’s protectionism is equalled by litigious big-copyright culture.
Cultural data at scale of individual biometric-affect-engagement responses will be necessary to develop personalized art-experiences. Existing proprietary datasets (such as those internal to YouTube, Google TikTok, Tinder, Facebook) based on swipe-timing, lack biological granularity: there’s no sweat or heartbeat or eye-tracking. Perhaps fitness trackers merged with Netflix? Recent localized and relatively simple research initiatives reveal parts of the path: Sweating the Details: Emotion Recognition and the Influence of Physical Exertion in Virtual Reality Exergaming utilized “pupillometry, electrodermal activity, heart rate, and facial tracking, as well as subjective affect ratings” to enhance engagement in VR exercise games”.
Future narrative general path-topologies and character-sets will be established by the AI+Author(s), yet release genre-modalities and details of delivery (cultural-gender-ideology orientations) into dynamically adaptive, personally cathartic impact affordances. Intimate sensor packages (implants, telepathically streaming dreams) and datasets (pulse-plot mappings) would allow astute AI+author-artist-teams to sculpt the flow parameters of plot estuaries.
Consider a future AI-enhanced rom-com: destabilize gender identities, enlarge the racial spectrum, discard ableist tropes, include those on the spectrum. As the system monitors fluctuations in viewer glands and dopamine flow, the personalized plot-guide-AI will sense and experiment with the limits of attentional capacity. Hybrid augmented AI+biometric feedback art-creation requires artists willing to surrender egoic attachments to the one right way. It also requires a world that is less predatory. And even though Ted Chiang is certain A.I. Isn’t Going to Make Art, I am not certain about anything, except that humans + AI will make art, and it will be unprecedented, personalized and (if we are lucky as a civilization) potentially healing.
🏓 Observations: Why A.I. Isn’t Going to Make Art (Ted Chiang | The New Yorker), Apple's Hidden AI Prompts Discovered in macOS Beta - MacRumors, The biology of smell is a mystery — AI is helping to solve it (Nature Feature), Council of Europe opens AI convention for signature | Digital Watch Observatory, Study shows ‘alarming’ level of trust in AI for life and death decisions - The Engineer, NaNoWriMo Says Condemning AI Is ‘Classist and Ableist’ (404Media)
✭ Why A.I. Isn’t Going to Make Art (Ted Chiang | The New Yorker) “To create a novel or a painting, an artist makes choices that are fundamentally alien to artificial intelligence. ~ If an A.I. generates a ten-thousand-word story based on your prompt, it has to fill in for all of the choices that you are not making. There are various ways it can do this. One is to take an average of the choices that other writers have made, as represented by text found on the Internet; that average is equivalent to the least interesting choices possible, which is why A.I.-generated text is often really bland. Another is to instruct the program to engage in style mimicry, emulating the choices made by a specific writer, which produces a highly derivative story. In neither case is it creating interesting art. ~ I think the same underlying principle applies to visual art, although it’s harder to quantify the choices that a painter might make. Real paintings bear the mark of an enormous number of decisions. By comparison, a person using a text-to-image program like DALL-E enters a prompt such as “A knight in a suit of armor fights a fire-breathing dragon,” and lets the program do the rest. (The newest version of DALL-E accepts prompts of up to four thousand characters—hundreds of words, but not enough to describe every detail of a scene.) Most of the choices in the resulting image have to be borrowed from similar paintings found online; the image might be exquisitely rendered, but the person entering the prompt can’t claim credit for that. ~ Some commentators imagine that image generators will affect visual culture as much as the advent of photography once did. Although this might seem superficially plausible, the idea that photography is similar to generative A.I. deserves closer examination. When photography was first developed, I suspect it didn’t seem like an artistic medium because it wasn’t apparent that there were a lot of choices to be made; you just set up the camera and start the exposure. But over time people realized that there were a vast number of things you could do with cameras, and the artistry lies in the many choices that a photographer makes. It might not always be easy to articulate what the choices are, but when you compare an amateur’s photos to a professional’s, you can see the difference. So then the question becomes: Is there a similar opportunity to make a vast number of choices using a text-to-image generator? I think the answer is no. An artist—whether working digitally or with paint—implicitly makes far more decisions during the process of making a painting than would fit into a text prompt of a few hundred words.”
✭ Apple's Hidden AI Prompts Discovered in macOS Beta - MacRumors “You are a helpful mail assistant which can help identify relevant questions from a given mail and a short reply snippet. Given a mail and the reply snippet, ask relevant questions which are explicitly asked in the mail. The answer to those questions will be selected by the recipient which will help reduce hallucination in drafting the response. Please output top questions along with set of possible answers/options for each of those questions. Do not ask questions which are answered by the reply snippet. The questions should be short, no more than 8 words. The answers should be short as well, around 2 words. Present your output in a json format with a list of dictionaries containing question and answers as the keys. If no question is asked in the mail, then output an empty list. Only output valid json and nothing else.”
✭The biology of smell is a mystery — AI is helping to solve it (Nature Feature) “Some researchers are using machine learning to accelerate the search for structures and their preferred chemical partners. Right now, scientists have identified odour molecules that bind to only about 20% of human ORs (olfactory receptors). The protein-prediction algorithm AlphaFold has suggested thousands of structures for mammalian odorant receptors. And machine learning and modelling has helped Matsunami and his colleagues to screen millions of compounds to see which ones might bind to two candidate OR structures. One of the molecules they found smells of orange blossom; another strongly of honey. The dream end point is to gather data on hundreds of ORs and how their activation lines up with the chemistry of millions of odorants, says Manglik.”
✭ Council of Europe opens AI convention for signature | Digital Watch Observatory On 5 September 2024, the Council of Europe’s Framework Convention on Artificial Intelligence, Human Rights, Democracy, and the Rule of Law will be officially opened for signature during an informal Conference of the Ministers of Justice of the Council of Europe in Vilnius, Lithuania. The Convention, adopted on 17 May 2024, during the Council of Europe’s Committee of Ministers’ annual meeting, provides a global legal framework for regulating AI systems, with a focus on ensuring these technologies align with human rights, democratic integrity, and the rule of law. ...~... The Convention allows for national security exemptions, i.e. the Parties are allowed to not implement the treaty for activities protecting national security, provided that these comply with international law.” ✭The Framework Convention on Artificial Intelligence - Artificial Intelligence “The Council of Europe Framework Convention on Artificial Intelligence and human rights, democracy and the rule of law is the first-ever international legally binding treaty in this field. It aims to ensure that activities within the lifecycle of artificial intelligence systems are fully consistent with human rights, democracy and the rule of law, while being conducive to technological progress and innovation. …~... Parties to the Framework Convention are not required to apply the provisions of the treaty to activities related to the protection of their national security interests but must ensure that such activities respect international law and democratic institutions and processes. The Framework Convention does not apply to national defence matters nor to research and development activities, except when the testing of AI systems may have the potential to interfere with human rights, democracy, or the rule of law.”
✭ Study shows ‘alarming’ level of trust in AI for life and death decisions - The Engineer “A US study that simulated life and death decisions has shown that humans place excessive trust in artificial intelligence when guiding their choices.”
✭ NaNoWriMo Says Condemning AI Is ‘Classist and Ableist’ (404Media) “The organization that runs National Novel Writing Month (NaNoWriMo) declared condemnation of AI “classist and ableist,” and participants in its annual writing challenge are pissed. ~ In a Zendesk post published on Saturday titled “What is NaNoWriMo's position on Artificial Intelligence (AI)?,” [UPDATED] NaNoWriMo organizers wrote that it “does not explicitly support any specific approach to writing, nor does it explicitly condemn any approach, including the use of AI.” ~ NaNoWriMo started as a group writing project in the ‘90s with the goal of writing a 50,000-word manuscript in one month (November), but in 2005, NaNoWriMo became a nonprofit organization that takes donations and runs fundraising campaigns. ~ “We also want to be clear in our belief that the categorical condemnation of Artificial Intelligence has classist and ableist undertones, and that questions around the use of AI tie to questions around privilege,” the post continues. It then outlines how it feels AI condemnation is classist, ableist, and poses general access issues.” → ✭ What is NaNoWriMo's position on Artificial Intelligence (AI)? – National Novel Writing Month “NaNoWriMo neither explicitly supports nor condemns any approach to writing, including the use of tools that leverage AI. We recognize that harm has been done to the writing and creative communities at the hands of bad actors in the generative AI space, and that the ethical questions and risks posed by some aspects of this technology are real. The fact that AI is a large, complex technology category (which encompasses both non-generative and generative AI, applied in a range of ways to a range of uses) contributes to our belief that AI is simply too big and too varied to categorically support or condemn. ~ NaNoWriMo's mission is to "provide the structure, community, and encouragement to help people use their voices, achieve creative goals, and build new worlds—on and off the page." We fulfill our mission by supporting the humans doing the writing. Please see this related post that speaks to our overall position on nondiscrimination with respect to approaches to creativity, writer's resources, and personal choice.”
⛲Foundational Revelations: AlphaProteo generates novel proteins for biology and health research (DeepMind), Replit Agent in early access—available today for subscribers" - Amjad Masad on X
✭ AlphaProteo generates novel proteins for biology and health research (DeepMind) “New AI system designs proteins that successfully bind to target molecules, with potential for advancing drug design, disease understanding and more. Today, we introduce AlphaProteo, our first AI system for designing novel, high-strength protein binders to serve as building blocks for biological and health research. This technology has the potential to accelerate our understanding of biological processes, and aid the discovery of new drugs, the development of biosensors and more. ~ AlphaProteo can generate new protein binders for diverse target proteins, including VEGF-A, which is associated with cancer and complications from diabetes. This is the first time an AI tool has been able to design a successful protein binder for VEGF-A. ~ AlphaProteo also achieves higher experimental success rates and 3 to 300 times better binding affinities than the best existing methods on seven target proteins we tested. …~... Protein binders that can bind tightly to a target protein are hard to design. Traditional methods are time intensive, requiring multiple rounds of extensive lab work. After the binders are created, they undergo additional experimental rounds to optimize binding affinity, so they bind tightly enough to be useful. Trained on vast amounts of protein data from the Protein Data Bank (PDB) and more than 100 million predicted structures from AlphaFold, AlphaProteo has learned the myriad ways molecules bind to each other. Given the structure of a target molecule and a set of preferred binding locations on that molecule, AlphaProteo generates a candidate protein that binds to the target at those locations.” → PDF ✭ 2024-09-05 De novo design of high-affinity protein binders with AlphaProteo “Computational design of protein-binding proteins is a fundamental capability with broad utility in biomedical research and biotechnology. Recent methods have made strides against some target proteins, but on-demand creation of high-affinity binders without multiple rounds of experimental testing remains an unsolved challenge. This technical report introduces AlphaProteo, a family of machine learning models for protein design, and details its performance on the de novo binder design problem. With AlphaProteo, we achieve 3- to 300-fold better binding affinities and higher experimental success rates than the best existing methods on seven target proteins. Our results suggest that AlphaProteo can generate binders "ready-to-use" for many research applications using only one round of medium-throughput screening and no further optimization.”
✭ "Announcing Replit Agent in early access—available today for subscribers" - Amjad Masad on X: "AI is incredible at writing code. But that's not enough to create software. You need to set up a dev environment, install packages, configure DB, and, if lucky, deploy. It's time to automate all this.” → ✭ Replit Agents are Here to Replace All Software Engineers (AIA) “Andrej Karpathy, who has been actively building using Cursor, said Replit Agents can be placed under the “feel the AGI” category.”
🛠️ Tech: Neurode, Scrape anything with AI - FetchFox, Zed.dev, Aider.chat, Multi-Datacenter Training: OpenAI's Ambitious Plan To Beat Google's Infrastructure (SemiAnalysis), Fine-tuning Best Practices Chapter 2: Models - OpenPipe, 🚀 𝐋𝐥𝐚𝐦𝐚-𝟑.𝟏-𝐒𝐭𝐨𝐫𝐦-𝟖𝐁, Two AI Developers Are Plotting $125 Billion Supercomputers — The Information, @xAI team brought Colossus 100k H100 training cluster online, Lumen Orbit 🚀 Data Centers in Space | Y Combinator, OpenAI co-founder Sutskever's new safety-focused AI startup SSI raises $1 billion | Reuters, anthropic-quickstarts, The AI Hype Fraud Reflection-Llama-70B, Apple introduces clinical-grade, over-the-counter Hearing Aid feature, Hacker News Book Map, DeepSeek-V2.5, A day in the life of the world’s fastest supercomputer (Nature)
✭ Neurode “Our goal - enhance productivity without drugs. Neurode is developing a headband that helps you improve focus, impulse control & memory in 20 minutes a day”
✭Scrape anything with AI - FetchFox
✭Zed.dev “The editor for what's next. Zed is a next-generation code editor designed for high-performance collaboration with humans and AI.”
✭ Aider.chat “Aider is AI pair programming in your terminal. Aider lets you pair program with LLMs, to edit code in your local git repository. Start a new project or work with an existing git repo. Aider works best with GPT-4o & Claude 3.5 Sonnet and can connect to almost any LLM.”
✭Multi-Datacenter Training: OpenAI's Ambitious Plan To Beat Google's Infrastructure (SemiAnalysis) “Buildouts of AI infrastructure are insatiable due to the continued improvements from fueling the scaling laws. The leading frontier AI model training clusters have scaled to 100,000 GPUs this year, with 300,000+ GPUs clusters in the works for 2025. Given many physical constraints including construction timelines, permitting, regulations, and power availability, the traditional method of synchronous training of a large model at a single datacenter site are reaching a breaking point. ~ Google, OpenAI, and Anthropic are already executing plans to expand their large model training from one site to multiple datacenter campuses. Google owns the most advanced computing systems in the world today and has pioneered the large-scale use of many critical technologies that are only just now being adopted by others such as their rack-scale liquid cooled architectures and multi-datacenter training. ~ Gemini 1 Ultra was trained across multiple datacenters. Despite having more FLOPS available to them, their existing models lag behind OpenAI and Anthropic because they are still catching up in terms of synthetic data, RL, and model architecture, but the impending release of Gemini 2 will change this. Furthermore, in 2025, Google will have the ability to conduct Gigawatt-scale training runs across multiple campuses, but surprisingly Google’s long-term plans aren’t nearly as aggressive as OpenAI and Microsoft. …~... Microsoft and OpenAI will be first to a multi-GW computing system. Along with their supply chain partners they are deep into the most ambitious infrastructure buildout ever.”
✭Fine-tuning Best Practices Chapter 2: Models - OpenPipe “Keep in mind that the best way to improve the quality of a fine-tuned model is through better selection and curation of your training data. So if you missed it, I highly recommend reviewing the first chapter in our series which covers training datasets. Even so, there are meaningful implications to consider when choosing a base model to fine-tune, so today we’re sharing insight into the factors that will help you make the best decision there.” ✭Tejas Kumar on X: "NEW on ConTejas Code @corbtt, CEO @openpipeai: How to fine-tune your own language models Through this discussion, I learned a ton from Kyle about machine learning, overfitting, hyperparameters, and a lot more. I hope you do too.
✭ Ashvini Jindal on X: "🚀 𝐋𝐥𝐚𝐦𝐚-𝟑.𝟏-𝐒𝐭𝐨𝐫𝐦-𝟖𝐁 has arrived! A new 8B parameter LLM that outperforms @Meta 𝗟𝗹𝗮𝗺𝗮-𝟯.𝟭-𝟴𝗕-𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁 and 𝗛𝗲𝗿𝗺𝗲𝘀-� ✭ ⛈️ Llama-3.1 Storm Models - a akjindal53244 Collection
✭Two AI Developers Are Plotting $125 Billion Supercomputers — The Information “Developers of artificial intelligence say they need bigger and bigger data centers to concentrate processing power so that it produces better versions of the technology. The companies are notoriously secretive about the details of those plans, though, which is why we just published an Al Data Center Database that lists some of the biggest existing and upcoming data centers— which we also call supercomputers or Al chip A clusters-as the race among half a dozen major developers intensifies. Beyond what we've listed in our database, discussions are hot and heavy about even bigger data center projects across the U.S., such as Microsoft and OpenAl's proposed $100 billion supercomputer.
It's now clear that Microsoft isn't the only company drawing up plans for what we'll call mega AI data centers—in fact, I've been speaking with a growing number of people involved in such projects.”
✭Launch YC: Lumen Orbit 🚀 Data Centers in Space | Y Combinator “TL;DR - We should train future large AI models in space to make use of abundant solar energy, cooling, and the ability to freely scale up. Hey all, we're building data centers in space. We’re launching our first satellite next year, which will have the most powerful GPUs ever put in space by >100x. We will launch a larger iteration each year until we reach gigawatt scale. ~ ❌The Problem: Future hyperscale data centers will put a huge strain on electricity grids, freshwater distribution, and the Western world’s permitting systems. It will simply not be possible to deploy multi-gigawatt scale data centers rapidly in the way we build data centers today. ~ ✨Our Solution:We take advantage of falling launch costs to make use of inexpensive solar energy in space and low-cost passive radiative cooling, rapidly scaling up orbital data centers almost indefinitely without the physical or permitting constraints faced on Earth. This will ensure we can continue training ever larger models without destroying the environment.”
✭ Exclusive: OpenAI co-founder Sutskever's new safety-focused AI startup SSI raises $1 billion | Reuters “Safe Superintelligence (SSI), newly co-founded by OpenAI's former chief scientist Ilya Sutskever, has raised $1 billion in cash to help develop safe artificial intelligence systems that far surpass human capabilities, company executives told Reuters. SSI, which currently has 10 employees, plans to use the funds to acquire computing power and hire top talent. It will focus on building a small highly trusted team of researchers and engineers split between Palo Alto, California and Tel Aviv, Israel. The company declined to share its valuation but sources close to the matter said it was valued at $5 billion. The funding underlines how some investors are still willing to make outsized bets on exceptional talent focused on foundational AI research. That's despite a general waning in interest towards funding such companies which can be unprofitable for some time, and which has caused several startup founders to leave their posts for tech giants.” ✭ Ilya Sutskever on X: "Mountain: identified. Time to climb"
✭The AI Hype Fraud Reflection-Llama-70B / X A story about fraud in the AI research community: On September 5th, Matt Shumer, CEO of OthersideAI, announces to the world that they've made a breakthrough, allowing them to train a mid-size model to top-tier levels of performance. This is huge. If it's real. ~ It isn't.” ✭ r/LocalLLaMA CONFIRMED: REFLECTION 70B'S OFFICIAL API IS SONNET 3.5 → BEWARE of ….✭I'm excited to announce Reflection 70B, X / ? “I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ @GlaiveAI”
✭ Apple introduces groundbreaking health features “Apple Watch delivers new sleep apnea notifications, and AirPods Pro 2 provide the world’s first all-in-one hearing health experience including a clinical-grade, over-the-counter Hearing Aid feature”
✭Hacker News Book Map “The 1000 most popular books on Hacker News visualized on an interactive map.” (with assistance from GPT-4)
✭DeepSeek-V2.5 - a deepseek-ai Collection
✭ A day in the life of the world’s fastest supercomputer (Nature) “In the hills of eastern Tennessee, a record-breaking machine called Frontier is providing scientists with unprecedented opportunities to study everything from atoms to galaxies.”
👁️🗨️ Research into AI: Pong prodigy: Hydrogel material shows unexpected learning abilities, Project Sid - Altera’s Substack, Loopy loopyavatar.github.io/ (ByteDance), DepthCrafter (Tencent), ReMamba: Equip Mamba with Effective Long-Sequence Modeling, Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
✭ Pong prodigy: Hydrogel material shows unexpected learning abilities In a study published 22 August in Cell Reports Physical Science, a team led by Dr. Yoshikatsu Hayashi demonstrated that a simple hydrogel—a type of soft, flexible material—can learn to play the simple 1970s computer game "Pong." The hydrogel, interfaced with a computer simulation of the classic game via a custom-built multi-electrode array, showed improved performance over time. Dr. Hayashi, a biomedical engineer at the University of Reading's School of Biological Sciences, said, "Our research shows that even very simple materials can exhibit complex, adaptive behaviors typically associated with living systems or sophisticated AI.” → ✭Electro-active polymer hydrogels exhibit emergent memory when embodied in a simulated game environment - ScienceDirect “Keypoints: EAP gel memory mechanics are demonstrated via ion concentration measurements. A hybrid EAP gel control system is integrated into a simulated Pong environment. The system shows improved performance over time, supported by control experiments. This demonstrates the application of alternate active medium to computational tasks. ~ Summary: The goal of artificial neural networks is to utilize the functions of biological brains to develop computational algorithms. However, these purely artificial implementations cannot achieve the adaptive behavior found in biological neural networks (BNNs) via their inherent memory. Alternative computing mediums that integrate biological neurons with computer hardware have shown similar emergent behavior via memory, as found in BNNs. By applying current theories in BNNs, can emergent memory functions be achieved with alternative mediums? Electro-active polymer (EAP) hydrogels were embedded in the simulated game-world of Pong via custom multi-electrode arrays and feedback between motor commands and stimulation. Through performance analysis within the game environment, emergent memory acquisition was demonstrated, driven by ion migration through the hydrogels.”
✭Project Sid - Altera’s Substack “What is Project Sid? What does it look like to have a civilization of AI agents? How far away are we from Westworld? Are we able to align the AI civilization with human civilization? We introduce Project Sid, our first step towards exploring these questions. Under Sid, we investigated many scenarios and aspects of society, including democracies, regulation of social norms, societal roles, hierarchies, trading, economy, religion, and more. Simulating tens, hundred, and even thousands of agents together, we discovered phenomena and challenges never seen before at a small scale with just a few agents.” ✭ Altera.ai “Altera is an applied research lab building its first products in games. Building in games allows us to iterate in virtual worlds, build strong data flywheels, and ship products where the consequences for frontier limitations are limited and where emergent behaviors could be features, not bugs. Our first digital human being products are friends that can play Minecraft with you and are always online.” ✭ Robert Yang on X: "Introducing Project Sid: the first simulations of 1000+ truly autonomous agents collaborating in a virtual world, w/ emergent economy, culture, religion, and government. “Humans are the only species to land the moon, because we can cooperate at a vast scale. Can AI do the same? At Altera, we ask whether our agents can organize at unprecedented scale to achieve what individual agents can’t. We saw agents form a merchant hub, vote in a democracy, spread religions, & collect 5x more distinct items than ever before.”
✭Loopy loopyavatar.github.io/ (ByteDance) TL;DR: we propose an end-to-end audio-only conditioned video diffusion model named Loopy. Specifically, we designed an inter- and intra-clip temporal module and an audio-to-latents module, enabling the model to leverage long-term motion information from the data to learn natural motion patterns and improving audio-portrait movement correlation. This method removes the need for manually specified spatial motion templates used in existing methods to constrain motion during inference, delivering more lifelike and high-quality results across various scenarios.”
✭DepthCrafter (Tencent) We innovate DepthCrafter, a novel video depth estimation approach, by leveraging video diffusion models. It can generate temporally consistent long depth sequences with fine-grained details for open-world videos, without requiring additional information such as camera poses or optical flow. ~ Brief introduction Motivation. Despite significant advancements in monocular depth estimation for static images, estimating video depth in the open world remains challenging, since open-world videos are extremely diverse in content, motion, camera movement, and length. We present DepthCrafter, an innovative method for generating temporally consistent long depth sequences with intricate details for open-world videos, without requiring any supplementary information such as camera poses or optical flow. DepthCrafter achieves generalization ability to open-world videos by training a video-to-depth model from a pre-trained image-to-video diffusion model, through our meticulously designed three-stage training strategy with the compiled paired video-depth datasets. Our training approach enables the model to generate depth sequences with variable lengths at one time, up to 110 frames, and harvest both precise depth details and rich content diversity from realistic and synthetic datasets. We also propose an inference strategy that processes extremely long videos through segment-wise estimation and seamless stitching.” ✭ Ying Shan on X: "Thanks @_akhaliq for featuring! DepthCrafter transforms monocular videos into depth videos with consistency and precision. Paper: https://t.co/FbO6COl9cU Project page: https://t.co/x9Lq29RzAZ
✭ReMamba: Equip Mamba with Effective Long-Sequence Modeling “While the Mamba architecture demonstrates superior inference efficiency and competitive performance on short-context natural language processing (NLP) tasks, empirical evidence suggests its capacity to comprehend long contexts is limited compared to transformer-based models. In this study, we investigate the long-context efficiency issues of the Mamba models and propose ReMamba, which enhances Mamba's ability to comprehend long contexts. ReMamba incorporates selective compression and adaptation techniques within a two-stage re-forward process, incurring minimal additional inference costs overhead. Experimental results on the LongBench and L-Eval benchmarks demonstrate ReMamba's efficacy, improving over the baselines by 3.2 and 1.6 points, respectively, and attaining performance almost on par with same-size transformer models.”
✭Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model “We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. Transfusion combines the language modeling loss function (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. We pretrain multiple Transfusion models up to 7B parameters from scratch on a mixture of text and image data, establishing scaling laws with respect to a variety of uni- and cross-modal benchmarks. Our experiments show that Transfusion scales significantly better than quantizing images and training a language model over discrete image tokens. By introducing modality-specific encoding and decoding layers, we can further improve the performance of Transfusion models, and even compress each image to just 16 patches. We further demonstrate that scaling our Transfusion recipe to 7B parameters and 2T multi-modal tokens produces a model that can generate images and text on a par with similar scale diffusion models and language models, reaping the benefits of both worlds.”
🔎 Applied Research: Sweating the Details: Emotion Recognition and the Influence of Physical Exertion in Virtual Reality Exergaming, Artificial intelligence guided screening for cardiomyopathies in an obstetric population: a pragmatic randomized clinical trial | Nature Medicine, Gut Microbiome Wellness Index 2 enhances health status prediction from gut microbiome taxonomic profiles | Nature, Multi-Relational Graph Representation Learning for Financial Statement Fraud Detection
✭Sweating the Details: Emotion Recognition and the Influence of Physical Exertion in Virtual Reality Exergaming | Proceedings of the CHI Conference on Human Factors in Computing Systems “There is great potential for adapting Virtual Reality (VR) exergames based on a user’s affective state. However, physical activity and VR interfere with physiological sensors, making affect recognition challenging. We conducted a study (n=72) in which users experienced four emotion inducing VR exergaming environments (happiness, sadness, stress and calmness) at three different levels of exertion (low, medium, high). We collected physiological measures through pupillometry, electrodermal activity, heart rate, and facial tracking, as well as subjective affect ratings. Our validated virtual environments, data, and analyses are openly available. We found that the level of exertion influences the way affect can be recognised, as well as affect itself. Furthermore, our results highlight the importance of data cleaning to account for environmental and interpersonal factors interfering with physiological measures. The results shed light on the relationships between physiological measures and affective states and inform design choices about sensors and data cleaning approaches for affective VR.” ✭ How personalized technology could turn exercise pain into pleasure “Virtual reality (VR) video games that combine screen time with exercise are a great way to get fit, but game designers face a major challenge—like with regular exercise, adherence to "exergames" is low, with most users dropping out once they start to feel uncomfortable or bored. ~ Computer scientists at the University of Bath believe they've found a solution: Create exergames that use sensors to continuously measure a person's emotional state while they exercise, then tweak the game—for instance, making it easier or harder—to keep the user engaged. ~ Dr. Dominic Potts, lead author of a new study into harnessing cutting-edge sensor technology to keep exercisers motivated, said, "When it comes to physical exercise in all forms, motivation and exercise adherence are huge problems. With exergaming, we can address this issue and maximize a person's enjoyment and performance by adapting the challenge level to match a user's abilities and mood.”
✭Artificial intelligence guided screening for cardiomyopathies in an obstetric population: a pragmatic randomized clinical trial | Nature Medicine “Nigeria has the highest reported incidence of peripartum cardiomyopathy worldwide. This open-label, pragmatic clinical trial randomized pregnant and postpartum women to usual care or artificial intelligence (AI)-guided screening to assess its impact on the diagnosis left ventricular systolic dysfunction (LVSD) in the perinatal period. The study intervention included digital stethoscope recordings with point of-care AI predictions and a 12-lead electrocardiogram with asynchronous AI predictions for LVSD. The primary end point was identification of LVSD during the study period. In the intervention arm, the primary end point was defined as the number of identified participants with LVSD as determined by a positive AI screen, confirmed by echocardiography. In the control arm, this was the number of participants with clinical recognition and documentation of LVSD on echocardiography in keeping with current standard of care. Participants in the intervention arm had a confirmatory echocardiogram at baseline for AI model validation. A total of 1,232 (616 in each arm) participants were randomized and 1,195 participants (587 intervention arm and 608 control arm) completed the baseline visit at 6 hospitals in Nigeria between August 2022 and September 2023 with follow-up through May 2024. Using the AI-enabled digital stethoscope, the primary study end point was met with detection of 24 out of 587 (4.1%) versus 12 out of 608 (2.0%) patients with LVSD (intervention versus control odds ratio 2.12, 95% CI 1.05–4.27; P = 0.032). With the 12-lead AI-electrocardiogram model, the primary end point was detected in 20 out of 587 (3.4%) versus 12 out of 608 (2.0%) patients (odds ratio 1.75, 95% CI 0.85–3.62; P = 0.125). A similar direction of effect was observed in prespecified subgroup analysis. There were no serious adverse events related to study participation. In pregnant and postpartum women, AI-guided screening using a digital stethoscope improved the diagnosis of pregnancy-related cardiomyopathy.” ✭ AI stethoscope doubles detection of pregnancy heart failure “Heart failure during pregnancy is a dangerous and often under-detected condition because common symptoms—shortness of breath, extreme fatigue and trouble breathing while lying down—are easily mistaken for typical pregnancy discomforts. Late-breaking research presented at the European Society of Cardiology Congress on a Mayo Clinic study showed an artificial intelligence (AI)-enabled digital stethoscope helped doctors identify twice as many cases of heart failure compared to a control group that received usual obstetric care and screening.”
✭Gut Microbiome Wellness Index 2 enhances health status prediction from gut microbiome taxonomic profiles | Nature Communications “Recent advancements in translational gut microbiome research have revealed its crucial role in shaping predictive healthcare applications. Herein, we introduce the Gut Microbiome Wellness Index 2 (GMWI2), an enhanced version of our original GMWI prototype, designed as a standardized disease-agnostic health status indicator based on gut microbiome taxonomic profiles. Our analysis involves pooling existing 8069 stool shotgun metagenomes from 54 published studies across a global demographic landscape (spanning 26 countries and six continents) to identify gut taxonomic signals linked to disease presence or absence. GMWI2 achieves a cross-validation balanced accuracy of 80% in distinguishing healthy (no disease) from non-healthy (diseased) individuals and surpasses 90% accuracy for samples with higher confidence (i.e., outside the “reject option”). This performance exceeds that of the original GMWI model and traditional species-level α-diversity indices, indicating a more robust gut microbiome signature for differentiating between healthy and non-healthy phenotypes across multiple diseases. When assessed through inter-study validation and external validation cohorts, GMWI2 maintains an average accuracy of nearly 75%. Furthermore, by reevaluating previously published datasets, GMWI2 offers new insights into the effects of diet, antibiotic exposure, and fecal microbiota transplantation on gut health. Available as an open-source command-line tool, GMWI2 represents a timely, pivotal resource for evaluating health using an individual’s unique gut microbial composition.” ✭Computational tool can measure the health of a person's gut microbiome “A team of Mayo Clinic researchers has developed an innovative computational tool that analyzes the gut microbiome, a complex ecosystem of trillions of bacteria, fungi, viruses and other microorganisms within the digestive system, to provide insights into overall well-being. In a new study published in Nature Communications, the tool demonstrated at least 80% accuracy in differentiating healthy individuals from those with any disease. The tool was developed by analyzing stool gut microbiome profiles from more than 8,000 samples representing various diseases, geographic regions and demographic groups.”
✭Multi-Relational Graph Representation Learning for Financial Statement Fraud Detection | TUP Journals & Magazine | IEEE Xplore “Financial statement fraud refers to malicious manipulations of financial data in listed companies' annual statements. Traditional machine learning approaches focus on individual companies, overlooking the interactive relationships among companies that are crucial for identifying fraud patterns. Moreover, fraud detection is a typical imbalanced binary classification task with normal samples outnumbering fraud ones. In this paper, we propose a multi-relational graph convolutional network, named FraudGCN, for detecting financial statement fraud. A multi-relational graph is constructed to integrate industrial, supply chain, and accounting-sharing relationships, effectively encapsulating the multidimensional and complex interactions among companies. We then develop a multi-relational graph convolutional network to aggregate information within each relationship and employ an attention mechanism to fuse information across multiple relationships. The attention mechanism enables the model to distinguish the importance of different relationships, thereby aggregating more useful information from key relationships. To alleviate the class imbalance problem, we present a diffusion-based under-sampling strategy that strategically selects key nodes globally for model training. We also employ focal loss to assign greater weights to harder-to-classify minority samples. We build a real-world dataset from the annual financial statement of listed companies in China. The experimental results show that FraudGCN achieves an improvement of 3.15% in Macro-recall, 3.36% in Macro-F1, and 3.86% in GMean compared to the second-best method. The dataset and codes are publicly available at: https://github.com/XNetLab/MRG-for-Finance.✭ Machine learning technique predicts likely accounting fraud across supply chains “The cutting-edge artificial intelligence financial-fraud detective they devised involves a graph, a structure that mathematically represents the connections or relations (described as edges) between different companies, individuals and products (described as nodes). And multi-relational graphs allow for multiple types of edges, allowing the representation of diverse relationships between nodes, and offer a more comprehensive representation of the complexity of connections among them. And the detective itself, called FraudGCN, is a graph convolutional network, or GCN, a type of neural network designed to operate on graph-structured data. Unlike traditional neural networks that operate on grid-like data such as images, GCNs can operate on data represented as graphs. FraudGCN itself constructs a multi-relational graph representing various industry connections, supply chain links, and shared accounting firm auditing practices, and by doing so, capture rich information arising from these relationships, in particular details uncovered in particular 'neighborhoods' of nodes in the graphs. By aggregating such information, FraudGCN not only enhances the ability to identify patterns indicative of existing likely fraudulent activities, but also predict where they are likely to arise. Finally, unlike previous efforts at machine-learning assisted fraud detection, FraudGCN is able to handle the addition of new nodes without the need for the model to be retrained, enhancing its adaptability and scalability. The team trialed FraudGCN on a real-world dataset from Chinese listed companies to assess its performance, and found that it beat state-of-the-art approaches by between 3.15% and 3.86%.”
👀Watching: Build SaaS with AI (Sabrina Ramonov 🍄 on YouTube), $125B for Superintelligence? 3 Models Coming, Sutskever's Secret SSI, & Data Centers (in space)... (AI Explained) , Where Minds Come From: The Scaling of Collective Intelligence, AI, and You | Michael Levin Lecture
✭ Build SaaS with AI (Sabrina Ramonov 🍄 on YouTube) “Playlist series: from MVP, Cursor AI to deployment with Firestore and Vercel”
✭$125B for Superintelligence? 3 Models Coming, Sutskever's Secret SSI, & Data Centers (in space)... (AI Explained) “Ilya Sutskever's 'straight shot to superintelligence' is already valued at $5B, but now we get $125B data centers in the works. Yes, plural. Will this be the ultimate gambit on the scaling hypothesis?”
✭ Where Minds Come From: The Scaling of Collective Intelligence, AI, and You | Michael Levin Lecture “Michael Levin is a Distinguished Professor in the Biology department at Tufts University and associate faculty at the Wyss Institute for Bioinspired Engineering at Harvard University. Prof Levin holds the Vannevar Bush endowed Chair and serves as director of the Allen Discovery Center at Tufts and the Tufts Center for Regenerative and Developmental Biology. Prior to college, Michael Levin worked as a software engineer and independent contractor in the field of scientific computing. He attended Tufts University, interested in artificial intelligence and unconventional computation. To explore the algorithms by which the biological world implemented complex adaptive behavior, he got dual B.S. degrees, in CS and in Biology and then received a PhD from Harvard University. He did post-doctoral training at Harvard Medical School, where he began to uncover a new bioelectric language by which cells coordinate their activity during embryogenesis. His independent laboratory develops new molecular-genetic and conceptual tools to probe large-scale information processing in regeneration, embryogenesis, and cancer suppression.”
🖲️AI Art-Research: RoyalCities/Vocal_Textures_Main · Hugging Face, Singaporean artist Niceaunties uses Sora (OpenAI | TikTok)
✭RoyalCities/Vocal_Textures_Main · Hugging Face “This finetuned Stable Audio Open model specializes in Vocal / Operatic Chord Progressions to support granular music production workflows. Capable of creating an infinite variety of chord progressions, all output is BPM-synced and key-locked to any note within the 12-tone chromatic scale, in both major and minor keys. This model was trained on a custom dataset crafted within FL Studio and features three distinct voicings: Male Vocals, Female Vocals, Ensemble Vocals (Combination of Male and Female)” → ✭ RoyalCities on X: "Flipping a Stable Audio AI Sample into glitchy liquid texture" / X → ✭ RoyalCities on X: "📢Calling all Music Producers & AI enthusiasts📢 “I'm dropping a ~5GB file that makes you Infinite Choir Textures AND the 14GB dataset that made it! I've ALSO joined a company that enables me to push this open source tech even further.🎉”
✭ Singaporean artist Niceaunties iuses Sora (OpenAI | TikTok) “Singaporean artist Niceaunties (@niceaunties ) uses AI for “an art project about aging, beauty, freedom & fun, and an attempt to understand ‘auntie culture'” based in south-east and east Asia. Outside of this project, she is an architectural designer: “Sora is at its most powerful when the interactive process feels almost like working alongside another human. The creations are incredible because of its ability to fill in details and craft uncanny visuals that blur the line between reality and illusion. It makes you question: What is real?! What excites me the most is the 'human-ness' of the characters, particularly their facial expressions and the emotions they convey. There is immense potential for these visuals to forge strong emotional connections, making it an incredibly powerful communication tool. I also learn from the way Sora interprets prompts—the more I engage with it, the more I learn. I love that I am constantly learning while creating! In this film Auntie bought some eggs from a mysterious vendor at the market, driven by food cravings intensified by the full moon's energy. To her surprise, mini aunties hatched from the eggs, throwing big Auntie's life into delightful chaos. Suddenly, with a squad of tiny assistants by her side, every aspect of Auntie's life was meticulously taken care of. But one day, just as mysteriously as they had arrived, the mini aunties departed, and the egg stall vanished from the market. ”
📚Retroactive Readings: Generative artificial intelligence, human creativity, and art | PNAS Nexus | Oxford Academic (March 2024)
✭ Generative artificial intelligence, human creativity, and art | PNAS Nexus | Oxford Academic (March 2024) “Recent artificial intelligence (AI) tools have demonstrated the ability to produce outputs traditionally considered creative. One such system is text-to-image generative AI (e.g. Midjourney, Stable Diffusion, DALL-E), which automates humans’ artistic execution to generate digital artworks. Utilizing a dataset of over 4 million artworks from more than 50,000 unique users, our research shows that over time, text-to-image AI significantly enhances human creative productivity by 25% and increases the value as measured by the likelihood of receiving a favorite per view by 50%. While peak artwork Content Novelty, defined as focal subject matter and relations, increases over time, average Content Novelty declines, suggesting an expanding but inefficient idea space. Additionally, there is a consistent reduction in both peak and average Visual Novelty, captured by pixel-level stylistic elements. Importantly, AI-assisted artists who can successfully explore more novel ideas, regardless of their prior originality, may produce artworks that their peers evaluate more favorably. Lastly, AI adoption decreased value capture (favorites earned) concentration among adopters. The results suggest that ideation and filtering are likely necessary skills in the text-to-image process, thus giving rise to “generative synesthesia”—the harmonious blending of human exploration and AI exploitation to discover new creative workflows.”