FLUX.1, Mistral Large 2, Gemma 2, Mapping misuse of generative AI (Jigsaw), Stable Fast 3D, COPIED Act, Foundation Models For Chemical Language, Warzone Caldera Data Set, Github Models, Whisper-Medusa
Fully-automatic robot dentist performs world's first human procedure, IsoFLOP curves, sqlite-vec, Calculating the Cost of a Google Deepmind Paper, Kalmogorov-Arnold Neural Networks, light-driven AI...
🛴July 31st - Aug 6th: Mapping the misuse of generative AI (Jigsaw @ Google), The reanimation of pseudoscience in machine learning and its ethical repercussions (Patterns | Cell), How I Use "AI" (Nicholas Carlini), How Does OpenAI Survive? (Edward Zitron), Will A.I. Kill Meaningless Jobs? (NYT), FLUX.1 (Black Forest Labs - Frontier AI Lab), Mistral Large 2, Scaling Exponents Across Parameterizations and Optimizers (DeepMind), A Large Encoder-Decoder Family of Foundation Models For Chemical Language, Gemma 2: Improving Open Language Models at a Practical Size, Stable Fast 3D: Rapid 3D Asset Generation From Single Images (Stability AI), Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act), Activision Releases Call of Duty®: Warzone™ Caldera Data Set for Academic Use, Friend.com (After paying $1.8 million for a domain name, Harvard dropout launches AI-wearable), Gemini 1.5 Pro experimental (0801) is currently ranked #1 on LMSYS for both text and multi-modal!, Introducing GitHub Models: A new generation of AI engineers building on GitHub, Google Cloud now has a dedicated cluster of Nvidia GPUs for Y Combinator startups (TechCrunch), Artificial Intelligence Gives Weather Forecasters a New Edge (NYT), Fully-automatic robot dentist performs world's first human procedure (NewAtlas), IsoFLOP curves of large language models are extremely flat (Severely Theoretical), Whisper-Medusa: Our Latest Open-Source AI Model (aiOla), sqlite-vec: Work-in-progress vector search SQLite extension, Calculating the Cost of a Google Deepmind Paper (How to burn US$10,000,000 on an arXiv preprint), Massive Multitask Agent Understanding (MMAU) benchmark, Closing the gap between open-source and commercial large language models for medical evidence summarization, A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology, Why editing the knowledge of LLMs post-training can create messy ripple effects, RoCE networks for distributed AI training at scale (Meta), Kalmogorov-Arnold Neural Networks Shake Up How AI Is Done, ExAvatar, Berkeley Humanoid: A Research Platform for Learning-based Control, Deep learning assists detection of esophageal cancer and precursor lesions in a prospective, randomized controlled study (Science Translational Medicine), Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE, Partial coherence enhances parallelized photonic computing (Nature): New 'game-changing' discovery for light-driven artificial intelligence (Phys.org), Superior polymeric gas separation membrane designed by explainable graph machine learning (Cell Reports Physical Science), Hunting for Polluted White Dwarfs and Other Treasures with Gaia XP Spectra and Unsupervised Machine Learning, Genetic Programming for Population Genetics (GP4PG): Modelling the demographic history of human North African genomes points to a recent soft split divergence between populations (Genome Biology), Federated learning as a catalyst for digital healthcare innovations (Patterns | Cell), InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds, BNP-Track (Bayesian nonparametric track): a framework for superresolved tracking (Nature Methods), AI and The Next Computing Platforms With Jensen Huang and Mark Zuckerberg, Indus Model by NASA and IBM, ‘Google smokes math Olympiads’ (Fireship), Trying to Convince ChatGPT It's Conscious (Alex O'Connor), Artificial Immediacy, Autolume (METACREATION), Replicate “Run AI with an API”, Fal.ai, InfoWars ‘2084’, Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction, Folk psychological attributions of consciousness to large language models (Neuroscience of Consciousness | Oxford)
TL;DR: Tech: a fully autonomous robot dentist performed world's first human procedure 8xs faster than a human using a handheld OCT scanner which means no x-rays are necessary; FLUX.1 an open-source state-of-the-art image-generation model was released by Black Forest Labs, a startup founded by the original creators of Stable Diffusion; Mistral released Mistral Large 2 while Google released the research paper for small model family Gemma 2; Stability AI released the 30sec Stable Fast 3D: Rapid 3D Asset Generation From Single Images; a 50% faster 95% accurate speech-to-text model Whisper-Medusa was released by aiOla; "Gemini 1.5 Pro experimental (0801) is currently ranked #1 on LMSYS for both text and multi-modal”; Jigsaw, a unit within Google, released Mapping the misuse of generative AI; an IEEE article explores a new AI architecture that is reputedly smaller and interpretable, Kalmogorov-Arnold Neural Networks Shake Up How AI Is Done; and if Chinchilla scaling ‘laws’ intrigue you, read the dense yet revelatory IsoFLOP curves of large language models are extremely flat (Severely Theoretical). Applied-AI: deep learning assists detection of esophageal cancer and precursor lesions in a prospective, randomized controlled study; a Superior polymeric gas separation membrane designed by explainable graph machine learning (industrial processes responsible for 15% of energy use can be optimized); a new 'game-changing' discovery for light-driven artificial intelligence performs “high-speed AI tasks at around 100 billion operations per second”; InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds offers swift 3D reconstruction of scenes from a few images; and the obscurely-titled yet important BNP-Track: a framework for superresolved tracking uses Bayesian methods to provide a clearer picture of biomolecules in motion (Phys.org). AI-Art: I wrote up my June & July 2024 audio-prompting binge creating 7 hours of 'reasonable' music in approx. 21 hours (with stats!) as Artificial Immediacy implies Personalized-Audio-Visual-Streams (PAVS); and I wonder, Does Anyone Know who/what made This? (Runway Gen 3 Alpha transitions invoke surreal fun).
Note #3 re AI & narrative: Beingness data (biometrics as the birthplace of AI literature)
Consider A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology (as proposed by Gamez, Barcari and Grig in a July 31st, 2024 Arxiv prepress): “...This paper describes how a new type of foundation model - a first-person foundation model [FPFM] - could be created from recordings of what a person sees and hears as well as their emotional and physiological reactions to these stimuli. A first-person foundation model would map environmental stimuli to a person's emotional and physiological states, and map a person's emotional and physiological states to their behavior… To obtain training data for a first-person foundation model, we have developed a recording rig that captures what the wearer is seeing and hearing as well as their emotional and physiological states.” The apex of this ambition is reminiscent of “We Live in Public” or Steve Mann 'sousveillance’ updated and enhanced by contemporary wearables/implants, ripe for harvesting. By who? For what?
Pernicious? Yes, there is a peculiar innocence to the FPFM paper: in that these digital twins will inevitably expose a myriad of intimate details to manipulative extraction. People will be hacked. Troubling? Yes, the ideological contours of state/corporate sponsored algorithms predictively analyzing emotions for security/sales is sub-optimal and dystopian. Improbable? Yes, the technical schematics of how this occur are still radically implausible. And the recorder rig in the paper is severely low-tech. More probably: market forces, leveraging competitive instincts, will induce the emergence of a lucrative sophisticated data-pool derived from skin-biometric activity-trackers, web/phone cameras, search/ads/cookies, pendant or AR glass first-person-style recordings, along with (eventually) implants. Multiple, intrusive, enhancing mergers of machine and metabolism. Massive replications of a simulacra of life.
Yet, the core idea (of biometric data harvests informing a foundation model) has implications for literature. → Just imagine for a moment, a collective-humanity frontier model trained on multimodal biometric camera audio visual data from an entire generation in real time; updated as they grew, fell in love, worked, played, celebrated and grieved. What would this mean for the evolution of artificial intelligent models capable of generating plausible fiction? → It would mean complex data offering intimate insights into the overlap between physiology and the often paradoxical spoken or gestural choreography of social situations, circumstances, and events. In this scenario, the large language model would become a multimodal physiological model trained on Beingness data.
🏓 Observations: Mapping the misuse of generative AI (Jigsaw @ Google), The reanimation of pseudoscience in machine learning and its ethical repercussions (Patterns | Cell), How I Use "AI" (Nicholas Carlini), How Does OpenAI Survive? (Edward Zitron), Will A.I. Kill Meaningless Jobs? (NYT)
✭ Mapping the misuse of generative AI (Jigsaw @ Google) “New research analyzes the misuse of multimodal generative AI today, in order to help build safer and more responsible technologies” ✭ Jigsaw “Jigsaw is a unit within Google that explores threats to open societies, and builds technology that inspires scalable solutions.” ✭ [2406.13843] Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data “Generative, multimodal artificial intelligence (GenAI) offers transformative potential across industries, but its misuse poses significant risks. Prior research has shed light on the potential of advanced AI systems to be exploited for malicious purposes. However, we still lack a concrete understanding of how GenAI models are specifically exploited or abused in practice, including the tactics employed to inflict harm. In this paper, we present a taxonomy of GenAI misuse tactics, informed by existing academic literature and a qualitative analysis of approximately 200 observed incidents of misuse reported between January 2023 and March 2024. Through this analysis, we illuminate key and novel patterns in misuse during this time period, including potential motivations, strategies, and how attackers leverage and abuse system capabilities across modalities (e.g. image, text, audio, video) in the wild.”
✭ The reanimation of pseudoscience in machine learning and its ethical repercussions (Patterns | Cell) “Machine learning has a pseudoscience problem. In this perspective, the authors explore the recent resurgence of deep learning-assisted physiognomy and argue that pseudoscientific and socially harmful applications of machine learning often arise from shared underlying epistemic failings. They urge researchers to reject notions that machine learning can be theory free and to consider their work in an appropriate social and historical context.”
✭How I Use "AI" (Nicholas Carlini) “I don't think that "AI" models [a] (by which I mean: large language models) are over-hyped. ~ Yes, it's true that any new technology will attract the grifters. And it is definitely true that many companies like to say they're "Using AI" in the same way they previously said they were powered by "The Blockchain". (As we've seen again, and again, and again, and again.) It's also the case we may be in a bubble. The internet was a bubble that burst in 2000, but the Internet applications we now have are what was previously the stuff of literal science fiction. ~ But the reason I think that the recent advances we've made aren't just hype is that, over the past year, I have spent at least a few hours every week interacting with various large language models, and have been consistently impressed by their ability to solve increasingly difficult tasks I give them. And as a result of this, I would say I'm at least 50% faster at writing code for both my research projects and my side projects as a result of these models. ~ Most of the people online I find who talk about LLM utility are either wildly optimistic, and claim all jobs will be automated within three years, or wildly pessimistic, and say they have contributed nothing and never will. ~ So in this post, I just want to try and ground the conversation. I'm not going to make any arguments about what the future holds. I just want to provide a list of 50 conversations that I (a programmer and research scientist studying machine learning) have had with different large language models to meaningfully improve my ability to perform research and help me work on random coding side projects.”
✭ How Does OpenAI Survive? (Edward Zitron) “Throughout the last year I’ve written in detail about the rot in tech — the spuriousness of charlatans looking to accumulate money and power, the desperation of the most powerful executives to maintain control and rapacious growth, and the speciousness of the latest hype cycle — but at the end of the day, these are just companies, which leads to a very simple question: can the largest, most prominent company in tech’s latest hype cycle actually survive? ~ I am, of course, talking about OpenAI. Regulars to this newsletter will know that I’m highly skeptical of OpenAI’s product, its business model, and its sustainability. While I don’t want to rehash the arguments made in previous newsletters and podcasts, here’s the crux of the matter: generative AI is a product with no mass-market utility - at least on the scale of truly revolutionary movements like the original cloud computing and smartphone booms - and it’s one that costs an eye-watering amount to build and run.”
✭ Will A.I. Kill Meaningless Jobs? (NYT) “When imagining a future where technology replaces human effort, we tend to think in two extremes: as a productivity boon for businesses and a disaster for the humans who will become obsolete. There is a possibility that lies somewhere between these scenarios, however, in which A.I. kills off some jobs that workers themselves deem meaningless, and even find psychologically degrading. If it did, would these workers be better off?”
⛲Foundational Revelations: FLUX.1 (Black Forest Labs - Frontier AI Lab), Mistral Large 2, Scaling Exponents Across Parameterizations and Optimizers (DeepMind), A Large Encoder-Decoder Family of Foundation Models For Chemical Language, Gemma 2: Improving Open Language Models at a Practical Size, Stable Fast 3D: Rapid 3D Asset Generation From Single Images (Stability AI)
✭FLUX.1 (Black Forest Labs - Frontier AI Lab) “The best of FLUX.1, offering state-of-the-art performance image generation with top of the line prompt following, visual quality, image detail and output diversity.” ✭ Stable Diffusion creators launch Black Forest Labs, secure $31M for FLUX.1 AI image generator | VentureBeat “Black Forest Labs, a startup founded by the original creators of Stable Diffusion, unveiled its brand new FLUX.1 text-to-image model suite today, injecting new life into the open-source artificial intelligence community. ~ This launch marks a potential watershed moment for accessible and powerful generative AI technology.” ✭ Announcing Black Forest Labs - Black Forest Labs “Today, we are excited to announce the launch of Black Forest Labs. Deeply rooted in the generative AI research community, our mission is to develop and advance state-of-the-art generative deep learning models for media such as images and videos, and to push the boundaries of creativity, efficiency and diversity. We believe that generative AI will be a fundamental building block of all future technologies. By making our models available to a wide audience, we want to bring its benefits to everyone, educate the public and enhance trust in the safety of these models. We are determined to build the industry standard for generative media. Today, as the first step towards this goal, we release the FLUX.1 suite of models that push the frontiers of text-to-image synthesis.”
✭Large Enough | Mistral AI | Frontier AI in your hands “Today, we are announcing Mistral Large 2, the new generation of our flagship model. Compared to its predecessor, Mistral Large 2 is significantly more capable in code generation, mathematics, and reasoning. It also provides a much stronger multilingual support, and advanced function calling capabilities.”
✭[2407.05872] Scaling Exponents Across Parameterizations and Optimizers “Robust and effective scaling of models from small to large width typically requires the precise adjustment of many algorithmic and architectural details, such as parameterization and optimizer choices. In this work, we propose a new perspective on parameterization by investigating a key assumption in prior work about the alignment between parameters and data and derive new theoretical results under weaker assumptions and a broader set of optimizers. Our extensive empirical investigation includes tens of thousands of models trained with all combinations of three optimizers, four parameterizations, several alignment assumptions, more than a dozen learning rates, and fourteen model sizes up to 26.8B parameters. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work. Our results show that all parameterizations, not just maximal update parameterization (muP), can achieve hyperparameter transfer; moreover, our novel per-layer learning rate prescription for standard parameterization outperforms muP. Finally, we demonstrate that an overlooked aspect of parameterization, the epsilon parameter in Adam, must be scaled correctly to avoid gradient underflow and propose Adam-atan2, a new numerically stable, scale-invariant version of Adam that eliminates the epsilon hyperparameter entirely.”
✭[2407.20267] A Large Encoder-Decoder Family of Foundation Models For Chemical Language “Large-scale pre-training methodologies for chemical language models represent a breakthrough in cheminformatics. These methods excel in tasks such as property prediction and molecule generation by learning contextualized representations of input tokens through self-supervised learning on large unlabeled corpora. Typically, this involves pre-training on unlabeled data followed by fine-tuning on specific tasks, reducing dependence on annotated datasets and broadening chemical language representation understanding. This paper introduces a large encoder-decoder chemical foundation models pre-trained on a curated dataset of 91 million SMILES samples sourced from PubChem, which is equivalent to 4 billion of molecular tokens. The proposed foundation model supports different complex tasks, including quantum property prediction, and offer flexibility with two main variants (289M and 8×289M). Our experiments across multiple benchmark datasets validate the capacity of the proposed model in providing state-of-the-art results for different tasks. We also provide a preliminary assessment of the compositionality of the embedding space as a prerequisite for the reasoning tasks. We demonstrate that the produced latent space is separable compared to the state-of-the-art with few-shot learning capabilities.”
✭[2408.00118] Gemma 2: Improving Open Language Models at a Practical Size “In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community.”
✭ Introducing Stable Fast 3D: Rapid 3D Asset Generation From Single Images — Stability AI “We are excited to introduce Stable Fast 3D, Stability AI’s latest breakthrough in 3D asset generation technology. This innovative model transforms a single input image into a detailed 3D asset, setting a new standard for speed and quality in the field of 3D reconstruction.”
🛠️ Tech: Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act), Activision Releases Call of Duty®: Warzone™ Caldera Data Set for Academic Use, Friend.com (After paying $1.8 million for a domain name, Harvard dropout launches AI-wearable), Gemini 1.5 Pro experimental (0801) is currently ranked #1 on LMSYS for both text and multi-modal!, Introducing GitHub Models: A new generation of AI engineers building on GitHub, Google Cloud now has a dedicated cluster of Nvidia GPUs for Y Combinator startups (TechCrunch), Artificial Intelligence Gives Weather Forecasters a New Edge (NYT), Fully-automatic robot dentist performs world's first human procedure (NewAtlas), IsoFLOP curves of large language models are extremely flat (Severely Theoretical), Whisper-Medusa: Our Latest Open-Source AI Model (aiOla), sqlite-vec: Work-in-progress vector search SQLite extension
✭ Activision Releases Call of Duty®: Warzone™ Caldera Data Set for Academic Use
✭ The AI-focused COPIED Act would make removing digital watermarks illegal - The Verge “A bipartisan group of senators introduced a new bill to make it easier to authenticate and detect artificial intelligence-generated content and protect journalists and artists from having their work gobbled up by AI models without their permission. ~ The Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act) would direct the National Institute of Standards and Technology (NIST) to create standards and guidelines that help prove the origin of content and detect synthetic content, like through watermarking. It also directs the agency to create security measures to prevent tampering and requires AI tools for creative or journalistic content to let users attach information about their origin and prohibit that information from being removed. Under the bill, such content also could not be used to train AI models. ~ Content owners, including broadcasters, artists, and newspapers, could sue companies they believe used their materials without permission or tampered with authentication markers. State attorneys general and the Federal Trade Commission could also enforce the bill, which its backers say prohibits anyone from “removing, disabling, or tampering with content provenance information” outside of an exception for some security research purposes.” → ✭Cantwell, Blackburn, Heinrich Introduce Legislation to Increase Transparency, Co... “Today, U.S. Senators Maria Cantwell (D-Wash.), Chair of the Senate Commerce Committee, Marsha Blackburn (R-Tenn.), member of the Commerce Committee, and Martin Heinrich (D-N.M.), member of the Senate AI Working Group, introduced the Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED ACT), to combat the rise of harmful deepfakes. The bill would set new federal transparency guidelines for marking, authenticating and detecting AI-generated content, protect journalists, actors and artists against AI-driven theft, and hold violators accountable for abuses.”
✭ Friend.com ✭ After paying $1.8 million for a domain name, Harvard dropout launches AI-wearable | Cybernews “The startup Friend has launched a $99 AI necklace that listens to all of a user's conversations. Friend, founded by Harvard dropout Avi Schiffmann, who previously created a website that tracked COVID-19, has launched an AI wearable that is supposed to combat loneliness. According to Techcrunch, the startup managed to get a $2.5 investment from well-known names in the tech world, including Perplexity CEO Aravind Srinivas, Solana founders Anatoly Yakovenko and Raj Gokal, and a few more investors Interestingly, GeekWire reports that Schiffmann borrowed $1.8 million to buy the domain friend.com.” ✭'Black Mirror'-like Friend companion has AI startup founders beefing “Friend revealed its wearable AI companion on Tuesday, sinking us ever deeper into our Black Mirror-style technology dystopia. Now the company's founder, Avi Schiffmann, and two other wearable AI companion startups are beefing about alleged copying of branding. ~ Like Friend, fellow startup Based Hardware offers a strikingly similar necklace device also called Friend, while the unaffiliated Based Social has the less confusing but still pendant-like Compass. Both these wearables are focused on productivity, using AI to listen to and summarise your conversations, while Schiffmann's Friend is designed to be an emotional support AI.”✭ Covid-era whiz kid is back, and he brought a Friend — a wearable, always listening, $99 AI companion – GeekWire “A friend might have your back, or always be by your side. Avi Schiffmann’s Friend is around his neck. Schiffmann, the former Mercer Island High School student who wowed the world as a teen with a website he built in 2020 to help track the spread of Covid-19, is back with technology that he hopes will change how people view and interact with artificial intelligence.Friend is an AI-enabled blood-cell-shaped pendant that hangs on a cord around a user’s neck. The always-listening device is meant to be like a close companion, who you would share experiences with and develop a strong relationship with over time. From idle chit chat to deep talks, Friend is always up for conversation. The hardware works in a couple different ways. A touchable light in the center of the device lets you speak directly to the AI and its replies are sent via text message from a companion app on your smartphone. Because Friend is always listening, it’s also gathering context for what is happening in your life from various circumstances and conversations, and it can proactively offer its AI viewpoint via push notifications.”
✭Logan Kilpatrick on X: "Today, we are making an experimental version (0801) of Gemini 1.5 Pro available for early testing and feedback in Google AI Studio and the Gemini API. Try it out and let us know what you think! ✭ Logan Kilpatrick on X: "Gemini 1.5 Pro experimental (0801) is currently ranked #1 on LMSYS for both text and multi-modal!
✭ Introducing GitHub Models: A new generation of AI engineers building on GitHub - The GitHub Blog “We are enabling the rise of the AI engineer with GitHub Models–bringing the power of industry leading large and small language models to our more than 100 million users directly on GitHub.”
✭Google Cloud now has a dedicated cluster of Nvidia GPUs for Y Combinator startups | TechCrunch “Google Cloud is giving Y Combinator startups access to a dedicated, subsidized cluster of Nvidia graphics processing units and Google tensor processing”
✭Artificial Intelligence Gives Weather Forecasters a New Edge (NYT) “The brainy machines are predicting global weather patterns with new speed and precision, doing in minutes and seconds what once took hours.”
✭Fully-automatic robot dentist performs world's first human procedure (NewAtlas) “Nightmare fuel? Maybe – but in a historic moment for the dental profession, an AI-controlled autonomous robot has performed an entire procedure on a human patient for the first time, about eight times faster than a human dentist could do it.”
✭ IsoFLOP curves of large language models are extremely flat (Severely Theoretical) “ An interesting detail in the recently released Llama-3 technical report has caught my eye (p. 8): “An important observation is that IsoFLOPs curves become flatter around the minimum as the compute budget increases. This implies that performance of the flagship model is relatively robust to small changes in the trade-off between model size and training tokens” ~ Since I had noted the same phenomenon in a previous post about the Chinchilla scaling laws (more than two years ago) to argue for training smaller models (point 4 in that post). I’m glad that this observation is finally being taken seriously, but I think the quotation above from the Llama-3 paper still underestimates the extent of this isoFLOP flatness issue. The performance of these models is not just robust to small variations in model size around the optimal, but it is actually pretty robust to even massive variations in model size at the Llama-3 compute scale.”
✭ Introducing Whisper-Medusa: Our Latest Open-Source AI Model (aiOla) “With aiOla’s Whisper-Medusa businesses can take advantage of a speech recognition model that is faster at understanding language with 95%+ accuracy.”
✭ GitHub - asg017/sqlite-vec: Work-in-progress vector search SQLite extension that runs anywhere.
👁️🗨️ Research into AI: Calculating the Cost of a Google Deepmind Paper (How to burn US$10,000,000 on an arXiv preprint), Massive Multitask Agent Understanding (MMAU) benchmark, Closing the gap between open-source and commercial large language models for medical evidence summarization, A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology, Why editing the knowledge of LLMs post-training can create messy ripple effects
✭Calculating the Cost of a Google Deepmind Paper (How to burn US$10,000,000 on an arXiv preprint) “Recently, GDM released a great paper titled, Scaling Exponents Across Parameterizations and Optimizers, in which they conduct over 10,000 LLM training runs to obtain optimal hyperparameters under different regimes. After reading it (it was great), I wanted to test my understanding of the paper by tallying up all experiments conducted within, calculating the total compute cost it would take to replicate the paper.”
✭axlearn/docs/research/mmau at main · apple/axlearn · GitHub “The Massive Multitask Agent Understanding (MMAU) benchmark is designed to evaluate the performance of large language models (LLMs) as agents across a wide variety of tasks. It provides comprehensive insights into the capabilities and limitations of these models by featuring extensive offline tasks that eliminate the need for complex environment setups. MMAU evaluates models across five key domains: Tool-use, Directed Acyclic Graph (DAG) QA, Data Science and Machine Learning Coding, Contest-level Programming, Mathematics. These domains cover five essential capabilities: Understanding, Reasoning, Planning, Problem-solving, Self-correction. With a total of 20 meticulously designed tasks encompassing over 3,000 distinct prompts, MMAU offers a robust framework for assessing the strengths and weaknesses of LLM agents.”
✭[2408.00588] Closing the gap between open-source and commercial large language models for medical evidence summarization “Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance in summarizing medical evidence. Utilizing a benchmark dataset, MedReview, consisting of 8,161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the fine-tuned LLMs obtained an increase of 9.89 in ROUGE-L (95% confidence interval: 8.94-10.81), 13.21 in METEOR score (95% confidence interval: 12.05-14.37), and 15.82 in CHRF score (95% confidence interval: 13.89-16.44). The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were also manifested in both human and GPT4-simulated evaluations. Our results can be applied to guide model selection for tasks demanding particular domain knowledge, such as medical evidence summarization.”
✭[2408.00030] A New Type of Foundation Model Based on Recordings of People's Emotions and Physiology “Foundation models have had a big impact in recent years and billions of dollars are being invested in them in the current AI boom. The more popular ones, such as Chat-GPT, are trained on large amounts of data from the Internet, and then reinforcement learning, RAG, prompt engineering and cognitive modelling are used to fine-tune and augment their behavior. This technology has been used to create models of individual people, such as Caryn Marjorie. However, these chatbots are not based on people's actual emotional and physiological responses to their environment, so they are, at best, surface-level approximations to the characters they are imitating. This paper describes how a new type of foundation model - a first-person foundation model - could be created from recordings of what a person sees and hears as well as their emotional and physiological reactions to these stimuli. A first-person foundation model would map environmental stimuli to a person's emotional and physiological states, and map a person's emotional and physiological states to their behavior. First-person foundation models have many exciting applications, including a new type of recommendation engine, personal assistants, generative adversarial networks, dating and recruitment. To obtain training data for a first-person foundation model, we have developed a recording rig that captures what the wearer is seeing and hearing as well as their emotional and physiological states. This novel source of data could help to address the shortage of new data for building the next generation of foundation models.”
✭ Why editing the knowledge of LLMs post-training can create messy ripple effects “models can sometimes report outdated information that they were fed during training, as opposed to other relevant and up-to-date information released after their training. To overcome this limitation of LLMs and increase the reliability of their answers, some computer scientists have been exploring the possibility of editing their knowledge base after they have completed their training. These knowledge editing (KE) interventions should then influence all the content produced by an LLM, creating a ripple effect. This means that all the model's future answers about a given topic should reflect the new information it acquired about this topic after its knowledge was altered. Unfortunately, studies suggest that these ripple effects do not always take place. In essence, this means that while a model might be able to correctly answer direct questions about altered information, it might not encompass the new knowledge it acquired in all of the answers it generates, including those that indirectly touch on the new information.”
✭ RoCE networks for distributed AI training at scale (Meta) “AI networks play an important role in interconnecting tens of thousands of GPUs together, forming the foundational infrastructure for training, enabling large models with hundreds of billions of parameters such as LLAMA 3.1 405B. This week at ACM SIGCOMM 2024 in Sydney, Australia, we are sharing details on the network we have built at Meta over the past few years to support our large-scale distributed AI training workload. Our paper, “RDMA over Ethernet for Distributed AI Training at Meta Scale,” provides the details on how we design, implement, and operate one of the world’s largest AI networks at scale.”
✭ Kalmogorov-Arnold Neural Networks Shake Up How AI Is Done “A New Type of Neural Network Is More Interpretable: Kolmogorov-Arnold Networks could point physicists to new hypotheses. This new type of network learns functions rather than linear weights, allowing researchers to understand their behavior better. These networks could be the antidote to "black box" artificial intelligence, which could be particularly useful in helping scientists discover new laws.”
🔎 Applied Research: ExAvatar, Berkeley Humanoid: A Research Platform for Learning-based Control, Deep learning assists detection of esophageal cancer and precursor lesions in a prospective, randomized controlled study (Science Translational Medicine), Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE, Partial coherence enhances parallelized photonic computing (Nature): New 'game-changing' discovery for light-driven artificial intelligence (Phys.org), Superior polymeric gas separation membrane designed by explainable graph machine learning (Cell Reports Physical Science), Hunting for Polluted White Dwarfs and Other Treasures with Gaia XP Spectra and Unsupervised Machine Learning, Genetic Programming for Population Genetics (GP4PG): Modelling the demographic history of human North African genomes points to a recent soft split divergence between populations (Genome Biology), Federated learning as a catalyst for digital healthcare innovations (Patterns | Cell), InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds, BNP-Track (Bayesian nonparametric track): a framework for superresolved tracking (Nature Methods)
✭ ExAvatar “What is ExAvatar? ExAvatar is our new expressive whole-body 3D Gaussian avatar. Combines 1) whole-body (body, hands, and face) drivability of SMPL-X and 2) powerful appearance modeling capability of 3DGS. Made from a casually captured short phone scan (around 10 seconds of the neutral pose). Supports animation with novel body poses, hand poses, and facial expressions and rendering from any viewpoints.”
✭ Berkeley Humanoid: A Research Platform for Learning-based Control “We introduce Berkeley Humanoid, a reliable and low-cost mid-scale humanoid research platform for learning-based control. Our lightweight, in-house-built robot is designed specifically for learning algorithms with low simulation complexity, anthropomorphic motion, and high reliability against falls. The robot's narrow sim-to-real gap enables agile and robust locomotion across various terrains in outdoor environments, achieved with a simple reinforcement learning controller using light domain randomization. Furthermore, we demonstrate the robot traversing for hundreds of meters, walking on a steep unpaved trail, and hopping with single and double legs as a testimony to its high performance in dynamical walking. Capable of omnidirectional locomotion and withstanding large perturbations with a compact setup, our system aims for scalable, sim-to-real deployment of learning-based humanoid systems.”
✭Deep learning assists detection of esophageal cancer and precursor lesions in a prospective, randomized controlled study | Science Translational Medicine “Early-stage esophageal cancers show better treatment response but are harder to detect. Li et al. developed a deep learning pipeline to aid clinicians in identifying early-stage, high-risk esophageal lesions and tested it in a randomized clinical trial in patients undergoing endoscopy. Deep learning assistance doubled the detection of high-risk esophageal lesions compared with the unassisted control group. This clinical trial demonstrates the promise of this pipeline to improve early esophageal cancer detection.” ✭ Experimental AI method boosts doctors' ability to diagnose cancers and precancers of the esophagus “a team of doctors and scientists from major research centers in China say they have developed and tested a deep-learning algorithm that substantially boosts the detection of high-risk malignant lesions in the esophagus.”
✭Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE “Small molecule drug design hinges on obtaining co-crystallized ligand-protein structures. Despite AlphaFold2’s strides in protein native structure prediction, its focus on apo structures overlooks ligands and associated holo structures. Moreover, designing selective drugs often benefits from the targeting of diverse metastable conformations. Therefore, direct application of AlphaFold2 models in virtual screening and drug discovery remains tentative. Here, we demonstrate an AlphaFold2 based framework combined with all-atom enhanced sampling molecular dynamics and induced fit docking, named AF2RAVE-Glide, to conduct computational model based small molecule binding of metastable protein kinase conformations, initiated from protein sequences. We demonstrate the AF2RAVE-Glide workflow on three different protein kinases and their type I and II inhibitors, with special emphasis on binding of known type II kinase inhibitors which target the metastable classical DFG-out state. These states are not easy to sample from AlphaFold2. Here we demonstrate how with AF2RAVE these metastable conformations can be sampled for different kinases with high enough accuracy to enable subsequent docking of known type II kinase inhibitors with more than 50% success rates across docking calculations. We believe the protocol should be deployable for other kinases and more proteins generally.” ✭ Scientists 'cautiously optimistic' about AI's role in drug discovery “The human body contains at least 20,000 different proteins, often called the "workhorses of the cell" because of their role in keeping cells healthy. Each protein consists of a unique string of amino acids that affects its shape and function—or dysfunction, in the case of proteins that assemble incorrectly, which can cause disease. ~ By understanding and predicting the vast array of shapes a protein can take, scientists can design drugs that target specific proteins with specific roles in a cell. The hope is that technologies like Google's AlphaFold—which uses artificial intelligence (AI) to predict the structure of proteins, DNA and other biomolecules—will speed up this daunting task and subsequently the development of potentially lifesaving medications. ~ University of Maryland researchers are "cautiously optimistic" about this ambitious goal but say that AlphaFold must be paired with a stronger foundation of physics to be successful. A method they developed, described in a new paper published in the journal eLife, does just that. ~ "There are lots of uncured diseases, and we hope that AI can help us screen a large number of compounds to identify effective, non-toxic drugs in a cost-efficient manner, ultimately lowering health care costs for all," said the study's senior author, Pratyush Tiwary. "Our method will speed up drug discovery and enable personalized medicine for complex diseases."”
✭Partial coherence enhances parallelized photonic computing | Nature “Advancements in optical coherence control1,2,3,4,5 have unlocked many cutting-edge applications, including long-haul communication, light detection and ranging (LiDAR) and optical coherence tomography6,7,8. Prevailing wisdom suggests that using more coherent light sources leads to enhanced system performance and device functionalities9,10,11. Our study introduces a photonic convolutional processing system that takes advantage of partially coherent light to boost computing parallelism without substantially sacrificing accuracy, potentially enabling larger-size photonic tensor cores. The reduction of the degree of coherence optimizes bandwidth use in the photonic convolutional processing system. This breakthrough challenges the traditional belief that coherence is essential or even advantageous in integrated photonic accelerators, thereby enabling the use of light sources with less rigorous feedback control and thermal-management requirements for high-throughput photonic computing. Here we demonstrate such a system in two photonic platforms for computing applications: a photonic tensor core using phase-change-material photonic memories that delivers parallel convolution operations to classify the gaits of ten patients with Parkinson’s disease with 92.2% accuracy (92.7% theoretically) and a silicon photonic tensor core with embedded electro-absorption modulators (EAMs) to facilitate 0.108 tera operations per second (TOPS) convolutional processing for classifying the Modified National Institute of Standards and Technology (MNIST) handwritten digits dataset with 92.4% accuracy (95.0% theoretically).” ✭ New 'game-changing' discovery for light-driven artificial intelligence (Phys.org) “low coherence light sources can actually function better in specific cases, such as a photonic AI accelerator—an emerging technology where photons are used instead of electrons to perform AI computations. ~ The team used a partially coherent light source by harnessing a narrow portion of the spectrum of incoherent light produced by an electrically-pumped erbium-doped fiber amplifier (a device used in optical communication to boost the strength of light signals traveling through optical fibers). ~This partially coherent light was evenly split and distributed into different input channels for a parallel AI computational array. Using such a light source, the parallelism of AI computation is surprisingly enhanced by N times in a photonic accelerator with N input channels. ~ As a test case, the team used this system to identify Parkinson's disease patients by analyzing how they walked, achieving a classification accuracy of over 92%. ~ The team also demonstrated how a simple system using only one partially coherent light source with nine input channels could be used to perform high-speed AI tasks at around 100 billion operations per second. Normally, such a speed—equivalent to playing more than two hours of 4K video in one second—could only be achieved in a coherent photonic AI accelerator with multiple separate coherent lasers.”
✭Superior polymeric gas separation membrane designed by explainable graph machine learning: Cell Reports Physical Science Outcomes: Graph imbalanced ML guides discovery of polymer membranes surpassing empirical limits. Two synthesized polymers show superior gas separation for O2/N2, H2/CH4, and H2/N2. O2/N2 separation selectivity is 1.6–6.7× higher than in existing polymer membranes. Explainable ML and simulations reveal molecular origins of high performance” ✭ Machine learning discovers 'hidden-gem' materials for heat-free gas separation (Phys.org) “Chemical separation, including gas separation, is a common process that is required for manufacturing and research. It accounts for a whopping 15% of U.S. energy consumption and produces millions of tons of carbon emissions. ~ Separating gases by passing them through membranes could be an efficient, environmentally friendly alternative to current methods—if only the right materials could be found to make them. ~ Applying a graph-based machine learning approach, a team of chemical and mechanical engineers and computer scientists at the University of Notre Dame have discovered, synthesized and tested polymer membranes that can separate gases up to 6.7 times more effectively than previously synthesized membranes.”
✭Hunting for Polluted White Dwarfs and Other Treasures with Gaia XP Spectra and Unsupervised Machine Learning - IOPscience “White dwarfs (WDs) polluted by exoplanetary material provide the unprecedented opportunity to directly observe the interiors of exoplanets. However, spectroscopic surveys are often limited by brightness constraints, and WDs tend to be very faint, making detections of large populations of polluted WDs difficult. In this paper, we aim to increase considerably the number of WDs with multiple metals in their atmospheres. Using 96,134 WDs with Gaia DR3 BP/RP (XP) spectra, we constructed a 2D map using an unsupervised machine-learning technique called Uniform Manifold Approximation and Projection (UMAP) to organize the WDs into identifiable spectral regions. The polluted WDs are among the distinct spectral groups identified in our map. We have shown that this selection method could potentially increase the number of known WDs with five or more metal species in their atmospheres by an order of magnitude. Such systems are essential for characterizing exoplanet diversity and geology.” ✭ Astronomers use AI to find elusive stars 'gobbling up' planets (Phys.org) “Astronomers have recently found hundreds of "polluted" white dwarf stars in our home galaxy, the Milky Way. These are white dwarfs caught actively consuming planets in their orbit. They are a valuable resource for studying the interiors of these distant, demolished planets. They are also difficult to find. ~ Historically, astronomers have had to manually review mountains of survey data for signs of these stars. Follow-up observations would then prove or refute their suspicions.
By using a novel form of artificial intelligence, called manifold learning, a team led by University of Texas at Austin graduate student Malia Kao has accelerated the process, leading to a 99% success rate in identification. The findings were published July 31 in The Astrophysical Journal. ~ White dwarfs are stars in their final stage of life. They've used up their fuel, released their outer layers into space and are slowly cooling. One day, our sun will become a white dwarf—but that won't be for another 6 billion years. ~Sometimes, the planets orbiting a white dwarf will be drawn in by their star's gravity, ripped apart and consumed. When this happens, the star becomes "polluted" with heavy metals from the planet's interior. Because white dwarfs' atmospheres are made almost entirely of hydrogen and helium, the presence of other elements can be reliably attributed to external sources. ~ "For polluted white dwarfs, the inside of the planet is literally being seared onto the surface of the star for us to look at," Kao said. "Polluted white dwarfs right now are the best way we can characterize planetary interiors."”
✭Genetic Programming for Population Genetics (GP4PG): Modelling the demographic history of human North African genomes points to a recent soft split divergence between populations (Genome Biology) “We conducted a comprehensive analysis of 364 genomes to construct detailed demographic models for the North African region, encompassing its two primary ethnic groups, the Arab and Amazigh populations. This was achieved through an Approximate Bayesian Computation with Deep Learning (ABC-DL) framework and a novel algorithm called Genetic Programming for Population Genetics (GP4PG). This innovative approach enabled us to effectively model intricate demographic scenarios, utilizing a subset of 16 whole genomes at > 30X coverage. The demographic model suggested by GP4PG exhibited a closer alignment with the observed data compared to the ABC-DL model. Both point to a back-to-Africa origin of North African individuals and a close relationship with Eurasian populations. Results support different origins for Amazigh and Arab populations, with Amazigh populations originating back in Epipaleolithic times, while GP4PG supports Arabization as the main source of Middle Eastern ancestry. The GP4PG model includes population substructure in surrounding populations (sub-Saharan Africa and Middle East) with continuous decaying gene flow after population split. Contrary to ABC-DL, the best GP4PG model does not require pulses of admixture from surrounding populations into North Africa pointing to soft splits as drivers of divergence in North Africa. Conclusions We have built a demographic model on North Africa that points to a back-to-Africa expansion and a differential origin between Arab and Amazigh populations.” ✭ Demographics of north African human populations unraveled using genomic data and artificial intelligence “To shed light on the origin and evolution of the Arab and Imazighen populations, the team conducted a comprehensive analysis of 364 complete genomes from different populations. To do so, it developed an innovative computational model with natural computing methods, within the field of artificial intelligence, dubbed "genetic programming for population genetics" (GP4PG). The results reveal that the differentiation between the Arab people and the Amazigh took place far earlier than expected. "The new GP4PG model has allowed a more precise, robust and refined analysis, which for the first time clearly separates the two peoples more than 20,000 years ago, when the Imazighen returned to Africa from Eurasia in the movement known as 'back to Africa'"”
✭Federated learning as a catalyst for digital healthcare innovations (Patterns | Cell) “As the landscape of digital healthcare continues to evolve, the integration of artificial intelligence (AI) presents both immense opportunities and profound challenges. At the heart of this dynamic field lies the quest for innovative solutions that enhance patient care while safeguarding sensitive medical data. In response to these imperatives, the emergence of federated learning (FL) represents a pivotal advancement, offering a pathway to harness the collective intelligence of distributed healthcare datasets while respecting privacy and security protocols. ~ This special collection on “federated learning in digital healthcare” curated for Patterns stands as a testament to the growing significance of FL in revolutionizing healthcare AI. In an era where data is hailed as the new currency of innovation, FL emerges as a beacon of promise, addressing the twin pillars of data accessibility and privacy preservation. Central to the ethos of FL is its ability to transcend traditional data silos, enabling collaborative model training across diverse healthcare institutions without necessitating the sharing of raw patient data. This decentralized approach not only fosters a spirit of cooperation but also empowers organizations to leverage the collective wisdom inherent in their datasets, thereby fuelling advancements in medical research and clinical practice.”
✭ InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds “While novel view synthesis (NVS) from a sparse set of images has made substantial progress in 3D computer vision, it requires an accurate initial estimation of camera parameters using Structure-from-Motion (SfM). However, SfM processes are time-consuming and prove unreliable in sparse-view scenarios where matched features are scarce. Moreover, the recent point-based representation (3D Gaussian Splatting or 3D-GS) is substantially dependent on the precision of SfM outcomes, leading to significant accumulated errors and limited generalization capability across varied datasets. In this study, we introduce a novel and streamlined framework to enhance robust NVS from sparse-view images. ~ Our framework, InstantSplat, integrates dense stereo predictions with point-based representations to construct 3D Gaussians of large-scale scenes from sparse-view data within seconds. Specifically, InstantSplat generates densely populated surface points across all training views and determines the initial camera parameters using pixel-aligned points.”
✭BNP-Track (Bayesian nonparametric track): a framework for superresolved tracking (Nature Methods) “Superresolution tools, such as PALM and STORM, provide nanoscale localization accuracy by relying on rare photophysical events, limiting these methods to static samples. By contrast, here, we extend superresolution to dynamics without relying on photodynamics by simultaneously determining emitter numbers and their tracks (localization and linking) with the same localization accuracy per frame as widefield superresolution on immobilized emitters under similar imaging conditions (≈50 nm). We demonstrate our Bayesian nonparametric track (BNP-Track) framework on both in cellulo and synthetic data. BNP-Track develops a joint (posterior) distribution that learns and quantifies uncertainty over emitter numbers and their associated tracks propagated from shot noise, camera artifacts, pixelation, background and out-of-focus motion. In doing so, we integrate spatiotemporal information into our distribution, which is otherwise compromised by modularly determining emitter numbers and localizing and linking emitter positions across frames. For this reason, BNP-Track remains accurate in crowding regimens beyond those accessible to other single-particle tracking tools.” ✭ BNP-Track algorithm offers a clearer picture of biomolecules in motion (Phys.org) “It's about to get easier to catch and analyze a high-quality image of fast-moving molecules. Assistant Professor Ioannis Sgouralis, Department of Mathematics, and colleagues have developed an algorithm that adds a new level to microscopy: super-resolution in motion. ~ The cutting-edge advancement of super-resolution microscopy was recognized with the 2014 Nobel Prize in Chemistry for its groundbreaking innovation. It improves optical microscopy with a suite of techniques that overcome the inherent limitations set by the physics of light. The high-frequency oscillations of light waves escape detection by the naked eye or conventional cameras, appearing continuous. Super-resolution microscopy captures details more refined than the wavelength of light which, due to diffraction, are otherwise missed by common microscopes and optical devices. ~ "For scientific experiments in biochemistry and molecular biology, where we typically need to observe individual biomolecules, such missing details are critical," said Sgouralis. "Characteristically, important biomolecules like DNA, RNA, and proteins are about 1,000 times smaller than light's wavelength, as a result their images appear noisy, distorted, and heavily blurred—which makes them inappropriate for scientific purposes." Super-resolution tools such as PALM or STORM fill in these details by relying on image-analysis algorithms to recover the missing information and capture accurate still images at the molecular level. "Although super-resolution experiments have had a huge impact on the life sciences, they allow recovery of the missing information only when the biomolecules remain immobile," said Sgouralis. "However, life is all about motion and biomolecules within a living organism are constantly moving." In their new research, published July 22 in Nature Methods, Sgouralis and colleagues demonstrate a new framework called Bayesian nonparametric track (BNP-Track), the first image-analysis algorithm that allows super-resolution for moving biomolecules.”
😵(Might) Watch (Partially): AI and The Next Computing Platforms With Jensen Huang and Mark Zuckerberg, Indus Model by NASA and IBM
✭ AI and The Next Computing Platforms With Jensen Huang and Mark Zuckerberg “NVIDIA founder and CEO Jensen Huang and Meta founder and CEO Mark Zuckerberg discuss how fundamental research is enabling AI breakthroughs, and how generative AI and open-source software will empower developers and creators. They also discuss the role of generative AI in building virtual worlds, and the potential of virtual worlds for building the next wave of AI and robots.”
👀(yes) Watched: ‘Google smokes math Olympiads’ (Fireship), Trying to Convince ChatGPT It's Conscious (Alex O'Connor)
Google smokes math Olympiads… and 15 crazy tech stories you missed in July
Trying to Convince ChatGPT It's Conscious Alex O'Connor
🖲️AI Art (Research, Tools, Play): Artificial Immediacy: AI Music Generation with Suno + Personalized-Audio-Visual-Streams (PAVS) & the listenership
✭ Artificial Immediacy: AI Music Generation with Suno + Personalized-Audio-Visual-Streams (PAVS) & the listenership “(June & July 2024) Prompting 7 hours of 'reasonable' music is approx. 21 hours of work. Contrast an arduous 20th century digital art apprenticeship, of years of arduous physical practice with interfaces and instruments, with the following Artificial Immediacy music generation process.” → ✭ ReRites, August 2017 + Suno Aug 4th 2024 (Lyrics from August 2017 RERITES : machine learning poetry edited by a human )
🔌AI-Art-Tools: Autolume (METACREATION), Replicate “Run AI with an API”, Fal.ai
✭ Autolume — METACREATION “A Neural-network based visual synthesizer. Autolume is a no-coding generative AI system allowing artists to train, craft, and explore their own models.”
✭ Replicate “Run AI with an API. Run and fine-tune open-source models. Deploy custom models at scale. All with one line of code.”
✭ Fal.ai “Welcome to fal! The fastest generative media platform for developers.Create a API key secret. Call the API Endpoint. We have the most-popular models implemented and available as API endpoints for you to start crafting your own AI-powered app today.”
🖲️AI-Art-Play: Does Anyone Know who/what made This?, Moss 369 - Perfumes, Fever Dream
Does Anyone Know who/what made This? Or this: U know me I know you (on TikTok)
⚔️War (wAIr) Propaganda: InfoWars ‘2084’
✭Info Wars '2084' (seen on Runway ML Discord channel)
Example on Yolov Object Detection and Tracking #ai #artificialintelligence #elsemind
📚Retroactive Readings: Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction, Folk psychological attributions of consciousness to large language models (Neuroscience of Consciousness | Oxford)
✭ [2406.19108] Computational Life: How Well-formed, Self-replicating Programs Emerge from Simple Interaction “The fields of Origin of Life and Artificial Life both question what life is and how it emerges from a distinct set of "pre-life" dynamics. One common feature of most substrates where life emerges is a marked shift in dynamics when self-replication appears. While there are some hypotheses regarding how self-replicators arose in nature, we know very little about the general dynamics, computational principles, and necessary conditions for self-replicators to emerge. This is especially true on "computational substrates" where interactions involve logical, mathematical, or programming rules. In this paper we take a step towards understanding how self-replicators arise by studying several computational substrates based on various simple programming languages and machine instruction sets. We show that when random, non self-replicating programs are placed in an environment lacking any explicit fitness landscape, self-replicators tend to arise. We demonstrate how this occurs due to random interactions and self-modification, and can happen with and without background random mutations. We also show how increasingly complex dynamics continue to emerge following the rise of self-replicators. Finally, we show a counterexample of a minimalistic programming language where self-replicators are possible, but so far have not been observed to arise.” Random Code Can Learn to Self-Replicate, New Study Finds
✭ Folk psychological attributions of consciousness to large language models | Neuroscience of Consciousness | Oxford Academic “Technological advances raise new puzzles and challenges for cognitive science and the study of how humans think about and interact with artificial intelligence (AI). For example, the advent of large language models and their human-like linguistic abilities has raised substantial debate regarding whether or not AI could be conscious. Here, we consider the question of whether AI could have subjective experiences such as feelings and sensations (‘phenomenal consciousness’). While experts from many fields have weighed in on this issue in academic and public discourse, it remains unknown whether and how the general population attributes phenomenal consciousness to AI. We surveyed a sample of US residents (n = 300) and found that a majority of participants were willing to attribute some possibility of phenomenal consciousness to large language models. These attributions were robust, as they predicted attributions of mental states typically associated with phenomenality—but also flexible, as they were sensitive to individual differences such as usage frequency. Overall, these results show how folk intuitions about AI consciousness can diverge from expert intuitions—with potential implications for the legal and ethical status of AI.” How could we tell whether AI has become conscious?


