AI Auto research
Curation of Auto research — listings under the AI tab.
Autonomous Scientific Discovery
AI systems that autonomously conduct research — generating hypotheses, designing experiments, and validating results:
- Foundation (2025): Specialized multi-agent systems for structured scientific reasoning (Co-Scientist, Gottweis et al. Nature 2026; Virtual Lab, Swanson et al. PMID 39719521, Nature 2025).
- Full autonomy (2025-2026): End-to-end systems that generate complete research papers from idea to manuscript (AI Scientist, Lu et al. ICML 2025; Robin, Qiu et al. Nat Biotechnol 2026).
- Open-source pipelines: Reproducible multi-stage workflows (AutoResearchClaw, CodeScientist from Ai2).
Trend: from assistive → autonomous, from specialized domains → general-purpose scientific discovery.
Summary Table
Click a column header to sort the table.
| NAME | Main citation | YEAR |
|---|---|---|
| AI Scientist | Lu C et al., Nature, 2026 |
2026 |
| AutoResearchClaw | AIMING Lab et al. |
NA |
| Biomni | Huang K et al., bioRxiv, 2025 |
2025 |
| Co-Scientist | Gottweis J et al., Nature, 2026 |
2026 |
| CodeScientist | Jansen P et al. |
NA |
| Denario | Villaescusa-Navarro F et al. |
NA |
| Robin | Ghareeb AE et al., Nature, 2026 |
2026 |
| SPARK | Trost F et al., Nat Med, 2026 |
2026 |
| Virtual Lab | Swanson K et al., Nature, 2025 |
2025 |
AI Scientist
PUBMED_LINK
FULL NAME
The AI Scientist — Towards End-to-End Automation of AI Research
DESCRIPTION
The AI Scientist is the first fully autonomous AI system to generate a paper that passed peer review (ICLR 2025 ICBINB workshop). It creates research ideas, writes code, runs experiments, analyses data, writes manuscripts, and performs peer review — all end-to-end. Template-free mode uses agentic tree search for open-ended scientific exploration. An automated reviewer achieves balanced accuracy comparable to human reviewers (69%). Paper quality scales with foundation model capability and test-time compute.
URL
TITLE
Towards end-to-end automation of AI research.
Main citation
Lu C, Lu C, Lange RT, Yamada Y, Hu S, Foerster J, Ha D, Clune J. (2026) Towards end-to-end automation of AI research. Nature, 651(8107):914-919. doi:10.1038/s41586-026-10265-5. PMID 41882133
ABSTRACT
The automation of science is a long-standing ambition in artificial intelligence research. Although the community has made substantial progress in automating individual components of the scientific process, a system that autonomously navigates the entire research life cycle from conception to publication has remained out of reach. Here we present a pipeline for automating the entire scientific process end to end. We present The AI Scientist, which creates research ideas, writes code, runs experiments, plots and analyses data, writes the entire scientific manuscript, and performs its own peer review. Its ideas, execution and presentation are of sufficient quality that the manuscript generated by this AI system passed the first round of peer review for a workshop of a top-tier machine learning conference.
DOI
10.1038/s41586-026-10265-5
AutoResearchClaw
FULL NAME
AutoResearchClaw — Self-Reinforcing Autonomous Research with Human-AI Collaboration
DESCRIPTION
AutoResearchClaw is an open-source 23-stage autonomous research pipeline from UNC Chapel Hill that turns a research idea into a conference-ready LaTeX paper. Features: multi-agent debate for hypothesis generation, self-healing executor with Pivot/Refine decision loop, verifiable result reporting preventing hallucinations, human-in-the-loop with 7 intervention modes, and cross-run evolution. Outperforms AI Scientist v2 by 54.7% on ARC-Bench. MIT licensed.
URL
Main citation
AIMING Lab. (2026) AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration. arXiv:2605.20025. doi:10.48550/arXiv.2605.20025
ABSTRACT
Automating scientific discovery requires more than generating papers from ideas. Real research is iterative: hypotheses are challenged from multiple perspectives, experiments fail and inform the next attempt, and lessons accumulate across cycles. We present AutoResearchClaw, a multi-agent autonomous research pipeline built on five mechanisms: structured multi-agent debate for hypothesis generation and result analysis, a self-healing executor with a Pivot/Refine decision loop that transforms failures into information, verifiable result reporting that prevents fabricated numbers and hallucinated citations, human-in-the-loop collaboration with seven intervention modes, and cross-run evolution that converts past mistakes into future safeguards. On ARC-Bench, AutoResearchClaw outperforms AI Scientist v2 by 54.7%.
DOI
10.48550/arXiv.2605.20025
Biomni
PUBMED_LINK
FULL NAME
Biomni: A General-Purpose Biomedical AI Agent
DESCRIPTION
Biomni is a general-purpose biomedical AI agent designed to autonomously execute a wide spectrum of research tasks across diverse biomedical subfields. It employs an action discovery agent to mine tools, databases, and protocols from tens of thousands of publications across 25 biomedical domains, creating the first unified agentic environment (Biomni-E1). Its generalist agentic architecture (Biomni-A1) integrates LLM reasoning with retrieval-augmented planning and code-based execution, dynamically composing complex workflows without predefined templates. Systematic benchmarking demonstrates strong zero-shot generalization across heterogeneous tasks including causal gene prioritization, drug repurposing, rare disease diagnosis, microbiome analysis, and molecular cloning.
URL
TITLE
Biomni: A General-Purpose Biomedical AI Agent.
Main citation
Huang K, Zhang S, Wang H, Qu Y, Lu Y, Roohani Y, Li R, Qiu L, Li G, Zhang J, Yin D, Marwaha S, Carter JN, Zhou X, Wheeler M, Bernstein JA, Wang M, He P, Zhou J, Snyder M, Cong L, Regev A, Leskovec J. (2025) Biomni: A General-Purpose Biomedical AI Agent. bioRxiv. doi:10.1101/2025.05.30.656746. PMID 40501924
ABSTRACT
Biomedical research underpins progress in our understanding of human health and disease, drug discovery, and clinical care. However, with the growth of complex lab experiments, large datasets, many analytical tools, and expansive literature, biomedical research is increasingly constrained by repetitive and fragmented workflows that slow discovery and limit innovation. Here, we introduce Biomni, a general-purpose biomedical AI agent designed to autonomously execute a wide spectrum of research tasks across diverse biomedical subfields. To systematically map the biomedical action space, Biomni first employs an action discovery agent to create the first unified agentic environment, mining essential tools, databases, and protocols from tens of thousands of publications across 25 biomedical domains. Built on this foundation, Biomni features a generalist agentic architecture that integrates LLM reasoning with retrieval-augmented planning and code-based execution, enabling it to dynamically compose and carry out complex biomedical workflows entirely without relying on predefined templates or rigid task flows. Systematic benchmarking demonstrates that Biomni achieves strong generalization across heterogeneous biomedical tasks including causal gene prioritization, drug repurposing, rare disease diagnosis, microbiome analysis, and molecular cloning without any task-specific prompt tuning. Real-world case studies further showcase Biomni's ability to interpret complex, multi-modal biomedical datasets and autonomously generate experimentally testable protocols.
DOI
10.1101/2025.05.30.656746
Co-Scientist
PUBMED_LINK
FULL NAME
Co-Scientist - A Multi-Agent AI System for Accelerating Scientific Discovery
DESCRIPTION
Co-Scientist is a multi-agent AI system built on Gemini for structured scientific thinking and hypothesis generation. It aims to help scientists discover new original knowledge by formulating demonstrably novel research hypotheses for experimental validation, conditioned on research objectives and prior scientific evidence.
TITLE
Accelerating scientific discovery with Co-Scientist.
Main citation
Gottweis J, Weng WH, Daryin A, Tu T, Sirkovic P, Myaskovsky A, Glowaty G, Weissenberger F, Orlandi A, Popovici D, Palepu A, Rong K, Tanno R, Saab K, Zhang F, Blum J, Carroll A, Kulkarni K, Tomašev N, Zverinski D, Rendulic I, Vedadi E, Hasler F, Rimanic L, Boia M, Budiselic I, Feinstein B, Bellaiche M, Sheffer T, Freyberg J, Ratcliff J, Bertolli O, Chou K, Hassidim A, Gokturk B, Vahdat A, Guan Y, Dhillon V, Vaishnav ED, Lee B, Costa TRD, Penadés JR, Peltz G, Matias Y, Manyika J, Hassabis D, Xu Y, Kohli P, Pawlosky A, Karthikesalingam A, Natarajan V. (2026) Accelerating scientific discovery with Co-Scientist. Nature. doi:10.1038/s41586-026-10644-y. PMID 42156544
ABSTRACT
Scientific discovery is driven by scientists generating novel hypotheses for complex problems that undergo rigorous experimental validation. To augment this process, we introduce Co-Scientist, a multi-agent AI system built on Gemini for structured scientific thinking and hypothesis generation. Co-Scientist aims to help scientists discover new original knowledge. Conditioned on their research objectives and prior scientific evidence, it formulates demonstrably novel research hypotheses for experimental validation.
DOI
10.1038/s41586-026-10644-y
CodeScientist
FULL NAME
CodeScientist — End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation
DESCRIPTION
CodeScientist is an autonomous scientific discovery system from Ai2 (Allen Institute for AI) that frames ideation and experiment construction as genetic search over research articles and code blocks. It conducted hundreds of automated experiments on agents and virtual environments, returning 19 discoveries, 6 of which were judged minimally sound and incrementally novel after multi-faceted evaluation (conference review, code review, replication). Discoveries span new tasks, agents, metrics, and data. Published at ACL 2025 Findings.
URL
Main citation
Jansen P. (2025) CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation. arXiv:2503.22708. ACL 2025 Findings. doi:10.48550/arXiv.2503.22708
ABSTRACT
Despite the surge of interest in autonomous scientific discovery (ASD) of software artifacts, current ASD systems face two key limitations: they largely explore variants of existing codebases, and they produce large volumes of research artifacts typically evaluated using conference-style paper review with limited evaluation of code. In this work we introduce CodeScientist, a novel ASD system that frames ideation and experiment construction as a form of genetic search jointly over combinations of research articles and codeblocks defining common actions in a domain. We use this paradigm to conduct hundreds of automated experiments, with the system returning 19 discoveries, 6 of which were judged as being both at least minimally sound and incrementally novel after multi-faceted evaluation.
DOI
10.48550/arXiv.2503.22708
Denario
FULL NAME
Denario — Deep Knowledge AI Agents for Scientific Discovery
DESCRIPTION
Denario is an AI multi-agent system from the Flatiron Institute (Simons Foundation) designed to serve as a scientific research assistant across disciplines. It can generate ideas, check literature for novelty, develop research plans, write and execute code, make plots, and draft and review scientific papers. Demonstrated across 11 AI-generated paper drafts in astrophysics, biology, biophysics, chemistry, material science, medicine, neuroscience and more. Excels at combining ideas across disciplines (e.g., quantum physics + ML applied to astrophysics).
URL
Main citation
Villaescusa-Navarro F, Bolliet B, Villanueva-Domingo P, et al. (2025) The Denario project: Deep knowledge AI agents for scientific discovery. arXiv:2510.26887. doi:10.48550/arXiv.2510.26887
ABSTRACT
We present Denario, an AI multi-agent system designed to serve as a scientific research assistant. Denario can perform many different tasks, such as generating ideas, checking the literature, developing research plans, writing and executing code, making plots, and drafting and reviewing a scientific paper. In this work, we describe in detail Denario and its modules, and illustrate its capabilities by presenting multiple AI-generated papers generated by it in many different scientific disciplines such as astrophysics, biology, biophysics, biomedical informatics, chemistry, material science, mathematical physics, medicine, neuroscience and planetary science. Denario also excels at combining ideas from different disciplines.
DOI
10.48550/arXiv.2510.26887
Robin
PUBMED_LINK
FULL NAME
Robin - A Multi-Agent System for Automating Scientific Discovery
DESCRIPTION
Robin is the first multi-agent system capable of fully automating both hypothesis generation and data analysis for experimental biology. By integrating literature search agents with data analysis agents, Robin can generate testable hypotheses from literature and design experiments to validate them, automating the entire scientific discovery cycle for biological research.
TITLE
A multi-agent system for automating scientific discovery.
Main citation
Ghareeb AE, Chang B, Mitchener L, Yiu A, Szostkiewicz CJ, Shved D, Gyimesi GJ, Laurent JM, Wright SM, Razzak MT, White AD, Finnemann SC, Hinks MM, Rodriques SG. (2026) A multi-agent system for automating scientific discovery. Nature. doi:10.1038/s41586-026-10652-y. PMID 42156546
ABSTRACT
Scientific discovery is driven by the iterative process of observation, hypothesis generation, experimentation, and data analysis. Despite recent advancements in applying artificial intelligence to biology, no system has yet automated all these stages. Here, we introduce Robin, the first multi-agent system capable of fully automating both hypothesis generation and data analysis for experimental biology. By integrating literature search agents with data analysis agents, Robin can generate testable hypotheses from literature and design experiments to validate them.
DOI
10.1038/s41586-026-10652-y
SPARK
PUBMED_LINK
FULL NAME
SPARK (System of Pathology Agents for Research and Knowledge)
DESCRIPTION
SPARK (System of Pathology Agents for Research and Knowledge) is a foundational agentic AI framework that uses language as a universal interface to autonomously generate biologically driven concepts for tumor analysis. It functions as a pathology 'brain' — an interconnected system of AI agents that autonomously reason, generate and implement biologically meaningful hypotheses as analytical tools without additional model training. SPARK uses four linked modules: idea generation (OpenAI o1), idea refinement, parameter coding (Claude Sonnet 3.5), and parameter verification. Evaluated across 18 patient cohorts spanning 5 cancer types and >5,400 patients, SPARK produced clinically and biologically relevant concepts correlated with prognosis, pathological variables, and predictive biomarkers, including patterns of tumor progression inferred from static images.
URL
TITLE
An agentic framework for autonomous scientific discovery in cancer pathology.
Main citation
Trost F, Zhang B, Aring J, Glamann L, Wessolly M, Johnson K, Göbel H, Lerbs T, Sangenne T, Herrmann P, Mairinger F, Kopp C, Michels S, Rasokat A, Heldwein M, Wagner S, Schömig-Markiefka B, Wolf J, Hartmann S, Wickenhauser C, Bychkov A, Klussmann JP, Quaas A, Buettner R, Tolkach Y. (2026) An agentic framework for autonomous scientific discovery in cancer pathology. Nature Medicine. doi:10.1038/s41591-026-04357-y. PMID 42056496
ABSTRACT
Artificial intelligence has advanced cancer pathology, but many systems still depend on hand-crafted features, are hard to explain and rely on fragmented workflows. We introduce SPARK (System of Pathology Agents for Research and Knowledge), a foundational agentic artificial intelligence approach that uses language as a universal interface to autonomously generate biologically driven concepts for tumor analysis. SPARK turns biological ideas into analytical tools and works directly with complex pathology data without extra model training. We evaluated SPARK across 18 patient cohorts spanning five cancer types (lung adenocarcinoma, lung squamous cell carcinoma, colorectal cancer, breast cancer and oropharyngeal squamous cell carcinoma) and more than 5,400 patients with available histopathology images and clinical/follow-up information, in both prognostic and predictive settings and on a well characterized spatial biology breast cancer dataset (n=625). We found that SPARK produced clinically and biologically relevant concepts correlated with prognosis, known pathological variables and predictive biomarkers, including patterns of tumor progression and temporal change inferred from static images. A dedicated module allows for human interaction with SPARK. All code, parameters and results are openly released.
DOI
10.1038/s41591-026-04357-y
Virtual Lab
PUBMED_LINK
FULL NAME
Virtual Lab - AI Agent Teams for Scientific Discovery
DESCRIPTION
The Virtual Lab is an AI agent framework that uses LLM-powered researchers in a simulated laboratory environment to collaboratively design and test scientific hypotheses. It was demonstrated by successfully designing new SARS-CoV-2 nanobodies, with AI agents specializing in different scientific roles working together to propose, evaluate, and refine experimental designs.
URL
TITLE
The Virtual Lab of AI agents designs new SARS-CoV-2 nanobodies.
ABSTRACT
Science frequently benefits from teams of interdisciplinary researchers, but many scientists do not have easy access to experts from multiple fields. Although large language models (LLMs) have shown an impressive ability to aid researchers across diverse domains, their uses have been largely limited to answering specific scientific questions rather than performing open-ended research. Here we expand the capabilities of LLMs for science by introducing the Virtual Lab, a framework where LLM-powered AI agents collaborate in a simulated laboratory to design and test scientific hypotheses. The Virtual Lab successfully designed new SARS-CoV-2 nanobodies, demonstrating the potential of multi-agent AI systems for open-ended scientific discovery.
DOI
10.1038/s41586-025-09442-9