Auto Research
Catalog entries using this tag (links open the entry card on its page):
Entries
AI Scientist
PUBMED_LINK
FULL NAME
The AI Scientist — Towards End-to-End Automation of AI Research
DESCRIPTION
The AI Scientist is the first fully autonomous AI system to generate a paper that passed peer review (ICLR 2025 ICBINB workshop). It creates research ideas, writes code, runs experiments, analyses data, writes manuscripts, and performs peer review — all end-to-end. Template-free mode uses agentic tree search for open-ended scientific exploration. An automated reviewer achieves balanced accuracy comparable to human reviewers (69%). Paper quality scales with foundation model capability and test-time compute.
URL
TITLE
Towards end-to-end automation of AI research.
Main citation
Lu C, Lu C, Lange RT, Yamada Y, Hu S, Foerster J, Ha D, Clune J. (2026) Towards end-to-end automation of AI research. Nature, 651(8107):914-919. doi:10.1038/s41586-026-10265-5. PMID 41882133
ABSTRACT
The automation of science is a long-standing ambition in artificial intelligence research. Although the community has made substantial progress in automating individual components of the scientific process, a system that autonomously navigates the entire research life cycle from conception to publication has remained out of reach. Here we present a pipeline for automating the entire scientific process end to end. We present The AI Scientist, which creates research ideas, writes code, runs experiments, plots and analyses data, writes the entire scientific manuscript, and performs its own peer review. Its ideas, execution and presentation are of sufficient quality that the manuscript generated by this AI system passed the first round of peer review for a workshop of a top-tier machine learning conference.
DOI
10.1038/s41586-026-10265-5
AutoResearchClaw
FULL NAME
AutoResearchClaw — Self-Reinforcing Autonomous Research with Human-AI Collaboration
DESCRIPTION
AutoResearchClaw is an open-source 23-stage autonomous research pipeline from UNC Chapel Hill that turns a research idea into a conference-ready LaTeX paper. Features: multi-agent debate for hypothesis generation, self-healing executor with Pivot/Refine decision loop, verifiable result reporting preventing hallucinations, human-in-the-loop with 7 intervention modes, and cross-run evolution. Outperforms AI Scientist v2 by 54.7% on ARC-Bench. MIT licensed.
URL
Main citation
AIMING Lab. (2026) AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration. arXiv:2605.20025. doi:10.48550/arXiv.2605.20025
ABSTRACT
Automating scientific discovery requires more than generating papers from ideas. Real research is iterative: hypotheses are challenged from multiple perspectives, experiments fail and inform the next attempt, and lessons accumulate across cycles. We present AutoResearchClaw, a multi-agent autonomous research pipeline built on five mechanisms: structured multi-agent debate for hypothesis generation and result analysis, a self-healing executor with a Pivot/Refine decision loop that transforms failures into information, verifiable result reporting that prevents fabricated numbers and hallucinated citations, human-in-the-loop collaboration with seven intervention modes, and cross-run evolution that converts past mistakes into future safeguards. On ARC-Bench, AutoResearchClaw outperforms AI Scientist v2 by 54.7%.
DOI
10.48550/arXiv.2605.20025
CodeScientist
FULL NAME
CodeScientist — End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation
DESCRIPTION
CodeScientist is an autonomous scientific discovery system from Ai2 (Allen Institute for AI) that frames ideation and experiment construction as genetic search over research articles and code blocks. It conducted hundreds of automated experiments on agents and virtual environments, returning 19 discoveries, 6 of which were judged minimally sound and incrementally novel after multi-faceted evaluation (conference review, code review, replication). Discoveries span new tasks, agents, metrics, and data. Published at ACL 2025 Findings.
URL
Main citation
Jansen P. (2025) CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation. arXiv:2503.22708. ACL 2025 Findings. doi:10.48550/arXiv.2503.22708
ABSTRACT
Despite the surge of interest in autonomous scientific discovery (ASD) of software artifacts, current ASD systems face two key limitations: they largely explore variants of existing codebases, and they produce large volumes of research artifacts typically evaluated using conference-style paper review with limited evaluation of code. In this work we introduce CodeScientist, a novel ASD system that frames ideation and experiment construction as a form of genetic search jointly over combinations of research articles and codeblocks defining common actions in a domain. We use this paradigm to conduct hundreds of automated experiments, with the system returning 19 discoveries, 6 of which were judged as being both at least minimally sound and incrementally novel after multi-faceted evaluation.
DOI
10.48550/arXiv.2503.22708
Denario
FULL NAME
Denario — Deep Knowledge AI Agents for Scientific Discovery
DESCRIPTION
Denario is an AI multi-agent system from the Flatiron Institute (Simons Foundation) designed to serve as a scientific research assistant across disciplines. It can generate ideas, check literature for novelty, develop research plans, write and execute code, make plots, and draft and review scientific papers. Demonstrated across 11 AI-generated paper drafts in astrophysics, biology, biophysics, chemistry, material science, medicine, neuroscience and more. Excels at combining ideas across disciplines (e.g., quantum physics + ML applied to astrophysics).
URL
Main citation
Villaescusa-Navarro F, Bolliet B, Villanueva-Domingo P, et al. (2025) The Denario project: Deep knowledge AI agents for scientific discovery. arXiv:2510.26887. doi:10.48550/arXiv.2510.26887
ABSTRACT
We present Denario, an AI multi-agent system designed to serve as a scientific research assistant. Denario can perform many different tasks, such as generating ideas, checking the literature, developing research plans, writing and executing code, making plots, and drafting and reviewing a scientific paper. In this work, we describe in detail Denario and its modules, and illustrate its capabilities by presenting multiple AI-generated papers generated by it in many different scientific disciplines such as astrophysics, biology, biophysics, biomedical informatics, chemistry, material science, mathematical physics, medicine, neuroscience and planetary science. Denario also excels at combining ideas from different disciplines.
DOI
10.48550/arXiv.2510.26887