Japanese startup Sakana has claimed that its AI produced the first peer-reviewed scientific paper. However, despite this assertion bearing some truth, there are imperative nuances to consider.
The discourse regarding AI’s function in the realm of scientific research intensifies with each passing day. While numerous researchers argue that AI is not yet equipped to act as a “co-scientist,” others see potential, albeit noting that we’re still in the early stages.
Sakana aligns with the latter perspective.
The firm stated that it deployed an AI framework known as The AI Scientist-v2 to generate a manuscript, which Sakana subsequently submitted to a workshop at ICLR, an esteemed AI conference with a longstanding history. Sakana asserts that the workshop’s coordinators and ICLR management consented to collaborate with the company to facilitate a double-blind review of manuscripts produced by AI.
The company reported collaboration with researchers from the University of British Columbia and the University of Oxford to submit three AI-generated manuscripts for peer review at the aforementioned workshop. According to Sakana, The AI Scientist-v2 autonomously produced the papers comprehensively, formulating the scientific hypotheses, conducting experiments, coding, analyzing data, visualizing results, and drafting text and titles.
“We provided the workshop abstract and details as input for the AI to generate relevant research ideas,” Robert Lange, a research scientist and founding member at Sakana, shared with TechCrunch via email. “This approach ensured that the generated papers were on topic and appropriate submissions.”
Out of the three submissions, one paper was accepted into the ICLR workshop, focusing critically on training methodologies for AI models. However, in a gesture of transparency and to adhere to ICLR protocols, Sakana withdrew the paper before publication.

“The accepted manuscript introduces an innovative technique for training neural networks and highlights ongoing empirical challenges,” Lange remarked. “It serves as an intriguing data point to inspire further scientific inquiry.”
Nevertheless, this milestone may not be as remarkable as it initially appears.
In a blog post, Sakana conceded that its AI occasionally produced “embarrassing” errors in citations, such as mistakenly crediting a 2016 paper instead of the original work from 1997.
Sakana’s paper also didn’t face the same level of thorough examination as some other peer-reviewed articles. Since the company withdrew its submission following the initial review, the paper skipped an additional “meta-review,” which workshop organizers could have potentially used to reject it.
Additionally, it’s worth noting that acceptance rates for conference workshops typically exceed those for the main “conference track” — a point Sakana acknowledges in its blog post. The company also mentioned that none of its AI-generated studies met their internal criteria for publication in the ICLR conference track.
Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, labeled Sakana’s findings as “somewhat misleading.”
“The Sakana team selected the papers from a set of generated options, indicating that human judgment was employed in choosing outputs deemed most likely to succeed,” he wrote in an email. “What this demonstrates is that a collaboration between humans and AI can yield effective results, rather than suggesting that AI alone can advance scientific discovery.”
Mike Cook, a research fellow at King’s College London with a focus on AI, expressed skepticism regarding the rigor of the peer review process employed by the workshop.
“New workshops, like this one, are frequently reviewed by less experienced researchers,” he remarked to TechCrunch. “It’s also important to note that this workshop is centered on negative results and challenges — which is beneficial, as I have moderated a similar workshop — but it’s arguably simpler for an AI to convincingly articulate a failure.”
Cook further noted that he was not surprised an AI would succeed in passing peer review, given that AI excels in creating human-like writing. The phenomenon of partially-AI-generated papers clearing journal evaluations is not new, nor are the ethical dilemmas that accompany it within the scientific arena.
The technical limitations of AI — such as its propensity for hallucination — make many scientists hesitant to endorse it for significant tasks. Additionally, experts are concerned that AI could simply add to the noise in scientific literature rather than facilitate progress.
“We must question whether [Sakana’s] outcome speaks to the ability of AI in designing and conducting experiments or if it merely showcases how proficient it is in persuading humans — something we know AI excels at already,” Cook concluded. “There’s a distinction between successfully passing peer review and genuinely contributing knowledge to a field.”
Sakana has not claimed that its AI is capable of producing groundbreaking or exceptionally novel scientific discoveries. Instead, the objective of this experiment was to “examine the quality of AI-generated research,” according to the company, while also emphasizing the immediate need for establishing “norms surrounding AI-generated science.”
“There are challenging questions regarding whether AI-generated science should be evaluated on its own merits to prevent bias against it,” the company stated. “Going forward, we intend to maintain discussions with the research community regarding the advancement of this technology, ensuring that it does not devolve into a scenario where its primary aim is merely to pass peer review, thereby significantly undermining the essence of the scientific peer review process.”
Compiled by Techarena.au.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence


