Vol. 2 No. 1 (2025): Large Language Models I

Accuracy of LLMs across Benchmarks-Humanity's Last Exam

The issue theme is large language models (LLMs) - their capabilities, limitations, benchmarks, methods to ensure their safety and value alignment with humans, and governance. Mercer et al. and Dario Amodei offer perspective on DeepSeek. Dan Hendrycks et al. take a very hard look at the geopolitics of AGI ('a coherent superintelligence strategy'), and Rehman et al. from RAND Corp. critique their proposal. Qureshi presents a framework for analyzing timelines predicting the advent of AGI. Humanity's Last Exam is a vast and open-ended compilation of 2700 questions 'at the frontier of human knowledge' to test AGI knowledge and reasoning capability. The Road to Artificial Superintelligence is a timely survey of safety & alighment methods.

Future of Life Institute's Uuk et al. scanned the literature for practical, effective, broadly-applicable mitigations for AGI risk; their top three: safety incident reports and security information sharing, third-party pre-deployment model audits, and pre-deployment risk assessments. Yoshua Bengio & team describe a computationally-efficient Bayesian ML program designed to assess risk probabilities of an agent's actions at runtime. What safety & alignment policies are actually in use at leading AI companies? See Anthropic's Responsible Scaling Policy

Modeling and simulation are essential tools for AGI safety & alignment. Nasim et al. offer non-coders an open-source simulator for opinion dynamics researchers to analyze influence propagation and counter-misinformation strategies in social networks that including LLM agents (see FLI's new proposal to ban models with superhuman persuasion capability).

Preston Estep contributes to the theory of mind, examines differences between human and artificial intelligence, and looks at how non-human attributes of AI must be taken into account when predicting AGI risks.

DOI: https://doi.org/10.70777/si.v2i1

Published: 2025-03-06

Articles

Highlights of the Issue: Large Language Models I

Kris Carlson
- PDF
DOI: https://doi.org/10.70777/si.v2i1.14075
Superintelligence Strategy: Expert Version

Dan Hendrycks, Eric Schmidt, Alexandr Wang
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13961
Humanity's Last Exam

Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13973
Pathways to Short Transformative AI Timelines Chapter 3: Short TAI timeline scenarios

Zershaaneh Qureshi
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13603
The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment

HyunJin Kim, Xiaoyuan Yi, JinYeong Bak, Jing Yao, Jianxun Lian, Muhua Huang, Shitong Duan, Xing Xie
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13963
Brief analysis of DeepSeek R1 and its implications for Generative AI

Sarah Mercer, Samuel Spillard, Daniel P. Martin
- PDF
DOI: https://doi.org/10.70777/si.v2i1.11097
Effective Mitigations for Systemic Risks from General-Purpose AI

Risto Uuk, Annemieke Brouwer, Tim Schreier, Noemi Dreksler, Valeria Pulignano, Rishi Bommasani
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13975
Simulating Influence Dynamics with LLM Agents

Mehwish Nasim, Syed Muslim Gilani, Amin Qasmi, Usman Naseem
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13971
Can a Bayesian Oracle Prevent Harm from an Agent?

Yoshua Bengio, Matt McDermott, Michael K. Cohen, Nikolay Malkin, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13799
Multiple unnatural attributes of AI undermine common anthropomorphically biased takeover speculations Eight Fundamental Differences between Biologically Evolved Humans and Digital AI

Preston Estep
- pdf
DOI: https://doi.org/10.70777/si.v2i1.13801

Commentary

Analysis of FLI’s AI Action Plan Recommendations and Proposed Policy Framework

Gil Syswerda
- PDF
DOI: https://doi.org/10.70777/si.v2i1.14139
Seeking Stability in the Competition for AI Advantage Commentary on Superintelligence Strategy by Dan Hendrycks, Eric Schmidt, and Alexandr Wang

Iskander Rehman, Karl P. Mueller, Michael J. Mazarr
- PDF
DOI: https://doi.org/10.70777/si.v2i1.14023
On DeepSeek and Export Controls

Dario Amodei
- PDF
DOI: https://doi.org/10.70777/si.v2i1.10695
Anthropic: Responsible Scaling Policy

Evan Hubinger
- PDF
DOI: https://doi.org/10.70777/si.v2i1.13657

Dario Amodei, The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

5 May 2026

Anthropic CEO Dario Amodei envisions a 'country of geniuses in a datacenter' with these five enabling properties:1) smarter than top humans across most domains, 2) able to act autonomously over long horizons, 3) use digital tools, 4) coordinate many copies, and 5) operate at much higher speed than humans. You could use these as a checlist to develop such a system, as Anthropic probably does. #5 is true in some domains (real-time learning being a counter-example), #3 is well in progress, and #1, #2, and #4 are still significant obstacles. He does not mention, e.g., robust generalization, abstraction, and world models. The essay discusses risks, governance, and economic implications. The essay’s overall thesis is: AI’s upside remains enormous, but humanity must treat the next few years as a civilizational test requiring technical alignment work, pragmatic regulation, geopolitical realism, economic adaptation, and moral seriousness.

Steve Omohundro: Regulating AGI: From Liability to Provable Contracts

18 November 2025

AGI will render today's liability-based AI regulation obsolete through its ability to circumvent cybersecurity, hide its origins, and act strategically—but it will also enable a new regulatory paradigm based on mathematically provable contracts.

Joe Rogan Experience #2345 - Roman Yampolskiy

24 September 2025

SuperIntelligence co-founding editor Roman Yampolskiy interviewed at length on Joe Rogan. Over 800,000 views.

Steve Omohundro Receives 2024 Future of Life Award

24 September 2025

SuperIntelligence co-founding editor Steve Omohundro was one of three recipients of the prestigious FLI Award 2024 award recognizing seminal contributions to AI safety: "...for laying the foundation of modern ethics and safety considerations for artificial intelligence and computers."

Steve Omohundro and Scientists Discuss the AI Alignment Problem with Neil deGrasse Tyson

24 September 2025

Hosted by Neil deGrasse Tyson, our co-founding editor Steve Omohundro discusses the AI alignment problem starting at ~23:29.

Vol. 2 No. 1 (2025): Large Language Models I

Articles

Highlights of the Issue: Large Language Models I

Superintelligence Strategy: Expert Version

Humanity's Last Exam

Pathways to Short Transformative AI Timelines Chapter 3: Short TAI timeline scenarios

The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment

Brief analysis of DeepSeek R1 and its implications for Generative AI

Effective Mitigations for Systemic Risks from General-Purpose AI

Simulating Influence Dynamics with LLM Agents

Can a Bayesian Oracle Prevent Harm from an Agent?

Multiple unnatural attributes of AI undermine common anthropomorphically biased takeover speculations Eight Fundamental Differences between Biologically Evolved Humans and Digital AI

Commentary

Analysis of FLI’s AI Action Plan Recommendations and Proposed Policy Framework

Seeking Stability in the Competition for AI Advantage Commentary on Superintelligence Strategy by Dan Hendrycks, Eric Schmidt, and Alexandr Wang

On DeepSeek and Export Controls

Anthropic: Responsible Scaling Policy

Current Issue

Announcements

Dario Amodei, The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

Steve Omohundro: Regulating AGI: From Liability to Provable Contracts

Joe Rogan Experience #2345 - Roman Yampolskiy

Steve Omohundro Receives 2024 Future of Life Award

Steve Omohundro and Scientists Discuss the AI Alignment Problem with Neil deGrasse Tyson

Information