Vol. 2 No. 1 (2025): Large Language Models I

Accuracy of LLMs across Benchmarks-Humanity's Last Exam

The issue theme is large language models (LLMs) - their capabilities, limitations, benchmarks, methods to ensure their safety and value alignment with humans, and governance. Mercer et al. and Dario Amodei offer perspective on DeepSeek. Dan Hendrycks et al. take a very hard look at the geopolitics of AGI ('a coherent superintelligence strategy'), and Rehman et al. from RAND Corp. critique their proposal. Qureshi presents a framework for analyzing timelines predicting the advent of AGI. Humanity's Last Exam is a vast and open-ended compilation of 2700 questions 'at the frontier of human knowledge' to test AGI knowledge and reasoning capability. The Road to Artificial Superintelligence is a timely survey of safety & alighment methods.

Future of Life Institute's Uuk et al. scanned the literature for practical, effective, broadly-applicable mitigations for AGI risk; their top three:  safety incident reports and security information sharing, third-party pre-deployment model audits, and pre-deployment risk assessments. Yoshua Bengio & team describe a computationally-efficient Bayesian ML program designed to assess risk probabilities of an agent's actions at runtime. What safety & alignment policies are actually in use at leading AI companies? See Anthropic's Responsible Scaling Policy

Modeling and simulation are essential tools for AGI safety & alignment. Nasim et al. offer non-coders an open-source simulator for opinion dynamics researchers to analyze influence propagation and counter-misinformation strategies in social networks that including LLM agents (see FLI's new proposal to ban models with superhuman persuasion capability). 

Preston Estep contributes to the theory of mind, examines differences between human and artificial intelligence, and looks at how non-human attributes of AI must be taken into account when predicting AGI risks.

Published: 2025-03-06

Articles

  • Highlights of the Issue: Large Language Models I

    Kris Carlson
    DOI: https://doi.org/10.70777/si.v2i1.14075
  • Superintelligence Strategy: Expert Version

    Dan Hendrycks, Eric Schmidt, Alexandr Wang
    DOI: https://doi.org/10.70777/si.v2i1.13961
  • Humanity's Last Exam

    Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li
    DOI: https://doi.org/10.70777/si.v2i1.13973
  • Pathways to Short Transformative AI Timelines Chapter 3: Short TAI timeline scenarios

    Zershaaneh Qureshi
    DOI: https://doi.org/10.70777/si.v2i1.13603
  • The Road to Artificial SuperIntelligence: A Comprehensive Survey of Superalignment

    HyunJin Kim, Xiaoyuan Yi, JinYeong Bak, Jing Yao, Jianxun Lian, Muhua Huang, Shitong Duan, Xing Xie
    DOI: https://doi.org/10.70777/si.v2i1.13963
  • Brief analysis of DeepSeek R1 and its implications for Generative AI

    Sarah Mercer, Samuel Spillard, Daniel P. Martin
    DOI: https://doi.org/10.70777/si.v2i1.11097
  • Effective Mitigations for Systemic Risks from General-Purpose AI

    Risto Uuk, Annemieke Brouwer, Tim Schreier, Noemi Dreksler, Valeria Pulignano, Rishi Bommasani
    DOI: https://doi.org/10.70777/si.v2i1.13975
  • Simulating Influence Dynamics with LLM Agents

    Mehwish Nasim, Syed Muslim Gilani, Amin Qasmi, Usman Naseem
    DOI: https://doi.org/10.70777/si.v2i1.13971
  • Can a Bayesian Oracle Prevent Harm from an Agent?

    Yoshua Bengio, Matt McDermott, Michael K. Cohen, Nikolay Malkin, Damiano Fornasiere, Pietro Greiner, Younesse Kaddar
    DOI: https://doi.org/10.70777/si.v2i1.13799
  • Multiple unnatural attributes of AI undermine common anthropomorphically biased takeover speculations Eight Fundamental Differences between Biologically Evolved Humans and Digital AI

    Preston Estep
    DOI: https://doi.org/10.70777/si.v2i1.13801

Commentary

  • Analysis of FLI’s AI Action Plan Recommendations and Proposed Policy Framework

    Gil Syswerda
    DOI: https://doi.org/10.70777/si.v2i1.14139
  • Seeking Stability in the Competition for AI Advantage Commentary on Superintelligence Strategy by Dan Hendrycks, Eric Schmidt, and Alexandr Wang

    Iskander Rehman, Karl P. Mueller, Michael J. Mazarr
    DOI: https://doi.org/10.70777/si.v2i1.14023
  • On DeepSeek and Export Controls

    Dario Amodei
    DOI: https://doi.org/10.70777/si.v2i1.10695
  • Anthropic: Responsible Scaling Policy

    Evan Hubinger
    DOI: https://doi.org/10.70777/si.v2i1.13657