AGI Metrics

32 Items

Metrics that help us predict the advent of AGI

All Items

AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges

Ranjan Sapkota, Konstantinos I. Roumeliotis, Manoj Karkee

DOI: https://doi.org/10.70777/si.v2i3.15161
Comparing Apples to Oranges: A Taxonomy for Navigating the Global Landscape of AI Regulation

Sacha Alanoca, Shira Gur-Arieh, Tom Zick, Kevin Klyman

DOI: https://doi.org/10.70777/si.v2i3.15137
Critical Review: Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i4.15315
Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

Jenny Zhang, Shengran Hu, Cong Lu, Robert Lange, Jeff Clune

DOI: https://doi.org/10.70777/si.v2i3.15063
Deliberative Alignment: Reasoning Enables Safer Language Models

Melody Y. Guan, Manas Joglekar, Eric Wallace, Saachi Jain, Boaz Barak, Alec Helyar, Rachel, Andrea Vallone, Hongyu Ren, Jason Wei, Hyung Won Chung, Sam Toyer, Johannes Heidecke, Alex, Amelia Glaese

DOI: https://doi.org/10.70777/si.v2i3.15159
GDPVAL: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Tejal Patwardhan, Rachel Dias, Elizabeth Proehl, Grace Kim, Michele Wang, Olivia Watkins, Sim´on Posada Fishman, Marwan Aljubeh, Phoebe Thacker, Laurance Fauconnet, Natalie S. Kim, Patrick Chao, Samuel Miserendino, Gildas Chabot, David Li, Michael Sharman, Alexandra Barr, Amelia Glaese, Jerry Tworek

DOI: https://doi.org/10.70777/si.v2i4.17197
Hardware-Enabled Mechanisms for Verifying Responsible AI Development

Aidan O’Gara, Gabriel, Will Hodgkins, James Petrie, Vincent Immler, Aydin Aysu, Kanad Basu, Shivam Bhasin, Stjepan Picek, Ankur Srivastava

DOI: https://doi.org/10.70777/si.v2i3.15157
Highlights of the Issue: Singapore Consensus – Safety Technology In Progress

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i5.15525
International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management

Yoshua Bengio, Stephen Clare, Carina Prunkl, Maksym Andriushchenko, Ben Bucknall, Philip Fox, Nestor Maslej, Conor McGlynn, Malcolm Murray, Shalaleh Rismani, Stephen Casper, Jessica Newman, Daniel Privitera, Sören Mindermann, Daron Acemoglu, Thomas G. Dietterich, Fredrik Heintz, Geoffrey Hinton, Nick Jennings, Susan Leavy, Teresa Ludermir, Vidushi Marda, Helen Margetts, John McDermid, Jane Munga, Arvind Narayanan, Alondra Nelson, Clara Neppel, Sarvapali D. (Gopal) Ramchurn, Stuart Russell, Marietje Schaake, Bernhard Schölkopf, Alvaro Soto, Lee Tiedrich, Gaël Varoquaux, Andrew Yao, Ya-Qin Zhang

DOI: https://doi.org/10.70777/si.v2i4.16671
International Al Safety Report: First Key Update Capabilities and Risk Implications

Yoshua Bengio, Benjamin Bucknall, Stephen Clare, Carina Prunkl, Maksym Andriushchenko, Philip Fox, Tiancheng Hu, Cameron Jones, Sam Manning, Nestor Maslej, Vasilios Mavroudis, Conor McGlynn, Malcolm Murray, Shalaleh Rismani, Charlotte Stix, Lucia Velasco, Nicole Wheeler, Daniel Privitera, Sören Mindermann, Daron Acemoglu, Thomas G. Dietterich, Fredrik Heintz, Geoffrey Hinton, Nick Jennings, Susan Leavy, Teresa Ludermir, Vidushi Marda, Helen Margetts, John McDermid, Jane Munga, Arvind Narayanan, Alondra Nelson, Clara Neppel, Sarvapali D. (Gopal) Ramchurn, Stuart Russell, Marietje Schaake, Bernhard Schölkopf, Alvaro Soto, Lee Tiedrich, Gaël Varoquaux, Andrew Yao, Ya-Qin Zhan

DOI: https://doi.org/10.70777/si.v2i6.16253
Measuring AI Agent Autonomy: Towards a Scalable Approach with Code Inspection

Peter Cihon, Merlin Stein, Gagan Bansal, Sam Manning, Kevin Xu

DOI: https://doi.org/10.70777/si.v2i3.15295
Outline: Proposed Zero Draft for a Standard on AI Testing, Evaluation, Verification, and Validation

NIST

DOI: https://doi.org/10.70777/si.v2i5.15513
Pathways to Short Transformative AI Timelines Chapter 3: Short TAI timeline scenarios

Zershaaneh Qureshi

DOI: https://doi.org/10.70777/si.v2i1.13603
Pitfalls of Evidence-Based AI Policy

Stephen Casper, David Krueger, Dylan Hadfield-Menell

DOI: https://doi.org/10.70777/si.v2i2.14611
Review: Large language Model-Powered AI Systems Achieve Self-Replication with No Human Intervention Xudong Pan (潘旭东), Jiarun Dai† (戴嘉润), Yihe Fan (范一禾), Minyuan Luo (罗铭源), Changyi Li (李长艺), Min Yang∗ (杨珉)

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i2.14607
Review: Large Language Models Pass the Turing Test Cameron R. Jones and Benjamin K. Bergen

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i2.14697
Review: Safety at Scale: Comprehensive Survey of Large Model Safety Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, ... Yu-Gang Jiang

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i2.14741
Review: Strategic Patience: Long-Horizon AI Dominance and the Erosion of Human Vigilance Roman Yampolskiy

Kris Carlson

DOI: https://doi.org/10.70777/si.v2i2.14603
The 2025 Foundation Model Transparency Index

Alexander Wan, Kevin Klyman, Sayash Kapoor, Nestor Maslej, Shayne Longpre, Betty Xiong, Percy Liang, Rishi Bommasani

DOI: https://doi.org/10.70777/si.v2i4.17165
The AI Productivity Index (APEX)

Bertie Vidgen, Abby Fennelly, Evan Pinnix, Chirag Mahapatra, Zach Richards, Austin Bridges, Calix Huang, Ben Hunsberger, Fez Zafar, Brendan Foody, Dominic Barton, Cass R. Sunstein, Eric Topol, Osvald Nitski

DOI: https://doi.org/10.70777/si.v2i4.17205
The Asymptotic Intelligence Thesis: Rethinking the Ceiling of AGI Cognition

Jeffrey E. Arle, MD, PhD, FAANS, FCNS

DOI: https://doi.org/10.70777/si.v2i6.16255
The First International AI Safety Report The International Scientific Report on the Safety of Advanced AI

Yoshua Bengio

DOI: https://doi.org/10.70777/si.v2i2.14755
The Iceberg Index: Measuring Workforce Exposure in the AI Economy

Ayush Chopra, Santanu Bhattacharya, DeAndrea Salvador, Ayan Paul, Teddy Wright, Aditi Garg, Feroz Ahmad, Alice C. Schwarze, Ramesh Raskar, Prasanna Balaprakash

DOI: https://doi.org/10.70777/si.v2i4.17207
The Illusion of Thinking Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Parshin Shojaee , Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, Mehrdad Farajtabar

DOI: https://doi.org/10.70777/si.v2i6.15919
The Singapore Consensus on Global AI Safety Research Priorities Building a Trustworthy, Reliable and Secure AI Ecosystem

Yoshua Bengio, Max Tegmark, Stuart Russell, Dawn Song, Sören Mindermann, Lan Xue, Stephen Casper, Luke Ong, Vanessa Wilfred, Tegan Maharaj, Wan Sie Lee, Ya-Qin Zhang

DOI: https://doi.org/10.70777/si.v2i5.15503

1-25 of 32 Next