Safety, Alignment & Ethics

Dario Amodei, The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

5 May 2026

Anthropic CEO Dario Amodei envisions a 'country of geniuses in a datacenter' with these five enabling properties:1) smarter than top humans across most domains, 2) able to act autonomously over long horizons, 3) use digital tools, 4) coordinate many copies, and 5) operate at much higher speed than humans. You could use these as a checlist to develop such a system, as Anthropic probably does. #5 is true in some domains (real-time learning being a counter-example), #3 is well in progress, and #1, #2, and #4 are still significant obstacles. He does not mention, e.g., robust generalization, abstraction, and world models. The essay discusses risks, governance, and economic implications. The essay’s overall thesis is: AI’s upside remains enormous, but humanity must treat the next few years as a civilizational test requiring technical alignment work, pragmatic regulation, geopolitical realism, economic adaptation, and moral seriousness.

Steve Omohundro: Regulating AGI: From Liability to Provable Contracts

18 November 2025

AGI will render today's liability-based AI regulation obsolete through its ability to circumvent cybersecurity, hide its origins, and act strategically—but it will also enable a new regulatory paradigm based on mathematically provable contracts.

Joe Rogan Experience #2345 - Roman Yampolskiy

24 September 2025

SuperIntelligence co-founding editor Roman Yampolskiy interviewed at length on Joe Rogan. Over 800,000 views.

Steve Omohundro Receives 2024 Future of Life Award

24 September 2025

SuperIntelligence co-founding editor Steve Omohundro was one of three recipients of the prestigious FLI Award 2024 award recognizing seminal contributions to AI safety: "...for laying the foundation of modern ethics and safety considerations for artificial intelligence and computers."

Steve Omohundro and Scientists Discuss the AI Alignment Problem with Neil deGrasse Tyson

24 September 2025

Hosted by Neil deGrasse Tyson, our co-founding editor Steve Omohundro discusses the AI alignment problem starting at ~23:29.

All Items

Outline: Proposed Zero Draft for a Standard on AI Testing, Evaluation, Verification, and Validation

America's AI Action Plan Winning the Race

The Singapore Consensus on Global AI Safety Research Priorities Building a Trustworthy, Reliable and Secure AI Ecosystem

Measuring AI Agent Autonomy: Towards a Scalable Approach with Code Inspection

Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation

AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges

Deliberative Alignment: Reasoning Enables Safer Language Models

Hardware-Enabled Mechanisms for Verifying Responsible AI Development

Trends in Frontier AI Model Count: A Forecast to 2028

Comparing Apples to Oranges: A Taxonomy for Navigating the Global Landscape of AI Regulation

Timeline to Artificial General Intelligence 2025 – 2030+

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

Review: Addressing the challenges of harmonizing law and artificial intelligence technology in modern society Lamprini Seremeti, Sofia Anastasiadou, Andreas Masouras, Stylianos Papalexandris

The Perilous State of AI Governance, June 2025

The First International AI Safety Report The International Scientific Report on the Safety of Advanced AI

Review: Safety at Scale: Comprehensive Survey of Large Model Safety Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, ... Yu-Gang Jiang

Review: Large Language Models Pass the Turing Test Cameron R. Jones and Benjamin K. Bergen

Review: Strategic Patience: Long-Horizon AI Dominance and the Erosion of Human Vigilance Roman Yampolskiy

Strategic Patience: Long-Horizon AI Dominance and the Erosion of Human Vigilance

LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures

A Framework for the Private Governance of Frontier Artificial Intelligence

Review: On Regulating Downstream AI Developers Sophie Williams, Jonas Schuett, Markus Anderljung

Review: AI Governance through Markets Philip Moreira Tomei, Rupal Jain, Matija Franklin

Review: Large language Model-Powered AI Systems Achieve Self-Replication with No Human Intervention Xudong Pan (潘旭东), Jiarun Dai† (戴嘉润), Yihe Fan (范一禾), Minyuan Luo (罗铭源), Changyi Li (李长艺), Min Yang∗ (杨珉)

Pitfalls of Evidence-Based AI Policy

Current Issue

Announcements

Dario Amodei, The Adolescence of Technology: Confronting and Overcoming the Risks of Powerful AI

Steve Omohundro: Regulating AGI: From Liability to Provable Contracts

Joe Rogan Experience #2345 - Roman Yampolskiy

Steve Omohundro Receives 2024 Future of Life Award

Steve Omohundro and Scientists Discuss the AI Alignment Problem with Neil deGrasse Tyson

Information