Guan, Melody Y., Manas Joglekar, Eric Wallace, Saachi Jain, Boaz Barak, Alec Helyar, Rachel Dias, et al. “Deliberative Alignment: Reasoning Enables Safer Language Models”. SuperIntelligence - Robotics - Safety & Alignment 2, no. 3 (July 20, 2025). Accessed May 18, 2026. https://s-rsa.com/index.php/agi/article/view/15159.