On DeepSeek and Export Controls
DOI:
https://doi.org/10.70777/si.v2i1.10695Keywords:
agi governance, agi risks, artificial general intelligence, superintelligence, ai scaling, agi architecture, ai pretrained modelAbstract
DeepSeek does not "do for $6M5 what cost US AI companies billions". I can only speak for Anthropic, but Claude 3.5 Sonnet is a mid-sized model that cost a few $10M's to train (I won't give an exact number). Also, 3.5 Sonnet was not trained in any way that involved a larger or more expensive model (contrary to some rumors). Sonnet's training was conducted 9-12 months ago, and DeepSeek's model was trained in November/December, while Sonnet remains notably ahead in many internal and external evals. Thus, I think a fair statement is "DeepSeek produced a model close to the performance of US models 7-10 months older, for a good deal less cost (but not anywhere near the ratios people have suggested)".
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2025 Dario Amodei

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.