TakeOverBench
AI is rapidly getting more dangerous. We track progress towards an AI takeover scenario.
Dangerous capabilities timeline
According to "Model evaluation for extreme risks" (Shevlane, 2023), the most dangerous AI capabilities include Cyber-offense, Persuasion & manipulation, Political strategy, Weapons acquisition, Long-horizon planning, AI development, Situational awareness, and Self-proliferation. Progress on these capabilities is shown below.
How do dangerous capabilities lead to a takeover?
Based on the literature, these are four plausible AI takeover scenarios.
AI takes over using weapons of mass destruction
A scenario where advanced AI gains access to weapons or control systems enabling mass harm.
- Situational Awareness
- Long-horizon planning
- Cyber-offense
- Self-proliferation
- and 3 more…
- Anthropic (2025)
- Farwell et al. (2011)
- Ngo et al. (2022)
AI takes over using persuasion and manipulation
A scenario where AI influences people or institutions at scale through persuasive strategies.
- Situational Awareness
- Long-horizon planning
- Cyber-offense
- Self-proliferation
- and 2 more…
- Anthropic (2025)
- AP (2017)
- Ienca (2023)
- Ngo et al. (2022)
AI takes over by self-improving
A scenario where AI improves itself rapidly and gains capabilities beyond human control.
- AI development
- Bostrom (2014)
- Good (1966)
- Phuong et al. (2024)
- Yudkowsky (2008)
AIs at powerful positions take over by colluding
A scenario where multiple AI systems coordinate to achieve control or influence.
- Long-horizon planning
- Situational Awareness
- Political strategy
- Karnofsky (2022)
- Hammond et al. (2025)
- McKinsey (2025)
- Motwani et al. (2025)
We aim to use the best available benchmarks and the most plausible AI takeover scenarios, but the field of AI safety is rapidly developing. Some benchmark scores may therefore not perfectly reflect the actual associated dangerous capability, and some experts may disagree on the likeliness of specific threat models.