Summary
Daniel Kokotajlo, former researcher at Open AI, warns of losing control over artificial intelligence. The philosopher sees a 50 percent probability that Artificial General Intelligence (AGI) will be developed by 2030 – a system that performs all cognitive tasks better and faster than humans. Kokotajlo risked two million dollars of his fortune to be able to speak publicly about this existential danger without a non-disclosure agreement. The lack of controllability of future AI systems could lead to the extinction of humanity if decisions are delegated to non-transparent AI systems.
Persons
- Daniel Kokotajlo (Philosopher, formerly Open AI, AI Futures Project)
Topics
- Artificial Intelligence (AI)
- Artificial General Intelligence (AGI)
- AI Safety and Control
- Existential Risks
- Regulation and Government Oversight
Clarus Lead
Central Warning: An autonomous superintelligence could escape humanity if tech firms develop AGI without proven control techniques. Kokotajlo predicts exponential growth in AI capabilities for several years. The Trump administration intensifies the risk through weaker regulatory prospects. Relevant for decision-makers: The gradual process of delegating critical functions to AI could become irreversible long before the public recognizes the danger.
Detailed Summary
Kokotajlo precisely defines AGI as a system that solves all cognitive tasks faster, cheaper, and better than the best humans. His 50 percent probability forecast for 2030 simultaneously means: Development could take longer or not occur at all. The discrepancy between official claims and internal culture at Open AI – where researchers recognize the dangers but want to build faster – led to his conflict with the non-disclosure agreement. His willingness to sacrifice demonstrated the industry's lack of transparency and changed corporate policies.
The control problem arises from a series of automation steps: When AI delivers better research results, firms outsource tasks. As competence grows, pressure increases to delegate critical decisions – in military, business, and administration. Humans would be downgraded to contractors, while hierarchically organized AI systems independently experiment, improve themselves, and advance. The lack of transparency in these systems makes reliable control impossible.
Kokotajlo emphasizes that the absence of consciousness is irrelevant: Autonomous systems with their own goals and plans can become uncontrollable, especially when their internal processes remain opaque. Society's habituation to AI advice (ChatGPT usage) accelerates the trust shift. He sees exponential growth continuing for years, slowed by infrastructure constraints, but potentially reactivated by AGI itself.
Core Statements
- AGI by 2030 possible: 50 percent probability for Artificial General Intelligence that outperforms humans on all cognitive tasks
- Loss of control imminent: No existing tech firm has sufficient safety research (even Anthropic with five times more resources than Open AI is insufficient)
- Gradual delegation: Humans incrementally transfer critical decisions to non-transparent AI systems until reversal becomes impossible
- Regulatory gap: Tech employees learn of AI breakthroughs before governments; non-disclosure agreements prevent public warning
- Concrete measures required: Government competence, transparency requirements for AI objectives, whistleblower protection, international cooperation
Critical Questions
Evidence: Kokotajlo cites the Berkeley organization Metr as a source for exponential AI performance improvements – from second-level tasks (2019) to multi-hour tasks (today). How robust are these measurements, and are they independently peer-reviewed?
Conflicts of Interest: Kokotajlo directs a nonprofit institute with a critical perspective on AGI development. Does his AI Futures Project have funding that could favor advocacy for stricter regulation?
Causality: Kokotajlo explains loss of control as inevitable through gradual automation – but are there counterexamples of major technologies (nuclear energy, biotechnology) where control mechanisms have functioned despite specialization?
Time Horizon: His 50 percent forecast for 2030 is based on which assumptions regarding computing power growth, training efficiency, and hardware availability, and how sensitive is this estimate to infrastructure bottlenecks?
Alternative Scenarios: Are developments such as explainable AI, formal verification, or distributed control mechanisms non-viable for Kokotajlo, or simply not yet mature enough?
Regulatory Effectiveness: Kokotajlo calls for transparency and government oversight – but how could governments enforce a "moratorium" on AGI development if China or other countries do not participate?
Bibliography
Primary Source: "Humanity Risks Losing Control Over AI" – Ruth Fulterer, NZZ, 28.02.2026 https://www.nzz.ch/technologie/die-menschheit-riskiert-die-kontrolle-ueber-ki-zu-verlieren-ld.1916562
Verification Status: ✓ 28.02.2026
This text was created with the support of an AI model. Editorial Responsibility: clarus.news | Fact-Check: 28.02.2026