It’s still early. It’s on the record. Sam, Ilya, Demis, and Dario are on the podcasts. They are saying it loudly into the microphone. Dario of Machines of Loving Grace fame is talking about 20% annual GDP growth rates, lifing billions out of poverty and curing most diseases in the next 5-10 years. Leopold wrote the roadmap down in Situational Awareness. Even if the U.S. Government can’t or won’t do a Manhattan Project 2.0: This Time It’s Serious, the wheels are in motion. The next 5 years are bought and paid up for. Land has been bought. Power contracts secured. Nuclear electricity contracted. GPUs pre-ordered. HBM and CoWoS capacity secured. This thing ain’t stopping. You can think there will be a bust, but this infrastructure is happening. It’s fibre cabling, except for compute this time. New scaling laws around inference time rather than raw power will change the balance between training and inference; data center and edge, but they don’t change the fact: we’ve embarked on the largest capital allocation project in human history.
Listen carefully. The people closest to the action tell us scaling works. Scaling has plenty of headroom for data center training. And as of September, we’ve likely just begun a new scaling path with inference time with GPT-o1. For training, just plug in more GPUs and add more data and parameters and off we go to level 5-models and maybe Gen6. Sure lots of work on synthetic data generation, efficient data sampling, post-training optimization et al, but we have the contours of where we are headed. But it’s been nearly two years and we haven’t got 5-level models. Nvidia fell 7% in a day. McKinsey thinks AI might be overhyped. It’s so over?
It was never over, and we never had to come back. The cost for 2 million tokens has come down 240x times in 2 years.. Claude Haiku, GPT-4o mini, and Gemini Nano are on the smartphones. Attention-based transformers can be natively multimodal, inputting text, voice, image, video and outputting any combo you want. Practical agents are within touching distance. o1 has fired the starting gun on better-than-human reasoning. Demis says embodiment might be as simple as just adding another modality. You don’t have to believe that attention-based transformers and diffusion models will get us to AGI. You have to believe that the richest companies in history, venture capitalists, and increasingly Governments have the incentive to scale AI.
It’s happening, but people aren’t updating their priors fast enough. With situational awareness, you see three opportunities. Scaling, deploying, and securing AI. First scale. By 2026, we'll see gen5 GPT, Claude, Gemini and Llama, with gen6 models likely in 2028 requiring $100bn+ and 10GW of power. The push to 7th-gen models could lead us towards $1trillion of cold hard cash and 100GW. We will create and transport many more electrons to data centers and try not to waste them. Second deploy. For systems to be pervasive, we will massively reduce token costs making intelligence too cheap to meter at $0.0001/million tokens. Finally, secure. For society to accept AI we must protect privacy, offer fair and unbiased models and have open access AI infrastructure. These are our problems. I bring solutions.
The scaling hypothesis worked. It’s a foot race. A foot race with Sam, Demis and Dario carrying billions in their pockets. And now Ilya too, the classic 1bn and 5bn seed round. Labs are investing genuinely unprecedented capex in scaling their models. This isn’t speculative. Capex has been budgeted for. Chips have been pre-ordered. 3-mile Island is being restarted and Microsoft have agreed a 20 year contract to supply power to their datacentres. By 2026, we'll see 5th-level GPT, Claude, Gemini and Llama, with 6th-level models following in 2028/29. As Leopold outlined, this needs $100 billion training clusters by 2028 and with something like $1 trillion for 7th-level clusters by 2030. Big numbers. But it’s also likely Elon Musk will be worth $1 trillion by 2030. So I dunno, it’s all relative in the fight for the future. At these scales, algorithms and hardware matter, but it’s a power game. Securing enough of it, delivering it into data centers, and using it efficiently. While GPT-4 used about 10 MW, 5th-level models may need in the order of 1GW - equivalent to a large nuclear reactor. For 2030+ clusters we might be staring down the barrel of 100GW, far exceeding current datacenter capacities. Scaling to this level requires mainly increasing power supply. But also in parallel we should aim to reduce power consumption, and improve system efficiency. I used to joke that a future AI fund is a rare earth mining fund. But for now investing in AI is basically an energy fund. If you are a ClimateTech fund reading this, I strongly suggest new positioning as “AI and Climate”. You’re welcome.
1. Increase power into data centers
2. Make data centers more efficient
3. Make server chips more efficient
Let's assume we’ve built these AI factories and are feeding 10-100 GW to our new God in the sky. It will all be for nought if it costs a fortune to speak to it. The labs have done remarkably well on the cost front already, with the cost for 2 million tokens (input+output) decreasing from $180 to $0.75 in 2 years. 240x cheaper is some serious numbers. But it’s not enough, just like we will need unprecedented power generation, distribution and delivery, we are going to need to throw everything at reducing costs. I’m saying $0.0001 per million tokens. It’s not a useful number, it’s just a very low number. But basically we are talking “too cheap to meter”. Only when it’s this cheap can it be integrated into all devices and products seamlessly. Gods don’t have usage caps.
4. Reduce semiconductor manufacturing costs
5. Optimize power consumption on edge devices
As AI systems get more powerful and pervasive, they pose significant risks to individual privacy, data security, and societal well-being. All trained models today basically scraped the Internet and the leading AI labs are paying publishers and companies for access to new data sources. Any data on the Internet was used to train models regardless of copyright, and reparations will be made. But going forward, companies, individuals and Governments are aware of the value of data for frontier models. The next vanguard will be access to personal data for training and inference. For models to be widely used across society and public services, ensuring fair and unbiased models will become a critical issue to solve. And finally, my contention is that AI infrastructure needs to be open access. At the very least citizens need an alternative to centralized cloud infrastructure in which AI runs on infrastructure controlled by a select few companies.
7. Protect data privacy during training and inference