Tesla has revealed its investment into a massive compute cluster comprising 10,000 Nvidia H100 GPUs specifically designed to power AI workloads.
The system, which went online this week, is designed to process the mountains of data its fleet of vehicles collect with a view to accelerating the development of fully self-driving vehicles, according to its leader of AI infrastructure, Tim Zaman.
Tesla has been striving for years to reach the point at which its vehicles can be considered entirely autonomous and has invested more than a billion dollars into adopting the infrastructure to make this possible.
Tesla supercomputer
In July 2023, CEO Elon Musk revealed the firm would invest $1 billion into building out its Dojo supercomputer over the next year. Dojo, which is based on Tesla’s own tech, began with the D1 chip, fitted with 354 custom CPU cores. Each training tile module comprises 25 D1 chips, with the base Dojo V1 configuration including 53,100 D1 cores in total.
The firm also built a compute cluster fitted with 5,760 Nvidia A100 GPUs in June 2012. But the firm’s latest investment in 10,000 of the company‘s H100 GPUs dwarfs the power of this supercomputer.
This AI cluster, worth more than $300 million, will offer a peak performance of 340 FP64 PFLOPS for technical computing and 39.58 INT8 ExaFLOPS for AI applications, according to Tom’s Hardware.
The power at Tesla’s disposal is actually more than that offered by the Lenoardo supercomputer, the publication pointed out, making it one of the most powerful computers on the planet.
Nvidia’s chips are the components that power many of the world’s leading generative AI platforms. These GPUs, which are fitted into servers, have several other use cases from medical imaging to generating weather models.
Tesla is hoping to use the power of these GPUs to more efficiently and effectively churn through the vast quantities of data it has to build a model that can successfully rival a human.
While many businesses would usually lean on infrastructure hosted by the likes of Google or Microsoft, Tesla’s supercomputing infrastructure is all on-prem, meanig the firm will also have to maintain all of it.