Cloudflare joined the industrywide race to deploy AI-optimized graphics processing units in the cloud last month.
“Right now, there are members of the Cloudflare team traveling the world with suitcases full of GPUs, installing them throughout our network,” CEO Matthew Prince said during the company’s Q3 2023 earnings call in November.
The cloud service provider had inference-optimized GPUs running in 75 cities and was on schedule to install the chips in 100 regions by the end of the year. Prince aims to make Cloudflare “the most widely distributed cloud-based AI inference platform,” he told CIO Dive in an interview Thursday.
Not all AI workloads are the same. Training LLMs is far more data intensive than the inferencing that’s done when users submit queries to a pre-trained model or the customizations that take place during fine-tuning.
Cloudflare is banking on this distinction as it readies its edge network for an inferencing influx.
“We are not the right place to do training directly on our infrastructure,” Prince said. “For training, what you really want is a big building full of lots of machines with the latest, greatest GPUs.”
Cloudflare’s network of smaller data centers, which extends to 310 cities in over 120 countries, has two key use cases for enterprise customers, Prince said: moving training data closer to hyperscaler GPU clusters and running inference workloads.
“Inference actually requires a different set of GPUs and a different scheduling algorithm than training,” Prince said. “You don't need the latest, greatest GPUs. In fact, inference-tuned GPUs are in much greater supply than some of the other AI chips.”
Training technology
AWS, Microsoft and Google Cloud are scaling for training as generative AI remakes their infrastructure.
The three largest hyperscalers were invested in expanding GPU infrastructure to handle AI workloads prior to the arrival of ChatGPT late last year. But OpenAI’s breakthrough technology has intensified the GPU deployment race and raised the possibility of a chip shortage.
Leading GPU manufacturer Nvidia has established partnerships around AI chips with AWS, Microsoft and Google Cloud. The CSPs are also developing proprietary AI chips as a hedge against scarcity and third-party vendor dependency.
Cloudflare partnered with Nvidia in 2021 to bring GPUs to its edge network and began installing the chipmaker’s full stack inference servers and software in September. But, as GPUs become a scarce resource, Prince said Cloudflare will shop around for silicon.
“While Nvidia has been a great partner and they are the leader in the space, we anticipate that we will be very promiscuous with the various GPU providers that we use,” Prince said, pointing to Intel, AMD and Qualcomm as potential sources.
“What we want to do is make sure that, across the world, we've got a number of different inference-optimized silicon options so that our customers don't have to think about it at all,” Prince said. “They just pay for the results.”