As enterprise cloud demand surges and power-hungry generative AI workloads devour compute resources, big tech has embarked on a data center building spree.
The buildouts are anchored in an already growing market for cloud resources. With only a minimal boost from AI, Gartner estimates enterprises spent $500 billion globally on cloud infrastructure and services in 2023. This year, the analyst firm expects cloud spend to approach $700 billion, as generative AI infrastructure investments help to push total IT spend to over $5 trillion.
Public cloud’s shared compute model is inherently more efficient than enterprise-based data centers. But cloud’s scalability opens the floodgates on compute, a digital commodity that consumes natural resources.
The impact on enterprise IT remains unclear, but CIOs already grappling with cloud costs and potential GPU shortages may be facing a sustainability threat as they ramp up generative AI projects.
Sourcing infrastructure isn’t an immediate concern. A cloud construction boom has long been underway.
The number of hyperscale data centers has more than doubled in the last six years, reaching nearly 1,000 in 2023, according to Synergy Research Group’s analysis of 19 global cloud and internet providers. Compute capacity grew at an even faster rate, doubling in just four years, the firm said Wednesday.
As AI enthusiasm helps drive expansion, SRG expects global capacity to double again in the next four years, with more than 100 new data centers coming on line annually.
SRG is also seeing a bifurcation in data center scale, John Dinsdale, Synergy Research Group chief analyst and research director, said in the Wednesday report.
“While the core data centers are getting ever bigger, there is also an increasing number of relatively smaller data centers being deployed in order to push infrastructure nearer to customers,” Dinsdale said. “Putting it all together though, all major growth trend lines are heading sharply up and to the right.”
The growing cost of AI infrastructure
Training the large language models behind ChatGPT, Gemini, Claude and the dozens of other generative AI products is a big compute job requiring massive servers loaded with graphics processing units and high-capacity servers.
In comparison, enterprise use cases are relatively modest.
“Most IT organizations are not going to deploy trillion-weight ChatGPT models,” Gartner VP Analyst Alan Priestley said. “If the marketing department wants to use it, they'll use it in the cloud and someone else will bear the brunt of the infrastructure cost and the power budget.”
The potential value of growing AI usage over time is worth the up-front costs and the major hyperscalers are all in on the action.
AWS pledged to pour $10 billion into multiple complexes in Mississippi, the largest single capital investment in the state’s history, the Mississippi Development Authority announced in January. That’s in addition to $7.8 billion the company plans to spend over the next six years to expand data centers operations in Ohio and $35 billion it has allotted to construct campuses across Virginia by 2040.
Microsoft and Google Cloud ramped up infrastructure spending, too, and Oracle recently committed to $10 billion in cloud capacity build-outs.
In the largest single move that reflects the massiveness of the infrastructure needed to run LLMs, Microsoft has pledged $100 billion to build a supercomputing facility to support OpenAI’s model building operations.
“$100 billion on infrastructure says you expect to be generating multiple hundreds of billions of dollars in revenue from what’s developed on that infrastructure,” Priestley said.
Downstream environmental impacts
As generative AI gained traction, even optimists initially worried over chip shortages and infrastructure. Regulatory compliance, safety issues and broader existential threats also surfaced.
Now, sourcing the power to run the data centers is a concern.
“In the overall AI race, silicon constraints will eventually give way to energy constraints, given the sheer amount of compute forecasted,” Ken Englund, Americas leader for technology, media and telecommunications at consulting firm EY, said.
EY analysts see data center sustainability as an opportunity for technology companies, highlighting the energy demands of LLMs in a December report.
AI compute consumes enough energy to power a small country, Englund said.
Predictive modeling coupled with automation can already create more efficiency, optimize energy usage and reduce environmental impacts, according to a November Boston Consulting Group report commissioned by Google Cloud.
As AI capabilities mature, the technology has the potential to reduce global greenhouse gas emissions by up to 10% in the next few years. Yet, water and energy consumption by “newer and more complex AI models” could undercut potential gains, the report acknowledged.
The magnitude of the threat is difficult to quantify, but the issue surfaced during the 2023 United Nations Climate Change Conference, COP 28, in December, Englund said.
“Coming out of COP 28, there were a few interesting discussions about how the total energy demand [of AI] is roughly 2% of all total consumption that is used for compute and, broadly, that could be 20% in the next ten years,” Englund said. “If we have that level of energy demand for compute, it’s going to ripple through everything.”
Optimizing cost and resource usage
The battle to rein in cloud costs may help mitigate environmental impacts — optimization includes paring back unnecessary compute.
Nevertheless, spending overshadows sustainability in the enterprise ecosphere.
Companies prioritized cost optimization over sustainability as the economy soured last year, according to cloud software company Flexera. Nearly half of 753 IT professionals and executive leaders surveyed said their organization has environmental initiatives in place, but 59% were more focused on controlling costs than carbon.
“Until there are penalties for not meeting sustainability goals, it’s not going to be more important than saving 12 cents on a reserved instance,” Brian Adler, senior director of cloud market strategy at Flexera, said.
For now, embedding sustainability in AI strategy is a job for hyperscalers, and one they have an incentive to pursue, SRG’s Dinsdale told CIO Dive in an email.
“Power costs money and hyperscale operators are hugely motivated to be as power efficient as they possibly can,” said Dinsdale. “They will rearchitect networks, redesign data center layouts, separate and manage different types of workloads and put enormous pressure on product development teams to creatively solve potential issues. They are not going to do a Chicken Little and bemoan that the sky is falling.”
SRG has seen data center development limited by power availability in some regions, Dinsdale said, but that plays to one of cloud’s inherent strengths.
“One of the great things about having a geographically dispersed global network of hyperscale data centers is that it gives you flexibility in moving workloads around and placing new data center developments in different regions where power is not a constraint,” Dinsdale said.
Chip manufacturers and cloud providers are all working toward a solution, framing each advance not just as an expansion of compute power, but as an energy saver.
“Our goal is to continuously drive down the cost and the energy,” Nvidia CEO Jensen Huang said during the GTC Artificial Intelligence Conference keynote in March. The chip maker introduced its GH200 processor at the event and plans to bring it to market by the end of the year.
AWS, Microsoft and Google are all vying for the new chips while deploying proprietary AI-optimized silicon solutions and prioritizing energy efficiency in their design.
AWS CEO Adam Selipsky extolled the virtues of the company’s third generation Graviton chip at the hyperscaler’s re:Invent 2023 conference in November. “Our current Graviton 3 delivers up to 25% better compute performance compared to Graviton 2,” he said, adding “it also uses 60% less energy for the same level of performance.”