Generative AI energy consumption grows, but ROI role unclear

by Pelican Press
26 views 12 minutes read

Generative AI energy consumption grows, but ROI role unclear

IT leaders have several cost considerations as they build a business case for generative AI — some obvious and some hidden.

Fees associated with large language models (LLMs) and SaaS subscriptions are among the most visible expenses. But then there’s the less obvious costs of technology adoption: preparing data, upgrading cloud infrastructure and managing organizational change.

Another latent cost has been generative AI (GenAI) energy consumption. Training LLMs requires tremendous amounts of computing power, as does responding to user requests — answering questions or creating images, for instance. Such compute-intensive functions produce heat and require elaborate data center cooling systems that also consume energy.

Enterprise consumers of GenAI tools haven’t fixated on the technology’s power demands. But those requirements are getting more attention, at least at a high level. In January, the International Energy Agency (IEA), a forum of 29 industrialized nations, predicted global “electricity consumption from data [centers], AI and cryptocurrency could double by 2026.” IEA’s “Electricity 2024” report noted data centers’ electricity use in 2026 could reach more than 1,000 terawatt-hours, a total the agency likened to Japan’s entire electricity use.

Goldman Sachs in an April report also pointed to spiraling energy use, citing AI as a contributor. Growth from AI — along with other factors, such as broader energy demand — has created a “power surge from data centers,” according to the financial services company. The report projected global data center electricity use to more than double by 2030.

What higher energy consumption means for GenAI ROI calculations remains unclear. To date, the expected benefits of generative AI deployment have outweighed energy cost concerns. The typical business has been somewhat shielded from having to deal directly with energy considerations, which have been mostly an issue for hyperscalers. Google, for example, reported a 13% year-over-year increase in its greenhouse gas emissions in 2023, citing higher data center energy consumption and pointing to AI as a contributor.

“As we further integrate AI into our products, reducing emissions may be challenging due to increasing energy demands from the greater intensity of AI compute,” the company noted in its “2024 Environmental Report.”

There’s energy being used — you don’t take it for granted. There’s a cost somewhere for the enterprise, and we have to take that into account.
Scott LikensU.S. and global chief AI engineering officer, PwC

But industry executives suggested businesses, as advanced technology consumers, should reckon with GenAI’s energy dimension — even if it hasn’t been a critical adoption obstacle.

“I wouldn’t say it’s been a blocker, but we do think it’s a key part of the long-term strategy,” said Scott Likens, U.S. and global chief AI engineering officer at consultancy PwC. “There’s energy being used — you don’t take it for granted. There’s a cost somewhere for the enterprise, and we have to take that into account.”

Accounting for energy costs

Enterprise consumers of GenAI might not see energy costs as a billing line item, but it’s nevertheless present.

Ryan Gross, senior director of data and applications at Caylent, an AWS cloud services provider in Irvine, Calif., said generative AI’s energy consumption is directly proportional to the cost.

Much of the energy cost stems from two categories: model training and model inferencing. Model inferencing happens every time a user prompts a GenAI tool to create a response. The energy use associated with a single query is miniscule compared with training an LLM — fractions of a cent vs. millions of dollars. However, the power demands and costs of individual queries add up over time and across millions of users.

How customers absorb those costs remains a bit murky. A business using an enterprise version of a generative AI product pays a licensing fee to access the technology. To the extent energy costs are baked into the fee, those expenses are diffused across the customer base.

Indeed, a PwC sustainability study, published in January, found that emissions stemming from generative AI’s power consumption — during model training, for instance — were distributed across each corporate entity licensing the model.

“Because the foundational training is shared, you actually spread that cost across a lot of users,” Likens said.

As for inference costs, GenAI vendors use a system of tokens to assess LLM usage fees. There’s a charge for each token, and the more complex the query, the more tokens the vendor processes. More tokens signal higher energy use, as inferencing requires power. But the financial effects on enterprises appear to be minimal.

Graphic outlining generative AI costs
Energy ranks among GenAI costs, but its role in ROI calculations has been limited to this point.

“The token cost has come down since last year,” Likens said, citing PwC’s in-house use of generative AI. “So, the inferencing cost has not been a large [cost] driver, even though we are using it more.”

The biggest cost contributors to generative AI deployments continue to be the usual suspects, such as infrastructure and data preparation, Likens said.

Rajesh Devnani, vice president of energy and utilities at Hitachi Digital Services, the technology services subsidiary of Hitachi Ltd., offered a similar assessment. He acknowledged the importance of generative AI’s energy use, citing various estimates that a GenAI query response consumes at least four to five times the power of a typical internet search query. But he pointed to other cost contributors as playing a greater role in determining a financial return: data preparation and ongoing data governance; training and change management; and model training, which includes infrastructure and tool costs.

“ROI calculations of GenAI should definitely consider energy costs as a relevant cost factor, though it would not count as the most significant one,” he said.

Indirectly influencing energy consumption

Most GenAI adopters don’t appear to have elevated energy costs as a concern. But they could end up indirectly addressing consumption as they tackle other deployment challenges.

That prospect has much to do with how organizations perceive their top obstacles. Until recently, the cost efficiency of models has prevented organizations from scaling GenAI from limited deployments to entire customer bases, Gross said. But the latest generation of models are more economical, he added.

For example, OpenAI’s GPT-4o mini, released in July, is 60% less expensive than GPT-3.5 Turbo regarding charge per token processed, according to the company.

Against that backdrop, organizations now are starting to focus on user expectations, specifically the time it takes to fulfill a request made to a generative AI model.

“It’s more of a latency problem,” Gross said. “Users won’t accept what we are seeing from the usability [perspective].”

Enterprises, however, can tap smaller, fine-tuned models to reduce latency. Such models generally demand fewer computational resources — and, therefore, require less energy to run. Organizations can also include smaller models as part of a multimodel GenAI approach, Gross said. Multiple models offer a range of latency and accuracy levels, as well as different carbon footprints.

In addition, the emergence of agentic AI means problems can be broken down into multiple steps and routed through an autonomous agent to the optimal GenAI model. Prompts that don’t require a general-purpose LLM are dispatched to smaller models for faster processing and — behind the scenes — lower energy use.

But cost efficiency, despite the increased interest in latency, remains an issue for GenAI adopters.

“In general, we’re trying to use agentic architecture to optimize costs,” Likens said. “So, triaging a broken-down question for the right model that costs the least amount of money for the highest accuracy.”

Yet, organizations that build AI agents and create effective agentic architectures also stand to reduce energy consumption, Likens noted.

Top data centers deal with GenAI energy demands

Companies consuming generative AI might obliquely address energy consumption. But data centers that train and run models face increasing power demands head on. Their expanding investment in cooling systems offers evidence.

The data center physical infrastructure (DCPI) market’s growth rate increased for the first time in five quarters during the second quarter of 2024, according to the Dell’Oro Group. The Redwood City, Calif., market research firm said the uptick signals the beginning of the “AI growth cycle” for infrastructure sales.

That infrastructure includes thermal management systems. Lucas Beran, research director at Dell’Oro Group, said the thermal management market returned to a double-digit growth rate in the second quarter after a single-digit rate in the first quarter. Beran added that thermal management is a “meaningful part” of DCPI vendor backlogs, which he said grew notably in the first half of 2024.

Liquid cooling gaining traction

Within thermal management, liquid cooling is gathering momentum as a way to cool the high-density computing centers handling AI workloads.

“Liquid cooling is definitely far more efficient at conducting heat than air cooling,” Devnani said.

Liquids have a higher heat capacity than air and can absorb heat more efficiently, he said. Liquid cooling is becoming more relevant due to the power density of GenAI and enhanced high-performance computing workloads, he added.

Liquid cooling represents a much smaller slice of the data center thermal management market, but the method has shown strong revenue growth during the first half of 2024, Beran noted. Liquid cooling deployments will “significantly accelerate” during the second half of 2024 and into 2025, he added, citing AI workloads and accelerated computing platforms, such as Nvidia’s upcoming Blackwell GPUs.

In addition, IDTechEx, a technology and market research firm based in Cambridge, U.K., projected annual data center liquid cooling revenue to exceed $50 billion by 2035. Chips with increasingly higher thermal design power (TDP) numbers call for more efficient thermal management systems, said Yulin Wang, senior technology analyst at IDTechEx. TDP is the maximum amount of heat a chip produces.

Wang said the company has observed chips with TDP of around 1,200 watts and said chips with TDP of around 1,500 watts are likely to emerge in the next year or two. In comparison, a laptop’s CPU might have a TDP of 15 watts.

Nuclear power for managing AI energy demands

Another power strategy taking shape is harnessing nuclear energy for data centers, a direction AWS, Google and Microsoft are exploring. AWS, for example, earlier this year bought Talen Energy’s nuclear-powered data center campus in Pennsylvania. The use of nuclear power aims to help massive data centers keep pace with AI’s energy demands and address sustainability goals. Nuclear power provides a lower carbon footprint than energy sources such as coal and natural gas.

The hyperscalers’ energy moves may ultimately improve cooling efficiency, address sustainability and keep the power costs of generative AI in check. The latter outcome could continue to shield businesses from energy’s ROI effects. Yet, the careful selection of GenAI models, whether by humans or AI agents, can contribute to energy conservation.

Likens said PwC includes “carbon impact” as part of its generative AI value flywheel, a framework for prioritizing GenAI deployments that the company uses internally and with clients.

“It’s part of the decision-making,” he said. “The cost of carbon is in there, so we shouldn’t ignore it.”

John Moore is a writer for TechTarget Editorial covering the CIO role, economic trends and the IT services industry.



Source link

#Generative #energy #consumption #grows #ROI #role #unclear

You may also like