Just one ChatGPT query uses 10 times as much power as a Google search
“Yes, we want to use the good part, the benefits we’re gaining from these models as a researcher or a user. But we need to look at how much (power) we’re spending,” said Tushar Sharma, an assistant professor at Dalhousie University’s faculty of computer science who is currently researching how to make generative AI models more energy-efficient.
Just one ChatGPT query uses 10 times as much power as a Google search, according to Goldman Sachs Group Inc. research released in May. Generating an image using a powerful AI model takes the same amount of energy as fully charging a smartphone, a December 2023 paper by researchers at AI startup Hugging Face Inc. and Carnegie Mellon University said, while generating text required the same amount of power as charging a smartphone to 16 per cent battery level.
As a result, the power demands of data centres used for AI processes are surging. Goldman Sachs estimated the power draw from those centres could increase 160 per cent by the end of the decade, from one per cent to two per cent of global power demand currently to three per cent or four per cent.
“As we further integrate AI into our products, reducing emissions may be challenging,” the report said.
Microsoft Corp., which has baked generative AI into much of its enterprise products and is rolling out AI-generated summaries in Bing, said in its 2024 sustainability report that its emissions had grown 29 per cent since 2020 due to the construction of more data centres “designed and optimized to support AI workloads.”
Sasha Luccioni, an artificial intelligence researcher and climate lead at Hugging Face in Montreal who co-authored its December report, said one under-explored area of AI’s carbon footprint is the supply chain for the powerful graphic processing units necessary for AI technologies.
She said the power demands of generative models require “orders of magnitude more” energy than the AI models that existed prior to them because the “good old-fashioned” models were extractive in nature.
A search engine using an extractive model would turn queries into a series of numbers and then find the documents or web pages with corresponding numbers. Generative models have to generate a series of probable next words to answer a query, or build an image by generating each pixel individually, around 100 times, until it arrives at the final result.
Sharma’s research has focused on optimizing AI model training. Currently, large language models (LLMs) are trained on vast amounts of data; he is looking to identify the less-necessary or lower-quality data and prune it from models.
“We probably don’t need everything,” he said. “The trick is what to select and what to drop … without (creating) a reduction in the value (of the model).”
Another one of his research projects has focused on identifying the most power-intensive code within an AI model and making those “hotspots” more efficient.
“A program may contain 5,000 lines of code, but not all 5,000 lines are contributing (equally),” Sharma said. “We can put a little bit more human effort into the 200 lines that are (the biggest draw).”
He said he’s searching for “strong industry collaboration” for his research and is currently in discussions with a couple of AI companies.
Researchers at MIT Lincoln Laboratory and Northeastern University, both in Massachusetts, have also explored putting caps on the amount of power GPUs can draw at a time. The researchers tested a cap on an LLM being trained and found that it added three hours of extra time to complete the model’s training, but it also resulted in energy savings equivalent to what the average U.S. household would use in a week.
Some hyperscalers have started investing in cleaner forms of power for their data centres. Google announced an investment into small modular nuclear reactors in mid-October, and Microsoft signed a deal in late September to restart the Three Mile Island nuclear plant in Pennsylvania.
Luccioni said one way to reduce the carbon impact of using AI models is to use smaller ones, which draw much less energy than LLMs. One model that tagged movie reviews as positive or negative tagged 1,000 reviews at a cost of 0.3 grams of carbon dioxide; the same task performed by an LLM spent 10 grams of CO2, according to Luccioni’s paper.
In the enterprise space in particular, she said, companies are often looking to apply a model for specific purposes, such as sentiment analysis or an internal tool that can search and summarize company documents for employees. Those kinds of applications don’t require an LLM and can be done quite well by smaller models.
“Think through what you’re trying to do and what the most relevant tool is for doing that,” she said. “If you want to do summarization, you can use a generative model, but there are extractive models that instead of generating, they’ll take the words from a text, put them together and make a shorter version. That existed before ChatGPT. There are ways of doing things that are more efficient and meant to do exactly what you want to do.”
Luccioni urged people to be much more sparing with their use of gen AI systems and to question whether the query they’re turning to it for is something they could do on their own, such as using a calculator rather than ChatGPT to answer a math question, or writing down “Buy milk” on a notepad instead of asking Siri to do it.
“AI isn’t ephemeral. It’s quite material and it does use compute (power), it uses water for cooling data centres, it uses energy for powering data centres, and none of that is free,” she said. “Just having this reflection of, ’This action I’m going to take is using AI (and that) has an impact. I should think about whether it’s the right thing to do.’”