AI uses between 10 and 15 TWh of electricity per year as of early 2026, roughly the annual consumption of a country the size of Ghana. A single ChatGPT query consumes approximately 10 times the energy of a Google search. That gap is widening as models grow larger, inference volumes scale, and new data centres come online every quarter. Here is exactly how much energy AI uses, broken down by provider, model, and use case.
How Much Energy Does AI Use Compared to Traditional Computing
AI energy consumption is the total electrical power drawn by hardware systems during model training, inference, and the cooling infrastructure that supports them. The International Energy Agency (IEA) estimated that global data centres consumed approximately 460 TWh in 2024, with AI workloads accounting for roughly 2 to 3% of that total. By 2026, the IEA projects AI-specific electricity demand will reach 90 TWh under moderate growth assumptions, and could exceed 150 TWh if current deployment rates accelerate.
A standard Google search query uses about 0.3 Wh of electricity. A ChatGPT query on GPT-4 uses approximately 2.9 Wh, nearly ten times more. An image generation request through Midjourney or DALL-E 3 consumes roughly 3.4 Wh per image. These numbers sound small in isolation, but ChatGPT alone handles over 1 billion queries per week as of late 2025. At 2.9 Wh per query, that single product consumes approximately 150 GWh per year, equivalent to powering 14,000 UK households.
Training is where ai data center power consumption reaches its most extreme levels. Training GPT-4 reportedly consumed approximately 50 GWh of electricity across several months on a cluster of 25,000 A100 GPUs. Training Llama 3 405B on 16,384 H100 GPUs over 54 days consumed roughly 30 GWh. Google’s Gemini Ultra training run is estimated at 45 to 55 GWh. Each successive generation of frontier models requires two to four times the compute of its predecessor, which means training energy costs are doubling every 12 to 18 months.
Energy Consumption by AI Provider and Model: 2025 Data
The energy footprint of AI varies significantly depending on the provider, the model architecture, and whether you are measuring training or inference. The table below compiles the best available data from corporate sustainability reports, academic papers, and independent research published through early 2026.
| Provider / Model | Training Energy (GWh) | Energy per Query (Wh) | Annual Inference Energy (est. GWh) | PUE Ratio |
|---|---|---|---|---|
| OpenAI GPT-4 | ~50 | 2.9 | ~150 | 1.10 – 1.20 |
| OpenAI GPT-4o | ~25 | 1.5 | ~200 | 1.10 – 1.20 |
| Google Gemini Ultra | ~50 | 2.2 | ~180 | 1.10 |
| Google Gemini Flash | ~8 | 0.4 | ~90 | 1.10 |
| Meta Llama 3 405B | ~30 | 1.8 | ~40 | 1.15 – 1.25 |
| Anthropic Claude 3.5 Sonnet | ~15 (est.) | 1.2 (est.) | ~60 | 1.12 (AWS) |
| Mistral Large | ~5 | 0.8 | ~15 | Varies (cloud) |
| Midjourney v6 (per image) | N/A | 3.4 | ~25 | Varies (cloud) |
| Stable Diffusion XL (per image) | ~0.2 | 0.8 | ~5 | Varies |
Several patterns emerge from this data. First, training energy scales roughly with parameter count and dataset size. GPT-4, estimated at 1.8 trillion parameters, consumed significantly more training energy than Llama 3 405B at less than a quarter of the parameter count. Second, inference energy per query varies by a factor of 8x between the most and least efficient models. Smaller, distilled models like Gemini Flash and Mistral Large deliver dramatically lower per-query costs. Third, total annual inference energy now exceeds training energy for the most widely deployed models because inference volumes are growing faster than anyone predicted in 2023.
Power Usage Effectiveness (PUE) is the ratio of total facility power to IT equipment power. A PUE of 1.10 means the data centre uses 10% additional energy for cooling, lighting, and other overhead beyond what the servers themselves consume. Google reports a fleet-wide PUE of 1.10, which is among the lowest in the industry. Microsoft reports 1.12 across Azure regions. Older facilities operated by colocation providers often run at PUE ratios of 1.4 to 1.6, meaning 40 to 60% of electricity goes to overhead rather than computation. When you evaluate how much energy AI uses, the PUE of the hosting facility matters as much as the model architecture itself.
AI Data Center Power Consumption: Where the Electricity Actually Goes
Understanding how an ai data center works reveals why AI workloads are so power-hungry. A modern AI data centre allocates its power budget across four primary categories: GPU compute, networking, cooling, and auxiliary systems. The distribution is not what most people expect.
GPU compute accounts for 60 to 70% of total facility power in a purpose-built AI data centre. A single NVIDIA H100 SXM GPU has a thermal design power (TDP) of 700W. A DGX H100 system containing eight GPUs draws approximately 10.2 kW at full load. A rack of four DGX systems pulls 40 to 42 kW. Scale that to a 100,000-GPU cluster and the compute layer alone demands 87.5 MW of continuous power. The newer B200 GPUs are even more power-hungry, with a TDP of 1,000W per chip, pushing rack-level power to 120 kW for an eight-GPU configuration.
Networking consumes 10 to 15% of facility power. InfiniBand NDR switches draw 1,200 to 1,500W per unit, and a large training cluster requires thousands of these switches in a fat-tree or Clos topology. The spine-leaf architecture for a 100,000-GPU cluster might include 800 to 1,200 leaf switches and 200 to 300 spine switches, collectively drawing 1.5 to 2.5 MW.
Cooling takes 15 to 25% depending on the technology deployed. Air-cooled facilities at the high end of that range are increasingly rare for new AI deployments. Direct liquid cooling reduces cooling overhead to 10 to 15% of total power. Rear-door heat exchangers, cold plate systems from vendors like CoolIT and Vertiv, and full immersion solutions from GRC and LiquidCool Solutions each offer different tradeoffs between cost, complexity, and efficiency. Every percentage point of PUE improvement at a 500 MW campus saves 5 MW of continuous power, which translates to $2.2 million per year at $0.05/kWh.
IEA Projections for AI Electricity Demand Through 2030
The International Energy Agency published its most detailed AI energy projections in the 2025 World Energy Outlook. The numbers challenge the assumption that efficiency improvements will keep pace with demand growth. Under the IEA’s Stated Policies Scenario (STEPS), global data centre electricity consumption reaches 945 TWh by 2030, with AI workloads comprising 15 to 20% of that total, roughly 140 to 190 TWh. Under the accelerated deployment scenario, AI-specific consumption could reach 300 TWh by 2030.
For context, 300 TWh is more than the entire annual electricity consumption of the United Kingdom (290 TWh in 2024). It is roughly 1% of projected global electricity generation in 2030. That may sound modest as a percentage, but the concentration matters. AI data centres cluster in specific grid regions: Northern Virginia, Dublin, Amsterdam, Singapore, and increasingly, the US Sunbelt. In Northern Virginia, data centres already account for over 25% of regional electricity demand, and Dominion Energy projects that figure will reach 40% by 2030.
The IEA identifies three variables that will determine whether AI energy consumption follows the moderate or accelerated trajectory. First, inference efficiency gains from techniques like quantisation, speculative decoding, and mixture-of-experts architectures could reduce per-query energy by 50 to 80% over the next three years. Second, the rate of AI adoption across enterprise and consumer applications determines total query volume. Third, the cadence of frontier model training runs sets the ceiling for peak power demand at individual facilities. If you are tracking how much energy AI uses at a macro level, these three factors are the ones to watch.
Training vs Inference: Which Phase of AI Uses More Energy
Training a large language model is the most energy-intensive single computation most organisations will ever run. But inference, the process of actually using a trained model to generate responses, now consumes more total energy across the industry because it runs continuously at massive scale.
Training GPT-4 consumed approximately 50 GWh. That training run happened once (with some restarts and experimentation adding perhaps 20 to 30% overhead). The total training-related energy for GPT-4, including preliminary experiments, might reach 60 to 65 GWh. By contrast, OpenAI’s inference infrastructure for GPT-4 and GPT-4o serves over 1 billion queries per week, consuming an estimated 350 GWh per year across all models. Inference already outpaces training by a factor of 5 to 6x in annual energy consumption for OpenAI alone.
This ratio is shifting further toward inference as more companies deploy AI models in production. Meta embeds Llama-based models in recommendation systems across Facebook, Instagram, and WhatsApp, generating trillions of inference calls per day. Google runs Gemini-based models in Search, Ads, YouTube, and Gmail. Each of these integrations multiplies the total inference energy footprint. Goldman Sachs estimated in late 2024 that global AI inference energy consumption would exceed AI training energy by 8 to 10x by 2027.
The efficiency characteristics of training and inference differ fundamentally. Training maximises GPU utilisation at near 100% for weeks or months. Inference operates in bursts, with average GPU utilisation often below 30% because providers must maintain capacity for peak demand. This low utilisation rate means inference wastes a significant amount of the energy allocated to it. Improving inference efficiency through batching, model distillation, and dynamic scaling represents one of the largest opportunities to reduce how much energy AI uses overall.
How Much Energy Does a Single AI Query Use: ChatGPT, Gemini, and Claude
The per-query energy consumption of AI models depends on four factors: model size (parameter count), context length, hardware efficiency, and data centre PUE. Larger models with longer context windows on older hardware in less efficient facilities use the most energy per query. Smaller models on newer hardware in optimised facilities use the least.
A ChatGPT query using GPT-4 Turbo processes approximately 500 to 1,500 tokens of input and generates 200 to 800 tokens of output in a typical conversational exchange. On NVIDIA H100 hardware, this requires roughly 0.001 to 0.003 kWh (1 to 3 Wh) of GPU compute energy, with facility overhead adding 10 to 20% on top. The weighted average across all ChatGPT interactions, including shorter GPT-3.5 queries, lands around 2.9 Wh per query according to analysis by the AI researcher Alex de Vries, published in Joule.
Google’s Gemini models span a wider efficiency range. Gemini Ultra, the largest variant, consumes an estimated 2.0 to 2.5 Wh per query. Gemini Pro sits at roughly 1.0 to 1.5 Wh. Gemini Flash, designed explicitly for efficiency, consumes approximately 0.3 to 0.5 Wh per query, making it one of the most energy-efficient frontier-class models available. Google achieves this through a combination of TPU hardware optimisation, smaller model variants, and aggressive quantisation.
The energy gap between a standard web search and an AI query is narrowing for lighter models but remains substantial for frontier models. When you ask how much energy AI uses per interaction, the answer ranges from 0.3 Wh for an optimised small model to 3.5 Wh for a large multimodal request. Your choice of model and provider directly determines your per-query ai carbon footprint.
AI Carbon Footprint: Converting Energy to Emissions
Energy consumption translates to carbon emissions through the carbon intensity of the local electricity grid. A data centre running on 100% renewable energy in Norway (carbon intensity: 8 gCO2/kWh) produces a fundamentally different carbon footprint than an identical facility running on the coal-heavy grid in West Virginia (carbon intensity: 800 gCO2/kWh).
Training GPT-4 at 50 GWh of electricity consumption would produce approximately 25,000 tonnes of CO2 on the average US grid (500 gCO2/kWh). On Microsoft’s reported energy mix, which includes substantial renewable procurement, the actual emissions were likely 40 to 60% lower. For comparison, 25,000 tonnes of CO2 is equivalent to approximately 5,500 return flights from London to New York, or the annual emissions of 5,000 average UK cars.
Google reported in its 2024 Environmental Report that its total data centre carbon emissions rose 48% year-over-year, from 10.2 million tonnes CO2e in 2022 to 14.3 million tonnes in 2023, driven almost entirely by AI workload growth. Microsoft reported a 29% increase in Scope 2 emissions in its 2024 Sustainability Report. Both companies have 2030 net-zero commitments, but the trajectory is currently moving in the wrong direction because AI energy demand is growing faster than renewable energy procurement can scale.
The ai carbon footprint of inference is particularly difficult to reduce because inference runs 24/7 across globally distributed data centres. Training runs can be scheduled to coincide with periods of high renewable energy availability, a technique called carbon-aware computing. Inference cannot wait for the wind to blow. This makes the grid mix of inference data centre locations the single most important variable in determining the carbon impact of deployed AI systems.
Reducing AI Energy Consumption: Hardware and Software Approaches
The AI industry is pursuing energy reduction on two parallel tracks: building more efficient hardware and deploying smarter software. Both are producing measurable results, though neither is keeping pace with overall demand growth.
On the hardware side, each GPU generation delivers roughly 2 to 3x better performance per watt than its predecessor. The NVIDIA B200 delivers approximately 2.5x the inference throughput of an H100 at 1.4x the power draw, yielding a net 1.8x improvement in performance per watt. Google’s TPU v5p delivers 2.8x the training throughput of TPU v4 with less than 2x the power increase. These improvements are real and significant, but they are being consumed by the exponential growth in model size and deployment scale.
Software-level optimisations offer more immediate energy savings. Quantisation reduces model precision from 32-bit floating point to 8-bit or even 4-bit integers, cutting memory requirements and compute energy by 50 to 75% with minimal accuracy loss for most applications. Mixture-of-experts (MoE) architectures activate only a subset of model parameters for each query, reducing compute energy proportionally. Speculative decoding uses a small, fast model to draft responses that a larger model then verifies, reducing the number of large-model forward passes by 40 to 60%.
Model distillation, where a smaller model is trained to replicate the behaviour of a larger one, represents perhaps the most impactful approach. GPT-4o Mini delivers roughly 80% of GPT-4’s quality at approximately 50% of the per-query energy cost. If providers can shift the majority of traffic to these distilled models while reserving frontier models for tasks that genuinely require them, the aggregate energy savings would be substantial. The question is whether user expectations and competitive dynamics will permit that shift. If you are evaluating how much energy does ai use within your own organisation, selecting the smallest model that meets your accuracy requirements is the single most impactful decision you can make.
Nuclear Power for AI Data Centres: The Long-Term Energy Solution
Nuclear energy is emerging as the preferred long-term power source for AI data centres because it delivers the two things AI facilities need most: high-capacity baseload power and near-zero carbon emissions. Nuclear plants operate at capacity factors above 90%, meaning they generate electricity more than 90% of the time, compared to 25 to 35% for solar and 35 to 45% for onshore wind. For AI data centres that need uninterrupted power 8,760 hours per year, nuclear’s reliability is not a preference but a requirement. Microsoft’s 20-year power purchase agreement with Constellation Energy to restart the Three Mile Island Unit 1 reactor (835 MW capacity) signalled a turning point for the industry in September 2024.
Amazon followed by acquiring a nuclear-powered data centre campus adjacent to the Susquehanna Steam Electric Station in Pennsylvania for $650 million. Google signed the first corporate agreement for small modular reactor (SMR) power with Kairos Power in October 2024, targeting 500 MW of capacity by 2035. Oracle announced plans for a 1 GW data centre campus powered by three SMRs. These are not speculative announcements. They represent binding contracts with capital committed.
Small modular reactors (SMRs) are factory-built nuclear reactors with capacities of 50 to 300 MW per unit, designed for faster deployment and lower upfront costs than conventional nuclear plants. NuScale Power’s VOYGR design and TerraPower’s Natrium reactor are the two leading SMR platforms in the US, with first commercial operations expected between 2029 and 2032. For AI companies planning data centre campuses that will operate for 20 to 30 years, nuclear provides a power source that scales with their needs without depending on weather patterns or fossil fuel price volatility.
Frequently Asked Questions
How much electricity does ChatGPT use per day?
ChatGPT consumes approximately 1.3 GWh of electricity per day based on an estimated 180 million daily active users generating roughly 450 million queries. That daily consumption is equivalent to powering about 120,000 UK households for one day. OpenAI’s total daily electricity consumption across all products and training runs likely exceeds 2 GWh.
Does AI use more energy than Bitcoin mining?
Bitcoin mining consumed approximately 120 to 150 TWh per year in 2025, while AI workloads consumed roughly 10 to 15 TWh. Bitcoin currently uses 8 to 15 times more energy than AI. However, AI energy demand is growing at 40 to 50% annually while Bitcoin mining growth has plateaued, so the gap is closing and AI could surpass Bitcoin by 2028 to 2030.
Can renewable energy power all AI data centres?
Renewable energy can technically supply the power AI data centres need, but not at the pace and concentration required. AI data centres need 24/7 baseload power in specific locations, while solar and wind are intermittent and geographically constrained. A hybrid approach combining renewables, battery storage, and nuclear power is the most realistic path to carbon-neutral AI infrastructure by 2035.
How much water do AI data centres consume for cooling?
A large AI data centre consumes 1 to 5 million litres of water per day for evaporative cooling, depending on climate and cooling technology. Microsoft reported a 34% increase in water consumption in 2023, reaching 7.8 billion litres globally. Direct liquid cooling and closed-loop systems can reduce water consumption by 80 to 90% compared to traditional evaporative methods.
Will AI energy consumption slow down the energy transition?
AI energy demand is placing measurable strain on grid decarbonisation timelines. Google and Microsoft both reported rising emissions in 2023 and 2024 despite aggressive renewable procurement. However, AI also accelerates the energy transition by optimising grid management, improving battery chemistry research, and enhancing weather forecasting for renewable energy planning. The net effect depends on how much energy AI uses versus how much energy AI helps save.