Artificial Footprints Series: The environmental impact of AI

27 Feb

Written By N

Artificial Intelligence is set to change the way we live. But have we thought about how it could change our planet?

Over the next few weeks we will be sharing a series of articles, exploring the environmental impact of AI. Read the full paper here, or dive into one of our articles below:

The carbon costs of training AI

Building an AI model requires monumental computing power - and, therefore, energy use - to generate and train the new system. The computing power demand is largely driven by three things: the size of the model (in terms of parameters, the number of variables, or weights, that the model adjusts during training), the size of the training datasets, and tuning the hyperparameters of the model; with the latter often going underreported. So how exactly can you measure the environmental impact of creating a new AI model?

Strubell devised a useful methodology for measuring the carbon emissions associated with training an AI model based on the power draw of the hardware used and the hours required. We applied their methodology to data from other sources to estimate the carbon emission for additional AI models, in order to give a broader picture of the range of emissions typically generated by training AI. Table 1 shows their results alongside our own additions. Note that these are estimates only, due to the difficulty of obtaining precise information on some of these models.

Emissions output is measured in kgCO2e, or kilograms of Carbon Dioxide Equivalent, a unit used to measure carbon footprints in terms of the amount of CO2 that would create the same level of global warming.

    Model
    Hardware
    Power (W)
    Hours
    kWh.PUE
    CO2e (kg)
  


    T2Tbig
    P100x8
    1515
    84
    201
    87
  

    ELMO
    P100x3
    518
    336
    275
    119
  

    BERTbase   (V100)
    V100x64
    12,042
    79
    1507
    652
  

    BERTbase   (TPU)
    TPUv2x16
    4000
    96
    607
    261
  

    BERTlarge
    TPUv3x64
    12,800
    96
    1941
    840
  

    NAS
    P100x8
    1515
    274,120
    656,347
    284,000
  

    NAS
    TPUv2
    250
    32,623
    12,900
    5580
  

    GPT-2
    TPUv3x32
    6400
    168
    1700
    735
  

    GPT-3
    V100
    300
    3,100,000
    1,474,000
    638,000
  

Model	Hardware	Power (W)	Hours	kWh.PUE	CO2e (kg)
T2Tbig	P100x8	1515	84	201	87
ELMO	P100x3	518	336	275	119
BERTbase (V100)	V100x64	12,042	79	1507	652
BERTbase (TPU)	TPUv2x16	4000	96	607	261
BERTlarge	TPUv3x64	12,800	96	1941	840
NAS	P100x8	1515	274,120	656,347	284,000
NAS	TPUv2	250	32,623	12,900	5580
GPT-2	TPUv3x32	6400	168	1700	735
GPT-3	V100	300	3,100,000	1,474,000	638,000

“It would take the average person over 127 years to generate the same level of emissions that it took to train GPT-3.”

Table 1 - Power requirements and CO2e emissions for AI models. Based on data and methodology from Strubell, with additional data from Li, Schwartz, Teich and TechPowerUp.

So what is the environmental impact of training AI? As illustrated above, there’s no simple answer, with emissions ranging from 87kgCO2e (roughly equivalent to driving 220 miles in the average car) to 638,000kgCO2e (equivalent to one person flying 10,500 miles from London to Sydney almost 142 times).

To put this latter figure into further context, the average human is responsible for 5,000kgCO2e in a single year - meaning it would take the average person over 127 years to generate the same level of emissions that it took to train GPT-3.

So why is there so much variation in energy consumption between different AI models?

The level of emissions produced by the AI training process is dependent on four key factors. The first and most obvious factor being the individual computational requirement of the individual system - the size of the model and of the training dataset. This can range from BERTbase, which took a mere 79 hours to train, to GPT-3, which required a whopping 3,100,000 hours - a little over 350 years - of total computing time. In the next section we’ll delve more deeply into the reasons behind this and the trends that are emerging in computational demands.

The second key factor determining emissions from AI training is the Power Unit Equivalent (PUE), which is the effective power required for a server centre to produce one unit of computing power, due to additional power draws such as cooling. Strubell estimates this at 1.58, but it’s possible to reduce this by improving the efficiency and management of server centres. In fact, AI itself could be one tool for improving efficiency - more on this later.

The third factor is the power draw of the hardware itself. If we compare the results of the BERTbase model to the BERTlarge model, the latter has just over three times the power use of the former, despite using four times the TPUs (Tensor Processing Units). This is due to the latter using a newer version (TPUv3) with a lower power draw, illustrating how improvements in hardware can also reduce emissions from AI.

“Running an AI model on servers in the UK rather than the US could halve its carbon emissions.”

The final major factor is the carbon intensity of the grid. In his research, Strubell assumes a carbon intensity of 433gCO2/kWh based on 2018 estimates from the EPA. However, 2022 US grid carbon intensity was 376g/kWh, which would produce lower emissions than Strubell’s estimates. Other locations would be even more efficient. For example, the EU’s average grid carbon intensity in 2022 was 250g/kWh, whilst the UK’s was 182g/kWh. Hence, running an AI model on servers in the UK rather than the US could halve the carbon emissions from the model.

So, taking these factors into account, is it possible to train AI in an environmentally responsible way?

Fig. 1 and Table 2 compare the real-world emissions estimates above to the hypothetical emissions that would have occurred if PUE were reduced to 1.25 and if grid carbon were equivalent to that of the UK in 2022. Together, these data centre efficiency and carbon grid intensity measures would reduce emissions in each case by more than 50%.

    Model
    CO2e (kg) base
    CO2e (kg) improved PUE
    CO2e (kg) UK grid carbon
    CO2e (kg) both
  


    T2Tbig
    87
    69
    37
    29
  

    ELMO
    119
    94
    50
    39
  

    BERTbase   (V100)
    652
    515
    274
    216
  

    BERTbase   (TPU)
    261
    206
    110
    87
  

    BERTlarge
    840
    664
    353
    279
  

    NAS
    284,000
    224,360
    119,280
    94,230
  

    NAS
    5,580
    4,408
    2,344
    1,850
  

    GPT-2
    735
    581
    309
    244
  

    GPT-3
    638,000
    504,020
    267,960
    211,700
  

Model	CO2e (kg) base	CO2e (kg) improved PUE	CO2e (kg) UK grid carbon	CO2e (kg) both
T2Tbig	87	69	37	29
ELMO	119	94	50	39
BERTbase (V100)	652	515	274	216
BERTbase (TPU)	261	206	110	87
BERTlarge	840	664	353	279
NAS	284,000	224,360	119,280	94,230
NAS	5,580	4,408	2,344	1,850
GPT-2	735	581	309	244
GPT-3	638,000	504,020	267,960	211,700

Table 2 - How reducing PUE (in this case from 1.58 to 1.25) and grid carbon (from 433gCO2/kWh to 182gCO2/kWh) affect emissions from AI.

Figure 1 - Reducing PUE and grid carbon can drastically cut the emissions from AI, but large models still dwarf the emissions of smaller ones. The chart is on a logarithmic scale.

However, although that 50% reduction illustrates the value of improving data centre efficiency and grid carbon intensity, we can also see that it would still be far outweighed by the huge increase in emissions between GPT-2 and its subsequent iteration GPT-3. This is indicative of a wider trend in which small gains in efficiency are far outstripped by a shift towards more complex AI systems requiring more energy to run.

In the following article next week, we will take a look at the incredible growth rate of AI models and explore the reasons behind this trend: The exponential growth of computational power behind AI