The artificial-intelligence (AI) industry is often compared to the oil industry: once mined and refined, data, like oil, can be a highly lucrative commodity. Now it seems the metaphor may extend even further.
Like its fossil-fuel counterpart, the process of deep learning has an outsize environmental impact.
In a new paper, researchers at the University of Massachusetts, Amherst, performed a life cycle assessment for training several common large AI models.
They found that the process can emit more than 626,000 pounds of carbon dioxide equivalent—nearly five times the lifetime emissions of the average American car (and that includes manufacture of the car itself).
The paper specifically examines the model training process for natural-language processing (NLP), the subfield of AI that focuses on teaching machines to handle human language.
The researchers looked at four models in the field that have been responsible for the biggest leaps in performance: the Transformer, ELMo, BERT, and GPT-2.
They trained each on a single GPU for up to a day to measure its power draw.
They found that the computational and environmental costs of training grew proportionally to model size and then exploded when additional tuning steps were used to increase the model’s final accuracy.
In particular, they found that a tuning process known as neural architecture search, had extraordinarily high associated costs for little performance benefit.
Without it, the most costly model, BERT, had a carbon footprint of roughly 1,400 pounds of carbon dioxide equivalent, close to a round-trip trans-America flight for one person.
What’s more, the researchers note that the figures should only be considered as baselines.
In practice, it’s much more likely that AI researchers would develop a new model from scratch or adapt an existing model to a new data set, either of which can require many more rounds of training and tuning.
Needless to say, that’s a substantial toll on the environment.
Reference- University of Massachusetts Paper, Futurism, New York Times, MIT Technology Review