Researchers at Google's DeepMind have unveiled a new method to speed up AI training, significantly reducing the computational resources and time required to do the job. According to a recent research paper, this new approach to typically energy-intensive processes could make AI development faster and cheaper—and that could be good news for the environment.
“Our approach—multimodal adversarial learning with joint instance selection (JEST)—outperforms state-of-the-art models with 13 times fewer iterations and 10 times fewer computations,” the study said.
The AI industry is notorious for its high energy consumption. Large-scale AI systems such as ChatGPT require large processing power, which in turn requires a lot of energy and water to cool these systems. Microsoft's water consumption, for example, reportedly increased 34 percent from 2021 to 2022 due to increased demand for AI computing, with ChatGPT accused of using about half a liter of water for every 5 to 50 gestures.
The International Energy Agency (IEA) projects that data center electricity consumption will double from 2022 to 2026 — a comparison between the power demand of AI and the oft-criticized energy profile of the cryptocurrency mining industry.
However, approaches such as JEST may offer a solution. By optimizing the selection of data for AI training, Google said, JEST can significantly reduce the number of necessary iterations and computational power, reducing overall energy consumption. This approach is consistent with efforts to improve the efficiency of AI technologies and reduce their environmental impact.
If this technique proves effective at scale, AI trainers will need only a fraction of the power used to train their models. This means they can either build more powerful AI tools with the same resources they currently use, or use fewer resources to develop new models.
How JEST works
JEST works by selecting complementary batches of data to maximize the learnability of the AI model. Unlike traditional methods that select individual instances, this algorithm considers the composition of the entire set.
For example, imagine you are learning multiple languages. Rather than learning English, German and Norwegian separately, perhaps with difficulty, you may find it more effective to study them together in a way where knowledge of one supports the learning of the other.
Google took a similar approach, and it proved successful.
“We show that jointly selecting batches of data is more efficient for learning than selecting examples independently,” the researchers said in their paper.
To do this, Google researchers used “multimodal adversarial learning,” where the JEST process identified dependencies between data points. This approach improves the speed and efficiency of AI training while requiring much less computing power.
Google noted that the key to the approach was to start with pre-trained reference models to drive the data selection process. This technique allowed the model to focus on high-quality, well-structured datasets, further improving training performance.
“The quality of a batch is also a function of its composition, in addition to the abstract quality of its data points considered independently,” explained the paper.
The study's experiments showed tangible performance gains in various benchmarks. For example, training on a typical WebLI dataset using JEST has shown significant improvements in learning speed and resource efficiency.
The researchers also found that the algorithm quickly discovered the most learnable subsets, speeding up the training process by focusing on specific pieces of data that “matched” together. This technique, called “data quality bootstrapping,” values quality over quantity and has proven to be better for AI training.
“A reference model trained on a small curated dataset can effectively guide the curation of a much larger dataset, allowing the training of models that outperform the reference model in many downstream tasks,” the paper said. strongly outperforms,” said Pepper.
Edited by Ryan Ozawa.