Computer scientists have discovered a new way to multiply large matrices faster than ever before, eliminating previously unknown inefficiencies, Quanta magazine reports. This could ultimately speed up AI models like ChatGPT, which rely heavily on matrix multiplication to function. Results presented in two recent papers indicate the largest improvement in matrix multiplication performance over a decade.

Multiplying two rectangular number arrays, known as matrix multiplication, plays an important role in today’s AI models, including speech and image recognition, chatbots from every major vendor, AI image generators, and video. Synthesis models such as Sora. Beyond AI, matrix arithmetic is so important to modern computing (think image processing and data compression) that even modest increases in performance can lead to computational and power savings.

Graphics processing units (GPUs) specialize in handling matrix multiplication tasks because of their ability to process many calculations simultaneously. They divide large matrix problems into smaller parts and solve them simultaneously using algorithms.

Perfecting this algorithm has been key to advances in matrix multiplication performance over the past century—even before computers entered the picture. In October 2022, we covered a new technique discovered by Google DeepMind AI models called AlphaTensor, which focuses on practical algorithmic improvements for specific matrix sizes, such as 4×4 matrices.

In contrast, the new study, by Ren Duan and Renfei Zhou of Tsinghua University, Hongxin Wu of the University of California, Berkeley, and Virginia Wasilewska Williams, Yinzhan Xu, and Zaxuan Xu of the Massachusetts Institute of Technology (in a second paper), In order to achieve broad performance across matrices of all sizes, it seeks theoretical enhancements with the aim of minimizing the exponent of complexity, ω. Rather than looking for quick, practical solutions like the alpha tensor, the new technique focuses on fundamental improvements that can transform the performance of matrix multiplication on a more general scale.

## approaching the ideal value

The traditional method of multiplying two n-by-n matrices requires n³ separate multiplications. However, the new technique, which improves on the “laser method” introduced by Volker Straussen in 1986, has reduced the upper limit of the exponent (defined as ω above), to an ideal of 2. is brought close to the value, which is the theoretical minimum number of operations required.

A traditional method of multiplying two grids filled with numbers would require 27 times the math for a 3×3 grid. But with these developments, the process is speeded up by significantly reducing the necessary multiplication steps. This effort minimizes the operations to slightly double the size of one side of the grid square, which is adjusted by a factor of 2.371552. This is a big deal because it nearly achieves the best performance of doubling the square’s dimensions, which is the fastest we can hope to do.

Here is a brief summary of the events. In 2020, Josh Ullmann and Williams introduced a significant improvement in the performance of matrix multiplication by establishing a new upper bound for ω at approximately 2.3728596. In November 2023, Duan and Zhou disclosed a method that addressed an inefficiency within the laser method, setting a new upper bound for ω at approximately 2.371866. The achievement marked the most significant progress in the field since 2010. But just two months later, Williams and his team published a second paper that gave detailed corrections that reduced the upper limit of ω to 2.371552.

The 2023 breakthrough came from the discovery of “hidden loss” in the laser mechanism, where useful blocks of data were unintentionally discarded. In the context of matrix multiplication, “blocks” refer to the smaller parts into which a larger matrix is divided for easier processing, and “block labeling” is a technique for classifying the parts in order to identify them. Which ones to keep and which ones to discard, to correct it. Multiplication process for speed and efficiency. By modifying the way the laser blocks the process, the researchers were able to reduce waste and significantly improve efficiency.

Although the decrease in the omega constant may appear modest at first glance—reducing the 2020 record value to 0.0013076—the overall work of Duan, Zhou, and Williams represents the most significant progress the field has seen since 2010. does.

“This is a significant technological breakthrough,” said William Kuzmaul, a theoretical computer scientist at Harvard University, as quoted by Quanta magazine. “This is the biggest improvement in matrix multiplication we’ve seen in more than a decade.”

While further progress is expected, the current approach has limitations. The researchers believe that a deeper understanding of the problem will lead to the development of even better algorithms. As Chow said in the Quanta report, “People are still in the early stages of understanding this age-old problem.”

So what are the practical applications? For AI models, the reduction in computational steps for matrix arithmetic can translate into faster training times and more efficient completion of tasks. This enables more complex models to be trained more quickly, potentially leading to advances in AI capabilities and the development of more sophisticated AI applications. Additionally, performance improvements can make AI technologies more accessible by reducing the computational power and energy consumption required for these tasks. It will also reduce the environmental impact of AI.

The exact effect on the speed of AI models depends on the specific architecture of the AI system and how dependent its functions are on matrix multiplication. Advances in algorithmic performance often need to be combined with hardware optimization to fully realize the potential speed gains. But still, as algorithmic techniques improve over time, AI will get faster.