Ethernet Advances Will End Nvidia’s InfiniBand Lead in AI Networks: Gartner • The Register

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Three improvements to Ethernet standards will make it a better alternative for hosting AI workloads, and it will see vendors look back at the tech as an alternative to Nvidia’s InfiniBand kit, which is expected to dominate for the next two years. ready for

That’s according to a piece published this week by analyst firm Gartner titled “Emerging Tech: Top Trends in Networking for Generative AI.” Authored by analyst Anshree Verma, a member of Gartner’s Emerging Technologies and Trends Group, the paper predicts that InfiniBand adoption among technology providers such as vendors and clouds will reach about 25 percent by 2026. Will go and stay there.

Ethernet will achieve the same rate of adoption by providers this year, then accelerate to the point where it is offered by eighty percent of providers in a decade.

This shift by tech providers will mean that by 2028, 45 percent of general AI workloads will run over Ethernet — down from less than 20 percent now.

The swing will come as Ethernet improves. Gartner currently calls it “not ideal” for AI training, but Verma highlighted three innovations that he believes will make Ethernet a worthy — even superior — contender against InfinitiBand:

  • RDMA over Converged Ethernet (RoCE) – will allow direct memory access between devices over Ethernet, improving performance and reducing CPU usage.
  • Lossless Ethernet – will bring advanced flow control, better congestion handling, improved hashing, buffering, and advanced flow telemetry that improves the capabilities of modern switches.
  • Featuring the Ultra Ethernet Consortium (UEC) in 2024 – specifically designed to develop Ethernet AI.

As Ethernet opens up, Verma expects many suppliers to implement the above three innovations, giving buyers choice and creating competition.

InfiniBand, by contrast, is more expensive than Ethernet and will remain so for five years. Verna believes it has “limitations in scalability and requires specialized skills to manage,” which means some network designers avoid it if it becomes unmanageable complexity.

Yet he predicted that 30 percent of generative AI workloads would run on InfiniBand — down from less than 20 percent today.

This growth will be slowed by the addition of optical interconnects to networks used to carry generative AI traffic. Verna found that less than one percent of networks used for AI workloads are interconnected today, but predicts that will increase to 25 percent by 2030.

He warned that even though the tech has major backers — such as Intel, TSMC, and HPE — it won’t be widely used until around 2028. Requires less power than electrical switching.

PCIe is also on the rise, and when combined with servers that use it to share memory across the bus using the CXL spec, Gartner expects both to become prevalent in AI workloads. will go.

Again, Varna predicts that uptake is a few years away: CXL launches in early 2023, and she feels that serious adoption will begin in 2026 — around the same time that PCIe 6.0 is implemented. Expects.

Verna recommends customers “evaluate early adoption opportunities to gain a competitive advantage by partnering with leading technology providers at the design stage,” and otherwise ensure that they have the technologies described above. understand

And for those considering InfinityBand, he wrote, “it will be necessary to reevaluate networking choices for performance, reliability, scalability and cost by evaluating InfinityBand-based switches versus UltraEthernet-based switches.” ” ®

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment