Americas

  • United States
michael_cooney
Senior Editor

Major network vendors team to advance Ethernet for scale-up AI networking

News
Oct 14, 20256 mins
Artificial IntelligenceData CenterHigh-Performance Computing

The Ethernet for Scale-Up Networking (ESUN) initiative includes AMD, Arista, ARM, Broadcom, Cisco, HPE Networking, Marvell, Meta, Microsoft, and Nvidia.

Speed line stream tunnel, internet speed network background.
Credit: Pingingz / Shutterstock

As AI networking technology blossoms, yet another group has formed to make sure Ethernet can handle the stress.

AMD, Arista, ARM, Broadcom, Cisco, HPE Networking, Marvell, Meta, Microsoft, Nvidia, OpenAI and Oracle have joined the new Ethernet for Scale-Up Networking (ESUN) initiative, which promises to advance the networking technology to handle scale-up connectivity across accelerated AI infrastructure. ESUN was formed by the nonprofit Open Compute Project, which is hosting its 2025 OCP Global Summit this week in San Jose, Calif.

“AI workloads are re-shaping modern data center architectures, and networking solutions must evolve to meet the growing demands,” wrote Martin Lund, executive vice president of Cisco’s common hardware group, in a blog post about the news. “ESUN brings together AI infrastructure operators and vendors to align on open standards, incorporate best practices, and accelerate innovation in Ethernet solutions for scale-up networking.”

ESUN will focus solely on open, standards-based Ethernet switching and framing for scale-up networking—excluding host-side stacks, non-Ethernet protocols, application-layer solutions, and proprietary technologies. The group will expand the development and interoperability of XPU network interfaces and Ethernet switch ASICs for scale-up networks, the OCP stated in a blog: “The Initial focus will be on L2/L3 Ethernet framing and switching, enabling robust, lossless, and error-resilient single-hop and multi-hop topologies.”

Importantly, OCP says ESUN will actively engage with other organizations looking to advance Ethernet for AI networks, such as the Ultra-Ethernet Consortium (UEC), and long-standing IEEE 802.3 Ethernet to align open standards, incorporate best practices, and accelerate innovation.

AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta and Microsoft originally formed the UEC in 2023 — which now has more than 75 members — with the goal to bring together industry leaders to build a complete Ethernet-based communication stack architecture for high-performance networking.

Another multivendor development group, the Ultra Accelerator Link (UALink) consortium, recently published its first specification aimed at delivering an open standard interconnect for AI clusters. The UALink 200G 1.0 Specification was crafted by many of the group’s 75 members — which include AMD, Broadcom, Cisco, Google, HPE, Intel, Meta, Microsoft and Synopsys — and lays out the technology needed to support a maximum data rate of 200 Giga transfers per second (GT/s) per channel or lane between accelerators and switches between up to 1,024 AI computing pods, UALink stated. 

ESUN will leverage the work of IEEE and UEC for Ethernet when possible, stated Arista’s CEO Jayshree Ullal and chief development officer Hugh Holbrook in a blog post about ESUN. To that end, Ullal and Holbrook described a modular framework for Ethernet scale-up with three key building blocks:

  1. Common Ethernet headers for Interoperability: ESUN will build on top of Ethernet to enable the widest range of upper-layer protocols and use cases.
  2. Open Ethernet data link layer: Provides the foundation for AI collectives with high-performance at XPU cluster scale. By selecting standards-based mechanisms (such as Link-Layer Retry (LLR), Priority-based Flow Control (PFC) and Credit-based Flow Control (CBFC), ESUN enables cost-efficiency and flexibility with performance for these networks. Even minor delays can stall thousands of concurrent operations.
  3. Ethernet PHY layer: By relying on the ubiquitous Ethernet physical layer, interoperability across multiple vendors and a wide range of optical and copper interconnect options is assured.

“ESUN is designed to support any upper layer transport, including one based on SUE-T. SUE-T (Scale-Up Ethernet Transport) is a new OCP workstream, seeded by Broadcom’s contribution of SUE (Scale-Up Ethernet) to OCP. SUE-T looks to define functionality that can be easily integrated into an ESUN-based XPU for reliability scheduling, load balancing, and transaction packing, which are critical performance enhancers for some AI workloads,” Ullal and Holbrook wrote.

“In essence, the ESUN framework enables a collection of individual accelerators to become a single, powerful AI super computer, where network performance directly correlates to the speed and efficiency of AI model development and execution,” Ullal and Holbrook wrote. “The layered approach of ESUN and SUE-T over Ethernet promotes innovation without fragmentation. XPU accelerator developers retain flexibility on host-side choices such as access models (push vs. pull, and memory vs streaming semantics), transport reliability (hop-by-hop vs. end-to-end), ordering rules, and congestion control strategies while retaining system design choices. The ESUN initiative takes a practical approach for iterative improvements.”

Gartner expects gains in AI networking fabrics

Scale-up AI fabrics (SAIF) have captured a lot of industry attention lately, according to Gartner. The research firm is forecasting massive growth in SAIF to support AI infrastructure initiatives through 2029. The vendor landscape will remain dynamic over the next two years, with multiple technology ecosystems emerging, Gartner wrote in its report, What are “Scale-Up” AI Fabrics and Why Should I Care?

“Scale-Up” AI fabrics (SAIF) provide high-bandwidth, low-latency physical network interconnectivity and enhanced memory interaction between nearby AI processors,” Garter wrote. “Current implementations of SAIF are vendor-proprietary platforms, and there are proximity limitations (typically, SAIF is confined to only a rack or row). In most scenarios, Gartner recommends using Ethernet when connecting multiple SAIF systems together. We believe the scale, performance and supportability of Ethernet is optimal.”

“From 2025 through 2027, we expect major shifts in this technology, including traction for Nvidia’s SAIF offering and other SAIF options. As of mid-2025, this technology segment remains dominated by Nvidia, who is evolving and expanding its Nvlink technology to partners such as Marvell, Fujitsu, Qualcomm and Astera Labs to directly integrate with NVIDIA’s SAIF offering (branded as Nvidia NVlink Fusion),” Gartner stated.

However, competing ecosystems are emerging, including UALink and others, and the result of these initiatives creates the potential for a multivendor ecosystem, greater flexibility and reduced lock-in, leading to a more competitive environment, Gartner wrote.