Edgecore Networks Demonstrates Up to 35.6% Performance Improvement in Collective Communication Benchmark
Broadcom’s Congestion Aware Sprayed Traffic (CAST) successfully mitigates congestion measured in a cluster of MI300X GPUs interconnected with Broadcom’s Tomahawk 5 switches and Thor 2 400G NIC
HSINCHU, Taiwan, January 06, 2026–(BUSINESS WIRE)–Edgecore Networks, a leading provider of open networking solutions, today announced the results of a joint performance study with our strategic technology partners. The study evaluated Broadcom’s Congestion Aware Sprayed Traffic (CAST) technology with Edgecore AIS800-64O 800G switch networking in AMD MI300X GPU cluster and its impact on collective communication workloads in AI and high-performance computing (HPC) environments.
Broadcom CAST optimizes multi-path communication by dynamically directing traffic based on real-time congestion metrics, specifically round-trip time (RTT). This intelligent traffic distribution significantly improves performance over traditional load balancing methods, particularly in distributed training, inference, and parallel computation.
The study focused on four key RCCL collective operations:
-
All-Reduce
-
All-Gather
-
Reduce-Scatter
-
All-To-All
Testing was conducted across three cluster configurations—oversubscribed (2:1), non-blocking (1:1), and undersubscribed (1:2)—with CAST consistently delivering performance improvements across all these three scenarios. Highlights include:
-
Oversubscription (2:1): Up to 26.7% improvement
-
Nonblocking (1:1): Up to 35.6% improvement
-
Undersubscription (1:2): Up to 29.8% improvement
The performance diagram is available at Edgecore website.
Edgecore contributed its expertise in open networking, high-speed Ethernet networking fabrics, and multi-rail RDMA environments, supporting system-level end-to-end performance tuning throughout the study. This study utilized a comprehensive SONiC & Broadcom Thor 2 NIC Telemetry and Monitoring Solution, providing the essential Network Observability required to capture real-time, granular visibility into configuration, utilization, thermal metrics, and crucial PFC/DCQCN congestion control states for accurate performance validation.
“Mitigating congestion is of vital importance to AI networks,” said Karen Schramm, vice president of Architecture, Data Center Solutions Group, Broadcom. “Enhanced with our CAST technology, Edgecore’s networking solutions effectively limit congestion delivering compelling performance improvements in AI applications as well as rapid link failure recovery without breaking compatibility with RoCEv2 standards — thereby ensuring compatibility with existing hardware.”
link
