What happened
OpenAI, collaborating with AMD, Broadcom, Intel, Microsoft, and NVIDIA, released Multipath Reliable Connection (MRC), an open-source networking protocol. This new protocol, whose full specification is now available through the Open Compute Project, aims to significantly enhance GPU networking performance and resilience specifically within large-scale AI training clusters. The joint effort by these industry leaders focuses on establishing a common standard for improving the underlying infrastructure critical for advanced AI model development.
Why it matters
MRC's release directly impacts platform engineers and security architects managing large-scale AI infrastructure by standardising network reliability and efficiency. This open specification, following previous hyperscaler efforts to standardise AI interconnects, provides a common framework. It aims to improve training throughput and predictability for frontier models, reducing reliance on proprietary solutions and potentially lowering overall network costs by optimising hardware utilisation.




