MoE DeepEP IBGDA RDMA 2025-10-27
Previewing UCCL-EP: Flexible and Efficient Expert Parallelism for Cloud and Beyond
GPU-driven communication (e.g., DeepEP) is the key to efficient and large-scale EP, but it cannot run on heterogeneous platforms in the public cloud due to tight coupling between GPU and NIC.