Rethinking Data-Center Network for AI: A Co-designed Architecture with Ramanujan Graphs and Spectral Principles
Abstract
Modern data centers face unprecedented challenges from complex communication patterns generated by large-scale AI training and novel heterogeneous accelerators. To address these issues, we present a co-designed data center network (DCN) architecture that holistically integrates topology, routing, and scheduling around a unified spectral principle. The core of our design is a hierarchical network built from Ramanujan subclusters, which leverages their near-optimal expansion properties for practical deployment. At the topology level, we propose an heuristic construction for Ramanujan graphs that supports arbitrary degrees for an arbitrary number of nodes and a scalable subcluster-merging procedure. At the routing level, we introduce a spectral-health-aware routing scheme that combines offline precomputation with lightweight online eigenvalue estimation to maintain performance under congestion and failures. At the scheduling level, we develop a topology-aware scheduling framework that uses spectral-affinity clustering to align distributed training workloads with the underlying network structure. Extensive simulations across diverse network topologies, traffic patterns, and large-scale training scenarios demonstrate that our co-designed system consistently achieves lower latency, more balanced link utilization, and improved robustness under challenging conditions including hotspot traffic and link failures.
Citation Information
@article{chengma2026,
title={Rethinking Data-Center Network for AI: A Co-designed Architecture with Ramanujan Graphs and Spectral Principles},
author={Cheng Ma and Yuqi Yang and Bo Xiao and Haizheng Xu and Sirui Zhang},
journal={Nature Portfolio},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-8648659/v1}
}
SinoXiv