Network
Architecture Optimization is the engineering discipline of removing the "speed limit"
from a supercomputer.
In HPC,
processors have become so fast that they spend most of their time waiting for
data to arrive from other nodes. If you have 1,000 CPUs but they spend 50% of
the time waiting for messages, you effectively only have 500 CPUs.
Network
Optimization focuses on Latency (how fast a message starts) and Topological
Efficiency (how many hops a message takes). It transforms a
"connected" cluster into a "tightly coupled" supercomputer.
Here is the
detailed breakdown of the optimization strategies, the critical role of RDMA,
and the topology choices, followed by the downloadable Word file.
1. The
Core Objective: Eliminating "Jitter"
Optimization
is not just about buying 400Gbps cables (Bandwidth). It is about ensuring that
every message arrives in exactly 1.2 microseconds, every single time.
2. The Protocol: RDMA is Mandatory
You cannot
use standard TCP/IP for high-performance simulation. It requires the OS Kernel
to process every packet, which adds ~10 microseconds of latency.
3.
Topology Optimization (The Shape of the Web)
How you
connect the switches determines how scalable the system is.
A. Fat
Tree (The Gold Standard)
B. Dragonfly / Dragonfly+
4.
Advanced Tuning Techniques
A.
Adaptive Routing (AR)
B. Sharp / In-Network Computing
5. Key Tools & Applications
|
Category |
Tool |
Usage |
|
Diagnostics |
Ibnetdiscover / Ibdiagnet |
The
"MRI" for InfiniBand. It scans the fabric to find links running at
1x speed instead of 4x (bad cables). |
|
Benchmarking |
OSU Micro-Benchmarks |
The
standard ruler. Measures Latency (in microseconds) and Bandwidth (in GB/s)
between two nodes or all-to-all. |
|
Management |
UFM (Unified Fabric Manager) |
NVIDIA's
software brain that watches for "Congestion Spreading" and suggests
routing changes. |
|
Testing |
Netperf / Iperf3 |
Basic
tools for testing Ethernet/RoCE throughput. |