AIstorageNetwork interfaceThe battle for speed
AI storage network interfaces are divided into external interfaces and internal interfaces. Internal interfaces include commonPCIe,NvidiaNVLink, AMD's Infinity Fabric and Intel's Xe Link, etc., the internal interface is still in the stage of copper cable interconnection. In the future, we can use articles to talk about the development trend of optical interconnection of internal interfaces. External interfaces include Ethernet, Infiniband, Fiber Channel, and SAS. These networks have long been interconnected using optical modules. Let’s discuss the applications of different networks from the perspective of optical modules. It should be noted here that since the optical module transmits data, it does not care about the network type. The minimum unit frame of network technology is composed of an arrangement of 0 and 1, and the optical module transmits 0 and 1 only according to the host's signal. So when customers purchase optical modules, they only need to pay attention to: working rate, packaging form and transmission distance. This article also discusses from these three aspects. For the sake of memory, all the rates mentioned in this article are integers.
1.Ethernet
We have sorted out the Ethernet protocol before, so I won’t go into details here. Currently, the protocol for the maximum Ethernet speed 800G and 1.6T is still under development. According to the actual usage of Dachengpeng’s customers, the current AI computing power is mainly used for 400G and 200G, while AI storage is mainly used for 200G. The 400G package forms are OSFP, QSFP-DD and QSFP112, and the 200G package forms are QSFP-DD and QSFP56, which are determined according to the host interface package type. Transmission distance: Multimode optical module is used for 100 meters of multimode optical module, singlemode optical module is used for 500m and 2km of singlemode optical module, passive copper cable DAC is used for 3 meters of the cabinet, and active optical cable AOC is used for interconnection of server/storage host and TOR switch within 30 meters. Table 1 shows the rate type development of Ethernet networks.
Serial number |
Logo |
rate |
Package |
Rate* Telecommunication ChannelArraycombine |
1 |
FE |
100Mbps |
SFP |
100M*1 |
2 |
GE |
1Gbps |
SFP |
1G *1 |
3 |
10GE |
10Gbps |
SFP+ |
10G *1 |
4 |
25GE |
25Gbps |
SFP28 |
25G *1 |
5 |
40GE |
40Gbps |
QSFP+ |
10G *4 |
6 |
50GE |
50Gpbs |
SFP56 |
50G *1 |
7 |
100GE |
100Gpbs |
QSFP28 |
25G *4 |
8 |
200GE |
200Gpbs |
QSFP56 |
25G *8 |
QSFP-DD |
50G*4 |
|||
9 |
400GE |
400Gpbs |
QSFP-DD |
50G *8 |
OSFP |
100G *4 |
|||
QSFP112 |
100G*4 |
Table 1
Currently, the maximum IB network speed is 800G. According to the actual usage of Dachengpeng’s customers, the current AI computing power is mainly 800G and 400G, while AI storage is mainly 400G and 200G. Both the 800G and 400G package form IB are designated as OSFP, and the 200G package form IB is designated as QSFP56. Transmission distance: Multimode optical modules are used for 100 meters of multimode fiber, singlemode optical modules are used for 500m and 2km of singlemode optical modules, passive copper cable DACs are used for 3 meters of the cabinet, and active optical cables AOCs are used for server/storage hosts and TOR switches are used for within 30 meters. Table 2 shows the rate type development of IB networks.
Serial number |
Rate identification |
rate |
Package |
Rate*channel combination |
1 |
SDR |
2.5Gbps per channel |
CX4 |
2.5G *4 |
2 |
DDR |
5Gbps per channel |
CX4 |
5G *4 |
QSFP+ |
5G *4 |
|||
3 |
QDR |
10Gbps per channel |
QSFP+ |
10G*4 |
CXP |
10G*12 |
|||
4 |
FDR |
14Gbps per channel |
QSFP+ |
14G*4 |
CXP |
14G*12 |
|||
5 |
EDR |
25Gbps per channel |
QSFP28 |
25G*4 |
CXP2 |
25G*12 |
|||
6 |
HDR |
50Gbps per channel |
QSFP56 |
50G*4 |
7 |
NDR |
100Gbps per channel |
OSFP |
100G*4 |
8 |
XDR |
200Gbps per channel |
OSFP |
200G*4 |
Table 2
Channel
According to the previous plan, the 8th generation 128GFC should be formulated in 2021, but the launch time has been delayed again and again, and it has not yet been released in 2024. The current maximum speed is the 7th generation 64GFC formulated and completed in 2018, but the application progress is still very slow. According to Dachengpeng’s customer feedback, AI storage currently does not use FC networks, and the mainstream domestic equipment is still at 16GFC and 32GFC, and the speed cannot keep up with the demand for computing power. The package used in 16GFC and 32GFC is SFP28. Transmission distance: Mainly multimode optical modules are used for 100 meters of multimode optical fiber. Table 3 shows the rate type development of FC networks.
Serial number |
Logo |
rate |
Package |
Rate*channel combination |
1 |
1GFC |
1Gbps |
SFP |
1G*1 |
2 |
2GFC |
2Gbps |
SFP |
2G*1 |
3 |
4GFC |
4Gbps |
SFP |
4G*1 |
4 |
8GFC |
8Gbps |
SFP+ |
8G*1 |
5 |
16GFC |
14Gbps |
SFP+ |
14G*1 |
6 |
32GFC |
28Gbps |
SFP28 |
28G*1 |
7 |
64GFC |
56Gbps |
QSFP+ |
14G*4 |
Table 3
SAS can be used as an internal interface for storage devices or as an external interface. Currently, the latest specifications of SAS interface are SAS-4, which was formulated in 2017, which is 24G SAS. According to the original roadmap, the next step of development should be SAS-5 with doubled bandwidth, namely 48G SAS. However, the SCSI Trade Association (STA), which is responsible for formulating specifications, proposed a plan different from the original roadmap at the end of 2023, abandoned the 48G SAS route to increase bandwidth, and instead used the existing 24G SAS physical layer, and combined it with 24G+ SAS that enhances the reliability, security and efficiency of the upper layer protocol. However, according to the transmission principle of optical modules, there is no difference in data transmission between SAS-4 and SAS-3. SAS is packaged as a Mini SAS, and mainly adopts the product form of active optical cable AOC and passive copper cable DAC. Table 4 shows the rate type development of SAS networks.
Serial number |
Logo |
rate |
Package |
Rate*channel combination |
1 |
SAS-1 |
6Gbps per channel |
Mini SAS |
6G*4 |
2 |
SAS-2 |
12Gbps per channel |
Mini SAS |
12G*4 |
3 |
SAS-3 |
24Gbps per channel |
Mini SAS |
24G*4 |
4 |
SAS-4 |
24Gbps per channel |
Mini SAS |
24G*4 |
Table 4
Based on the above, we can intuitively see that in the field of AI storage, FC and SAS network interfaces are far from catching up with Ethernet and IB. At present, it is still a direct competition between Ethernet and IB. At present, under IB's excellent RDMA capabilities, it has already gained a leading technological advantage. In response to the rise of IB, Ethernet has formulated a RoCE protocol toInfiniBandThe RDMA transmission architecture is ported to the Ethernet network, making RoCE combine the low latency of RDMA and the low cost of Ethernet. However, its RDMA still has a series of shortcomings in congestion control, load balancing, etc. Therefore, the HyperEthernet Alliance established last year needs to formulate a new protocol to replace the existing RoCE protocol, improve Yongser management through a new transmission layer, and reduce latency, etc. The HyperEthernet Alliance plans to launch new standards in the third quarter of this year.