Case study

Fujian Telecom assesses the benefits of remote direct memory access over 25GE for their cloud storage

Fujian Telecom Company Limited, founded in 2003 and based in Fuzhou, China, is now a part of China Telecom Corp. Ltd. Fujian provides fixed telephony, data communications, network elements and Internet services, and now plays a leading role in the emerging narrowband IoT sector. NB-IoT connects large populations of small devices like sensors and smart meters to serve utilities and applications across smart cities, agriculture, water quality monitoring and other large scale deployments.

Fujian had been relying on a traditional enterprise database for several years and, like most service providers, had been struggling with software and hardware performance issues. Most pressing was the challenge posed by the ever-increasing amount of data to be processed and the poor performance resulting from low-speed, legacy storage.

Instead of looking for another closed, blackbox solution, Fujian wanted an open source networking infrastructure that would give them greater flexibility, agility and higher performance.  The company also decided to migrate from 10Gbs to 25Gbs Ethernet (25GE).

25GE is a new standard based on technologies defined for 100G Ethernet, which is actually implemented as four 25Gbps channels running on four fiber or copper cable pairs. 25GE has significant advantages over 10GE, including excellent scalability. It offers higher bandwidth in a single channel at lower cost and lower power consumption, allowing superior server and switch port density. 25GE solutions are also backward and forward compatible with 10GE, 50GE, 100GE, and future 200GE and 400GE.

Choosing partner and solution

After weighing up its options, Fujian sought advice from Mellanox Technologies – a major supplier of end-to-end Ethernet intelligent interconnect solutions for servers, storage, and hyper-converged infrastructures. In the first 8 months this year Mellanox shipped over 2 million 25G and higher speed Ethernet adapters worldwide, making it a natural first choice.

Mellanox, with experience of supplying global cloud giants’ data centers, suggested a creative solution that involved using remote direct memory access (RDMA) over Converged Ethernet (RoCE) to build a cloud storage platform. 25GE’s high bandwidth interconnection would have imposed a heavy network request processing load on CPU, but the adoption of smart adapters and flow control technology such as RoCE would greatly reduce that burden. RDMA allows a network adapter direct access to an application buffer, thus bypassing kernel, CPU and protocol stack, freeing the CPU to process more application tasks during I/O transmission. Better server performances enables a significant increase in application workload to oprtimize the benefits of the higher speed Ethernet.

The storage challenge

System performance depends on CPU and network – but equally on storage performance. Wikibon’s predicts continuing steady overall growth in the overall storage market. However, the market share of Hyper-Scale Server SAN, enterprise-level Server SAN and other distributed storage is expected to exceed 80% in 2021 and 90% in 2026, while the decline of traditional SAN/NAS storage is accelerating.

The challenge for distributed storage is to optimize remote access performance. In terms of protocol, RoCE-based access technologies are suitable. In network terms the emphasis is on better utilization of the network adapters’ offload characteristics combine with the low latency, zero packet loss characteristic and advanced flow control mechanism of Ethernet switching.

In terms of storage medium, NVMe SSD has better random read-write performance and the U.2 interface supports hot plug, and can achieve RAID to meet storage’s high reliability requirement. To provide the right blend of capacity and speed, high-speed NVMe SSD serves as the cache, with high-capacity hard disks holding masses of cold data. Since reading and writing a single NVMe SSD requires a minimum bandwidth exceeding 20Gb/S, the chosen 25GE network is the minimum requirement.

In a distributed architecture it is also very important to ensure data integrity. The right combination of switch and adapter is essential to achieve end-to-end flow control. Mellanox’s intelligent network solutions provided appropriate high bandwidth and low latency, while the recommended RDMA/RoCE technology would accelerate data transmission in an efficient lossless network. See figure 1.

The resulting performance

Fujian Telecom measured the resulting performance upgrade with 8 computing nodes. In the key 4K random read-write test, the throughput of the 25GE RoCE network is about 9-14 times that of 10GE network (Figure 2), while the latency is only 5~10% of that of 10GE network (Figure 3).

 

 

In the same environment, the sequential read-write bandwidth of 25GE network is more than 2.2 times that of 10GE network, and the latency is again only about 5% of that of 10GE network.

Conclusion

For high-performance distributed storage,the 25GE network’s higher throughput  and reduced latency has obvious advantages compared with traditional 10GE. These results confirm the feasibility, stability and high performance of distributed storage in a core production deployment.

Further work can address ways to optimize the fit between distributed storage and specific applications. Another project could be to explore RoCE technology’s potential to improve other aspects of the telecommunication network.