Jianxin Xiong, Intel, Corp.
Jianxin Xiong is a Software Engineer at Intel. Over the past 15+ years he has worked on various layers of interconnection software stack, such as RDMA drivers in Linux kernel, RDMA device virtualization, Open Fabric Interface, DAPL, Tag Matching Interface, and Intel MPI. His current focus is GPU/accelerator scale-out with RDMA devices.
Discrete GPUs have been widely used in systems for high performance data parallel computations. Scale-out configuration of such systems often include RDMA capable NICs to provide high bandwidth, low latency inter-node communication. Over the PCIe bus, the GPU appears as peer device of the NIC and extra steps are needed to set up GPU memory for RDMA operations. Proprietary solutions such as Peer-Direct from Mellanox have existed for a while for this purpose. However, direct use of GPU memory in RDMA operations (A.K.A. GPU Direct RDMA) is still unsupported by upstream RDMA drivers. Dma-buf is a standard mechanism in Linux kernel for sharing buffers for DMA access across different device drivers and subsystems. In this talk, a prototype is presented that utilizes dma-buf to enable peer-to-peer DMA between the NIC and GPU memory. The required changes in the kernel RDMA driver, user space RDMA core libraries, as well as Open Fabric Interface library (libfabric) are discussed in detail. The goal is to provide a non-proprietary approach to enable direct RDMA to/from GPU memory.
Watch video RDMA with GPU Memory via DMA-Buf online, duration hours minute second in high quality that is uploaded to the channel insideHPC Report 22 June 2020. Share the link to the video on social media so that your subscribers and friends will also watch this video. This video clip has been viewed 2,586 times and liked it 26 visitors.