Kenneth Cain, Intel, Corp.
Ken Cain is a software developer in Intel’s HPC organization in the Cloud and Enterprise Solutions Group. He is contributing to the development of DAOS, a very large-scale distributed storage solution. His experience in high performance networking includes switch and fabric management software, high performance host interfaces and communication middleware, and spans HPC, traditional Ethernet, and embedded systems interconnects. Before joining Intel, he contributed on teams in both research and commercial systems provider organizations.
The Intel team developing the Distributed Asynchronous Object Storage (DAOS) system proposes to present and engage in discussion with the OFA community of administrators, developers, and technology providers. DAOS architecture and implementation will be presented by exploring its interaction with fabric hardware/software (e.g., libfabric) and its role in delivering (with storage class persistent memory and NVMe SSDs) very high-performance scale-out object storage. Feedback from building this service using libfabric, and potential opportunities to further leverage fabric will be discussed. Areas of potential interest to this audience may include: increasing diversity of I/O patterns (e.g., large volumes of random reads/writes), checkpoint/restart snapshots,producer/consumer flows, small I/O and metadata handling using fabric and persistent memory, bulk data handling with fabric, persistent memory and SSDs. collective communication for scalability and efficient dissemination/retrieval of metadata, dynamic contraction/expansion of storage servers, data scaling for performance, data replication, erasure coding and fault domain awareness for resilience, metadata service resilience and protocols/fabric communications, user-space design for storage (PMDK, SPDK) and fabric (OFI) interactions, asynchronous data and metadata operations and progress, online data rebuild when storage node/target is lost or removed ; rebalancing when added. Integration with Lustre and unified namespace. POSIX filesystem emulation, MPI-I/O and HDF5 over DAOS.
Watch video Distributed Asynchronous Object Storage (DAOS) online, duration hours minute second in high quality that is uploaded to the channel insideHPC Report 22 June 2020. Share the link to the video on social media so that your subscribers and friends will also watch this video. This video clip has been viewed 1,800 times and liked it 23 visitors.