Offloading relieves the host resources to be used for application work. Traditional offloading has been focussed mostly with either plumbing features (TSO, checksum, etc) or limited L4 (TC offloads, etc). Offloading of TCP machinery -famously called TOE- for example has been frowned upon for many reasons[1]. Ryo Nakamura and Hajime Tazaki feel they could address these outstanding issues transparently. The application on the host has no idea that the rest of the stack is outsourced somewhere else... How, you ask?
On the host side they intercept system calls made by the application and transparently route them to a syscall proxy sitting on an xPU. The syscall proxy, which is a lightweight user space program, interfaces on the northbound to the host and on its southbound to Linux Kernel Library(LKL)[2]. LKL is optimized to improve packet processing code paths. Our esteemed speakers have implemented this feature using the Nvidia Bluefield DPU.
In this talk they will present the design of the offload and discuss results of various workload performance tests: 1) a typical TCP transmission flow with existing offloading feature (TSO/LRO, checksum), 2) avoiding data copying between host and NIC with sendfile syscalls, and 3) kTLS offload with sendfile syscall (but not optimized as no hardware crypto offloading) over existing implementations (nginx and openssl). Last, but not least, they will briefly discuss the opportunities for the future improvements, including further offload (crypto, larger segmentation size for TCP, our RDMA channel implementation, and a choice of passthrough method, etc).
[1]https://wiki.linuxfoundation.org/networking/toe [2]https://lkl.github.io/
cheers, jamal