Willem de Bruijn says timestamping, via SO_TIMESTAMPING, is key to debugging network stack latency. Instead of gut-feel finger pointing between network and kernel tribes we can just get down to the facts of where the latency really is. SO_TIMESTAMPING can isolate transmission, reception and even scheduling sources. Capturing connection state along with timestamps further enables root cause discovery, such as TCP receive window size. Capturing timestamps at more points, such as traffic shaping and NIC hardware, expands visibility to tough issues like incast. SO_TIMESTAMPING has seen iterative development to enable fleetwide RPC monitoring at Google. They presented details of this infra called "Fathom" at SIGCOMM 2023[1]. In this talk Willem will start us beyond where that SIGCOMM paper ends. He will take a deep dive on the Linux kernel infrastructure that makes fleetwide continuous latency analysis and attribution possible.
API extensions include covering TCP bytestreams, capturing transport protocol state along with events (OPT_STAT), and supporting selective sampling (OPT_CMSG). The talk reviews the core SO_TIMESTAMPING API, discusses non-obvious extensions (MSG_EOR, SO_RCVLOWAT), summarizes gotchas from the field (OPT_ID_TCP), and explains how all this combines to enable robust continuous RPC monitoring. It touches on clock synchronization and precision. Finally, it compares this UAPI to dynamic tracing with uprobes, kprobes, tracepoints and BPF.
Should be fun, cant wait to attend this talk!
[1] search for "fathom" at https://conferences.sigcomm.org/sigcomm/2023/program.html
cheers, jamal