There has been an ongoing effort in the community to get Zero-copy
working over the last few years - with much success.
While both MSG_ZEROCOPY (zero copy transmit) and socket mmap (zero copy
receive) have been out in the wild for sometime now, there are no known
high scale open source application consumers of these interfaces.
In this talk, Or Gerlitz will describe how he makes use of these
interfaces to integrate into spdk(https://spdk.io/) - an open source
storage framework which uses sockets for nvme-over-tcp in a smart NIC
environment. The goal is to use kernel uAPIs while achieving high scale
performance.
Or will delve into MSG_ZEROCPOY and the challenges that he had to
deal with. He will describe the need for an app author to understand and
how to best tie-in their app state machine to the interfaces:
to address both transactions responses from the peer app and zero-copy
notifications from the local socket provider. Should the app use ZC with
all or nothing approach or sometimes yes and other-times no?
Come to the talk to get the answer and advise on how to effectively use
these interfaces.
In addition Or will spend time going into the performance analysis
details correlating I/O performance visavis CPU cost (which is often
tricky to get right with traditional tools like profiler/flame-graphs).
More info:
https://netdevconf.info/0x14/session.html?talk-storage-application-performa…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
MRP(Media Redundancy Protocol) is an open standard for ring topologies
in industrial ethernet networks defined in common standards-based
protocol(IEC 62439-2). In an MRP-enabled network each Ethernet switch
is connected to two other switches forming a ring. An MRP-enabled ring
can overcome single link point of failures at worst case recovery time
of 30ms - which is faster than STP.
In this talk Horatiu Vultur will describe the MRP protocol in
some detail. They will then proceed to discuss the effort to add support
to the kernel; different implementation approaches considered and
eventual implementation path taken after receiving feedback on the
mailing list. And last but not least, future work will be discussed
including hardware offload of MRP as well as preliminary results
comparing hardware-offloaded MRP vs non-offloaded version.
More info:
https://netdevconf.info/0x14/session.html?talk-adding-MRP-to-the-linux-kern…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
RPL is an IPv6 Routing Protocol for Low-Power and Lossy Networks defined
in RFC 6550. RFC 6550 defines a mode that is known as "storing" mode.
RFC 6554 defines a "non-storing" mode of route propogation.
While there are several other RPL open source implementations, they
all implement RPL using the "storing" mode which propagates routes
via ICMPv6. There are no open source "non-storing" mode implementations.
In this talk, Alexander Aring et al will discuss an implementation
that uses "non-storing" mode (RFC 6554).
In the storing mode(per RFC 6554), route propagation is done via the
IPv6 Routing Header.
Aring et al discuss the architecture, interface approach, challenges
that they overcame and outstanding future work.
More info:
https://netdevconf.info/0x14/session.html?talk-extend-segment-routing-for-R…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
Operations, Administration, and Maintenance (OAM)
refers to a set of techniques and mechanisms for performing
fault detection, isolation and performance measurements.
Classical approaches such are traceroute, ping, etc can now be
improved by collecting more granular and precise per-packet telemetry.
The IETF is currently in the process of standardizing In-situ OAM (IOAM)
to allow collecting operational information along a path.
In this talk Justin Iurman et al discuss an implementation of IOAM
for the Linux kernel with IPv6 as the encapsulation protocol.
They will discuss the details of their approach and demonstrate
evaluation results.
More info:
https://netdevconf.info/0x14/session.html?talk-implementation-of-IPv6-IOAM-…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
The PC has accepted a new workshop session.
Donald Sharp and David Lamparter will chair the FRRouting
(https://frrouting.org/) workshop.
Current agenda for the workshop includes:
* Where to put kernel boundary conventions:
* Current Status on netconf/yang models
* Status of using kernel nexthop groups FRR
* FRR feature rundown:
Recent past and future work
* Installing FRR for kernel developers
For more details:
https://netdevconf.info/0x14/session.html?workshop-FRR
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
Hierarchial bandwith management is a very important packet
service for a lot of networking use cases (ranging from large data
centres to service provider use cases, etc).
Over the last decade, the TC Hierarchical Token Bucket(HTB) qdisc
has emerged as the most popular non-work conserving queueing discipline
for enabling this service in Linux.
HTB is quite flexible and versatile, but at large scale
(think thousands to million flows) it comes at a cost:
1) cpu cycles predominantly due to stalls caused by shared
queus lock contentions 2)extensive memory costs when adding many flows.
At 0x14 we have two sessions that are addressing this issue in
different ways.
Our first talk is from Yosef Kuperman, Rony Efraim and Maxim
Mikityanskiy and focuses on offloading HTB to the NIC hardware
(Mellanox cnx5).
Flow classification takes place in the TC egress clsact to avoid
any sorts of (queue) locking. Packets are tagged and the offloaded
HTB uses these tags as flow/classids to select the correct queue in
the hierarchy.
Kuperman et al will go over the challenges they overcame, show
performance numbers and solicit feedback.
More Info:
https://netdevconf.info/0x14/session.html?talk-hierarchical-QoS-hardware-of…
Our second talk is from the Google folks Stanislav Fomichev, Eric
Dumazet, Willem de Bruijn, Vlad Dumitrescu, Bill Sommerfeld and
Peter Oskolkov.
Google has for many years utilized HTB and consequently faced scaling
challenges.
With the recent introduction of Early Departure Time model (See
Van Jacobson's keynote on EDT in netdev 0x12), an opportunity has
opened up to achieve the same packet service in a more efficient way.
In this talk, Stan et al describe how they moved away from HTB
altogether.
The packet service is created using composition of BPF, FQ and
EDT. The authors will provide performance numbers, discuss some of the
outstanding challenges and solicit feedback from the community.
More info:
https://www.netdevconf.info/0x14/session.html?talk-replacing-HTB-with-EDT-a…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
We are pleased to announce Bronze sponsorship from Meter[meter.com]!
Thank you for your support Meter.
Meter is the easiest way to get the best Internet and WiFi for offices.
Meter takes care of everything, from ISP selection and installation to
ongoing support and network management. Meter combines powerful
software, custom hardware, and dedicated experts to provide
dramatically better Internet speed, security, and reliability.
More info:
https://netdevconf.info/0x14/news.html?bronze-sponsor-meter-com
cheers,
jamal
In this talk Markuze Alex et al describe how they improved,
by orders of magnitude, client download times of a
global overlay network across public clouds.
The overlay network known as the Pathway project
(operated by VMware Research) interconnects
geographical spread of public clouds and their vast
compute and networking infrastructure
The secret sauce? KTCP, a Kernel module running on a
modified Linux Kernel which implements novel TCP
splicing.
Markuze and co. will discuss why their approach is
different relative to the many approaches already
out in the wild that implement TCP proxying.
They will present numbers against classical approaches
which demonstrate that KTCP is able to considerably
increase the link utilization by TCP connections and
reduce the connection latency close to its theoretical
minimum.
More info:
https://netdevconf.info/0x14/session.html?talk-kernels-of-splitting-TCP-in-…
cheers,
jamal
Tom Herbert loves moonshots and three-letter acronyms.
First it was XDP and now it is BP4.
In this talk, Tom will introduce BP4 - a Domain Specific Language
for programmable dataplanes based on unifying the best features of
eBPF and P4. The goal of BP4 is “write once, run anywhere, run well!”
BP4 is intended to run in _both software and hardware_ execution
environments.
Central to a BP4 program is a dynamically programmable parser that
supports a wide variety of protocols and permits support for new
protocols to be added on the fly. The BP4 parser semantics include
native support for parsing Variable Length Headers (VLH) that contains
TLVs, flag-fields, or variable length arrays.
Tom will describe the first PoC for BP4 which leverages the eBPF
infrastructure. The PoC implements a flow dissector as a BP4 program by
essentially replicating the functionality of the current Linux kernel
flow_dissector with extra functionality to handle TLV and flag-fields.
The programmable flow dissector will then be used as the basis for a
dynamic tc-flower classification (which will allow protocols to be
programmed and dynamically added for tc-flower processing).
More info:
https://netdevconf.info/0x14/session.html?talk-BP4-byte-code-for-programmab…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal
TLS aint cheap on the CPU. The trend on Internet traffic
is indicating that the majority of internet traffic is being
encrypted with TLS. In other words the most common packets
are using TLS! For this reason, we need to pay more attention
to TLS performance. At 0x14 we have a small TLS festival. And
its all about improving performance.
In the first talk, Pawel Szymanski and Manasi Deval that
assert the claim that you can achive good performance by
letting the CPU do its thing. Use modern CPU instructions
like X86 AESNI.
They run experiments that compare user-mode TLS, Kernel TLS
write and kernel TLS Sendfile to contrast various bottlenecks
in each one with regards to encryption and authentication,
cost of system calls and the memory bandwidth.
They will present their results during the talk.
The talk will also provide some insight on which of the
three approaches is best suited for different type of application
scenarios.
More info:
https://netdevconf.info/0x14/session.html?talk-TLS-performance-characteriza…
And in the second TLS talk, Tariq Toukan, Bar Tuaf and
Tal Gilboa discuss offloading of TLS to the NIC.
They start by reviewing the life cycle of a HW
offloaded kTLS connection and the driver-HW interaction
in order to support it.
They will then present their experiments where NginX TLS
activity is offloaded to the Mellanox ConnectX-6Dx NIC
(using mlx5e driver). And finally, they present their experimental
results which show significant performance speed-up gained by
offloading kTLS operations to the HW.
More info:
https://netdevconf.info/0x14/session.html?talk-kTLS-HW-offload-implementati…
In the third talk, Alexander Krizhanovsky and Ivan Koveshnikov continue
their quest(see netdev 0x12 talk) to investigate and improve TLS
handshake by moving it into the kernel (currently user space handled).
In continuation from 0x12, Alexander and Ivan have been experimenting
with new kernel approaches to reduce some perf culprits involved in
completing a TLS handshake, namely dynamic memory allocation and big
integer initialization. They will present their results which quantify
the new approach. The talk will cover many other perf topics related
to TLS handshake.
More info:
https://netdevconf.info/0x14/session.html?talk-performance-study-of-kernel-…
Reminder, registration is now open and early bird is still in effect.
https://netdevconf.info/0x14/registration.html
cheers,
jamal