Batching with End-to-End Performance Estimation

Networking
Performance

Introduces methods to optimize TCP/IP batching decisions by estimating end-to-end performance through simple counters in TCP metadata exchanges.

Author

Avidan Borisov, Nadav Amit, and Dan Tsafrir

Published

May 14, 2025

Abstract

Batching heuristics are used in multiple layers of the TCP/IP stack, aiming to improve performance by amortizing overheads. When performance is defined as average latency and throughput, optimal batching decisions can be infeasible if application-perceived end-to-end performance is unknown, which is commonly the case in general-purpose setups. We address this problem by occasionally adding a few easily maintained counters to TCP metadata exchanges and using them to estimate end-to-end performance via Little’s law. We experimentally show that these estimates are accurate when application requests can be identified by the kernel (corresponding, for example, to send system calls, packets, or some fixed number of bytes). Had these estimates been used to dynamically toggle Nagle batching, they could have extended Redis’s range of sustainable throughput at tolerable latencies by nearly 2x and improved latency within this range by as much as nearly 3x. When the kernel cannot identify requests on its own, we propose that applications use a simple new interface to enlighten it, thereby ensuring accuracy.