DeepErr: Automatic Root-Cause Analysis of System Call Failures

System-call analyzer that uses symbolic-execution–generated comparative executions and probe-point–based control-flow tracing to pinpoint the exact failure predicate.

Operating Systems
Author

Nadav Amit and Michael Wei

Published

September 10, 2025

Abstract

System call failures present significant challenges for operating system (OS) users, as the failures are often cryptic and difficult to diagnose due to limited error codes and missing documentation. As a result, software developers struggle to utilize system calls effectively, and power users encounter difficulties configuring the OS and resolving environment problems. Existing automatic root-cause analysis tools are inadequate, primarily due to dependence on comparative analysis, which requires similar successful executions that are often unavailable.

In this paper, we present a tracing and root-cause analysis solution to address these limitations. We enable comparative analysis by using symbolic execution to generate analogous successful executions in the absence of actual ones. Furthermore, to address the limited availability and shortcomings of hardware-based control-flow tracing, we propose probe-point based tracing of the entire control flow. Utilizing these techniques, we develop DeepErr, a system call analyzer that identifies the precise predicate responsible for failures. DeepErr’s effectiveness is affirmed through application on 100 tests from the Linux Test Project, successfully pinpointing root causes in 91% of the scenarios and identifying the failing function in an additional 7% of cases.

Award: This paper received the Best Paper Award at SysTor 2025.