OSDI 06 Seattle WiP: Pattern Mining Kernel Trace Data to Detect Systemic Problems
Christopher LaRosa, Li Xiong, Ken Mandelberg
Abstract:
Systemic performance problems resulting from inter-process or
process-to-operating system interactions are difficult to diagnose using
traditional system administration tools. Recent kernel-level tracing tools,
including DTrace and the Linux Trace Toolkit (LTT), provide detailed process
and operating system tracing functionality that may help users isolate these
difficult-to-identify problems on deployment systems. However, using these
tools to diagnose systemic problems typically involves writing a series of ad
hoc scripts to control an in-kernel tracing system; these scripts often rely
on application-level implementation knowledge. General, system-wide traces
have not been useful for identifying the causes of systemic problems because
users have been stymied by a lack of proper trace analysis tools. Manually
inspecting the noisy and voluminous data generated by unrestricted
kernel-level tracing remains nearly impossible.
Our key idea is to address the analysis functionality shortcomings of kernel
tracing tools by applying data mining techniques, specifically frequent
pattern mining, to general system-wide traces. These mining algorithms uncover
systemic problems in trace data that escape the detection of traditional
system monitoring utilities. While our system has a number of innovative
features, we will focus on the novel application of frequent itemset mining,
which allows for special treatment of the temporal characteristics of kernel
trace data. Our treatment overcomes the challenges presented by large gaps in
process execution and process reorderings introduced by scheduler. Our
framework also includes a data filtering technique that allows versatile and
flexible cross-attribute pattern mining and a suite of tools that maintains
the full semantics of each trace event while allowing for efficient mining.
We will show one preliminary study with the problematic gtik gnome applet
(version 2.0), using both LTT and DTrace data, where we successfully detect
frequent interactions between gtik and X server that point directly to the
problematic gtik application. Our second preliminary study shows the approach
successfully detects a poorly implemented server application that spawns many
short-lived processes. These systemic problems had previously only been
identified through repeated, ad hoc use of DTrace in conjunction with
traditional system administration tools. Our approach is an improvement in
both efficiency and effectiveness in that it requires no custom script writing
or implementation knowledge. We will briefly discuss our ongoing work
including tightening our system's coupling with the operating system and
developing a library of standard frequent patterns that can be used to
eliminate output noise in the form of frequent, but uninteresting patterns.
Presentation Slides, Monday Nov. 6, 2006
MSKD: A System for Mining Kernel Trace Data
Christopher LaRosa
The introduction of low-impact, kernel-level tracing tools allows for comprehensive and transparent reporting of process and operating system activity. An operating system trace log provides detailed, time-series information about processoper trace analysis tools that
can be used to extract useful knowledge from raw trace logs.
We develop a Mining System for Kernel Trace Data (MSKD) that fuses kernel tracing tools with data mining tools. Our system transforms the output of tracing tools into compact, mine-ready data. Our system sends this data into a maximal frequent itemset mining application, harvests the output of the mining application, and reports frequent execution patterns in a human-readable format for use in debugging
or optimization tasks.
We implement prototypes of our system using dTrace, LINUX Trace Toolkit, and MAFIA. In this paper we detail our practical experience using these frameworks.
Presentation Slides, Aug. 14, 2006