Friday, March 27, 2009

DTrace, XTrace, Friday


DTrace: Dynamic Instrumentation of Production Systems

DTrace is a tracing tool with the following features:
-entirely dynamic, has zero cost when not turned on
-centralized, one stop shop for all kernel and user level tracing
-built into Solaris, and subsequently other systems.
-guarantees no silent trace data corruption
-arbitrary actions can be used to enable probes
-predicates (filtering) at the source
-virtualized per consumer (so no weird sharing of probes)

->Providers are kernel modules responsible for instrumenting the system. the framework calls them and they call back into the dtrace kernel to set up a probe
->Probes are created and then advertised to consumers, who can enable them , which creates an enabling control block (ECB). if more than one consumer enables a probe, then each one has an ECB associated with it and when the provider fires that probe then the DTrace framework loops through all ECBs for it and runs each one independently (this is where per ECB predicates kick in and provide filtering of which probes will be activated per consumer)
->each ECB has a list of actions for the probe, the actions are run and can store stuff in per CPU mem buffer that each consumer has associated with it. Actions are limited to restrict the amount of overhead and damage they can impose (e.g. can't change registers). An ECB always storest the same amount of data in the consumer's mem buffer. actions can fail because mem buffer is full, i.e. the data is simply droped and buffers drop count is ++.
->Two buffers are kept per consumer per cpu, one active, one inactive. INTERRUPTS ARE DISABLED IN BOTH PROBE PROCESSING AND BUFFER SWITCHING
->Actions and predicates are specified in a simple RISC instruction set, also a C-like language called D

What is the problem? Is the problem real?
Today, performance testing tools are aimed at developers but the perfomnace testing itself is being done more by systems integrators, and tools for this level fo perf. testing are scarse.

What is the solution's main idea (nugget)?
A complex but amazingly well engineered low level tracing tool which can capture system calls, function boundries, kernel synchronization primitives, and many other metrics.

Why is solution different from previous work? (Is technology different?, Is workload different?, Is problem different?)
They adhere to a zero overhead (when not enabled) policy, also this system is implemented more ubiquituously and at a deeper level than any other tracing tracing tool I know of.

Can you write straight assembly code for your actions/predicates? (probably.) Do permissions ever become an issue? It seems like you need to run this as root, are there scenarios that would require not giving someone root but still giving them (potentially restricted) dtrace functionality.


This paper Presents a path based tracing tool which requires the instrumentation of source code to propagate a small constant amount of trace metadata and emit log statements. The log statements can be reconstructed offline into an partial ordering so that the relative timing of many events (e.g. function call or returns), even those events collected from different architectural layers of the system, can be known deterministically.

What is the problem? Is the problem real?
Doing finger pointing and event correlation in a distributed system is difficult.

What is the solution's main idea (nugget)?
We can augment the data execution path to pass around a small fixed amount of metadata which can be used to correlate events in a system distributed physically or even administratively.

Why is solution different from previous work? (Is technology different?, Is workload different?, Is problem different?)
Previous work did not focus on the cross layer aspect of causal tracing.

One criticism I have is that code instrumentation is a heavyweight tool, especially when it comes to low level libraries like RPC libs, and they don't present any work on techniques for automating the pain of this implementation.
They also cite Pinpoint, which they describe as being very similar to X-Trace. In the paper they say "Our work focuses on recovering the task trees associated with multi-layer protocols, rather than the analysis of those recovered paths." The lack of focus on analysis of trace data is actually a weakness, not something which should be used to differentiate the systems. Also, the analysis of the trace data presupposes the feasible collection of it, which makes me wonder if the Pinpoint system simply took the collection as trivial or as less interesting than the analysis

1 comment:

Ion said...

Good point about the analysis vs. instrumentation. At the end of the day the instrumentation is only as valuable as the type of analysis it enables.