USDT/BPF Tracing Tools: Java, Python, Ruby, Node, MySQL, PostgreSQL

December 23, 2016

no comments

A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include:

  • Garbage collection events and latencies in Java and Node
  • Method invocations and latencies in Python and Ruby
  • Object allocations in Ruby and Java
  • Thread start and stop events in Java
  • Class load events in Java and Ruby

This information can be used for a variety of investigations in the field, as well as performance monitoring of a running production system.

If you have an application or runtime instrumented with USDT probes, there are a bunch of tools in the BCC project that you can now use to get low-overhead, safe, and powerful information from a production system (granted that you have a recent enough kernel to use BPF). A couple of days ago I was able to get several new tools merged, and it’s time to describe a few of them along with examples and usage scenarios. Let’s go!

Raw Probes: tplist, trace, and argdist

If you know exactly which probes you care about, and are willing to do the legwork of correlating trace output yourself, you can use the multi-tool tracing trio: tplist, trace, and argdist. All three tools support USDT probes. For example, you can use tplist to get a list of the probes embedded into a running Node process:

# tplist -p $(pgrep -n node)
../node node:gc__start
../node node:gc__done
../node node:http__server__request
../node node:http__server__response
<...more output omitted for brevity...>

To figure out the format of the probe, you can apply tplist again to see the number and arguments and their types, but you will often need to consult the runtime’s source to see what the arguments stand for. Specifically, for the http__server__request probe in Node, the probe’s arguments include the requested URL, the client’s IP address, the HTTP method, and some other details.

Next, you can instrument probes you’re interested in, for example to monitor URLs that are frequently accessed by clients, with the argdist multi-tool:

# argdist -C 'u:/opt/bin/node:http__server__request():char*:arg6' -p $(pgrep -n node)
  123    arg6 = /index.html
  11     arg6 = /favicon.ico
  1      arg6 = /

… or to print a trace message whenever there is a garbage collection in the Node process with the trace multi-tool:

# trace 'u:/opt/bin/node:gc__start' -p $(pgrep -n node) -T
TIME     PID   TID   COMM       FUNC
13:56:54 7333  7333  node       gc__start

I especially love trace because it’s essentially a dynamic logger. I can attach it to arbitrary locations in the kernel or in user-space processes, and get live logs when interesting things happen — with minimal overhead. But I digress.

Java, Python, Ruby, Node: u* Tools

Learning the different kinds of probes that are available in different runtimes and then tracing and correlating this information can be a bit tedious. This is why I wrote a set of tools (inspired by Brendan Gregg’s work on the DTrace Toolkit for Solaris) that help trace GC latency, method calls and flow (including syscalls), thread start and stop events, and other interesting statistics and metrics for high-level language runtimes. These tools have just recently been merged into the BCC project, and should work on all Linux kernels after 4.3.

For example, let’s trace garbage collections in a Java process:

# ugc -m java $(pidof java)
Tracing garbage collections in java process 1241... Ctrl-C to quit.
START   DESCRIPTION                    TIME (ms)
1.831   PS Scavenge PS Eden Space ...  50.00
2.005   PS Scavenge PS Old Gen ...     19.00

Or, trace method calls in a Python script:

# ucalls -L -l python $(pidof python)
Tracing calls in process 4049 (language: python)... Ctrl-C to quit.
METHOD                       # CALLS   TIME (us)
./   42        2017394.35
./        17        1011.38

Or, count syscall frequencies and latencies in the same process:

# ucalls -LS 4923
Attached 742 kernel probes for syscall tracing.
Tracing calls in process 4923 (language: none)... Ctrl-C to quit.
METHOD                       # CALLS   TIME (us)
sys_select                   42        1998941.02

Or, trace Java thread creation and destruction events:

# uthreads -l java 27420
Tracing thread events in process 27420 (language: java)... Ctrl-C to quit.
TIME     ID               TYPE     DESCRIPTION                   
18.596   R=9/N=0          start    SIGINT handler                
18.596   R=4/N=0          stop     Signal Dispatcher             

Or, trace Ruby object allocations in a REPL process:

# uobjnew ruby 27245
Tracing allocations in process 27245 (language: ruby)... Ctrl-C to quit.
TYPE                           # ALLOCS      # BYTES
NameError                             1            0
RubyToken::TkSPACE                    1            0
RubyToken::TkSTRING                   1            0
String                                7            0
RubyToken::TkNL                       2            0
RubyToken::TkIDENTIFIER               2            0
array                                55          129
string                              344         1348

Or even trace method flow in a Ruby process, including filtering capabilities (this also works for Python OOTB and for Java with the -XX:+ExtendedDTraceProbes flag):

# uflow ruby 27245
Tracing method calls in ruby process 27245... Ctrl-C to quit.
3   27245  27245  4.536    <- IO.gets                              
3   27245  27245  4.536    <- IRB::StdioInputMethod.gets           
3   27245  27245  4.536    -> IRB::Context.verbose?                
3   27245  27245  4.536      -> NilClass.nil?                      
3   27245  27245  4.536      <- NilClass.nil?                      
3   27245  27245  4.536      -> IO.tty?                            
3   27245  27245  4.536      <- IO.tty?                            
3   27245  27245  4.536      -> Kernel.kind_of?                    
3   27245  27245  4.536      <- Kernel.kind_of?                    
3   27245  27245  4.536    <- IRB::Context.verbose?                
3   27245  27245  4.536    <- IRB::Irb.signal_status               
3   27245  27245  4.536    -> String.chars                         
3   27245  27245  4.536    <- String.chars                         

Or just monitor interesting events across all processes on the system that have one of the supported runtimes:

# ustat -C 5
Tracing... Output every 5 secs. Hit Ctrl-C to end
12:18:26 loadavg: 0.27 0.11 0.04 2/336 26455

PID    CMDLINE              METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXC/s  THR/s 
3018   node/node            0          1      0          0        0      0     

12:18:31 loadavg: 0.33 0.12 0.04 2/336 26456

PID    CMDLINE              METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXC/s  THR/s 
3018   node/node            0          0      0          0        0      0     
26439  java -XX:+ExtendedDT 2776045    0      0          0        0      0     

12:18:37 loadavg: 0.38 0.14 0.05 2/336 26457

PID    CMDLINE              METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXC/s  THR/s 
3018   node/node            0          0      0          0        0      0     
26439  java -XX:+ExtendedDT 2804378    0      0          0        0      0    

Databases: dbslower

Finally, database engines such as MySQL and PostgreSQL are also instrumented with USDT probes. These can be very effective in identifying slow queries, tracing the query text, monitoring transaction execution and locks taken, and other various database events that might be difficult to trace without otherwise instrumenting your applications. The dbslower tool, which was developed as an exercise solution for my Linux tracing workshop, can attach to MySQL and PostgreSQL processes and trace queries slower than a predefined threshold (following the design of fileslower, ext4slower, and other similar tools from BCC). Already in BCC you can find Brendan Gregg’s mysqld_slower, which is a version that only supports MySQL:

# mysqld_qslower $(pgrep -n mysqld) 0.1
Tracing MySQL server queries for PID 14371 slower than 0.1 ms...
TIME(s)        PID          MS QUERY
0.000000       18608    24.201 SELECT COUNT(*) FROM words
13.242390      18608   130.378 SELECT * FROM words WHERE word REGEXP '^bre.*n$'
23.601751      18608   119.198 SELECT * FROM words WHERE word REGEXP '^zzzzzzzz$'

In Closing

If you haven’t seen the power of USDT tracing before, this is a good time to start. Runtimes and frameworks that are instrumented with USDT probes can be used very effectively for production investigations that do not require a system restart, do not require debug information, and do not require invasive steps such as ptrace-ing or suspending the target. I can only hope more software for Linux will ship with USDT on by default in the future, so we can all benefit from it in our production apps.

It’s also worth mentioning that even if you don’t have a kernel new enough to use the BPF tools above, you can still benefit from USDT tracepoints. The venerable perf is getting support for SDT events (try perf buildid-cache --add <path> and then perf list sdt), and you can also use tools that support uprobes to collect USDT information, albeit with some effort. SystemTap, too, is an option to consider.

You can also follow me on Twitter, where I put stuff that doesn’t necessarily deserve a full-blown blog post.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>