Sunday, April 2, 2017
In my last post, I lamented the lack of call stack support for LTTng events in .NET Core. Fortunately, being open source, this is somewhat correctable -- so I set out to produce a quick-and-dirty patch that adds USDT support for CoreCLR's tracing events. This post explores some of the things that then become possible, and will hopefully become available in one form or another in CoreCLR in the future.
Very Brief USDT Primer
USDT (User Statically Defined Tracing) is a lightweight approach for embedding static trace markers into user-space libraries and applications. I've taken a closer look a year ago when...
Thursday, March 30, 2017
After exploring the basic profiling story, let's turn to ETW events. On Windows, the CLR is instrumented with a myriad of ETW events, which can be used to tackle very hard problems at runtime. Here are some examples of these events:
Thread start/stop (including thread pool threads)
Exceptions thrown, caught, filtered
Methods compiled (JIT)
By collecting all of, or a subset of, these events, you can get a very nice picture of what your .NET application is doing. By combining these with Windows kernel events for CPU sampling, file accesses, process creations and more -- you have a fairly...
Friday, December 23, 2016
A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include:
Garbage collection events and latencies in Java and Node
Method invocations and latencies in Python and Ruby
Object allocations in Ruby and Java
Thread start and stop events in Java
Class load events in Java and Ruby
Thursday, March 31, 2016
Now that BCC has support for USDT probes, another thing I wanted to try is look at OpenJDK probes and extract some useful examples. To follow along, install a recent OpenJDK (I used 1.8) that has USDT probes enabled. On my Fedora 22, sudo dnf install java was just enough for everything.
Conveniently, OpenJDK ships with a set of .stp files that contain probe definitions. Here's an example -- and there are many more in your $JAVA_HOME/tapset directory:
probe hotspot.thread_start = process("/usr/lib/jvm/java-1.8.0-openjdk-188.8.131.52-1.b03.fc22.x86_64/jre/lib/amd64/server/libjvm.so").mark("thread__start")
name = "thread_start";
thread_name = user_string_n($arg1, $arg2);
id = $arg3;
native_id = $arg4;
is_daemon = $arg5;
probestr = sprintf("%s(thread_name='%s',id=%d,native_id=%d,is_daemon=%d)",
Wednesday, March 30, 2016
BPF is the next Linux tracing superpower, and its potential just keeps growing. The BCC project just merged my latest PR, which introduces USDT probe support in BPF programs. Before we look at the details, here's an example of what it means:
# trace -p $(pidof node) 'u:node:http__server__request "%s %s (from %s:%d)" arg5, arg6, arg3, arg4'
TIME PID COMM FUNC -
04:50:44 22185 node http__server__request GET /foofoo (from ::1:51056)
04:50:46 22185 node http__server__request GET / (from ::1:51056)
Yep, that's Node.js running...
Sunday, February 14, 2016
Warning: This post requires a bit of background. I strongly recommend Brendan Gregg's introduction to eBPF and bcc. With that said, the post below describes two new bcc-based tools, which you can use directly without perusing the implementation details.
A few weeks ago, I started experimenting with eBPF. In a nutshell, eBPF (introduced in Linux kernel 3.19 and further improved in 4.x kernels) allows you to attach verifiably-safe programs to arbitrary functions in the kernel or a user process. These little programs, which execute in kernel mode, can collect performance information, trace diagnostic data, and aggregate statistics that are then...
Friday, January 22, 2016
This blog post is also on GitHub in its entirety. If you prefer to read it there along with the code, I won't mind. Go ahead.
In one of my recent training classes, I was asked to demonstrate some practical uses of shared memory. My knee-jerk reply was that shared memory can be used for inter-process communication and message-passing. In fact, most IPC mechanisms are based on shared memory in their implementation. The question was whether it's worth the effort to build a message-passing interface on top of shared memory queues, or whether sockets or pipes could produce a better result...