December 23, 2016
A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include:
Garbage collection events and latencies in Java and Node
Method invocations and latencies in Python and Ruby
Object allocations in Ruby and Java
Thread start and stop events in Java
Class load events in Java and Ruby
December 21, 2016
This has been an incredible year. Unfortunately, incredible often means very busy -- so this blog spent a record period of several months sitting alone in the dark, crying for attention. I thought it would make sense to briefly review what was going on during the last few months, before I try to return to my scheduled blogging routine.
Briefly put, 2016 was a year of conferences. I spoke at 20 events (18 of them international), not including various user group presentations. I haven't done the exact math but I guess I spent at least 2 months not sleeping in my own...
July 5, 2016
A couple of months ago it has come to my attention that a certain Russian publisher is selling a Russian translation of "Pro .NET Performance". The book seems reasonably popular, and at the recent DotNext conferences in Moscow and St. Petersburg I was even asked to sign it a couple of times.
I just wanted to let you know that I have nothing to do with this translation. As far as I'm aware, Apress (the book's publisher) doesn't know about it. I certainly didn't know about it, and I didn't approve the translation itself. They even got the authors' names on the...
May 5, 2016
The first few months of 2016 are so incredibly busy that I didn't find time to blog about my conference talks and provide additional resources, as I usually do. So here's a quick summary of my speaking engagements so far, and the plan for the next two months. Thanks for your patience!
Chilly Montreal: ConFoo
My first conference for 2016 was ConFoo in Montreal. This is a great community-driven non-profit conference, organized by the indefatigable Anna Filina and Yann Larrivee. One highlight from my visit this year is that, in three days, I spent exactly 2 minutes outside. And most of that time was frantically trying...
March 31, 2016
Now that BCC has support for USDT probes, another thing I wanted to try is look at OpenJDK probes and extract some useful examples. To follow along, install a recent OpenJDK (I used 1.8) that has USDT probes enabled. On my Fedora 22, sudo dnf install java was just enough for everything.
Conveniently, OpenJDK ships with a set of .stp files that contain probe definitions. Here's an example -- and there are many more in your $JAVA_HOME/tapset directory:
probe hotspot.thread_start = process("/usr/lib/jvm/java-1.8.0-openjdk-18.104.22.168-1.b03.fc22.x86_64/jre/lib/amd64/server/libjvm.so").mark("thread__start")
name = "thread_start";
thread_name = user_string_n($arg1, $arg2);
id = $arg3;
native_id = $arg4;
is_daemon = $arg5;
probestr = sprintf("%s(thread_name='%s',id=%d,native_id=%d,is_daemon=%d)",
March 30, 2016
BPF is the next Linux tracing superpower, and its potential just keeps growing. The BCC project just merged my latest PR, which introduces USDT probe support in BPF programs. Before we look at the details, here's an example of what it means:
# trace -p $(pidof node) 'u:node:http__server__request "%s %s (from %s:%d)" arg5, arg6, arg3, arg4'
TIME PID COMM FUNC -
04:50:44 22185 node http__server__request GET /foofoo (from ::1:51056)
04:50:46 22185 node http__server__request GET / (from ::1:51056)
Yep, that's Node.js running...
February 14, 2016
Warning: This post requires a bit of background. I strongly recommend Brendan Gregg's introduction to eBPF and bcc. With that said, the post below describes two new bcc-based tools, which you can use directly without perusing the implementation details.
A few weeks ago, I started experimenting with eBPF. In a nutshell, eBPF (introduced in Linux kernel 3.19 and further improved in 4.x kernels) allows you to attach verifiably-safe programs to arbitrary functions in the kernel or a user process. These little programs, which execute in kernel mode, can collect performance information, trace diagnostic data, and aggregate statistics that are then...
January 22, 2016
This blog post is also on GitHub in its entirety. If you prefer to read it there along with the code, I won't mind. Go ahead.
In one of my recent training classes, I was asked to demonstrate some practical uses of shared memory. My knee-jerk reply was that shared memory can be used for inter-process communication and message-passing. In fact, most IPC mechanisms are based on shared memory in their implementation. The question was whether it's worth the effort to build a message-passing interface on top of shared memory queues, or whether sockets or pipes could produce a better result...
January 21, 2016
I am often asked why memory-mapped files can be more efficient than plain read/write I/O calls, and whether shared memory is slower than private memory. These seemingly unrelated mechanisms share a common implementation in the Windows kernel, known as section objects or file mapping objects. Yes, this shared implementation powers memory pages that are shared across multiple processes (by name) as well as file regions mapped to memory pages (even in a single process).
If you're interested in a thorough discussion of how section objects work, I must refer you to Windows Internals, 6th Edition. But if you're only here for...
January 5, 2016
"How much memory is your process using?" -- I bet you were asked that question, or asked it yourself, more times than you can remember. But what do you really mean by memory?
I never thought it would be hard to find a definitive resource for what the various memory usage counters mean for a Windows process. But try it: Google "Windows Task Manager memory columns" and you'll see confusing, conflicting, inconsistent, unclear explanations of what the different metrics represent. If we can't even agree on what "working set" or "commit size" means, how can we ever monitor our Windows...