March 31, 2016
Now that BCC has support for USDT probes, another thing I wanted to try is look at OpenJDK probes and extract some useful examples. To follow along, install a recent OpenJDK (I used 1.8) that has USDT probes enabled. On my Fedora 22, sudo dnf install java was just enough for everything.
Conveniently, OpenJDK ships with a set of .stp files that contain probe definitions. Here's an example -- and there are many more in your $JAVA_HOME/tapset directory:
probe hotspot.thread_start = process("/usr/lib/jvm/java-1.8.0-openjdk-184.108.40.206-1.b03.fc22.x86_64/jre/lib/amd64/server/libjvm.so").mark("thread__start")
name = "thread_start";
thread_name = user_string_n($arg1, $arg2);
id = $arg3;
native_id = $arg4;
is_daemon = $arg5;
probestr = sprintf("%s(thread_name='%s',id=%d,native_id=%d,is_daemon=%d)",
March 30, 2016
BPF is the next Linux tracing superpower, and its potential just keeps growing. The BCC project just merged my latest PR, which introduces USDT probe support in BPF programs. Before we look at the details, here's an example of what it means:
# trace -p $(pidof node) 'u:node:http__server__request "%s %s (from %s:%d)" arg5, arg6, arg3, arg4'
TIME PID COMM FUNC -
04:50:44 22185 node http__server__request GET /foofoo (from ::1:51056)
04:50:46 22185 node http__server__request GET / (from ::1:51056)
Yep, that's Node.js running...
February 14, 2016
Warning: This post requires a bit of background. I strongly recommend Brendan Gregg's introduction to eBPF and bcc. With that said, the post below describes two new bcc-based tools, which you can use directly without perusing the implementation details.
A few weeks ago, I started experimenting with eBPF. In a nutshell, eBPF (introduced in Linux kernel 3.19 and further improved in 4.x kernels) allows you to attach verifiably-safe programs to arbitrary functions in the kernel or a user process. These little programs, which execute in kernel mode, can collect performance information, trace diagnostic data, and aggregate statistics that are then...
January 22, 2016
This blog post is also on GitHub in its entirety. If you prefer to read it there along with the code, I won't mind. Go ahead.
In one of my recent training classes, I was asked to demonstrate some practical uses of shared memory. My knee-jerk reply was that shared memory can be used for inter-process communication and message-passing. In fact, most IPC mechanisms are based on shared memory in their implementation. The question was whether it's worth the effort to build a message-passing interface on top of shared memory queues, or whether sockets or pipes could produce a better result...
January 21, 2016
I am often asked why memory-mapped files can be more efficient than plain read/write I/O calls, and whether shared memory is slower than private memory. These seemingly unrelated mechanisms share a common implementation in the Windows kernel, known as section objects or file mapping objects. Yes, this shared implementation powers memory pages that are shared across multiple processes (by name) as well as file regions mapped to memory pages (even in a single process).
If you're interested in a thorough discussion of how section objects work, I must refer you to Windows Internals, 6th Edition. But if you're only here for...
January 5, 2016
"How much memory is your process using?" -- I bet you were asked that question, or asked it yourself, more times than you can remember. But what do you really mean by memory?
I never thought it would be hard to find a definitive resource for what the various memory usage counters mean for a Windows process. But try it: Google "Windows Task Manager memory columns" and you'll see confusing, conflicting, inconsistent, unclear explanations of what the different metrics represent. If we can't even agree on what "working set" or "commit size" means, how can we ever monitor our Windows...
January 2, 2016
A few weeks ago, I had the honor of being invited to speak at DotNext 2015, Russia's only .NET conference and one of the leading developer conferences in the country. As some of my readers probably know already, I was born in the USSR, so I speak Russian with a heavy Israeli accent but can understand both written and spoken Russian very well. The fact it was my wife's birthday and we could elope for a weekend of wintery weather and hardcore CLR internals only added to my resolve.
I proposed two talks, and the organizers had such difficulty picking...
December 9, 2015
I'm writing this on the plane back home from a week-long trip to Orlando, Vilnius, and Kiev, where I had the chance to speak at Live360! and BuildStuff; I've just counted and it's my tenth flight in three weeks, which is quite insane. But this is my second-to-last trip for 2015 -- the last one is going to be in December to DotNext Moscow.
In this talk, we discussed vector registers and instructions that you could use from other languages like FORTRAN and C++ for more than 15 years. Starting from the MMX instruction set extensions in the 1997...
October 27, 2015
I travel to a lot of conferences, but among the ones I like the most are Software Architect and DevWeek. I'm writing this post on the flight back home from Software Architect, where I had the pleasure of delivering a workshop and three talks. If you attended the conference, thanks a lot for coming and I hope you find the materials useful; if you haven't been to the conference, I expect to see you next year!
My first talk was an introduction to Haskell for developers with no prior experience in functional programming. For me personally, Haskell is not...
October 23, 2015
The Windows heap manager was designed to avoid the overhead of having to allocate virtual memory directly with VirtualAlloc, among other things. If you only need a 20-byte object, it's a waste to call a system service (involving a user-kernel transition) and allocate a full page. The heap manager avoids that overhead by managing large blocks of virtual memory in user mode---it is implemented in ntdll.dll.
However, when you allocate particularly large blocks of memory (>= 512KB at the time of writing), the heap manager doesn't see a reason to interfere, so it just forwards your request to VirtualAlloc. It still knows about...