Profiling the JVM on Linux: A Hybrid Approach

Friday, July 7, 2017

I hope you're outraged that your performance tools are lying to you. For quite a while, many Java sampling profilers have been known to blatantly misrepresent reality. In a nutshell, stack sampling using the documented JVMTI GetStackTrace method produces results that are biased towards safepoints, and not representative of the real CPU processing performed by your program. Over the years, alternative profilers popped up, trying to fix this problem by using AsyncGetCallTrace, a less-documented API that doesn't wait for a safepoint, and can produce more accurate results. Simply calling AGCT from a timer signal handler gives you a fairly reliable way...
no comments

Tracing .NET Core on Linux with USDT and BCC

Sunday, April 2, 2017

In my last post, I lamented the lack of call stack support for LTTng events in .NET Core. Fortunately, being open source, this is somewhat correctable -- so I set out to produce a quick-and-dirty patch that adds USDT support for CoreCLR's tracing events. This post explores some of the things that then become possible, and will hopefully become available in one form or another in CoreCLR in the future. Very Brief USDT Primer USDT (User Statically Defined Tracing) is a lightweight approach for embedding static trace markers into user-space libraries and applications. I've taken a closer look a year ago when...

Tracing Runtime Events in .NET Core on Linux

Thursday, March 30, 2017

After exploring the basic profiling story, let's turn to ETW events. On Windows, the CLR is instrumented with a myriad of ETW events, which can be used to tackle very hard problems at runtime. Here are some examples of these events: Garbage collections Assembly load/unload Thread start/stop (including thread pool threads) Object allocations Exceptions thrown, caught, filtered Methods compiled (JIT) By collecting all of, or a subset of, these events, you can get a very nice picture of what your .NET application is doing. By combining these with Windows kernel events for CPU sampling, file accesses, process creations and more -- you have a fairly...

Profiling a .NET Core Application on Linux

Monday, February 27, 2017

In the same vein of my previous post on analyzing core dumps of .NET Core applications on Linux, let's take a look at what it takes to do some basic performance profiling. When starting out, here are a few things I wrote down that would be nice to do: CPU profiling (sampling) to see where the CPU bottlenecks are Grabbing stacks for interesting system events (file accesses, network, forks, etc.) Tracing memory management activity such as GCs and object allocations Identifying blocked time and the block and wake-up reasons With this task list in mind, let's get started! Collecting Call Stacks of .NET Core Processes Generally speaking, a...
one comment

Analyzing a .NET Core Core Dump on Linux

Sunday, February 26, 2017

Recently, I had to open a core dump of a .NET Core application on Linux. I thought this walkthrough might be useful if you find yourself in the same boat, because, to be quite honest, I didn't find it trivial. Configure Linux to Generate Core Dumps Before you begin, you need to configure your Linux box to generate core dumps in the first place. A lot of distros will have something preconfigured, but the simplest approach is to just put a file name in the /proc/sys/kernel/core_pattern file: # echo core > /proc/sys/kernel/core_pattern Additionally, there's a system limit maximum size for the generated core file. ulimit -c unlimited removes...
10 comments

USDT/BPF Tracing Tools: Java, Python, Ruby, Node, MySQL, PostgreSQL

Friday, December 23, 2016

A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include: Garbage collection events and latencies in Java and Node Method invocations and latencies in Python and Ruby Object allocations in Ruby and Java Thread start and stop events in Java Class load events in Java and Ruby This...
no comments

What A Year!

Wednesday, December 21, 2016

This has been an incredible year. Unfortunately, incredible often means very busy -- so this blog spent a record period of several months sitting alone in the dark, crying for attention. I thought it would make sense to briefly review what was going on during the last few months, before I try to return to my scheduled blogging routine. Briefly put, 2016 was a year of conferences. I spoke at 20 events (18 of them international), not including various user group presentations. I haven't done the exact math but I guess I spent at least 2 months not sleeping in my own...
no comments

H1 2016 Conference Plan

Thursday, May 5, 2016

The first few months of 2016 are so incredibly busy that I didn't find time to blog about my conference talks and provide additional resources, as I usually do. So here's a quick summary of my speaking engagements so far, and the plan for the next two months. Thanks for your patience! Chilly Montreal: ConFoo My first conference for 2016 was ConFoo in Montreal. This is a great community-driven non-profit conference, organized by the indefatigable Anna Filina and Yann Larrivee. One highlight from my visit this year is that, in three days, I spent exactly 2 minutes outside. And most of that time was frantically trying...
no comments

Probing the JVM with BPF/BCC

Thursday, March 31, 2016

Now that BCC has support for USDT probes, another thing I wanted to try is look at OpenJDK probes and extract some useful examples. To follow along, install a recent OpenJDK (I used 1.8) that has USDT probes enabled. On my Fedora 22, sudo dnf install java was just enough for everything. Conveniently, OpenJDK ships with a set of .stp files that contain probe definitions. Here's an example -- and there are many more in your $JAVA_HOME/tapset directory: probe hotspot.thread_start = process("/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.77-1.b03.fc22.x86_64/jre/lib/amd64/server/libjvm.so").mark("thread__start") {   name = "thread_start";   thread_name = user_string_n($arg1, $arg2);   id = $arg3;   native_id = $arg4;   is_daemon = $arg5;   probestr = sprintf("%s(thread_name='%s',id=%d,native_id=%d,is_daemon=%d)",               ...
no comments

USDT Probe Support in BPF/BCC

Wednesday, March 30, 2016

BPF is the next Linux tracing superpower, and its potential just keeps growing. The BCC project just merged my latest PR, which introduces USDT probe support in BPF programs. Before we look at the details, here's an example of what it means: # trace -p $(pidof node) 'u:node:http__server__request "%s %s (from %s:%d)" arg5, arg6, arg3, arg4' TIME PID COMM FUNC - 04:50:44 22185 node http__server__request GET /foofoo (from ::1:51056) 04:50:46 22185 node http__server__request GET / (from ::1:51056) ^C Yep, that's Node.js running...
3 comments