Getting Stacks for LTTng Events with .NET Core on Linux

Tuesday, February 6, 2018

On Windows, .NET contains numerous very useful ETW events, which can be used for tracing garbage collections, assembly loading, exceptions thrown, object allocations, and other interesting scenarios. All events can come with a stack trace, which helps understand where they’re coming from. In fact, I’d say for some events, not getting the stack trace means the event is completely useless — e.g. what good is the ExceptionThrown event if you don’t have the exception stack trace? On Linux, .NET Core doesn’t use ETW (Event Tracing for Windows, ya know). It uses LTTng instead, which is an awesome tracing framework but doesn’t have stack...
no comments

Wrapping Up Sela’s Hackathon With Four New Diagnostic Projects

Monday, December 25, 2017

In the beginning of December, the consultants team at Sela had a day off-site for our annual hackathon to work on a variety of projects. This day was a blast, and there was a bunch of great energy and interesting work being done all around, but my team (Avi Avni and I) focused on diagnostics tools -- my favorite -- and here are some preliminary results. Real-time Win32 memory leak diagnoser This is a project I've had on my todo list for a couple of years now. In a nutshell, Win32 memory leak analysis in production is quite painful because of...
no comments

Lightweight JVM Diagnostics Tools and Containers

Wednesday, September 27, 2017

If you're reading this, I hope you're curious what your options are when it comes to running JVM diagnostic tools on containerized applications. Generally when it comes to containers, you can either shove all your diagnostic tools into the container image, or you can try running them from the host -- this short post tries to explain what works, what doesn't, and what can be done about it. Although it is focused on JVM tools (and HotSpot specifically), a lot of the same obstacles will apply to other runtimes and languages. Container isolation As a very quick reminder, container isolation on...
one comment

Profiling the JVM on Linux: A Hybrid Approach

Friday, July 7, 2017

I hope you're outraged that your performance tools are lying to you. For quite a while, many Java sampling profilers have been known to blatantly misrepresent reality. In a nutshell, stack sampling using the documented JVMTI GetStackTrace method produces results that are biased towards safepoints, and not representative of the real CPU processing performed by your program. Over the years, alternative profilers popped up, trying to fix this problem by using AsyncGetCallTrace, a less-documented API that doesn't wait for a safepoint, and can produce more accurate results. Simply calling AGCT from a timer signal handler gives you a fairly reliable way...
no comments

Tracing .NET Core on Linux with USDT and BCC

Sunday, April 2, 2017

In my last post, I lamented the lack of call stack support for LTTng events in .NET Core. Fortunately, being open source, this is somewhat correctable -- so I set out to produce a quick-and-dirty patch that adds USDT support for CoreCLR's tracing events. This post explores some of the things that then become possible, and will hopefully become available in one form or another in CoreCLR in the future. Very Brief USDT Primer USDT (User Statically Defined Tracing) is a lightweight approach for embedding static trace markers into user-space libraries and applications. I've taken a closer look a year ago when...

Tracing Runtime Events in .NET Core on Linux

Thursday, March 30, 2017

After exploring the basic profiling story, let's turn to ETW events. On Windows, the CLR is instrumented with a myriad of ETW events, which can be used to tackle very hard problems at runtime. Here are some examples of these events: Garbage collections Assembly load/unload Thread start/stop (including thread pool threads) Object allocations Exceptions thrown, caught, filtered Methods compiled (JIT) By collecting all of, or a subset of, these events, you can get a very nice picture of what your .NET application is doing. By combining these with Windows kernel events for CPU sampling, file accesses, process creations and more -- you have a fairly...

Profiling a .NET Core Application on Linux

Monday, February 27, 2017

In the same vein of my previous post on analyzing core dumps of .NET Core applications on Linux, let's take a look at what it takes to do some basic performance profiling. When starting out, here are a few things I wrote down that would be nice to do: CPU profiling (sampling) to see where the CPU bottlenecks are Grabbing stacks for interesting system events (file accesses, network, forks, etc.) Tracing memory management activity such as GCs and object allocations Identifying blocked time and the block and wake-up reasons With this task list in mind, let's get started! Collecting Call Stacks of .NET Core Processes Generally speaking, a...
3 comments

USDT/BPF Tracing Tools: Java, Python, Ruby, Node, MySQL, PostgreSQL

Friday, December 23, 2016

A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include: Garbage collection events and latencies in Java and Node Method invocations and latencies in Python and Ruby Object allocations in Ruby and Java Thread start and stop events in Java Class load events in Java and Ruby This...
no comments

Two New eBPF Tools: memleak and argdist

Sunday, February 14, 2016

Warning: This post requires a bit of background. I strongly recommend Brendan Gregg's introduction to eBPF and bcc. With that said, the post below describes two new bcc-based tools, which you can use directly without perusing the implementation details. A few weeks ago, I started experimenting with eBPF. In a nutshell, eBPF (introduced in Linux kernel 3.19 and further improved in 4.x kernels) allows you to attach verifiably-safe programs to arbitrary functions in the kernel or a user process. These little programs, which execute in kernel mode, can collect performance information, trace diagnostic data, and aggregate statistics that are then...
2 comments

Shared Memory Queue, Adaptive pthread_mutex, and Dynamic Tracing

Friday, January 22, 2016

This blog post is also on GitHub in its entirety. If you prefer to read it there along with the code, I won't mind. Go ahead. In one of my recent training classes, I was asked to demonstrate some practical uses of shared memory. My knee-jerk reply was that shared memory can be used for inter-process communication and message-passing. In fact, most IPC mechanisms are based on shared memory in their implementation. The question was whether it's worth the effort to build a message-passing interface on top of shared memory queues, or whether sockets or pipes could produce a better result...
4 comments