In the same vein of my previous post on analyzing core dumps of .NET Core applications on Linux, let's take a look at what it takes to do some basic performance profiling. When starting out, here are a few things I wrote down that would be nice to do:
CPU profiling (sampling) to see where the CPU bottlenecks are
Grabbing stacks for interesting system events (file accesses, network, forks, etc.)
Tracing memory management activity such as GCs and object allocations
Identifying blocked time and the block and wake-up reasons
With this task list in mind, let's get started!
Collecting Call Stacks of .NET Core Processes
Generally speaking, a...
A lot of high-level languages, runtimes, and libraries people use on Linux have USDT probes embedded in them. In some cases, you have to compile with a specific flag to get the probes embedded (e.g. in Node.js), and in other cases they are part of the default package on most major distributions (e.g. in OpenJDK). Some examples of information these probes can provide include:
Garbage collection events and latencies in Java and Node
Method invocations and latencies in Python and Ruby
Object allocations in Ruby and Java
Thread start and stop events in Java
Class load events in Java and Ruby
BPF is the next Linux tracing superpower, and its potential just keeps growing. The BCC project just merged my latest PR, which introduces USDT probe support in BPF programs. Before we look at the details, here's an example of what it means:
# trace -p $(pidof node) 'u:node:http__server__request "%s %s (from %s:%d)" arg5, arg6, arg3, arg4'
TIME PID COMM FUNC -
04:50:44 22185 node http__server__request GET /foofoo (from ::1:51056)
04:50:46 22185 node http__server__request GET / (from ::1:51056)
Yep, that's Node.js running...
Warning: This post requires a bit of background. I strongly recommend Brendan Gregg's introduction to eBPF and bcc. With that said, the post below describes two new bcc-based tools, which you can use directly without perusing the implementation details.
A few weeks ago, I started experimenting with eBPF. In a nutshell, eBPF (introduced in Linux kernel 3.19 and further improved in 4.x kernels) allows you to attach verifiably-safe programs to arbitrary functions in the kernel or a user process. These little programs, which execute in kernel mode, can collect performance information, trace diagnostic data, and aggregate statistics that are then...
"How much memory is your process using?" -- I bet you were asked that question, or asked it yourself, more times than you can remember. But what do you really mean by memory?
I never thought it would be hard to find a definitive resource for what the various memory usage counters mean for a Windows process. But try it: Google "Windows Task Manager memory columns" and you'll see confusing, conflicting, inconsistent, unclear explanations of what the different metrics represent. If we can't even agree on what "working set" or "commit size" means, how can we ever monitor our Windows...
In my previous post on MiniDumper, I promised to explain in more detail how it figures out which memory ranges are required for .NET heap analysis. This is an interesting story, actually, because I tried a couple of approaches that failed before coming up with the final idea. Basically, I knew that a dump with full memory contains way more information than is necessary for .NET dump analysis. Even if you need the entire .NET heap available, you typically don't need a bunch of other memory ranges: executable code, Win32 heaps, unused regions of thread stacks, and so much more.
Oops! This was sitting in my queue for several months now, and I just noticed it needs to be published. But better late than never I guess. Here goes:
I've been lucky enough to be invited to speak at TechDays Netherlands again this year. This time I was asked to do four talks on some of my favorite subjects -- performance optimization, debugging, and diagnostics. Same as last year, the conference was impeccably organized.
I'm really looking forward to next year's TechDays :-) In the meantime, here are the materials from my talks.
Making .NET Applications Faster
My usual favorite on improving...
MiniDumper is born. It is an open source library and command-line tool that can generate dump files of .NET processes. However, unlike standard tools such as Procdump, MiniDumper has three modes of operation:
Full memory dumps (analogous to Procdump's -ma option). This is a complete dump of the process' memory, which includes the CLR heap but also a bunch of unnecessary information if you're mostly working with .NET applications. For example, a full memory dump will contain the binary code for all loaded modules, the unmanaged heap data, and a lot more.
Heap-only dumps (no Procdump analog). This is a dump that contains the...
Three years ago I blogged about obtaining SOS.dll and mscordacwks.dll indirectly from the Microsoft KB websites in case you only have a dump from the production system but can't gain access to copy these files over.
(Reminder: SOS.dll is a WinDbg extension for debugging .NET processes and dump files. The DAC, or mscordacwks.dll, is a helper library used by SOS to access the inner workings of a specific CLR version's data structures. The DAC is also used by ClrMD, a managed library that provides an API replacement for the SOS extension commands.)
It turns out that for Windows Phone applications (using the Windows Phone...
Just a couple of months ago, I agreed to deliver eight breakout sessions and a full-day workshop at DevWeek 2015. And no, I don't have any regrets -- but it was definitely a very packed week with lots of room changes and, more importantly, context switches from one topic to another. If you've been to DevWeek this year, I'm sure you enjoyed it: it's getting better year over year, and this is my third one so far.
Below you can find the materials for my eight sessions. If you've been to my workshop and haven't got the materials, please contact...