tl;dr I wrote a simple proof-of-concept tool called place-probe.py which helps place dynamic tracepoints on .NET methods. For example:
place-probe.py $PID 'System.Threading.Thread::Sleep'.
Dynamic tracing is one of the Linux diagnostics superpowers. By adding dynamic tracepoints on arbitrary functions across the system, you can diagnose a variety of “impossible” bugs and performance problems on a live production application without having to add instrumentation, rebuild, and restart. The underlying kernel mechanism that makes dynamic tracing possible is called uprobes (for userspace) and kprobes (for kernel functions).
Unfortunately, uprobes can only be placed on code that is backed by an on-disk image. In other words, not generated code, which was compiled at runtime. This precludes runtimes like JVM or CLR from using uprobes, because Java bytecode and CLR intermediate language instructions are compiled to machine code on the fly, and not backed by a disk image.
But the CLR has a trick up its sleeve: ahead-of-time compilation. On Windows, this is known as NGen, and the .NET Core cross-platform mechanism is called CrossGen. This is a tool that invokes the JIT compiler (libclrjit.so) on an assembly and stores the compilation results in a native image, which contains machine code instructions. These native images are then loaded into memory and executed directly, and because they are backed by a disk image, they can be traced with dynamic tracepoints!
The actual work of placing a dynamic probe on a CrossGen-compiled image is the following. You need the method’s offset from the image base, and then you place the probe with something like:
perf probe -x /path/to/MyImage.dll --add 0xbadcafe
The only problem is finding the offset that corresponds to a given managed method. The general approach is as follows:
- Use the crossgen command-line tool to generate debug information for all the CrossGen-compiled assemblies. This produces .map files in a simple format that contains the method offset and name.
- Find the desired managed method in the .map files. The map entry will look like the following, where the offset (in the first column) is the offset from the base address where the native image is loaded (let’s call it $METHODOFFSET):
0000000000020D70 36 instance void [app] app.Employee::Work()
- Find the native assembly’s load address and first executable section in /proc/$PID/maps. We need the offset of the executable section from the assembly’s load address (let’s call it $EXEOFFSET), and the offset within the on-disk image ($DISKOFFSET). Here’s an example for System.Console.dll – the executable section starts at 7f7e038a1000, while the first section is at 7f7e03880000, so the difference is 0x21000; and the on-disk offset for the executable section is the third column, which is 0x1000.
7f7e03880000-7f7e03881000 r--p 00000000 ca:01 537652 /home/ubuntu/dotnet/out/System.Console.dll 7f7e03890000-7f7e03892000 rw-p 00000000 ca:01 537652 /home/ubuntu/dotnet/out/System.Console.dll 7f7e038a1000-7f7e038cd000 r-xp 00001000 ca:01 537652 /home/ubuntu/dotnet/out/System.Console.dll 7f7e038dc000-7f7e038dd000 r--p 0002c000 ca:01 537652 /home/ubuntu/dotnet/out/System.Console.dll
- Now, compute $PROBEOFFSET = $METHODOFFSET – $EXEOFFSET + $DISKOFFSET. This is the offset that we need to place the dynamic probe on in order to trace the managed method.
The above process is encapsulated by a POC tool I wrote called place-probe.py, which performs the above computations and places the probe for you, or prints the required command, if given the –dry-run switch. Here’s a simple example:
$ ./place-probe.py $(pidof app) 'System.Threading.Thread::Sleep(int32)' Added new event: probe_System:abs_4d6610 (on 0x4d6610 in /home/ubuntu/dotnet/out/System.Private.CoreLib.dll) You can now use it in all perf tools, such as: perf record -e probe_System:abs_4d6610 -aR sleep 1 Added new event: probe_System:abs_5920 (on 0x5920 in /home/ubuntu/dotnet/out/System.Threading.Thread.dll) You can now use it in all perf tools, such as: perf record -e probe_System:abs_5920 -aR sleep 1 $ sudo perf record -e probe_System:* -ag -- sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.136 MB perf.data (20 samples) ] $ sudo chown $USER perf.data $ perf script | head Failed to open /home/ubuntu/dotnet/out/System.Threading.Thread.dll, continuing without symbols Failed to open [kernel.kallsyms], continuing without symbols Failed to open /home/ubuntu/dotnet/out/System.Private.CoreLib.dll, continuing without symbols app 29891  154218.288270: probe_System:abs_5920: (7f7e03855920) 5920 [unknown] (/home/ubuntu/dotnet/out/System.Threading.Thread.dll) 256f07 CallDescrWorkerInternal (/home/ubuntu/dotnet/out/libcoreclr.so) 167ce0 MethodDescCallSite::CallTargetWorker (/home/ubuntu/dotnet/out/libcoreclr.so) 278c03 RunMain (/home/ubuntu/dotnet/out/libcoreclr.so) 278ea3 Assembly::ExecuteMainMethod (/home/ubuntu/dotnet/out/libcoreclr.so) aa3fb CorHost2::ExecuteAssembly (/home/ubuntu/dotnet/out/libcoreclr.so) 84dd6 coreclr_execute_assembly (/home/ubuntu/dotnet/out/libcoreclr.so) 8a433 coreclr::execute_assembly (/home/ubuntu/dotnet/out/libhostpolicy.so) 7f0d8 run (/home/ubuntu/dotnet/out/libhostpolicy.so) $ sudo perf probe --del=*
To use this with your own application binaries (and not just CrossGen-compiled .NET Core assemblies), run CrossGen on them. Here’s an example that assumes you’ve used
dotnet publish --self-contained such that all .NET dependencies are in the out directory:
crossgen /Platform_Assemblies_Paths out out/app.dll
After doing this, you can replace the original out/app.dll with the generated out/app.ni.dll (or out/app.ni.exe for the main executable) and use place-probe.py on that binary.
Oh, and where does CrossGen come from? You can either build it from source, or download it from the .NET Core NuGet packages. My dotnet-mapgen-v2.py script can help, among other things, with downloading CrossGen automatically and generating the required map files.