Sunday, December 02, 2007

Thanks, SystemTap!

I started this week's work with the idea of instrumenting the spufs module found in Linux/Cell to be able to take some traces of the execution of Cell applications. At first, I modified that module to emit events at certain key points, which were later registered in a circular queue. Then, I implemented a file in /proc so that a user-space application could read from it and free space from the queue to prevent the loss of events when it was full.

That first implementation never worked well, but as I liked how it was evolving, I thought it could be a neat idea to make this "framework" more generic so that other parts of the kernel could use it. I rewrote everything with this idea in mind and then also modified the regular scheduler and the process-management system calls to also rise events for my trace. And got it working.

But then, I was talking to Brainstorm about his new "Sun Campus Ambassador" position at the University, and during the conversation he mentioned DTrace. So I asked... "Mmm, that tool could probably simplify all my work; is it there something similar for Linux?". And yes; yes it is! Its name, SystemTap.

As the web page says, SystemTap "provides an infrastructure to simplify the gathering of information about the running Linux system". You do this by writing small scripts that hook into specific points of the kernel — at the function level, at specific mark points, etc. — and which get executed when the script is processed and installed into the live kernel as a loadable kernel module.

With this tool I can discard my several-hundred-long changes to gather traces and replace them with some very, very simple SystemTap scripts. No need to rebuild the kernel, no need to deal with custom changes to it, no need to rebuild every now and then... neat!

Now I'm having problems using the feature that allows to instrument kernel markers, and I need them because otherwise some private functions cannot be instrumented due to compiler optimizations (I think). OK, I'd expose those functions, but while I'm at it, I think it'd be a good idea to write a decent tapset for spufs that could later be published. And that prevents me from doing such hacks.

But anyway, kudos to the SystemTap developers. I now understand why everybody is so excited about DTrace.

3 comments:

Anonymous said...

There is also http://www.opersys.com/LTT/index.html

fche said...

Systemtap should be able to probe static functions just fine with kprobes breakpoints - it has roughly the same visibility into a program as gdb does. There is a recent gcc optimization that makes our lives hard though (always inlining a one-caller-only static function - and losing debugging information on its arguments). That may be what you're encountering.

Julio M. Merino Vidal said...

Yes, that's the exact problem I was having: not being able to retrieve arguments from some static functions called only once from the source file. And I found some post somewhere mentioning the compiler optimization thing, hence why I said it here as a possibility. It turns out that that comment was right :-)

But anyway, I've got the markers to work with some changes to the SystemTap code (already submitted as part of bug 4446). Hope they are correct and they can be integrated!