Comments to this Systemtap vs Dtrace comparison
This is in response to this web page posted
http://sources.redhat.com
I have went line by line and commented on its content and in accuracies. This was also sent to the Systemtap mailing list.
Going line by line...
Licenses: are correct.
Processor support:
Not sure if Apple, or the FreeBSD guys ported to PPC or not, but if Apple doesn't support, the Polaris project will support DTrace on its PPC OpenSolaris port.
Of course it should be pointed out that Systemtap only supports Linux, where Dtrace now supports Solaris and OSX, with the FreeBSD DTrace port rapidly proceeding, in fact last I checked they are furher along in the process that Systemtap is.
Kernel Lock in:
DTrace kernel lock step is highly limited when compared to Systemtap. Remember that Linux doesn't believe in API/ABI stability. I just ran a complex DTrace script I wrote over 2 years ago on a modern Solaris express release and it worked just as it always has, Systemtap isn't even stable enough to run a complex script written 6 months ago, even if you were on the same kernel, of course you have to update your elfutils quite often as well, I don't really see anything that is stable about Systemtap. So on this line, it would be that Systemtap has the greater amount of lock in, not just in the kernel but Systemtap it self and possibly even the tools used to compile the script.
Core Developers:
The majority core developers and active contributors on the Systemtap mailing list are IBM and RedHat employees.
DTrace is currently developed by a number of people outside of sun, including the people that have ported DTrace to OSX, as well as the team that is working on porting it to FreeBSD. There are also many users of DTrace that submit bug reports and static probes for scripting languages and applications.
Development began:
Are you accounting for kprobes that was the base of Systemtap developed for a while before Systemtap started?
DTrace development began in October 2001, and first released in September 2003 along with submission of the USENIX paper.
Ongoing evolution:
Yes Systemtap is development is rapid, but at the same time they claim to have no kernel lock in? Exactly which is it, we have Linux that has no guaranteed API/ABI (rapidly changing) including the kprobes kernel module that is still under active development, and a project that is rapidly changing, so how can it have little kernel lock in? Of course Systemtap needs to have rapid evolution to try and match where DTrace is today.
DTrace on the other hand, has met most of its goals and is now adding features and fixing bugs, where the active development is adding static probes to languages and apps making it easier for the end user and developers. DTrace currently has accomplished 99% of its goals it stated at the beginning of the project (http://www.usenix.org/event/usenix04/tech/general/full_papers/cantrill/cantrill_html/)
Target Audience:
Systemtap has seemed to miss its target audience, its current audience is Kernel developers; I know of no users that use Systemtap on a daily basis, the same statement holds true for Sysadmins as well. There are hardly any pre-made Systemtap scripts, at the moment so your only users are kernel Coders. I’ve not heard of any userland developers using Systemtap to solve problems since userland probes are not included. Nor is support for any other application or scripting language probing.
DTrace is targeted at Developers including ones that specialize in kernel and userland, System administrators, and end users. DTrace is so stable that they include scripts that are providing performance data as part of the operating system that are used daily in production systems with out fear. Brendan Gregg’s DTrace tool kit provides ready made script so that even newbies that can’t program a line can benefit from DTrace. DTrace is also working to integrate static probes in scripting languages including Java, Ruby, Php, Perl, Python, Postgres, Apache, Xserver these and one line DTrace script allows end users to figure out performance bottlenecks in applications even though they can’t code in either DTrace or the target language.
Target Usage:
DTrace is targeted at debugging, well if it wasn’t it sure does a job great at it anyway, please see http://uadmin.blogspot.com/2006/05/what-is-dtrace.html for examples of bugs being solved by DTrace, these bugs are not just limited to Solaris applications, DTrace has been used to solve new and long existing bugs in userland applications including, NTP, Gnome, Java Applets, Mozilla, Star Office/Open Office. This is not a complete list just the ones I dug up on the internet a while ago.
Systemtap seems limited to solving Linux Kernel problems since it has no userland probes, and no non-kernel developers or system administrators using it on a daily basis.
Style:
Since when did C language turn into a scripting language? Your scripts are basically C code modified to make it compatible with aC compiler. If one turns on guru mode, you are writing 100% C code, no way to consider it anything but C.
Control structures:
Systemtap has full control structures as stated, but if a bug happens in the Systemtap script it can cause the box to crash.
DTrace: doesn't have functions or loops per se, but you can work around this with a little thought. In return even if a DTrace user makes a horrendous mistake, it doesn't take down the box. Seems like a fair trade, doesn't it? It does to me, you should ask your target users, are any of them perfect? In that they never make a mistake? Are they willing to crash there production box because of a mistake in their systemtap script?
Variable Typing:
Systemtap’s use of implicit type control seems more of a limitation that a feature,
Because by implicitly setting the type of the variable you are now locked into that type, if the code you are probing changes a variable type your script no longer works. Thus eliminates the possibility of providing ready made scripts long term. This is made even worse since Linux doesn’t have a stable API or ABI, so the script developed today will have to be recompiled and possibly ported with the next kernel release or the next release of a program.
When Systemtap gets userland probes its inferred variable typing will become even more of a hindrance, lots of programs don’t ship with debugging code embedded, so if you don’t have the source code to recompile with debugging information Systemtap is useless, with DTrace it will try and make guesses at the data structure and include files and allow the user to type cast variables as needed, Systemtap does not have the native ability to process include files or handling data of unknown types.
Complex Reports:
If Systemtap’s report generating ability is so great why is there work done on a dashboard that is designed to make the reports look better?
What limitations are you seeing in DTrace's report generating capability?
Thread-local variables:
Systemtap: seems to have added these as an after thought, as justified by the need for disclaimer.
DTrace this is a built in feature, not an after thought.
Speculative tracing:
Systemtap: you can’t just wish this requirement a way, you can’t judge whether or not you need a piece of data, in a complex system when the first event occurs, you have to store it until possibly many other events have occurred, and so unless your script can predict the future, it really should provide a way to abandon information it collected that it really didn’t care about. Your users may not feel confident enough with the framework to write complex scripts, that need this functionality, where this is a problem.
Binary Tracing:
This seems vague at best, and an unimportant implementation detail that the end user really won’t care about.
Probe execution:
For Systemtap its compiled C code, especially in guru mode.
Number of probe points:
System tap, has too much bloat to have a truly unlimited number of probes, the overhead of each probe, last I checked was 64 bytes of code per script code segment, attached to a probe at minimum, plus additional storage requirements for each probe, so it does have a finite number of probes, 1 million active probes requires 64MB of kernel space allocated, since each Systemtap module is independent of all others, you get multiple copies of kernel code that is shared in DTrace.
What is the highest number of Systemtap probes ever activated in an active script with out the machine falling over? In my tests of DTrace it is over 500,000. The reason for stopping at 500,000 wasn’t because I hit any limit in the system, just decided any more was pointless. I could easily run multiple copies of the same script with out a problem. By the way the limit of 50,000 pre-defined probe points in DTrace are just for the kernel, you can probe any line of userland code, with out limit. Really for all but hard core kernel coders it is an unlimited number of probes because 99.99% of your users will not need to probe a specific line in the kernel.
Of course a good question is, what kind of kernel coder can’t debug a problem when he knows what functions are being called, and by what function, how often and how much time its taking, and what functions it calls and complete userland and kernel stack tracing. Seems to be a pretty silly to do all that extra work and risk system stability in the name of probing every line in the kernel.
Probe arbitrary points in code:
DTrace can probe entry and exit points of any function whether it is userland or kernel space, you can also define static probe points in both userland and kernel code, Arbitrary probe points in kernel code really seems to be of very limited importance by 99.9999% of coders as explained in the explanation above. DTrace can probe arbitrary points in userspace applications should it be required.
Can Systemtap still probe arbitrary points in code if you don’t have a binary with debugging information intact?
Dynamic loaded kernel objects
Yes DTrace can probe dynamically loaded kernel module, as well as userland libs that are loaded dynamically.
Concurrent probes on multiprocessors:
DTrace can probe multiple processors, and multiple tasks and multiple threads regardless if they are userland or kernel space. You can even run multiple copies of the same script with out problems, last time I tried this simple test on Systemtap, it failed.
Extract arbitrary data at probe point:
DTrace can read any location in memory be it in kernel space or userland, during probe execution. DTrace also handles any traps that occur as the result of the attempt of reading the data on all of its platforms, last I heard this feat has not been accomplished in all of SystemTap’s platforms. DTrace also understands the C struct construct so it can also read data stored in structs and use pointers stored in structs to access other data even if the program isn’t compiled with debugging information. Systemtap requires debugging information to access data in structures even if the user knows the layout of the data structures.
End user extendible libraries:
Systemtap is at the same state as DTrace script is a compiled program requiring a compiler on the target system; there is no way for a user to extend the probe library with precompiled scripts that are distributed without a compiler installed.
Hardware performance counters probing:
DTrace is working on adding access to performance counters that are available in the processors.
Safety:
The safety category is correct mostly on the DTrace side.
The Systemtap side, it talks a good game, but the proof is in the pudding. Even given Systemtap’s limited use in by the general public, when was the last week, that no bug report was filed against Systemtap that involved the system falling over. In the last 2 years, DTrace has perhaps had a handful of such reports.
Has anyone used Systemtap on a daily basis to solve problems on production systems without fear of it falling over?











0 Comments:
Post a Comment
<< Home