
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 France License.
Often, when I change something in my project, I wonder if my modifications had any impact on other modules. On the one hand there is correctness. If I change something I want to be confident that I didn't break anything else in a dependent part of the code. In order to alleviate this problem, I often use a battery of unit tests that I ran before committing my changes. This does not give me a proof that I didn't break anything, but at least some level of confidence that if I broke something, this is not something I thought of before... As ocaml library I use the excellent oUnit library.
Another hunting question is about performances. This is more difficult to test. To be sure I didn't degrade the performances of my code, I need access performance related information of my code sometimes in the past. If you use a scm to manage your code like git, there are facilities to run this kind of tests (that of course need some heavy scripting abilities if you want to check all your functions) . Otherwise you are a bit left to your own destiny...
Starting from the Benchmark module, I cooked up a companion module ExtBenchmark to take care of time regression testing for you.
This is the .mli file that I hope somehow readable...
The idea of the module is pretty easy. Every time I change my code, I run my battery of tests and save them in a file on disk with the timestamp, the id of the machine and the results. Next time, these tests will be used to compare the new tests and the old ones, to check if out modifications had an impact on some part of the code. I give a small example below borrowing a bit of code from the examples of the benchmark module.
First we declare three functions that we are going to test. Then we run the tests in the function run() and in the main function we actually use the module ExtBenchmark. We execute all benchmarks and we obtain a test sample, then we save the sample on disk in a time stamped file. In the second part, we load all samples, and we print a comparison table. The printing function takes care of showing if the running time of a function increased w.r.t. the lowest running time of the same function and it is also able to print samples that contain different functions, making it easy to add tests along the way.
Notice that I run the program twice in order to generate two trace. The results by default are save in the .benchmarks directory and are in a simple textual format. To make it a bit more reliable I'll also add a machine id in the future, so to avoid mixing benchmarks that were run on different hosts. Moreover time regressions of more then 0.001 seconds are marked with an asterisk in the table so to pinpoint possible problems. I'm aware that these results must be taken with a grain of salt. A function must be run with many repetitions to avoid false positive. I think anyway this is a good starting point to enhance the benchmark module.
The code is available as part of the dose3 framework I'm writing for the mancoosi project. You can download it here. If of interest I'll might ask to merge it with the benchmark project. At the moment the module does not have dependencies to the rest of the code. Enjoy !
Comments
Machine id storage?
Hello,
How are you going to store the machine id, i.e. how can you extract the machine id when running the benchmark on a given machine?
Yours, d.
hostid
On linux you have the gethostid sys call. It's part of the posix standard that should give you a 32-bit id of the machine you are running based on the mac address of your network adapter and other hardware related information. Of course you can't be sure that it is unique, but it is better then nothing ...
http://0pointer.de/blog/proje
http://0pointer.de/blog/projects/ids.html