Date Tags debian

For one of our experiments, we ended up analyzing all versions mentioned in a Debian Packages file if mentioned in a constraint or in a version field of a package. It seems that there are a lot of DDs that like to use strange ‘formatting’ when writing down a versioned constraint. This is might be not a choice done directly by the debian maintainer, but just a consequence of a particular version schema from upstream.

The debian policy is rigorous with relation to the algorithm to compare version .

The strings are compared from left to right.

First the initial part of each string consisting entirely of non-digit characters is determined. These two parts (one of which may be empty) are compared lexically. If a difference is found it is returned. The lexical comparison is a comparison of ASCII values modified so that all the letters sort earlier than all the non-letters and so that a tilde sorts before anything, even the end of a part. For example, the following parts are in sorted order from earliest to latest: ~~, ~~a, ~, the empty part, a.[34]

Then the initial part of the remainder of each string which consists entirely of digit characters is determined. The numerical values of these two parts are compared, and any difference found is returned as the result of the comparison. For these purposes an empty string (which can only occur at the end of one or both version strings being compared) counts as zero.

These two steps (comparing and removing initial non-digit strings and initial digit strings) are repeated until a difference is found or both strings are exhausted.

I think the important part is about the numerical comparison The numerical values of these two parts are compared, and any difference found is returned as the result of the comparison .

Our tool finds a number of examples of equivalent versions : version 0.1-1 is equivalent to (0.00001-1,0.001-1,0.01-1,0.1-1) version 2.1.0-1 is equivalent to (2.01.00-1,2.1.00-1,2.1.0-1) * … etc . I’ve one page full of equivalent versions.

For example version ‘0.00001-1’ is from package libbenchmark-progressbar-perl. I don’t know why so many equivalent way of writing a version string are used. Probably they appear in completely unrelated packages I’m pretty sure no harm is intended :) However, to avoid confusion, it might be a good idea to settle on a non-normative canonical representation of versions. Like no ‘0’ in front of dots and hyphen… Maybe we could add a warning in lintian ?

If somebody is interested I can generate a full report that associates each version to a package name.