Changes with respect to the 2011 competition

The definition of the measurements has been completely reorganized: we now have a separation into
- a selection of a set of packages that will be measured,
- the definition of the measurements that act on the selected sets of packages.
The above change has a consequence on the interpretation of the paranoid criterion in that one counts packages, not names of packages.
The trendy criterion has been dropped. We now have criteria paranoid, basic user, full user.
Timeouts are now specific to tracks.

Basics

Three different optimization criteria will be used. All three are lexicographic combinations of different measurements which we describe below.

Note that each of these measurements calculates an integer value only from the initial and the final installation status, that is the user request to install or remove a certain package is not used here. The reason is that each proposed solution must first of all satisfy the user request, otherwise it will be discarded and the solver will obtain a penalty value. Only those proposed solutions that actually are solutions will be compared by the optimization criteria defined below, yielding a score according to the ranking obtained. This is more precisely described in the ranking rules.

Additional package property

For the third ("full user") criterion we will make use of an extra package property which is not among the mandatory package properties. This additional property is recommends, and it has values of type vpkgformula (that is the same type as the mandatory property depends). The idea is that recommended packages are non-mandatory dependencies.

For that reason, the CUDF documents will contain a specification of that property in the document preamble, like this:

preamble: property: recommends: vpkgformula = [ true! ]

Definition of package selectors

The measurements that we will describe below are taken on selected sets of packages in order to measure the quality of a proposed solution. Package sets are denoted by the following constant expressions:

solution, new, removed, changed. up, down

Basic sets are expressed using only solution, new, changed and removed.

Given an initial universe I and a proposed solution S, each of these constant expressions denotes a set of packages. In the following we write p.f for the value of property f in package p. If X is a set of packages then X.f denotes the set of values of packages in X under property f. For instance, I.name denotes the set of names of packages that are initially installed.

solution = S, that is solution denotes the set of packages that are in the proposed solution.
changed = {p ∈ S| p ∉ I} ∪ {p ∈ I| p ∉ S}, in other words the symmetric set difference between S and I. That is, changed denotes the set of packages that have been newly installed or removed. Note that when a package with name foo gets upgraded from version 1 to version 2 this means that (foo,1) gets removed and (foo,2) gets installed, and as a consequence the set changed contains both packages (foo,1) and (foo,2).
new = { p ∈ S | p.name ∉ I.name } is the set of packages that are finally installed, and for which no package of the same name was initially installed.
removed = { p ∈ I | p.name ∉ S.name } is the set of packages that were initially installed, and for which no package of the same name is finally installed.
up = { p ∈ S | ∃ q ∈ I with q.name = p.name, and ∀ q ∈ I with q.name = p.name : q.version < p.version}. In other words, up is the set of upgraded packages. In case multiple versions of packages with the same name may be installed, the upgrade consists of the packages with a version higher than any previously installed version.
down = { p ∈ S | ∃ q ∈ I with q.name = p.name, and ∀ q ∈ I with q.name = p.name: q.version > p.version}. In other words, down is the set of downgraded packages. In case multiple versions of packages with the same name may be installed, the downgrade consists of the packages with a version smaller than any previously installed version.

Example: The following table illustrates these definitions, The first two lines define I and S, the remaining ones show the sets calculated from these:

package name	a	b	c	d	r	s	t
versions in I	2	3	5	-	4, 6	4,6	4,6
versions in S	3	-	5	1	3, 7	5	5,7
versions in `solution`	3	-	5	1	3, 7	5	5,7
versions in `changed`	2,3	3	-	1	3,4,6,7	4,5,6	4,5,6,7
versions in `new`	-	-	-	1	-	-	-
versions in `removed`	-	3	-	-	-	-	-
versions in `up`	3	-	-	-	7	-	7
versions in `down`	-	-	-	-	3	-	-

Definition of measurements

Measurements are terms that are evaluated to an integer value. They are always constructed as a function applied to a set of packages (this set will be one of the sets solution, changed, new, removed defined above) plus possibly additional arguments. More exactly, a measure is a term of one of the forms

count(X), sum(X,f), unsat_recommends(X), aligned(X,g1,g2), notuptodate(X)

where X is a set of packages, f a property of type int or one of its subtypes, and g1 and g2 any properties. These properties may be core properties defined in the CUDF specification, or properties that are declared in the preamble of the CUDF document. Basic measurements are measurements built using only count or sum.

Measurements are evaluated as follows. Note that some of the definitions make use of the set S (the solution) or U (the universe, that is the set of all known packages):

count(X) = #(X)
that is, count(X) is the number of packages in X.

sum(X,f) = Σ_{x∈ X} x.f
that is, sum(X,f) is the sum over all packages in the set X of the value of their field f.
A typical example of this would be sum(solution,size), where size is an additional integer-valued property of packages indicating their size.

notuptodate(X) = # { p∈ X | ∃ q ∈ U : p.name = q.name and p.version < q.version }
that is, notuptodate(X) is the number of packages in X that are not in the latest known version.

unsat_recommends(X) = #{ (p,c) |p ∈ X, p recommends ..., c, ... and c is not satisfied by S }
that is unsat_recommends(X) counts the number of disjunctions in Recommends-fields of packages in X that are not satisfied by S.

For instance, if package a recommends
b, c|d|e, e|f|g, b|g, h
and if S installs a, e, f, and h, but neither of b, c, d, or g, then one would obtain for the package a alone a value of 2 for unsat_recommends since the 2nd, 3rd and 5th disjunct of the recommendation are satisfied, and the 1st and 4th disjunct are not. If no other package in X contains recommendations that means that, in that case, unsat_recommends(X)=2.

aligned(X,g1,g2) = #{ (x.g1,x.g2) | x ∈ X } - #{ x.g1 | x ∈ X }
In other words, we cluster the packages in X according to their values at the properties g1 and g2 and count the number of clusters, yielding a value v1. Then we do the same when clustering only by the property g1, yielding a value v2. The value returned is v1-v2.

The properties g1 and g2 may only have type int or string.
<p/>
  There are several applications for this:
  <ol>
<li> aligned(solution,package,version) is the number of multiple
  versions of installed packages: a package installed in 
  only one version counts 0, and a package installed in n different
  versions counts n-1.
  <p/>
    For instance, if the <kbd>solution</kbd> is
    S={(a,1), (a,2), (a,3), (b,1), (b,2)} then the first set of
    clusters by (package,version) would be the same as S of
    cardinality 5, and the second set of clusters by package name only
    would be {a,b} of cardinality 2, yielding a result of 5-2=3.
</li>
<li> aligned(solution,sourcename,sourceversion) is
the number of <i>version changes</i> as defined in the paper
<a href=http://eptcs.org/paper.cgi?LoCoCo2011.1>Aligning
  component upgrades</a> by Di Cosmo, Lhomme, Michel at LoCoCo 2011.
  </ol>

Definition of the three tracks

The criteria used in the three tracks are

paranoid: we want to answer the user request, minimizing the number of packages removed in the solution, and also the packages changed by the solution; the optimization criterion is

lex( min removed, min changed)
Hence, two solutions S1 and S2 will be compared as follows:
1. compute r1 = count(removed) and c1 = count(changed) for the first solution S1, and r2 = count(removed) and c2 = count(changed) for the second solution S2,
2. S1 is better than S2 iff r1 < r2 or (r1=r2 and c1< c2)
Note that this is definition is subtly different from the paranoid criterion used in earlier MISC competitions in that one counts changed packages, not only names of changed packages.

The timeout for this track is 30 seconds. The signal SIGUSR1 is sent approximately 5 seconds before the timeout.
basic user: we want to answer the user request, optimizing by a criterion provided by the user. The criterion is a lexicographic combination of basic measurements and basic sets as defined above, each of them with a polarity ("+" for a function to maximize, "-" for a function to minimize). The functions are listed in decreasing order of priority, and separated by the symbol ",".
Hence, in this track we use only sets build with solution, new, changed and removed, and we use only measurements build with count and sum.
For instance, the paranoid criterion of the first track could be written as
-count(removed), -count(changed)
The timeout for this track is 150 seconds. The signal SIGUSR1 is sent approximately 10 seconds before the timeout.

full user: we want to answer the user request, optimizing by a criterion provided by the user. The criterion is a lexicographic combination of any set expressions and any measurements as defined above, each of them with a polarity ("+" for a function to maximize, "-" for a function to minimize). The functions are listed in decreasing order of priority, and separated by the symbol ",".
Hence, in this track we use sets build with solution, new, removed, changed. up, down; and we use measurements build with count, sum, unsat_recommends, aligned, and notuptodate.
For instance, a user of a machine with a small disk might write

-count(removed),-sum(solution,installedsize),-notuptodate(solution),-unsat_recommends(solution),-count(new)
The timeout for this track is 300 seconds. The signal SIGUSR1 is sent approximately 10 seconds before the timeout.