EBNF grammar for CUDF documents

The latest version of this document is available online at: http://www.mancoosi.org/cudf/ebnf

Revision History
Revision 0.52010-04-22
first public (beta) release, comments are welcome

This document provides a non authoritative EBNF grammar for CUDF documents.

The actual grammar is given in Section 2, “Grammar”, whereas some of the additional constraints not grasped by the grammar are documented in Section 3, “Additional constraints”. The authoritative reference for all constraints which apply to the parsing of CUDF documents is [TZ09].

Please refer to [CUDFW] for general information about the CUDF format.

Overall structure
[1]cudf::= preamble? universe request /* a CUDF document is preamble + universe + request */
Flow elements
[2]ssep::= (comment | '\n')* '\n' (comment | '\n')* /* stanza separator: empty line(s) and comments */
[3]comment::= '#' line /* a comment starts with '#' and extends to the end of the line */
[4]line::= [^\n]* '\n' /* (the final part of) a text line */
Document parts
[5]preamble::= 'preamble: ' line stanza ssep /* the rest of the line after 'preamble: ' is ignored */
[6]universe::= package* /* the universe is a list of packages */
[7]package::= 'package: ' pkgname stanza ssep  
[8]request::= 'request: ' line stanza comment* /* the rest of the line after 'request: ' is ignored */
Stanzas
[9]stanza::= (property '\n' | comment)* /* a stanza is a (non-empty) list of property */
[10]property::= propname ': ' value /* a property is a pair: name, value */
[11]propname::= ident /* a property name is an identifier */
[12]value::= bool | enum | int | nat | posint | string | pkgname | ident | typedecl | vpkg | veqpkg | vpkgformula | vpkglist | veqpkglist /* a property value belongs to the lexical space of one of the CUDF types */
Values: CUDF types
[13]bool::='true' | 'false' 
[14]int::=('+'|'-')? ['0'-'9']+ 
[15]string::= [^'\r''\n']* /* Unicode strings with no CR/LF */
[16]vpkg::= pkgname (sp+ vconstr)? /* package names with optional version predicates */
[17]vpkgformula::= andfla | 'true!' | 'false!' /* a package formula is either a CNF, or a boolean */
[18]vpkglist::= '' | vpkg (sp* ',' sp* vpkg)* /* a (possibly empty) list of package predicates */
[19]enum::= ident  
[20]pkgname::= ['A'-'Z' 'a'-'z' '0'-'9' '-' '+' '.' '/' '@' '(' ')' '%']+  
[21]ident::= ['a'-'z'] ['a'-'z' '0'-'9' '-']*  
[22]nat::= '+'? ['0'-'9']+  
[23]posint::= '+'? ['0'-'9']* ['1'-'9'] ['0'-'9']*  
[24]veqpkg::= pkgname (sp+ veqconstr)? /* package names with optional version equality predicates */
[25]veqpkglist::= '' | veqpkg (sp* ',' sp* veqpkg)*  
[26]typedecl::= '' | typedecl1 (sp* ',' sp* typedecl1)*  
Values: gory details
[27]vconstr::= relop sp+ ver  
[28]veqconstr::= '=' sp+ ver  
[29]relop::= '=' | '!=' | '>=' | '>' | '<=' | '<' /* relational / comparison operators */
[30]sp::= ' ' | '\t' /* space or tab Unicode characters */
[31]ver::= posint /* versions are positive integers */
[32]andfla::= orfla (sp* ',' sp* orfla)* /* a conjunction of disjunction */
[33]orfla::= atomfla (sp* '|' sp* atomfla)* /* a disjunction of atoms */
[34]atomfla::= vpkg /* atoms are package predicates */
[35]typedecl1::= ident sp* ':' sp* typeexpr (sp* '=' sp* '[' value* ']')?  
[36]typeexpr::= typename | 'enum' sp* '[' ident (',' sp* ident)* ']'  
[37]typename::= 'bool' | 'int' | 'nat' | 'posint' | 'string' | 'pkgname' | 'ident' | 'vpkg' | 'veqpkg' | 'vpkgformula' | 'vpkglist' | 'veqpkglist' /* a type name other than 'enum' and 'typedecl' */

This section describes some of the additional constraints, not grasped by the grammar of grasped by the grammar of Section 2, “Grammar”, which are relevant to parse CUDF documents.

The overall structure of a CUDF document is a sequence of stanzas, separated by empty lines. Three types of stanza are distinguished, according to whether they start with preamble: , package: , or request: .

Each stanza consists of a list of properties. According to the type of stanza, different properties are allowed. For each property, a type of stanza defines a type schema, that is: whether it is mandatory or not, which type of values are allowed for it (corresponding to one among the Values: CUDF types productions), and (in case it is optional) its default value.

In the reminder you can find a brief summary that contains, for each type of stanza, the relevant property schemata.




In addition to the property schemata listed above, extra schemata can be allowed, in package stanzas only, when declared in the preamble using the property property. Its typedecl value defines list of pairs <name, type>, possibly equipped with a default value in square brackets. When 'enum' is used instead of a type name, the type of the property is ident, but only allows the list of identifiers specified in parentheses. When the declared property is of type string, the default value is specified within double quotes and is subject to unescaping; see the type definition of typedecl, in Section 2.2.2 of [TZ09] for more information.