stOTTR: Terse syntax for Reasonable Ontology Templates
top
correct examples:
/
nonexamples:
/
grammar samples:
/
Table of Contents
Terse Syntax for Reasonable Ontology Templates (stOTTR)
Easy to read and write syntax for OTTR templates and instances.
- Version
- 0.1.2
- Published
- 2022-09-20
- Authors
- Martin G. Skjæveland, Leif Harald Karlsen
- Issues
- https://gitlab.com/ottr/language/stOTTR/issues
1 Introduction
This specification defines the Terse Syntax for Reasonable Ontology Templates (stOTTR) for serialising OTTR templates and instances of OTTR templates, as defined by rOTTR [2].
The purpose of this specification is to determine what is syntactically correct stOTTR syntax, and not its semantics, e.g., the result of expanding templates and what valid template libraries and datasets are. This is found in other specifications [2,1].
The formal specification of stOTTR is done in Antlr EBNF [3].
The vocabulary in this formalisation lies close to the formal vocabulary established in mOTTR
[1] and rOTTR [2]. We will therefore not explicitly
establish the (reasonably obvious) connection between these vocabularies, e.g.,
that the stOTTR syntactic element template
represents a
template as defined in mOTTR [1].
The stOTTR grammar is based on parts of the RDF Turtle grammar [4], more specifically it extends the syntax used for terms, i.e., IRIs, blank nodes and literals, to include the declaration of templates and instances.
stOTTR is designed to be compact and easy to read and write.
1.1 Documents
This specification consists of the following files:
- ./stOTTR.g4
- Antlr grammar specification of stOTTR. This is the formal grammar specification of the stOTTR syntax.
- ./Turtle.g4
- Antlr grammar specification of the parts of Turtle which are used by stOTTR.
- ./index.html
- This file. It provides explanations around the Antlr grammar specification, including examples and non-examples.
1.2 Document conventions
Prefixes
In this document, we assume the following prefix declarations:
Prefix | Namespace |
---|---|
ottt |
https://spec.ottr.xyz/rottr/0/ |
xsd |
http://www.w3.org/2001/XMLSchema# |
rdf |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs |
http://www.w3.org/2000/01/rdf-schema# |
owl |
http://www.w3.org/2002/07/owl# |
ex |
http://example.net/ns# |
Formatting
This is the first main occurrence of a defined term.
This is a mention of an already defined term.
This is an example box containing correct stOTTR syntax. However, the example need not be a complete stOTTR document.
This is an example box containing incorrect stOTTR syntax.
This is a grammar box containing grammar rules in Antlr syntax.
2 stOTTR language
2.1 stOTTR documents
We call the unit or artefact where samples of stOTTR is read from or written to a stOTTR document.
A stOTTR document may contain any number of RDF prefixes and base
declarations, as defined by Turtle [4], and any number of
stOTTR statements. A stOTTR statement is either a signature, a
template, a base template or an instance. Each stOTTR statement
must be terminated by a dot .
.
stOTTRDoc : ( directive // Turtle prefixes and base | statement )* EOF ; statement : ( signature | template | baseTemplate | instance ) '.' ;
@prefix : <http://example.xyz/ns> . @prefix ex: <http://example.net/ns> . PREFIX ex2: <http://example.com/ns>
2.2 Comments
Comments are useful for adding explanatory notes to documents. These are ignored by the parser.
Comments may be added to documents in two forms: single line comments
or multi-line comments. Anything following the #
character on the
same line is a single line comment. Anything between the tokens /***
and ***/
is a multi-line comment.
Comment : '#' ~('\r' | '\n')* -> skip ; CommentBlock : '/***' .*? '***/' -> skip ;
# This is a single line comment @prefix foaf: <http://xmlns.com/foaf/0.1/>. # This is a comment too /*** This is a multi-line comment ***/
2.3 Terms
A term is, syntactically, either a variable, a constant or a list of terms.
A variable denotes a blank node and is written as ?blank_label
where blank_label
must be on the same format as the blank node labels of
Turtle [4], but without the trailing underscore and colon characters _:
.
A constant can either be an IRI, a blank node or a literal—all as defined by
Turtle [4]. The keyword none
is short for ottr:none
, the IRI
for the empty term.
A list is a sequence of terms
separated by comma ,
and surrounded by round brackets ( )
.
However, note that the variable ?var
and the blank node _:var
denote
the same RDF node. Thus, using a blank node with a label equalling that
of a variable in the same template is discouraged and should give a warning
by the parser.
term : Variable | constant | termList ; constantTerm : constant | constantList ; Variable : '?' BNodeLabel ; /* Turtle blank node labels without trailing '_:' */ fragment BNodeLabel : (PN_CHARS_U) ((PN_CHARS | '.')* PN_CHARS)? ; constant : iri | blankNode | literal | none ; none : 'none' ; termList : '(' (term (',' term)*)? ')' ; constantList : '(' (constantTerm (',' constantTerm)*)? ')' ;
# These are variables: ?myVariable ?MYVARIABLE2 # These are IRIs: <http://example.com/iri> ex:iri2 :iri3 # These are blank nodes: [] _:blanknode # These are literals: "string" "typed string"^^xsd:normalisedString 23 true 3.4 # These are constant lists: ("string", ex:iri, 23) (("string", 23), ex:iri, ( 34) )
2.4 Types
A type, i.e., the type of a term, is either a basic type, a list type or a LUB-type (least upper bound).
A basic type is an IRI written as a prefixed name, as defined by Turtle [4]. We do not allow full IRIs as to not confuse these with the list- and LUB-types. Permissible types are defined in rOTTR [2].
A list type is written as List<TYPE>
or NEList<TYPE>
where
TYPE
is a type. NEList
is short for "non-empty list".
A LUB-type is written as LUB<TYPE>
where TYPE
is a basic type.
type : basicType | lubType | listType | neListType ; listType : 'List<' type '>' ; neListType : 'NEList<' type '>' ; lubType : 'LUB<' basicType '>' ; basicType : prefixedName ;
# These are basic types: xsd:string owl:Class rdfs:Resource ottt:Bot # These are list types: List<xsd:string> List<NEList<xsd:int>> ## list types can be nested. # These are least upper bound types: LUB<xsd:string> LUB<owl:Class>
# Error! This is a valid IRI and a permissible type according to rOTTR, # but it should be written as a prefixedName: <http://www.w3.org/2002/07/owl#Class> # Error! LUB-types cannot be nested: LUB<LUB<owl:Class>>
2.5 Template instances
A template instance consists of a template name (see specification
of templates below), a possibly empty list of arguments,
and may be prefixed by a list expander separated
from the template name with a vertical bar |
.
The list expander is either cross
, zipMin
or zipMax
.
The arguments in the argument list are separated by commas ,
and
surrounded by round brackets ( )
.
An argument is a term, possibly prefixed by the "listExpand" marker
++
.
instance : (ListExpander '|')? templateName argumentList ; ListExpander : 'cross' | 'zipMin' | 'zipMax' ; argumentList : '(' (argument (',' argument)*)? ')' ; argument : ListExpand? term ; ListExpand : '++' ;
# An instance with two arguments, both are IRIs: ex:Template (ex:A, ex:B) . # An instance with two arguments, both are no-value arguments: ex:Template ( , ) . # An instance with two arguments, one variable and one literal: ex:Template (?var, "string") . # An instance with one argument, which is a list with two elements. # The instance is marked with the instance modifier ~cross~, and the argument is marked with the list expander. cross | ex:Template (++(ex:A, ex:B)) . # An instance with three argument, where the last argument is a list. # The instance is marked with the instance modifier ~zipMin~, and the list argument is marked with the list expander. zipMin | ex:Template (1, 2, ++(ex:A, ex:B, ex:C)) .
2.6 Template signatures
A signature consists of a template name, a possibly empty list of parameters, and an optional list of template instances which annotate the signature.
A template name is an IRI.
The parameters in the parameter list are separated by commas ,
and surrounded by square brackets [ ]
.
A parameter consist of possibly many parameter modifiers, an
optional type, a variable and optionally a default value. There are
two parameter modifiers: optional written ?
and non blank
written !
. The type and variable is written as specified in the above section. If
no type is given, the type is set to the most general type, Top. A
default value is given by separating the variable and default value
with a equal sign =
.
Annotation instances must be prefixed by @@
.
signature : templateName parameterList annotationList? ; templateName : iri ; parameterList : '[' (parameter (',' parameter)*)? ']' ; parameter : ParameterMode* type? Variable defaultValue? ; ParameterMode : '?' /* optional */ | '!' /* non blank */ ; defaultValue : '=' constantTerm ; annotationList : annotation (',' annotation)* ; annotation : '@@' instance ;
ex:Template1 [ ?a , ?b ] .
This is a template signature for ex:Template1
has two parameters,
both with no modifiers, default type and no default value. The
template has no annotations.
ex:Template2 [ ! owl:Class ?a, ? xsd:int ?b = 5 ] .
The signature ex:Template2
has two parameters. The first parameter
has a non blank modifier !
and the type owl:Class
. The second
parameter has an optional modifier ?
, the type xsd:int
and the
default value 5
.
ex:Template3 [ !??a ] .
The signature ex:Template3
has a single parameter with two modifiers
non blank and optional.
ex:Template4 [ ] @@ex:Template1(ex:Template4, "arg"), @@ex:Template1(ex:Template4, "other arg") .
The signature ex:Template4
has no parameters. It is annotated by two
instances.
2.7 Templates and Base templates
A template consists of a signature and a pattern. The pattern is a
list of template instances separated by comma ,
and surrounded by
curly brackets { }
.
A base template consists of a signature, but has no
pattern. Instead and in-place of a pattern, a base template contains
the token BASE
.
baseTemplate : signature '::' 'BASE' ; template : signature '::' patternList ; patternList : '{' (instance (',' instance)*)? '}' ;
# This is a base template: ex:Template1 [ ?a, ?b ] :: BASE . # This is a template with a pattern containing three instances: ex:Template2 [ ?a = "String", ?b ] :: { ex:Template1 ( "arg1", "arg2" ), ex:Template1 ( "arg3", "arg4" ), ex:Template1 ( "arg5", "arg6" ) } . # This is a template, which has one annotation, and a pattern containing one instance. ex:Template [ ?a, ?b ] @@ex:Template2 (ottt:none, 23) :: { ex:Template3 ( true, ex:A ) } .
3 Appendix
3.1 stOTTR grammar
This is the complete stOTTR Antlr grammar.
grammar stOTTR; import Turtle; stOTTRDoc : ( directive // Turtle prefixes and base | statement )* EOF ; statement : ( signature | template | baseTemplate | instance ) '.' ; /*** Comments ***/ Comment : '#' ~('\r' | '\n')* -> skip ; CommentBlock : '/***' .*? '***/' -> skip ; /*** Signature ***/ signature : templateName parameterList annotationList? ; templateName : iri ; parameterList : '[' (parameter (',' parameter)*)? ']' ; parameter : ParameterMode* type? Variable defaultValue? ; ParameterMode : '?' /* optional */ | '!' /* non blank */ ; defaultValue : '=' constantTerm ; annotationList : annotation (',' annotation)* ; annotation : '@@' instance ; /*** Templates ***/ baseTemplate : signature '::' 'BASE' ; template : signature '::' patternList ; patternList : '{' (instance (',' instance)*)? '}' ; /*** Instance ***/ instance : (ListExpander '|')? templateName argumentList ; ListExpander : 'cross' | 'zipMin' | 'zipMax' ; argumentList : '(' (argument (',' argument)*)? ')' ; argument : ListExpand? term ; ListExpand : '++' ; /*** Types ***/ type : basicType | lubType | listType | neListType ; listType : 'List<' type '>' ; neListType : 'NEList<' type '>' ; lubType : 'LUB<' basicType '>' ; basicType : prefixedName ; /*** Terms ***/ term : Variable | constant | termList ; constantTerm : constant | constantList ; Variable : '?' BNodeLabel ; /* Turtle blank node labels without trailing '_:' */ fragment BNodeLabel : (PN_CHARS_U) ((PN_CHARS | '.')* PN_CHARS)? ; constant : iri | blankNode | literal | none ; none : 'none' ; termList : '(' (term (',' term)*)? ')' ; constantList : '(' (constantTerm (',' constantTerm)*)? ')' ;
3.2 Turtle grammar
The Turtle Antlr grammar is created from the EBNF grammar taken from the W3C recommendation at https://www.w3.org/TR/turtle/#sec-grammar-grammar by following this procedure:
- replacing
::=
with:
- terminating rules with
;
- fixing quotes, e.g, from
''
to'\'\'\
- escaping necessary characters, e.g., backslashes
- correcting negations, e.g., from
[^A]
to(
[A])~ - correcting unicode, e.g., from
#x20
to\u0020
- correcting
WS
andPN_CHARS_BASE
:- converting into acceptable lexer rules
- removing
'\u10000' .. '\uEFFFF'
range fromPN_CHARS_BASE
- changing some capitalised non-terminal rules to lowercase:
prefixedName
,rdfLiteral
Additionally,
- Many parser rules have been commented out as they are not needed
by stOTTR, e.g.,
triples
- The lexer rules
BlankNode
andANON
have been made into parser rules in order to work with stOTTR's parameter lists.
/* RDF turtle antlr grammar, reduced to rules for prefixes and terms for use by stOTTR grammar. */ grammar Turtle; // [1] // turtleDoc : statement*; // [2] // statement : directive | triples '.'; // [3] directive : prefixID | base | sparqlPrefix | sparqlBase; // [4] prefixID : '@prefix' PNAME_NS IRIREF '.'; // [5] base : '@base' IRIREF '.'; // [5s] sparqlBase : 'BASE' IRIREF; // [6s] sparqlPrefix : 'PREFIX' PNAME_NS IRIREF; // [6] // triples : subject predicateObjectList | blankNodePropertyList predicateObjectList?; // [7] // predicateObjectList : verb objectList (';' (verb objectList)?)*; // [8] // objectList : object (',' object)*; // [9] // verb : predicate | 'a'; // [10] // subject : iri | blankNode | collection; // [11] // predicate : iri; // [12] // object : iri | blankNode | collection | blankNodePropertyList | literal; // [13] literal : rdfLiteral | numericLiteral | BooleanLiteral; // [14] // blankNodePropertyList : '[' predicateObjectList ']'; // [15] // collection : '(' object* ')'; // [16] numericLiteral : INTEGER | DECIMAL | DOUBLE; // [128s] rdfLiteral : String (LANGTAG | '^^' iri)?; // [133s] BooleanLiteral : 'true' | 'false'; // [17] String : STRING_LITERAL_QUOTE | STRING_LITERAL_SINGLE_QUOTE | STRING_LITERAL_LONG_SINGLE_QUOTE | STRING_LITERAL_LONG_QUOTE; // [135s] iri : IRIREF | prefixedName; // [136s] prefixedName : PNAME_LN | PNAME_NS; // [137s] blankNode : BLANK_NODE_LABEL | anon ; // Productions for terminals // [18] IRIREF : '<' ((~[\u0000-\u0020<>"{}|^`\\]) | UCHAR)* '>' /* #x00=NULL #01-#x1F=control codes #x20=space */ ; // [139s] PNAME_NS : PN_PREFIX? ':'; // [140s] PNAME_LN : PNAME_NS PN_LOCAL; // [141s] BLANK_NODE_LABEL : '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)?; // [144s] LANGTAG : '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*; // [19] INTEGER : [+-]? [0-9]+; // [20] DECIMAL : [+-]? [0-9]* '.' [0-9]+; // [21] DOUBLE : [+-]? ([0-9]+ '.' [0-9]* EXPONENT | '.' [0-9]+ EXPONENT | [0-9]+ EXPONENT); // [154s] EXPONENT : [eE] [+-]? [0-9]+; // [22] STRING_LITERAL_QUOTE : '"' ((~[\u0022\u005C\u000A\u000D]) | ECHAR | UCHAR)* '"' /* #x22=" #x5C=\ #xA=new line #xD=carriage return */ ; // [23] STRING_LITERAL_SINGLE_QUOTE : '\'' ((~[\u0027\u005C\u000A\u000D]) | ECHAR | UCHAR)* '\'' /* #x27=' #x5C=\ #xA=new line #xD=carriage return */ ; // [24] STRING_LITERAL_LONG_SINGLE_QUOTE : '\'\'\'' (('\'' | '\'\'')? ((~['\\]) | ECHAR | UCHAR))* '\'\'\''; // [25] STRING_LITERAL_LONG_QUOTE : '"""' (('"' | '""')? ((~["\\]) | ECHAR | UCHAR))* '"""'; // [26] UCHAR : '\\u' HEX HEX HEX HEX | '\\U' HEX HEX HEX HEX HEX HEX HEX HEX; // [159s] ECHAR : '\\' [tbnrf"'\\]; // [161s] WS : [\u0020\u0009\u000D\u000A] -> skip /* #x20=space #x9=character tabulation #xD=carriage return #xA=new line */; // [162s] anon : '[' WS* ']'; // [163s] PN_CHARS_BASE : 'A' .. 'Z' | 'a' .. 'z' | '\u00C0' .. '\u00D6' | '\u00D8' .. '\u00F6' | '\u00F8' .. '\u02FF' | '\u0370' .. '\u037D' | '\u037F' .. '\u1FFF' | '\u200C' .. '\u200D' | '\u2070' .. '\u218F' | '\u2C00' .. '\u2FEF' | '\u3001' .. '\uD7FF' | '\uF900' .. '\uFDCF' | '\uFDF0' .. '\uFFFD' /*| '\u10000' .. '\uEFFFF'*/ ; // [164s] PN_CHARS_U : PN_CHARS_BASE | '_'; // [166s] PN_CHARS : PN_CHARS_U | '-' | [0-9] | [\u00B7] | [\u0300-\u036F] | [\u203F-\u2040]; // [167s] PN_PREFIX : PN_CHARS_BASE ((PN_CHARS | '.')* PN_CHARS)?; // [168s] PN_LOCAL : (PN_CHARS_U | ':' | [0-9] | PLX) ((PN_CHARS | '.' | ':' | PLX)* (PN_CHARS | ':' | PLX))?; // [169s] PLX : PERCENT | PN_LOCAL_ESC; // [170s] PERCENT : '%' HEX HEX; // [171s] HEX : [0-9] | [A-F] | [a-f]; // [172s] PN_LOCAL_ESC : '\\' ('_' | '~' | '.' | '-' | '!' | '$' | '&' | '\'' | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%');
3.3 Grammar tests
Here follow various examples of stOTTR useful for testing the correctness of the grammar specification.
# modifiers ex:NamedPizza [ ??pizza ] . ex:NamedPizza [ !?pizza ] . ex:NamedPizza [ ?!?pizza ] . ex:NamedPizza [ !??pizza ] . # type too ex:NamedPizza [ owl:Class ?pizza ] . ex:NamedPizza [ ? owl:Class ?pizza ] . ex:NamedPizza [ ?! owl:Class ?pizza ] . # default value ex:NamedPizza [ owl:Class ?pizza = p:pizza] . ex:NamedPizza [ ? owl:Class ?pizza = 2] . ex:NamedPizza [ ?! owl:Class ?pizza = "asdf" ] . # more parameters ex:NamedPizza [ ?pizza , ?country , ?toppings ] . # lists ex:NamedPizza [ ?pizza = "asdf" , ?country = ("asdf", "asdf") , ?toppings = ((())) ] . # more complex types ex:NamedPizza [ ! owl:Class ?pizza , ?! owl:NamedIndividual ?country = ex:Class , NEList<List<List<owl:Class>>> ?toppings ] .
ex:template [ ] :: { ex:template((ex:template)) } . ex:template [?!?var ] :: { ex:template((((ex:template)))) } . ex:template [ ] :: { ex:template(( ex:template )) } . ex:NamedPizza [ ! owl:Class ?pizza = p:Grandiosa , ?! LUB<owl:NamedIndividual> ?country , List<owl:Class> ?toppings ] @@ cross | ex:SomeAnnotationTemplate("asdf", "asdf", "asdf" ), @@<http://asdf>("asdf", "asdf", ++("A", "B", "C") ) :: { cross | ex:Template1 (?pizza, ++?toppings) , ex:Template2 (1, 2,4, 5) , <http://Template2.com> ("asdf"^^xsd:string) , zipMax | ex:Template4 ("asdf"^^xsd:string, ?pizza, ++( "a", "B" )), zipMax | ex:Template4 ([], [], [], ++([], [])) } .
4 Change log
- 0.1.2
- Updated grammar for default values to also accept lists.
5 References
- mOTTR
- https://spec.ottr.xyz/mottr/0/
- rOTTR
- https://spec.ottr.xyz/rottr/0/
- Antrl
- https://www.antlr.org/
- RDF 1.1 Turtle
- https://www.w3.org/TR/turtle/