A Prolog program consists of a sequence of sentences or lists of sentences. Each sentence is a Prolog term. How terms are interpreted as sentences is defined below (see section Syntax of Sentences as Terms). Note that a term representing a sentence may be written in any of its equivalent syntactic forms. For example, the 2-ary functor `:-' could be written in standard prefix notation instead of as the usual infix operator.
Terms are written as sequences of tokens. Tokens are sequences of characters which are treated as separate symbols. Tokens include the symbols for variables, constants and functors, as well as punctuation characters such as brackets and commas.
We define below how lists of tokens are interpreted as terms (see section Syntax of Terms as Tokens). Each list of tokens which is read in (for interpretation as a term or sentence) has to be terminated by a full-stop token. Two tokens must be separated by a layout-text token if they could otherwise be interpreted as a single token. Layout-text tokens are ignored when interpreting the token list as a term, and may appear at any point in the token list.
We define below defines how tokens are represented as strings of characters (see section Syntax of Tokens as Character Strings). But we start by describing the notation used in the formal definition of Prolog syntax (see section Notation).
C --> F1 | F2 | F3which states that an entity of category C may take any of the alternative forms F1, F2, F3, etc.
sentence --> module : sentence | list { where list is a list of sentence } | clause | directive | grammar-rule clause --> non-unit-clause | unit-clause directive --> command | query non-unit-clause --> head :- body unit-clause --> head { where head is not otherwise a sentence } command --> :- body query --> ?- body head --> module : head | goal { where goal is not a variable } body --> module : body | body -> body ; body | body -> body | \+ body | body ; body | body , body | goal goal --> term { where term is not otherwise a body } grammar-rule --> gr-head --> gr-body gr-head --> module : gr-head | gr-head , terminals | non-terminal { where non-terminal is not a variable } gr-body --> module : gr-body | gr-body -> gr-body ; gr-body | gr-body -> gr-body | \+ gr-body | gr-body ; gr-body | gr-body , gr-body | non-terminal | terminals | gr-condition non-terminal --> term { where term is not otherwise a gr-body } terminals --> list | string gr-condition --> ! | { body } module --> atom
term-read-in --> subterm(1200) full-stop subterm(N) --> term(M) { where M is less than or equal to N } term(N) --> op(N,fx) subterm(N-1) { except in the case of a number } { if subterm starts with a (, op must be followed by layout-text } | op(N,fy) subterm(N) { if subterm starts with a (, op must be followed by layout-text } | subterm(N-1) op(N,xfx) subterm(N-1) | subterm(N-1) op(N,xfy) subterm(N) | subterm(N) op(N,yfx) subterm(N-1) | subterm(N-1) op(N,xf) | subterm(N) op(N,yf) term(1000) --> subterm(999) , subterm(1000) term(0) --> functor ( arguments ) { provided there is no layout-text between the functor and the ( } | ( subterm(1200) ) | { subterm(1200) } | list | string | constant | variable op(N,T) --> name { where name has been declared as an operator of type T and precedence N } arguments --> subterm(999) | subterm(999) , arguments list --> [] | [ listexpr ] listexpr --> subterm(999) | subterm(999) , listexpr | subterm(999) | subterm(999) constant --> atom | number number --> unsigned-number | sign unsigned-number | sign inf | sign nan unsigned-number --> natural-number | unsigned-float atom --> name functor --> name
By default, SICStus Prolog uses the ISO 8859/1 character set standard, but will
alternatively support the EUC (Extended UNIX Code) standard. This is
governed by the value of the environment variable SP_CTYPE
(see section Getting Started).
The character categories used below are defined as follows in the two standards:
token --> name | natural-number | unsigned-float | variable | string | punctuation-char | layout-text | full-stop name --> quoted-name | word | symbol | solo-char | [ ?layout-text ] | { ?layout-text } quoted-name --> ' ?quoted-item... ' quoted-item --> char { other than ' or \ } | '' | \ escape-sequence word --> small-letter ?alpha... symbol --> symbol-char... { except in the case of a full-stop or where the first 2 chars are /* } natural-number --> digit... | base ' alpha... { where each alpha must be less than the base, treating a,b,... and A,B,... as 10,11,... } | 0 ' char-item { yielding the character code for char } char-item --> char { other than \ } | \ escape-sequence base --> digit... { in the range [2..36] } unsigned-float --> simple-float | simple-float exp exponent simple-float --> digit... . digit... exp --> e | E exponent --> digit... | sign digit... sign --> - | + variable --> underline ?alpha... | capital-letter ?alpha... string --> " ?string-item... " string-item --> char { other than " or \ } | "" | \ escape-sequence layout-text --> layout-text-item... layout-text-item --> layout-char | comment comment --> /* ?char... */ { where ?char... must not contain */ } | % ?char... LFD { where ?char... must not contain LFD } full-stop --> . { the following token, if any, must be layout-text} char --> { any character, i.e. } layout-char | alpha | symbol-char | solo-char | punctuation-char | quote-char alpha --> capital-letter | small-letter | digit | underline escape-sequence --> b { backspace, character code 8 } | t { horizontal tab, character code 9 } | n { newline, character code 10 } | v { vertical tab, character code 11 } | f { form feed, character code 12 } | r { carriage return, character code 13 } | e { escape, character code 27 } | d { delete, character code 127 } | a { alarm, character code 7 } | x alpha alpha {treating a,b,... and A,B,... as 10,11,... } { in the range [0..15], hex character code } | digit ?digit ?digit { in the range [0..7], octal character code } | ^ ? { delete, character code 127 } | ^ capital-letter | ^ small-letter { the control character alpha mod 32 } | c ?layout-char... { ignored } | layout-char { ignored } | char { other than the above, represents itself }
A backslash occurring inside integers in `0'' notation or inside quoted atoms or strings has special meaning, and indicates the start of an escape sequence. Character escaping can be turned off for compatibility with old code. The following escape sequences exist:
\b
\t
\n
\v
\f
\r
\e
\d
\^?
\a
\xCD
\octal
\^char
char mod 32
, where char is a letter.
\layout-char
\c
\other
X,Ydenotes the term
','(X,Y)
in standard syntax.
(X)denotes simply the term
X
.
{X}denotes the term
{}(X)
in standard syntax.
-3
denotes a number whereas -(3)
denotes a compound term which has the 1-ary functor -
as its
principal functor.
Go to the first, previous, next, last section, table of contents.