2.15 Syntax Notes
SWI-Prolog syntax is close to ISO-Prolog standard syntax, which is closely compatible with Edinburgh Prolog syntax. A description of this syntax can be found in the Prolog books referenced in the introduction. Below are some non-standard or non-common constructs that are accepted by SWI-Prolog:
/* ... /* ... */ ... */
The/* ... */comment statement can be nested. This is useful if some code with/* ... */comment statements in it should be commented out.
2.15.1 ISO Syntax Support
SWI-Prolog offers ISO compatible extensions to the Edinburgh syntax.
2.15.1.1 Processor Character Set
The
processor character set specifies the class of each character used for
parsing Prolog source text. Character classification is fixed to use
UCS/Unicode as provided by the C-library wchar_t based
primitives. See also section 2.18.
2.15.1.2 Character Escape Syntax
Within quoted atoms (using single quotes: '<atom>')
special characters are represented using escape sequences. An escape
sequence is led in by the backslash ()
character. The list of escape sequences is compatible with the ISO
standard but contains some extensions, and the interpretation of
numerically specified characters is slightly more flexible to improve
compatibility.
\
\a- Alert character. Normally the ASCII character 7 (beep).
\b- Backspace character.
\c- No output. All input characters up to but not including the first
non-layout character are skipped. This allows for the specification of
pretty-looking long lines. For compatibility with Quintus Prolog. Not
supported by ISO. Example:
format('This is a long line that looks better if it was \c split across multiple physical lines in the input') \<RETURN>- No output. Skips input to the next non-layout character or to the end of
the next line. ISO demands skipping only the newline. We advise using
\cor putting the layout before the, as shown below. Using\\cis supported by various other Prolog implementations and will remain supported by SWI-Prolog. The style shown below is the most compatible solution.8Future versions are likely to interpret<return> according to ISO.\format('This is a long line that looks better if it was \ split across multiple physical lines in the input')instead of
format('This is a long line that looks better if it was\ split across multiple physical lines in the input') \e- Escape character (ASCII 27). Not ISO, but widely supported.
\f- Form-feed character.
\n- Next-line character.
\r- Carriage-return only (i.e., go back to the start of the line).
\s- Space character. Intended to allow writing
0'\sto get the character code of the space character. Not ISO. \t- Horizontal tab-character.
\v- Vertical tab-character (ASCII 11).
\xXX..\- Hexadecimal specification of a character. The closing
\is obligatory according to the ISO standard, but optional in SWI-Prolog to enhance compatibility with the older Edinburgh standard. The code\xa\3emits the character 10 (hexadecimal `a') followed by `3'. Characters specified this way are interpreted as Unicode characters. See also\u. \uXXXX- Unicode character specification where the character is specified using
exactly 4 hexadecimal digits. This is an extension to the ISO
standard fixing two problems. First of all, where
\xdefines a numeric character code, it doesn't specify the character set in which the character should be interpreted. Second, it is not needed to use the idiosyncratic closingISO Prolog syntax.\ \UXXXXXXXX- Same as
\uXXXX, but using 8 digits to cover the whole Unicode set. \40- Octal character specification. The rules and remarks for hexadecimal specifications apply to octal specifications as well.
\<character>- Any character immediately preceded by a
and not covered by the above escape sequences is copied verbatim. Thus,\'\\'is an atom consisting of a singleand\'\''and''''both describe the atom with a singleĀ'.
Character escaping is only available if
current_prolog_flag(character_escapes, true) is active
(default). See current_prolog_flag/2.
Character escapes conflict with writef/2
in two ways: \40 is interpreted as decimal 40 by writef/2,
but as octal 40 (decimal 32) by read. Also, \l
is translated to a single `l'. It is advised to use the more widely
supported format/[2,3]
predicate instead. If you insist upon using writef/2,
either switch character_escapes
to
false, or use double \\, as in writef('\\l').
2.15.1.3 Syntax for non-decimal numbers
SWI-Prolog implements both Edinburgh and ISO representations for
non-decimal numbers. According to Edinburgh syntax, such numbers are
written as <radix>'<number>, where <radix>
is a number between 2 and 36. ISO defines binary, octal and hexadecimal
numbers using
0[bxo]<number>. For example: A is 0b100 \/ 0xf00
is a valid expression. Such numbers are always unsigned.
2.15.1.4 Unicode Prolog source
The ISO standard specifies the Prolog syntax in ASCII characters. As SWI-Prolog supports Unicode in source files we must extend the syntax. This section describes the implication for the source files, while writing international source files is described in section 3.1.3.
The SWI-Prolog Unicode character classification is based on version 6.0.0 of the Unicode standard. Please note that char_type/2 and friends, intended to be used with all text except Prolog source code, is based on the C-library locale-based classification routines.
- Quoted atoms and strings
Any character of any script can be used in quoted atoms and strings. The escape sequences\uXXXXand\UXXXXXXXX(see section 2.15.1.2) were introduced to specify Unicode code points in ASCII files. - Atoms and Variables
We handle them in one item as they are closely related. The Unicode standard defines a syntax for identifiers in computer languages.9http://www.unicode.org/reports/tr31/ In this syntax identifiers start withID_Startfollowed by a sequence ofID_Continuecodes. Such sequences are handled as a single token in SWI-Prolog. The token is a variable iff it starts with an uppercase character or an underscore (_). Otherwise it is an atom. Note that many languages do not have the notion of character-case. In such languages variables must be written as_name. - White space
All characters marked as separators (Z*) in the Unicode tables are handled as layout characters. - Control and unassigned characters
Control and unassigned (C*) characters produce a syntax error if encountered outside quoted atoms/strings and outside comments. - Other characters
The first 128 characters follow the ISO Prolog standard. Unicode symbol and punctuation characters (general category S* and P*) act as glueing symbol characters (i.e., just like: an unquoted sequence of symbol characters are combined into an atom).==Other characters (this is mainly
No: a numeric character of other type) are currently handled as `solo'.
2.15.1.5 Singleton variable checking
A singleton
variable is a variable that appears only one time in a clause. It
can always be replaced by _, the
anonymous variable. In some cases however people prefer to give
the variable a name. As mistyping a variable is a common mistake, Prolog
systems generally give a warning (controlled by style_check/1)
if a variable is used only once. The system can be informed a variable
is known to appear once by starting it with an underscore. E.g. _Name.
Please note that any variable, except plain _, shares with
variables of the same name. The term t(_X, _X) is
equivalent to t(X, X), which is different from
t(_, _).
As Unicode requires variables to start with an underscore in many languages this schema needs to be extended.10After a proposal by Richard O'Keefe. First we define the two classes of named variables.
- Named singleton variables
Named singletons start with a double underscore (__) or a single underscore followed by an uppercase letter. E.g.__varor_Var. - Normal variables
All other variables are `normal' variables. Note this makes_vara normal variable.11Some Prolog dialects write variables this way.
Any normal variable appearing exactly once in the clause and any named singleton variables appearing more than once are reported. Below are some examples with warnings in the right column. Singleton messages can be suppressed using the style_check/1 directive.
| test(_). | |
| test(_a). | Singleton variables: [_a] |
| test(_12). | Singleton variables: [_12] |
| test(A). | Singleton variables: [A] |
| test(_A). | |
| test(__a). | |
| test(_, _). | |
| test(_a, _a). | |
| test(__a, __a). | Singleton-marked variables appearing more than once: [__a] |
| test(_A, _A). | Singleton-marked variables appearing more than once: [_A] |
| test(A, A). |