|Did you know ...||Search Documentation:|
The DTD (Document Type Definition) is a separate entity in sgml2pl, that can be created, freed, defined and inspected. Like the parser itself, it is filled by opening it as a Prolog output stream and sending data to it. This section summarises the predicates for handling the DTD.
dialectoption from open_dtd/3 and the
encodingoption from open/4. Notably the
dialectoption must match the dialect used for subsequent parsing using this DTD.
xmlnsprocesses the DTD case-sensitive.
dtdusing the call:
..., absolute_file_name(dtd(Type), [ extensions([dtd]), access(read) ], DtdFile), ...
Note that DTD objects may be modified while processing errornous
documents. For example, loading an SGML document starting with
<?xml ...?> switches the DTD to XML mode and
encountering unknown elements adds these elements to the DTD object.
Re-using a DTD object to parse multiple documents should be restricted
to situations where the documents processed are known to be error-free.
html is handled separately. The Prolog flag
html_dialect specifies the default html dialect, which is
that HTML5 has no DTD. The loaded DTD is an informal DTD that includes
most of the HTML5 extensions (http://www.cs.tut.fi/~jkorpela/html5-dtd.html).
In addition, the parser sets the
dialect flag of the DTD
object. This is used by the parser to accept HTML extensions.
Next, the corresponding DTD is loaded.
omit(OmitOpen, OmitClose), where both arguments are booleans (
falserepresenting whether the open- or close-tag may be omitted. Content is the content-model of the element represented as a Prolog term. This term takes the following form:
cdata, but entity-references are expanded.
nutoken. For DTD types that allow for a list, the notation
list(Type)is used. Finally, the DTD construct
(a|b|...)is mapped to the term
Default describes the sgml default. It is one
implied. If a
real default is present, it is one of
As this parser allows for processing partial documents and process the DTD separately, the DOCTYPE declaration plays a special role.
If a document has no DOCTYPE declaraction, the parser returns a list holding all elements and CDATA found. If the document has a DOCTYPE declaraction, the parser will open the element defined in the DOCTYPE as soon as the first real data is encountered.