Next: 9.3.3 XML documents
Up: 9.3 Predicate Reference
Previous: 9.3.1 Loading Structured Documents
Contents
Index
9.3.2 Handling of White Spaces
Four modes for handling white-spaces are provided. The initial mode can be
switched using the space(SpaceMode) option to
load_structure/3 or set_sgml_parser/2. In XML
mode, the mode is further controlled by the xml:space attribute,
which may be specified both in the DTD and in the document. The defined
modes are:
- space(sgml)
-
Newlines at the start and end of an element are removed.
This is the default mode for the SGML dialect.
- space(preserve)
-
White space is passed literally to the application. This mode leaves all
white space handling to the application. This is the default mode for
the XML dialect.
- space(default)
-
In addition to sgml space-mode, all consecutive whitespace is
reduced to a single space-character.
- space(remove)
-
In addition to default, all leading and trailing white-space is
removed from CDATA objects. If, as a result, the CDATA
becomes empty, nothing is passed to the application. This mode is
especially handy for processing data-oriented documents, such as RDF.
It is not suitable for normal text documents. Consider the HTML
fragment below. When processed in this mode, the spaces surrounding the
three elements in the example below are lost. This mode is not part of
any standard: XML 1.0 allows only default and preserve.
Consider adjacent <b>bold</b> <ul>and</ul> <it>italic</it> words.
The parsed term will be
['Consider adjacent',element(b,[],[bold]),element(ul,[], [and]),element(it,[],[italics]),words].
Next: 9.3.3 XML documents
Up: 9.3 Predicate Reference
Previous: 9.3.1 Loading Structured Documents
Contents
Index
Terrance Swift
2007-10-06