next up previous contents index
Next: 9.3.3 XML documents Up: 9.3 Predicate Reference Previous: 9.3.1 Loading Structured Documents   Contents   Index


9.3.2 Handling of White Spaces

Four modes for handling white-spaces are provided. The initial mode can be switched using the space(SpaceMode) option to load_structure/3 or set_sgml_parser/2. In XML mode, the mode is further controlled by the xml:space attribute, which may be specified both in the DTD and in the document. The defined modes are:

space(sgml)

Newlines at the start and end of an element are removed. This is the default mode for the SGML dialect.

space(preserve)

White space is passed literally to the application. This mode leaves all white space handling to the application. This is the default mode for the XML dialect.

space(default)

In addition to sgml space-mode, all consecutive whitespace is reduced to a single space-character.

space(remove)

In addition to default, all leading and trailing white-space is removed from CDATA objects. If, as a result, the CDATA becomes empty, nothing is passed to the application. This mode is especially handy for processing data-oriented documents, such as RDF. It is not suitable for normal text documents. Consider the HTML fragment below. When processed in this mode, the spaces surrounding the three elements in the example below are lost. This mode is not part of any standard: XML 1.0 allows only default and preserve.
    Consider adjacent <b>bold</b> <ul>and</ul> <it>italic</it> words.
The parsed term will be ['Consider adjacent',element(b,[],[bold]),element(ul,[], [and]),element(it,[],[italics]),words].


next up previous contents index
Next: 9.3.3 XML documents Up: 9.3 Predicate Reference Previous: 9.3.1 Loading Structured Documents   Contents   Index
Terrance Swift 2007-10-06