Scanners, (sometimes called tokenizers) take an input string, usually in ASCII or similar format, and produce a scanned sequence of tokens. The requirements that various applications have for scanning differ in small but important ways - a character that is special to one application may be part of the token of another; or some applications may want lower case text converted to upper-case test. The stdscan.P library provides a simple scanner written in XSB that can be configured in several ways. While useful, this scanner is not intended to be as powerful as general-purpose scanners such as lex or flex.
Given as input a List of character codes, scan/2 scans this list producing a list of atoms constituting the lexical tokens. Its parameters are set via set_scan_pars/1.
Tokens produced are either a sequence of letters and/or numbers or consist of a single special character (e.g. ( or )). Whitespaces may occur between tokens.
Given as input a List of character codes, along with a character code for a field separator, scan/3 scans this list producing a list of list of atoms constituting the lexical tokens in each field. scan/3 thus can be used to scan tabular information. Its parameters are set via set_scan_pars/1.
set_scan_pars(+List) is used to configure the tokenizer to a particular need. List is a list of parameters including the following:
| { } [ ] " $ % & ' ( ) * + , - . / : ; < = > ? @ \ ^ _ ~ `