39 Parsing and Generating XML

library(xml) is a package for parsing XML with Prolog, which provides Prolog applications with a simple “Document Value Model” interface to XML documents. A description of the subset of XML that it supports can be found at: http://homepages.tesco.net/binding-time/xml.pl.html

The package, originally written by Binding Time Ltd., is in the public domain and unsupported. To use the package, enter the query:

     | ?- use_module(library(xml)).

The package represents XML documents by the abstract data type document, which is defined by the following grammar:

document ::= xml(attributes,content) { well-formed document }
| malformed(attributes,content) { malformed document }

attributes ::= []
| [name=chardata|attributes]

content ::= []
| [cterm|content]

cterm ::= pcdata(chardata) { text }
| comment(chardata) { an XML comment }
| namespace(URI,prefix,element) { a Namespace }
| element(tagattributes,content) { <tag>..</tag> encloses content or <tag /> if empty }
| instructions(name,chardata) { A PI <? name chardata ?> }
| cdata(chardata) { <![CDATA[chardata]]> }
| doctype(tag,doctypeid) { DTD <!DOCTYPE .. > }
| unparsed(chardata) { text that hasn't been parsed }
| out_of_context(tag) { tag is not closed }

tag ::= atom { naming an element }

name ::= atom { not naming an element }

URI ::= atom { giving the URI of a namespace }

chardata ::= code-list

doctypeid ::= public(chardata,chardata)
| system(chardata)
| local

The following predicates are exported by the package:

xml_parse(+Chars, -Document[, +Options])
Parses Chars, a code-list, to Document, a document. Chars is not required to represent strictly well-formed XML.

Options is a list of zero or more of the following, where Boolean must be true or false:

format(Boolean)
Indent the element content (default true).
extended_characters(Boolean)
Use the extended character entities for XHTML (default true).
remove_attribute_prefixes(Boolean)
Remove namespace prefixes from attributes when it's the same as the prefix of the parent element (default false).

xml_parse(-Chars, +Document[, +Options])
Generates Chars, a code-list, from Document, a document. If Document is not a valid document term representing well-formed XML, an exception is raised.

In this usage of the predicate, the only option available is format/1.

xml_subterm(+Term, ?Subterm)
Unifies Subterm with a sub-term of Term, a document. This can be especially useful when trying to test or retrieve a deeply-nested subterm from a document.
xml_pp(+Document)
“Pretty prints” Document, a document, on the current output stream.