10.45 Parsing and Generating XML—library(xml)

This is a package for parsing XML with Prolog, which provides Prolog applications with a simple “Document Value Model” interface to XML documents. A description of the subset of XML that it supports can be found at: http://www.binding-time.co.uk/xmlpl.html

The package, originally written by Binding Time Ltd., is in the public domain and unsupported. To use the package, enter the query:

| ?- use_module(library(xml)).

The package represents XML documents by the abstract data type document, which is defined by the following grammar:

document::= xml(attributes,content){ well-formed document }
| malformed(attributes,content){ malformed document }
attributes::= []
| [name=char-data|attributes]
content::= []
| [cterm|content]
cterm::= pcdata(char-data){ text }
| comment(char-data){ an XML comment }
| namespace(URI,prefix,element){ a Namespace }
| element(tagattributes,content){ <tag>..</tag> encloses content or <tag /> if empty }
| instructions(name,char-data){ A PI <? name char-data ?> }
| cdata(char-data){ <![CDATA[char-data]]> }
| doctype(tag,doctype-id){ DTD <!DOCTYPE .. > }
| unparsed(char-data){ text that hasn’t been parsed }
| out_of_context(tag){ tag is not closed }
tag::= atom{ naming an element }
name::= atom{ not naming an element }
URI::= atom{ giving the URI of a namespace }
char-data::= code-list
doctype-id::= public(char-data,char-data)
| public(char-data,dtd-literals)
| system(char-data)
| system(char-data,dtd-literals)
| local
| local,dtd-literals
dtd-literals::= []
| [dtd_literal(char-data)|dtd-literals]

The following predicates are exported by the package:

xml_parse(?Chars, ?Document)
xml_parse(?Chars, ?Document, +Options)

Either parses Chars, a code-list, to Document, a document. Chars is not required to represent strictly well-formed XML. Or generates Chars, a code-list, from Document, a document. If Document is not a valid document term representing well-formed XML, an exception is raised. In the second usage of the predicate, the only option available is format/1.

Options is a list of zero or more of the following, where Boolean must be true or false:

format(Boolean)

Indent the element content (default true).

extended_characters(Boolean)

Use the extended character entities for XHTML (default true).

remove_attribute_prefixes(Boolean)

Remove namespace prefixes from attributes when it’s the same as the prefix of the parent element (default false).

xml_subterm(+Term, ?Subterm)

Unifies Subterm with a sub-term of Term, a document. This can be especially useful when trying to test or retrieve a deeply-nested subterm from a document.

xml_pp(+Document)

“Pretty prints” Document, a document, on the current output stream.



Send feedback on this subject.