PublicShow sourcesgml_write.pl -- XML/SGML writer module

This library provides the inverse functionality of the sgml.pl parser library, writing XML, SGML and HTML documents from the parsed output. It is intended to allow rewriting in a different dialect or encoding or to perform document transformation in Prolog on the parsed representation.

The current implementation is particularly keen on getting character encoding and the use of character entities right. Some work has been done providing layout, but space handling in XML and SGML make this a very hazardous area.

The Prolog-based low-level character and escape handling is the real bottleneck in this library and will probably be moved to C in a later stage.

See also
- library(http/html_write) provides a high-level library for emitting HTML and XHTML.
Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.

Source xmlns(?NS, ?URI) is nondet[multifile]
Hook to define human readable abbreviations for XML namespaces. xml_write/3 tries these locations:
  1. This hook
  2. Defaults (see below)
  3. rdf_db:ns/2 for RDF-DB integration

Default XML namespaces are:

xsihttp://www.w3.org/2001/XMLSchema-instance
xshttp://www.w3.org/2001/XMLSchema
xhtmlhttp://www.w3.org/1999/xhtml
soap11http://schemas.xmlsoap.org/soap/envelope/
soap12http://www.w3.org/2003/05/soap-envelope
See also
- xml_write/2, rdf_register_ns/2.
Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.

Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.

Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.

Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.

Source xml_write(+Data, +Options) is det
Source sgml_write(+Data, +Options) is det
Source html_write(+Data, +Options) is det
Source xml_write(+Stream, +Data, +Options) is det
Source sgml_write(+Stream, +Data, +Options) is det
Source html_write(+Stream, +Data, +Options) is det
Write a term as created by the SGML/XML parser to a stream in SGML or XML format. Options:
cleanns(Bool)
If true (default), remove duplicate xmlns attributes.
dtd(DTD)
The DTD. This is needed for SGML documents that contain elements with content model EMPTY. Characters which may not be written directly in the Stream's encoding will be written using character data entities from the DTD if at all possible, otherwise as numeric character references. Note that the DTD will NOT be written out at all; as yet there is no way to write out an internal subset, though it would not be hard to add one.
doctype(DocType)
Document type for the SGML document type declaration. If omitted it is taken from the root element. There is never any point in having this be disagree with the root element. A <!DOCTYPE> declaration will be written if and only if at least one of doctype(_), public(_), or system(_) is provided in Options.
public PubId
The public identifier to be written in the <!DOCTYPE> line.
system(SysId)
The system identifier to be written in the <!DOCTYPE> line.
header(Bool)
If Bool is 'false', do not emit the <xml ...> header line. (xml_write/3 only)
nsmap(Map:list(Id=URI))
When emitting embedded XML, assume these namespaces are already defined from the environment. (xml_write/3 only).
indent(Indent)
Indentation of the document (for embedding)
layout(Bool)
Emit/do not emit layout characters to make output readable.
net(Bool)
Use/do not use Null End Tags. For XML, this applies only to empty elements, so you get
    <foo/>      (default, net(true))
    <foo></foo> (net(false))

For SGML, this applies to empty elements, so you get

    <foo>       (if foo is declared to be EMPTY in the DTD)
    <foo></foo> (default, net(false))
    <foo//      (net(true))

and also to elements with character content not containing /

    <b>xxx</b>  (default, net(false))
    <b/xxx/     (net(true)).

Note that if the stream is UTF-8, the system will write special characters as UTF-8 sequences, while if it is ISO Latin-1 it will use (character) entities if there is a DTD that provides them, otherwise it will use numeric character references.