Node:Prolog Level WCX Features, Next:WCX Environment Variables, Previous:WCX Concepts, Up:Handling Wide Characters
SICStus Prolog has a Prolog flag, called wcx
, whose value can be
an arbitrary atom, and which is initialized to []
. This flag is
used at opening a stream, its value is normally passed to a user-defined
hook function. This can be used to pass some information from Prolog to
the hook function. In the example of A Sample WCX Box, which
supports the selection of external encodings on a stream-by-stream
basis, the value of the wcx
flag is used to specify the encoding
to be used for the newly opened stream.
The value of the wcx
flag can be overridden by supplying a
wcx(
Value)
option to open/4
and load_files/2
.
If such an option is present, then the Value
is passed on
to the hook function.
The wcx
flag has a reserved value. The value wci
(wide
character internal encoding) signifies that the stream should use the
SICStus Prolog internal encoding (UTF-8), bypassing the hook functions
supplied by the user. This is appropriate, e.g. if a file with wide
characters is to be produced, which has to be readable irrespective of
the (possibly user supplied) encoding scheme.
Wide characters generally require several bytes to be input or output.
Therefore, for each stream, SICStus Prolog keeps track of the number of
bytes input or output, in addition to the number of (wide) characters.
Accordingly there is a built-in predicate
byte_count(+
Stream,?
N)
for accessing the number of
bytes read/written on a stream.
Note that the predicate character_count/2
returns the number of
characters read or written, which may be less than the number of bytes,
if some of the characters are multibyte. (On output streams the
byte_count/2
can also be less than the character_count/2
,
if some codes, not belonging to the code-set handled, are not written
out.)
Note that if a stream is opened as a binary stream:
open(..., ..., ..., [type(binary)])
then no wide character handling will take place; every character output will produce a single byte on the stream, and every byte input will be considered a separate character.