Next: , Previous: , Up: The Prolog Library   [Contents][Index]


10.12 I/O on Comma-Separated Values (CSV) Files and Strings—library(csv)

This library module provides some utilities for Comma-Separated Values (CSV) files and strings. In this context, a file is a sequence of records, and a record is a sequence of fields. In a CSV file, fields are separated by commas, and each record is terminated by RET.

This module does not report any syntax errors. In the event of prematurely terminated input file, the current field and record will be terminated silently.

Then a CSV record is read, it will yield a list of fields of the following form:

integer(Number,Codes)

Stands for the integer Number, where number_codes(Number,Codes) holds, and Codes is the list of character codes actually read.

float(Number,Codes)

Stands for the float Number, where number_codes(Number,Codes) holds, and Codes is the list of character codes actually read.

string(Codes)

Stands for the text string (list of character codes) Codes, and number_codes(Number,Codes) does not hold.

When a CSV records is written, the Codes argument of the above terms is used, but the following fields are also allowed:

integer(Number)

Stands for the integer Number.

float(Number)

Stands for the float Number.

atom(Atom)

Stands for the atom Atom.

Adapted to the conventions of this manual, RFC 4180 specifies the following. Where this module relaxes the requirements, that is explicitly mentioned:

  1. Each record is located on a separate line, delimited by a line break. For example:
    aaa,bbb,ccc RET
    zzz,yyy,xxx RET
    
  2. The last record in the file may or may not have an ending line break. For example:
    aaa,bbb,ccc RET
    zzz,yyy,xxx
    
  3. There may be an optional header line appearing as the first line of the file with the same format as normal record lines. This header will contain names corresponding to the fields in the file and should contain the same number of fields as the records in the rest of the file. For example:
    field_name,field_name,field_name RET
    aaa,bbb,ccc RET
    zzz,yyy,xxx RET
    

    This module does not attempt to detect a header line nor treat it in any special way.

  4. Within the header and each record, there may be one or more fields, separated by commas. Each record should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma, so if the record ends with a comma, the last field is treated as empty. For example, the following is treated as four fields:
    aaa,bbb,ccc,
    

    This module does not require or check that each record contains the same number of fields.

  5. Each field may or may not be enclosed in double quotes. If fields contain line breaks (RET), double quotes or commas, then they should be enclosed in double quotes, otherwise the double quotes may be omitted. For example:
    "aaa","bbb","ccc" RET
    "aaa","b RET
    bb","ccc" RET
    zzz,yyy,xxx
    

    If an unenclosed field is immediately followed by a ", (or vice versa), then this module treats that as a new enclosed (or unenclosed) field to be read and appended to the field read so far.

  6. If double quotes are used to enclose fields, then a double quote appearing inside a field must be escaped by preceding it with another double quote. For example:
    "aaa","b""bb","ccc"
    

Exported predicates:

read_record(-Record)
read_record(+Stream, -Record)

Reads a single record from the stream Stream, which defaults to the current input stream, and unifies it with Record. On end of file, Record is unified with end_of_file.

read_records(-Records)
read_records(+Stream, -Records)

Reads records from the stream Stream, which defaults to the current input stream, up to the end of the stream, and unifies them with Records.

read_record_from_codes(-Record, +Codes)
read_record_from_codes(-Record, +Codes, -Suffix)

Reads a record from the code list Codes. In the arity 2 variant, there must be no trailing character codes after the record. In the arity 3 variant, any trailing character codes are unified with Suffix, which can be used for reading subsequent records.

write_record(+Record)
write_record(+Stream, +Record)

Writes a single record to the stream Stream, which defaults to the current output stream.

write_records(+Records)
write_records(+Stream, +Records)

Writes records to the stream Stream, which defaults to the current output stream.

write_record_to_codes(+Record, -Codes)

Writes a single record to the code list Codes, without the terminating RET.



Send feedback on this subject.