Go to the first, previous, next, last section, table of contents.

External Storage of Terms (External Database)

This library handles storage and retrieval of terms on files. By using indexing, the store/retrieve operations are efficient also for large data sets.

The package is loaded by the query

| ?- use_module(library(db)).


The idea is to get a behavior similar to assert/1, retract/1 and clause/2 but the terms are stored on files instead of in primary memory.

The differences compared with the internal database are:

Some commercial databases can't store non-ground terms or more than one instance of a term. The SICStus database can however store terms of either kind.

The database is kept in a secure state and the last stored term is safe even if the SICStus process dies (machine rebooted, process killed, halt/0, power failure...) (but see section Current Limitations).

Current Limitations

The DB-Spec--Informal Description

The db-spec defines which parts of a term that is used for indexing in a database. It is a structure with the functor on or off. The arguments are on, off or the same kind of structure.

The db-spec is compared with the indexed term and every argument where there is an on in the db-spec is indexed.

If the db-spec is of lower arity than the indexed term, the last part of the indexed term is skipped or vice versa.

The idea of a db-spec are illustrated with a few examples. (A section further down explains the db-spec in a more formal way).

DB-Spec    Term (the parts with indexing are underlined) 

on(on)           a(b)   a(b,c)  a   [a,b(c),d]    /* as Prolog */
                 - -    - -     -   --

on(on,on)        a(b)   a(b,c)  a(b,c,d)  a(b,c(d))   [a,b(c),d]
                 - -    - - -   - - -     - - -       ---

on(off,on(on))   a(b)   a(b,c)  a(b,c,d)  a(b,c(d)) [a,b(c),d]
                 -      -   -   -   -     -   - -   - --


The following conventions are used in the predicate descriptions below.

db_open(+Name, +Mode, -DBref)
db_open(+Name, +Mode, ?Spec, -DBref)
Opens a database with the name Name. The database physically consists of a subdirectory with the same name, containing the files that make up the database. If the subdirectory does not exist, it is created. In that case Mode must be update.
The db-spec Spec must be ground if opening a new database. If an existing database is opened, Spec is unified with the db-spec given when the database was created. If the unification fails the predicate fails and an error is raised.
On creating a new database, the db-spec is on(on), the same kind of indexing as in the internal database. When opening an existing database, any db-spec from the database is accepted.
Closes a database. db_close/0 closes the default database. Note that after db_close/0 there is no default database. abort/0 does not close databases.
Sets the database Name or DBref to be the default database regardless of whether there already is one or not.
Unifies DBref with the default database.
current_db(?Name, ?Mode, ?Spec, ?DBref)
Unifies the arguments with the open databases. This predicate can be used for enumerating all currently open databases through backtracking.
db_store(+Term, -TermRef)
db_store(+DBref, +Term, -TermRef)
Stores Term in the database DBref, which defaults to the default database. TermRef is unified with a corresponding term reference.
db_fetch(?Term, ?TermRef)
db_fetch(+DBref, ?Term, ?TermRef)
Unifies Term with a term from the database DBref, which defaults to the default database. At the same time, TermRef is unified with a corresponding term reference. Backtracking over the predicate unifies with all terms matching Term. If you simply want to find all matching terms, it is more efficient to use db_findall/(2-3). If TermRef and DBref are instantiated (and the referenced term is not erased), the referenced term is read and unified with Term.
db_findall(?Term, ?TermList)
db_findall(+DBref, ?Term, ?TermList)
Unifies TermList with the list of all terms matching Term from the database DBref, which defaults to the default database. The list is guaranteed to be free of duplicates, except in the cases where the same term has actually been stored more than once.
db_erase(+DBref, +TermRef)
Deletes the term in the database DBref, which defaults to the default database, that is referenced by TermRef if it is not already deleted.
db_canonical(+TermRef, -TermID)
db_canonical(+DBref, +TermRef, -TermID)
Returns the canonical term identifier for the term in DBref, which defaults to the default database, that is referenced by TermRef.
db_buffering(?Old, ?New)
db_buffering(+DBref, ?Old, ?New)
Unifies Old with the current buffering mode of DBref, which defaults to the defaults database, and sets its buffering mode to New, which must be on or off. Buffering is initially off. When it is on, modified pages are not immediately flushed to disk, enabling faster execution but with a higher risk of inconsistencies if a crash occurs. db_close always flushes any modified pages.

An Example Session

| ?- db_open(my_db,update,on(on),R), set_default_db(R).

R = '$db'(1411928) ?

| ?- db_store(a(b),_).

| ?- db_store(a(c),_).

| ?- db_fetch(X,_).

X = a(b) ? ;

X = a(c) ? ;

| ?- current_db(A,B,C,D).

A = my_db,
B = update,
C = on(on),
D = '$db'(1411928) ? ;

| ?- db_close.


The Db-Spec

A db-spec is on of the following terms:

The following table defines the way indices are calculated a bit more formally. The table defines the index as a function INDEX of the db-spec and the indexed term.

The column Index Calculation describes the procedure: a "yes" means that some primitive function is used, such as a hash function; "INDEX(s,t)" means simply that the function definition (the table) is applied again, but with a new db-spec s and a new term t; I means on or off.

DB-Spec      Indexed Term    Index Calculation
off             any             no
on              any             yes (on principal functor)
I(...)          atomic          yes
I(...)          variable        yes
I(S1,S2...Sn)    F(A1,A2...Am)  yes (on principal functor)
                                INDEX(Si,Ai) for 1 =< i =< min(n,m)

Every term is stored together with a set of "keywords" for indexing purposes. The space overhead is approximately 16 bytes per keyword per term. The number of keywords stored depends on the db-spec and on the term being indexed. The following table defines the number of keywords as a function K of the db-spec and the indexed term.

DB-Spec      Indexed Term    Number of keywords
off             any             0
on              any             2
I(...)          atomic          2
I(...)          variable        2
I(S1,...,Sn)    F(A1,...,Am)  1+K(S1,A1)*...*K(Sj,Aj), j=min(n,m)

Go to the first, previous, next, last section, table of contents.