Next: lib-assoc, Up: The Prolog Library [Contents][Index]
library(aggregate)
Data base query languages usually provide so-called "aggregation" operations. Given a relation, aggregation specifies
One might, for example, ask
PRINT DEPT,SUM(AREA) WHERE OFFICE(_ID,DEPT,AREA,_OCCUPANT)
and get a table of <Department,TotalArea> pairs. The Prolog equivalent of this might be
dept_office_area(Dept, TotalArea) :- aggregate(sum(Area), I^O^office(I,Dept,Area,O), TotalArea).
where Area is the column and sum(_)
is the aggregation operator.
We can also ask who has the smallest office in each department:
smallest_office(Dept, Occupant) :- aggregate(min(Area), I^O^office(I,Dept,Area,O), MinArea), office(_, Dept, MinArea, Occupant).
This module provides an aggregation operator in Prolog:
aggregate(Template, Generator, Results)
where:
sum | min | max
{for now}
Results is unified with a form of the same structure as Template.
Things like mean and standard deviation can be calculated from sums, e.g. to find the average population of countries (defined as "if you sampled people at random, what would be the mean size of their answers to the question ’what is the population of your country?’?") we could do
?- aggregate(x(sum(Pop),sum(Pop*Pop)), Country^population(Country,Pop), x(People,PeopleTimesPops)), AveragePop is PeopleTimesPops/People.
Note that according to this definition, aggregate/3
FAILS if
there are no solutions. For max(_)
, min(_)
, and many other
operations (such as mean(_)
) this is the only sensible
definition (which is why bagof/3
works that way). Even if
bagof/3 yielded an empty list, aggregate/3 would still fail.
Concerning the minimum and maximum, it is convenient at times to know Which term had the minimum or maximum value. So we write
min(Expression, Term) max(Expression, Term)
and in the constructed term we will have
min(MinimumValue, TermForThatValue) max(MaximumValue, TermForThatValue)
So another way of asking who has the smallest office is
smallest_office(Dept, Occupant) :- aggregate(min(Area,O), I^office(I,Dept,Area,O), min(_,Occupant)).
Consider queries like
aggregate(sum(Pay), Person^pay(Person,Pay), TotalPay)
where for some reason pay/2
might have multiple solutions.
(For example, someone might be listed in two departments.)
We need a way of saying "treat identical instances of the
Template as a single instance, UNLESS they correspond to
different instances of a Discriminator." That is what
aggregate(Template, Discriminator, Generator, Results)
does.
Operations available:
count
sum(1)
sum(E)
sum of values of E
min(E)
minimum of values of E
min(E,X)
min(E)
with corresponding instance of X
max(E)
maximum of values of E
max(E,X)
max(E)
with corresponding instance of X
set(X)
ordered set of instances of X
bag(X)
list of instances of X in generated order.
bagof(X, G, B) :- aggregate(bag(X), G, L). setof(X, G, B) :- aggregate(set(X), X, G, L).
Exported predicates:
forall(:Generator, :Goal)
succeeds when Goal is provable for each true instance of Generator. Note that there is a sort of double negation going on in here (it is in effect a nested pair of failure-driven loops), so it will never bind any of the variables which occur in it.
foreach(:Generator, :Goal)
for each proof of Generator in turn, we make a copy of Goal with
the appropriate substitution, then we execute these copies in
sequence. For example, foreach(between(1,3,I), p(I))
is
equivalent to p(1), p(2), p(3)
.
Note that this is not the same as forall/2
. For example,
forall(between(1,3,I), p(I))
is equivalent to
\+ \+ p(1), \+ \+ p(2), \+ \+ p(3)
.
The trick in foreach/2
is to ensure that the variables of Goal which
do not occur in Generator are restored properly. (If there are no
such variables, you might as well use forall/2
.)
Like forall/2
, this predicate does a failure-driven loop over the
Generator. Unlike forall/2
, the Goals are executed as an ordinary
conjunction, and may succeed in more than one way.
aggregate(+Template, +Discriminator, :Generator, -Result)
is a generalisation of setof/3
which lets you compute sums,
minima, maxima, and so on.
aggregate(+Template, :Generator, -Result)
is a generalisation of findall/3
which lets you compute sums,
minima, maxima, and so on.
aggregate_all(+Template, +Discriminator, :Generator, -Result)
is like aggregate/4
except that it will find at most one solution,
and does not bind free variables in the Generator.
aggregate_all(+Template, :Generator, -Result)
is like aggregate/3
except that it will find at most one solution,
and does not bind free variables in the Generator.
free_variables(:Goal, +Bound, +Vars0, -Vars)
binds Vars to the union of Vars0 with the set of free variables
in Goal, that is the set of variables which are captured neither
by Bound nor by any internal quantifiers or templates in Goal.
We have to watch out for setof/3
and bagof/3
themselves, for the
explicit existential quantifier Vars^Goal
, and for things like
\+(_)
which might look as though they bind variables but can’t.
term_variables(+Term, +Vars0, -Vars)
binds Vars to a union of Vars0 and the variables which occur in Term. This doesn’t take quantifiers into account at all.
New code should consider the built in term_variables/2
which is likely
to be faster, and works for cyclic terms.
Could be defined as:
term_variables(Term, Vars0, Vars) :- nonvar(Term), !, ( foreacharg(Arg,Term), fromto(Vars0,S0,S,Vars) do term_variables(Arg, S0, S) ). term_variables(Term, Vars0, Vars) :- ( foreach(X,Vars0), param(Term) do X\==Term ), !, Vars = [Term|Vars0]. term_variables(_, Vars, Vars).