library(aggregate)
Data base query languages usually provide so-called "aggregation" operations. Given a relation, aggregation specifies
One might, for example, ask
PRINT DEPT,SUM(AREA) WHERE OFFICE(_ID,DEPT,AREA,_OCCUPANT)
and get a table of <Department,TotalArea> pairs. The Prolog equivalent of this might be
dept_office_area(Dept, TotalArea) :- aggregate(sum(Area), I^O^office(I,Dept,Area,O), TotalArea).
where Area is the column and sum(_)
is the aggregation operator.
We can also ask who has the smallest office in each department:
smallest_office(Dept, Occupant) :- aggregate(min(Area), I^O^office(I,Dept,Area,O), MinArea), office(_, Dept, MinArea, Occupant).
This module provides an aggregation operator in Prolog:
aggregate(Template, Generator, Results)
where:
sum | min | max
{for now}
Results is unified with a form of the same structure as Template.
Things like mean and standard deviation can be calculated from sums, e.g. to find the average population of countries (defined as "if you sampled people at random, what would be the mean size of their answers to the question 'what is the population of your country?'?") we could do
?- aggregate(x(sum(Pop),sum(Pop*Pop)), Country^population(Country,Pop), x(People,PeopleTimesPops)), AveragePop is PeopleTimesPops/People.
Note that according to this definition, aggregate/3
FAILS if
there are no solutions. For max(_)
, min(_)
, and many other
operations (such as mean(_)
) this is the only sensible
definition (which is why bagof/3
works that way). Even if
bagof/3 yielded an empty list, aggregate/3 would still fail.
Concerning the minimum and maximum, it is convenient at times to know Which term had the minimum or maximum value. So we write
min(Expression, Term) max(Expression, Term)
and in the constructed term we will have
min(MinimumValue, TermForThatValue) max(MaximumValue, TermForThatValue)
So another way of asking who has the smallest office is
smallest_office(Dept, Occupant) :- aggregate(min(Area,O), I^office(I,Dept,Area,O), min(_,Occupant)).
Consider queries like
aggregate(sum(Pay), Person^pay(Person,Pay), TotalPay)
where for some reason pay/2
might have multiple solutions.
(For example, someone might be listed in two departments.)
We need a way of saying "treat identical instances of the
Template as a single instance, UNLESS they correspond to
different instances of a Discriminator." That is what
aggregate(Template, Discriminator, Generator, Results)
does.
Operations available:
count
sum(1)
sum(
E)
min(
E)
min(
E,
X)
min(
E)
with corresponding instance of X
max(
E)
max(
E,
X)
max(
E)
with corresponding instance of X
set(
X)
bag(
X)
bagof(X, G, B) :- aggregate(bag(X), G, L). setof(X, G, B) :- aggregate(set(X), X, G, L).
Exported predicates:
forall(
:Generator,
:Goal)
foreach(
:Generator,
:Goal)
foreach(between(1,3,I), p(I))
is
equivalent to p(1), p(2), p(3)
.
Note that this is not the same as forall/2
. For example,
forall(between(1,3,I), p(I))
is equivalent to
\+ \+ p(1), \+ \+ p(2), \+ \+ p(3)
.
The trick in foreach/2
is to ensure that the variables of Goal which
do not occur in Generator are restored properly. (If there are no
such variables, you might as well use forall/2
.)
Like forall/2
, this predicate does a failure-driven loop over the
Generator. Unlike forall/2
, the Goals are executed as an ordinary
conjunction, and may succeed in more than one way.
aggregate(
+Template,
+Discriminator,
:Generator,
-Result)
setof/3
which lets you compute sums,
minima, maxima, and so on.
aggregate(
+Template,
:Generator,
-Result)
findall/3
which lets you compute sums,
minima, maxima, and so on.
aggregate_all(+Template, +Discriminator, :Generator, -Result)
aggregate/4
except that it will find at most one solution,
and does not bind free variables in the Generator.
aggregate_all(+Template, :Generator, -Result)
aggregate/3
except that it will find at most one solution,
and does not bind free variables in the Generator.
free_variables(:Goal, +Bound, +Vars0, -Vars)
setof/3
and bagof/3
themselves, for the
explicit existential quantifier Vars^
Goal, and for things like
\+(_)
which might look as though they bind variables but can't.
term_variables(
+Term,
+Vars0,
-Vars)
New code should consider the built in term_variables/2
which is likely
to be faster, and works for cyclic terms.
Could be defined as:
term_variables(Term, Vars0, Vars) :- nonvar(Term), !, ( foreacharg(Arg,Term), fromto(Vars0,S0,S,Vars) do term_variables(Arg, S0, S) ). term_variables(Term, Vars0, Vars) :- ( foreach(X,Vars0), param(Term) do X\==Term ), !, Vars = [Term|Vars0]. term_variables(_, Vars, Vars).