All predicatesShow sourceaggregate.pl -- Aggregation operators on backtrackable predicates

This library provides aggregating operators over the solutions of a predicate. The operations are a generalisation of the bagof/3, setof/3 and findall/3 built-in predicates. The defined aggregation operations are counting, computing the sum, minimum, maximum, a bag of solutions and a set of solutions. We first give a simple example, computing the country with the smallest area:

smallest_country(Name, Area) :-
        aggregate(min(A, N), country(N, A), min(Area, Name)).

There are four aggregation predicates (aggregate/3, aggregate/4, aggregate_all/3 and aggregate/4), distinguished on two properties.

aggregate vs. aggregate_all
The aggregate predicates use setof/3 (aggregate/4) or bagof/3 (aggregate/3), dealing with existential qualified variables (Var^Goal) and providing multiple solutions for the remaining free variables in Goal. The aggregate_all/3 predicate uses findall/3, implicitly qualifying all free variables and providing exactly one solution, while aggregate_all/4 uses sort/2 over solutions that Discriminator (see below) generated using findall/3.
The Discriminator argument
The versions with 4 arguments deduplicate redundant solutions of Goal. Solutions for which both the template variables and Discriminator are identical will be treated as one solution. For example, if we wish to compute the total population of all countries, and for some reason country(belgium, 11000000) may succeed twice, we can use the following to avoid counting the population of Belgium twice:
    aggregate(sum(P), Name, country(Name, P), Total)

All aggregation predicates support the following operators below in Template. In addition, they allow for an arbitrary named compound term, where each of the arguments is a term from the list below. For example, the term r(min(X), max(X)) computes both the minimum and maximum binding for X.

count
Count number of solutions. Same as sum(1).
sum(Expr)
Sum of Expr for all solutions.
min(Expr)
Minimum of Expr for all solutions.
min(Expr, Witness)
A term min(Min, Witness), where Min is the minimal version of Expr over all solutions, and Witness is any other template applied to solutions that produced Min. If multiple solutions provide the same minimum, Witness corresponds to the first solution.
max(Expr)
Maximum of Expr for all solutions.
max(Expr, Witness)
As min(Expr, Witness), but producing the maximum result.
set(X)
An ordered set with all solutions for X.
bag(X)
A list of all solutions for X.

Acknowledgements

The development of this library was sponsored by SecuritEase, http://www.securitease.com

Compatibility
- Quintus, SICStus 4. The forall/2 is a SWI-Prolog built-in and term_variables/3 is a SWI-Prolog built-in with different semantics.
To be done
- Analysing the aggregation template and compiling a predicate for the list aggregation can be done at compile time.
- aggregate_all/3 can be rewritten to run in constant space using non-backtrackable assignment on a term.
Source aggregate(+Template, :Goal, -Result) is nondet
Aggregate bindings in Goal according to Template. The aggregate/3 version performs bagof/3 on Goal.
Source aggregate(+Template, +Discriminator, :Goal, -Result) is nondet
Aggregate bindings in Goal according to Template. The aggregate/4 version performs setof/3 on Goal.
Source aggregate_all(+Template, :Goal, -Result) is semidet
Aggregate bindings in Goal according to Template. The aggregate_all/3 version performs findall/3 on Goal. Note that this predicate fails if Template contains one or more of min(X), max(X), min(X,Witness) or max(X,Witness) and Goal has no solutions, i.e., the minumum and maximum of an empty set is undefined.
Source aggregate_all(+Template, +Discriminator, :Goal, -Result) is semidet
Aggregate bindings in Goal according to Template. The aggregate_all/4 version performs findall/3 followed by sort/2 on Goal. See aggregate_all/3 to understand why this predicate can fail.
Source clean_body(+Goal0, -Goal) is det[private]
Remove redundant true from Goal0.
Source template_to_pattern(+Template, -Pattern, -Post, -Vars, -Aggregate)[private]
Determine which parts of the goal we must remember in the findall/3 pattern.
Arguments:
Post- is a body-term that evaluates expressions to reduce storage requirements.
Vars- is a list of intermediate variables that must be added to the existential variables for bagof/3.
Aggregate- defines the aggregation operation to execute.
Source needs_one(+Ops, -OneOrZero)[private]
If one of the operations in Ops needs at least one answer, unify OneOrZero to 1. Else 0.
Source aggregate_list(+Op, +List, -Answer) is semidet[private]
Aggregate the answer from the list produced by findall/3, bagof/3 or setof/3. The latter two cases deal with compound answers.
To be done
- Compile code for incremental state update, which we will use for aggregate_all/3 as well. We should be using goal_expansion to generate these clauses.
Source min_pair(+Pairs, -Key, -Value) is det[private]
Source max_pair(+Pairs, -Key, -Value) is det[private]
True if Key-Value has the smallest/largest key in Pairs. If multiple pairs share the smallest/largest key, the first pair is returned.
Source step(+AggregateAction, +New, +State0, -State1)[private]
Source state0(+Op, -State, -Finish)[private]
Source state1(+Op, +First, -State, -Finish)[private]
Source foreach(:Generator, :Goal)
True if conjunction of results is true. Unlike forall/2, which runs a failure-driven loop that proves Goal for each solution of Generator, foreach/2 creates a conjunction. Each member of the conjunction is a copy of Goal, where the variables it shares with Generator are filled with the values from the corresponding solution.

The implementation executes forall/2 if Goal does not contain any variables that are not shared with Generator.

Here is an example:

?- foreach(between(1,4,X), dif(X,Y)), Y = 5.
Y = 5.
?- foreach(between(1,4,X), dif(X,Y)), Y = 3.
false.
bug
- Goal is copied repeatedly, which may cause problems if attributed variables are involved.
Source free_variables(:Generator, +Template, +VarList0, -VarList) is det
Find free variables in bagof/setof template. In order to handle variables properly, we have to find all the universally quantified variables in the Generator. All variables as yet unbound are universally quantified, unless
  1. they occur in the template
  2. they are bound by X^P, setof/3, or bagof/3

free_variables(Generator, Template, OldList, NewList) finds this set using OldList as an accumulator.

author
- Richard O'Keefe
- Jan Wielemaker (made some SWI-Prolog enhancements)
license
- Public domain (from DEC10 library).
To be done
- Distinguish between control-structures and data terms.
- Exploit our built-in term_variables/2 at some places?
Source term_is_free_of(+Term, +Var) is semidet[private]
True if Var does not appear in Term. This has been rewritten from the DEC10 library source to exploit our non-deterministic arg/3.
Source list_is_free_of(+List, +Var) is semidet[private]
True if Var is not in List.
Source sandbox:safe_meta(+Goal, -Called) is semidet[multifile]
Declare the aggregate meta-calls safe. This cannot be proven due to the manipulations of the argument Goal.
Source min_pair(+Pairs, -Key, -Value) is det[private]
Source max_pair(+Pairs, -Key, -Value) is det[private]
True if Key-Value has the smallest/largest key in Pairs. If multiple pairs share the smallest/largest key, the first pair is returned.