1/* Part of SWI-Prolog 2 3 Author: Jan Wielemaker 4 E-mail: J.Wielemaker@vu.nl 5 WWW: http://www.swi-prolog.org 6 Copyright (c) 2008-2024, University of Amsterdam 7 VU University Amsterdam 8 CWI Amsterdam 9 SWI-Prolog Solutions b.v. 10 All rights reserved. 11 12 Redistribution and use in source and binary forms, with or without 13 modification, are permitted provided that the following conditions 14 are met: 15 16 1. Redistributions of source code must retain the above copyright 17 notice, this list of conditions and the following disclaimer. 18 19 2. Redistributions in binary form must reproduce the above copyright 20 notice, this list of conditions and the following disclaimer in 21 the documentation and/or other materials provided with the 22 distribution. 23 24 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 25 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 26 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 27 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 28 COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 29 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 30 BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 31 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 32 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 33 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 34 ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 35 POSSIBILITY OF SUCH DAMAGE. 36*/ 37 38:- module(aggregate, 39 [ aggregate/3, % +Templ, :Goal, -Result 40 aggregate/4, % +Templ, +Discrim, :Goal, -Result 41 aggregate_all/3, % +Templ, :Goal, -Result 42 aggregate_all/4, % +Templ, +Discrim, :Goal, -Result 43 foldall/4, % :Folder, :Goal, +V0, -V) 44 foreach/2, % :Generator, :Goal 45 free_variables/4 % :Generator, :Template, +Vars0, -Vars 46 ]). 47:- autoload(library(apply),[maplist/4,maplist/5]). 48:- autoload(library(error), 49 [instantiation_error/1,type_error/2,domain_error/2]). 50:- autoload(library(lists), 51 [append/3,member/2,sum_list/2,max_list/2,min_list/2]). 52:- autoload(library(ordsets),[ord_intersection/3]). 53:- autoload(library(pairs),[pairs_values/2]). 54:- autoload(library(prolog_code), [mkconj/3]). 55 56:- set_prolog_flag(generate_debug_info, false). 57 58:- meta_predicate 59 aggregate( , , ), 60 aggregate( , , , ), 61 aggregate_all( , , ), 62 aggregate_all( , , , ), 63 foldall( , , , ), 64 foreach( , ).
148 /******************************* 149 * AGGREGATE * 150 *******************************/
157aggregate(Template, Goal0, Result) :-
158 template_to_pattern(bag, Template, Pattern, Goal0, Goal, Aggregate),
159 bagof(Pattern, Goal, List),
160 aggregate_list(Aggregate, List, Result).
167aggregate(Template, Discriminator, Goal0, Result) :-
168 template_to_pattern(bag, Template, Pattern, Goal0, Goal, Aggregate),
169 setof(Discriminator-Pattern, Goal, Pairs),
170 pairs_values(Pairs, List),
171 aggregate_list(Aggregate, List, Result).
min(X)
, max(X)
,
min(X,Witness)
or max(X,Witness)
and Goal has no solutions, i.e.,
the minimum and maximum of an empty set is undefined.
The Template values count
, sum(X)
, max(X)
, min(X)
, max(X,W)
and
min(X,W)
are processed incrementally rather than using findall/3 and
run in constant memory.
187aggregate_all(Var, _, _) :- 188 var(Var), 189 !, 190 instantiation_error(Var). 191aggregate_all(count, Goal, Count) :- 192 !, 193 aggregate_all(sum(1), Goal, Count). 194aggregate_all(sum(X), Goal, Sum) :- 195 !, 196 State = state(0), 197 ( call(Goal), 198 arg(1, State, S0), 199 S is S0 + X, 200 nb_setarg(1, State, S), 201 fail 202 ; arg(1, State, Sum) 203 ). 204aggregate_all(max(X), Goal, Max) :- 205 !, 206 State = state(X), 207 ( call(Goal), 208 arg(1, State, M0), 209 M is max(M0,X), 210 nb_setarg(1, State, M), 211 fail 212 ; arg(1, State, Max), 213 nonvar(Max) 214 ). 215aggregate_all(min(X), Goal, Min) :- 216 !, 217 State = state(X), 218 ( call(Goal), 219 arg(1, State, M0), 220 M is min(M0,X), 221 nb_setarg(1, State, M), 222 fail 223 ; arg(1, State, Min), 224 nonvar(Min) 225 ). 226aggregate_all(max(X,W), Goal, max(Max,Witness)) :- 227 !, 228 State = state(false, _Max, _Witness), 229 ( call(Goal), 230 ( State = state(true, Max0, _) 231 -> X > Max0, 232 nb_setarg(2, State, X), 233 nb_setarg(3, State, W) 234 ; number(X) 235 -> nb_setarg(1, State, true), 236 nb_setarg(2, State, X), 237 nb_setarg(3, State, W) 238 ; type_error(number, X) 239 ), 240 fail 241 ; State = state(true, Max, Witness) 242 ). 243aggregate_all(min(X,W), Goal, min(Min,Witness)) :- 244 !, 245 State = state(false, _Min, _Witness), 246 ( call(Goal), 247 ( State = state(true, Min0, _) 248 -> X < Min0, 249 nb_setarg(2, State, X), 250 nb_setarg(3, State, W) 251 ; number(X) 252 -> nb_setarg(1, State, true), 253 nb_setarg(2, State, X), 254 nb_setarg(3, State, W) 255 ; type_error(number, X) 256 ), 257 fail 258 ; State = state(true, Min, Witness) 259 ). 260aggregate_all(Template, Goal0, Result) :- 261 template_to_pattern(all, Template, Pattern, Goal0, Goal, Aggregate), 262 findall(Pattern, Goal, List), 263 aggregate_list(Aggregate, List, Result).
272aggregate_all(Template, Discriminator, Goal0, Result) :- 273 template_to_pattern(all, Template, Pattern, Goal0, Goal, Aggregate), 274 findall(Discriminator-Pattern, Goal, Pairs0), 275 sort(Pairs0, Pairs), 276 pairs_values(Pairs, List), 277 aggregate_list(Aggregate, List, Result). 278 279template_to_pattern(All, Template, Pattern, Goal0, Goal, Aggregate) :- 280 template_to_pattern(Template, Pattern, Post, Vars, Aggregate), 281 existential_vars(Goal0, Goal1, AllVars, Vars), 282 clean_body((Goal1, Post), Goal2), 283 ( All == bag 284 -> add_existential_vars(AllVars, Goal2, Goal) 285 ; Goal = Goal2 286 ). 287 288existential_vars(Var, Var) --> 289 { var(Var) }, 290 !. 291existential_vars(Var^G0, G) --> 292 !, 293 [Var], 294 existential_vars(G0, G). 295existential_vars(M:G0, M:G) --> 296 !, 297 existential_vars(G0, G). 298existential_vars(G, G) --> 299 []. 300 301add_existential_vars([], G, G). 302add_existential_vars([H|T], G0, H^G1) :- 303 add_existential_vars(T, G0, G1).
true
from Goal0.310clean_body((Goal0,Goal1), Goal) => 311 clean_body(Goal0, GoalA), 312 clean_body(Goal1, GoalB), 313 mkconj(GoalA, GoalB, Goal). 314clean_body(Goal0, Goal) => 315 Goal = Goal0.
329template_to_pattern(Term, Pattern, Goal, Vars, Aggregate) :- 330 templ_to_pattern(Term, Pattern, Goal, Vars, Aggregate), 331 !. 332template_to_pattern(Term, Pattern, Goal, Vars, term(MinNeeded, Functor, AggregateArgs)) :- 333 compound(Term), 334 !, 335 Term =.. [Functor|Args0], 336 templates_to_patterns(Args0, Args, Goal, Vars, AggregateArgs), 337 needs_one(AggregateArgs, MinNeeded), 338 Pattern =.. [Functor|Args]. 339template_to_pattern(Term, _, _, _, _) :- 340 invalid_template(Term). 341 342templ_to_pattern(sum(X), X, true, [], sum) :- var(X), !. 343templ_to_pattern(sum(X0), X, X is X0, [X0], sum) :- !. 344templ_to_pattern(count, 1, true, [], count) :- !. 345templ_to_pattern(min(X), X, true, [], min) :- var(X), !. 346templ_to_pattern(min(X0), X, X is X0, [X0], min) :- !. 347templ_to_pattern(min(X0, Witness), X-Witness, X is X0, [X0], min_witness) :- !. 348templ_to_pattern(max(X0), X, X is X0, [X0], max) :- !. 349templ_to_pattern(max(X0, Witness), X-Witness, X is X0, [X0], max_witness) :- !. 350templ_to_pattern(set(X), X, true, [], set) :- !. 351templ_to_pattern(bag(X), X, true, [], bag) :- !. 352 353templates_to_patterns([], [], true, [], []). 354templates_to_patterns([H0], [H], G, Vars, [A]) :- 355 !, 356 sub_template_to_pattern(H0, H, G, Vars, A). 357templates_to_patterns([H0|T0], [H|T], (G0,G), Vars, [A0|A]) :- 358 sub_template_to_pattern(H0, H, G0, V0, A0), 359 append(V0, RV, Vars), 360 templates_to_patterns(T0, T, G, RV, A). 361 362sub_template_to_pattern(Term, Pattern, Goal, Vars, Aggregate) :- 363 templ_to_pattern(Term, Pattern, Goal, Vars, Aggregate), 364 !. 365sub_template_to_pattern(Term, _, _, _, _) :- 366 invalid_template(Term). 367 368invalid_template(Term) :- 369 callable(Term), 370 !, 371 domain_error(aggregate_template, Term). 372invalid_template(Term) :- 373 type_error(aggregate_template, Term).
380needs_one(Ops, 1) :- 381 member(Op, Ops), 382 needs_one(Op), 383 !. 384needs_one(_, 0). 385 386needs_one(min). 387needs_one(min_witness). 388needs_one(max). 389needs_one(max_witness).
401aggregate_list(bag, List0, List) :- 402 !, 403 List = List0. 404aggregate_list(set, List, Set) :- 405 !, 406 sort(List, Set). 407aggregate_list(sum, List, Sum) :- 408 sum_list(List, Sum). 409aggregate_list(count, List, Count) :- 410 length(List, Count). 411aggregate_list(max, List, Sum) :- 412 max_list(List, Sum). 413aggregate_list(max_witness, List, max(Max, Witness)) :- 414 max_pair(List, Max, Witness). 415aggregate_list(min, List, Sum) :- 416 min_list(List, Sum). 417aggregate_list(min_witness, List, min(Min, Witness)) :- 418 min_pair(List, Min, Witness). 419aggregate_list(term(0, Functor, Ops), List, Result) :- 420 !, 421 maplist(state0, Ops, StateArgs, FinishArgs), 422 State0 =.. [Functor|StateArgs], 423 aggregate_term_list(List, Ops, State0, Result0), 424 finish_result(Ops, FinishArgs, Result0, Result). 425aggregate_list(term(1, Functor, Ops), [H|List], Result) :- 426 H =.. [Functor|Args], 427 maplist(state1, Ops, Args, StateArgs, FinishArgs), 428 State0 =.. [Functor|StateArgs], 429 aggregate_term_list(List, Ops, State0, Result0), 430 finish_result(Ops, FinishArgs, Result0, Result). 431 432aggregate_term_list([], _, State, State). 433aggregate_term_list([H|T], Ops, State0, State) :- 434 step_term(Ops, H, State0, State1), 435 aggregate_term_list(T, Ops, State1, State).
445min_pair([M0-W0|T], M, W) :- 446 min_pair(T, M0, W0, M, W). 447 448min_pair([], M, W, M, W). 449min_pair([M0-W0|T], M1, W1, M, W) :- 450 ( M0 < M1 451 -> min_pair(T, M0, W0, M, W) 452 ; min_pair(T, M1, W1, M, W) 453 ). 454 455max_pair([M0-W0|T], M, W) :- 456 max_pair(T, M0, W0, M, W). 457 458max_pair([], M, W, M, W). 459max_pair([M0-W0|T], M1, W1, M, W) :- 460 ( M0 > M1 461 -> max_pair(T, M0, W0, M, W) 462 ; max_pair(T, M1, W1, M, W) 463 ).
467step(bag, X, [X|L], L). 468step(set, X, [X|L], L). 469step(count, _, X0, X1) :- 470 succ(X0, X1). 471step(sum, X, X0, X1) :- 472 X1 is X0+X. 473step(max, X, X0, X1) :- 474 X1 is max(X0, X). 475step(min, X, X0, X1) :- 476 X1 is min(X0, X). 477step(max_witness, X-W, X0-W0, X1-W1) :- 478 ( X > X0 479 -> X1 = X, W1 = W 480 ; X1 = X0, W1 = W0 481 ). 482step(min_witness, X-W, X0-W0, X1-W1) :- 483 ( X < X0 484 -> X1 = X, W1 = W 485 ; X1 = X0, W1 = W0 486 ). 487step(term(Ops), Row, Row0, Row1) :- 488 step_term(Ops, Row, Row0, Row1). 489 490step_term(Ops, Row, Row0, Row1) :- 491 functor(Row, Name, Arity), 492 functor(Row1, Name, Arity), 493 step_list(Ops, 1, Row, Row0, Row1). 494 495step_list([], _, _, _, _). 496step_list([Op|OpT], Arg, Row, Row0, Row1) :- 497 arg(Arg, Row, X), 498 arg(Arg, Row0, X0), 499 arg(Arg, Row1, X1), 500 step(Op, X, X0, X1), 501 succ(Arg, Arg1), 502 step_list(OpT, Arg1, Row, Row0, Row1). 503 504finish_result(Ops, Finish, R0, R) :- 505 functor(R0, Functor, Arity), 506 functor(R, Functor, Arity), 507 finish_result(Ops, Finish, 1, R0, R). 508 509finish_result([], _, _, _, _). 510finish_result([Op|OpT], [F|FT], I, R0, R) :- 511 arg(I, R0, A0), 512 arg(I, R, A), 513 finish_result1(Op, F, A0, A), 514 succ(I, I2), 515 finish_result(OpT, FT, I2, R0, R). 516 517finish_result1(bag, Bag0, [], Bag) :- 518 !, 519 Bag = Bag0. 520finish_result1(set, Bag, [], Set) :- 521 !, 522 sort(Bag, Set). 523finish_result1(max_witness, _, M-W, R) :- 524 !, 525 R = max(M,W). 526finish_result1(min_witness, _, M-W, R) :- 527 !, 528 R = min(M,W). 529finish_result1(_, _, A, A).
533state0(bag, L, L). 534state0(set, L, L). 535state0(count, 0, _). 536state0(sum, 0, _).
540state1(bag, X, L, [X|L]) :- !. 541state1(set, X, L, [X|L]) :- !. 542state1(_, X, X, _). 543 544 545 /******************************* 546 * FOLDALL * 547 *******************************/
call(Folder,V0,V1)
. This predicate provides behaviour similar to
aggregate_all/3-4, but operates in constant space and allows for
custom aggregation (Folder) operators. The example below uses plus/3
to realise aggregate_all(sum(X), between(1,10,X), Sum)
.
?- foldall(plus(X), between(1,10,X), 0, Sum). Sum = 55
The implementation uses nb_setarg/3 for non-backtrackable state updates.
566foldall(Op, Goal, V0, V) :- 567 State = state(V0), 568 ( call(Goal), 569 arg(1, State, S0), 570 call(Op, S0, S), 571 nb_setarg(1, State, S), 572 fail 573 ; arg(1, State, V) 574 ). 575 576 577 /******************************* 578 * FOREACH * 579 *******************************/
foreach(Generator, Goal) :- findall(Goal, Generator, Goals), maplist(call, Goals).
The actual implementation uses findall/3 on a template created from the variables shared between Generator and Goal. Subsequently, it uses every instance of this template to instantiate Goal, call Goal and undo only the instantiation of the template and not other instantiations created by running Goal. Here is an example:
?- foreach(between(1,4,X), dif(X,Y)), Y = 5. Y = 5. ?- foreach(between(1,4,X), dif(X,Y)), Y = 3. false.
The predicate foreach/2 is mostly used if Goal performs backtrackable destructive assignment on terms. Attributed variables (underlying constraints) are an example. Another example of a backtrackable data structure is in library(hashtable). If we care only about the side effects (I/O, dynamic database, etc.) or the truth value of Goal, forall/2 is a faster and simpler alternative. If Goal instantiates its arguments it is will often fail as the argument cannot be instantiated to multiple values. It is possible to incrementally grow an argument:
?- foreach(between(1,4,X), member(X, L)). L = [1,2,3,4|_].
Note that SWI-Prolog up to version 8.3.4 created copies of Goal using copy_term/2 for each iteration, this makes the current implementation unable to properly handle compound terms (in Goal's arguments) that share variables with the Generator. As a workaround you can define a goal that does not use compound terms, like in this example:
mem(E,L) :- % mem/2 hides the compound argument from foreach/2 member(r(E),L). ?- foreach( between(1,5,N), mem(N,L)).
635foreach(Generator, Goal) :- 636 term_variables(Generator, GenVars0), sort(GenVars0, GenVars), 637 term_variables(Goal, GoalVars0), sort(GoalVars0, GoalVars), 638 ord_intersection(GenVars, GoalVars, SharedVars), 639 Templ =.. [v|SharedVars], 640 findall(Templ, Generator, List), 641 prove_list(List, Templ, Goal). 642 643prove_list([], _, _). 644prove_list([H|T], Templ, Goal) :- 645 Templ = H, 646 call(Goal), 647 '$unbind_template'(Templ), 648 prove_list(T, Templ, Goal).
free_variables(Generator, Template, OldList, NewList)
finds this
set using OldList as an accumulator.
670free_variables(Term, Bound, VarList, [Term|VarList]) :- 671 var(Term), 672 term_is_free_of(Bound, Term), 673 list_is_free_of(VarList, Term), 674 !. 675free_variables(Term, _Bound, VarList, VarList) :- 676 var(Term), 677 !. 678free_variables(Term, Bound, OldList, NewList) :- 679 explicit_binding(Term, Bound, NewTerm, NewBound), 680 !, 681 free_variables(NewTerm, NewBound, OldList, NewList). 682free_variables(Term, Bound, OldList, NewList) :- 683 functor(Term, _, N), 684 free_variables(N, Term, Bound, OldList, NewList). 685 686free_variables(0, _, _, VarList, VarList) :- !. 687free_variables(N, Term, Bound, OldList, NewList) :- 688 arg(N, Term, Argument), 689 free_variables(Argument, Bound, OldList, MidList), 690 M is N-1, 691 !, 692 free_variables(M, Term, Bound, MidList, NewList). 693 694% explicit_binding checks for goals known to existentially quantify 695% one or more variables. In particular \+ is quite common. 696 697explicit_binding(\+ _Goal, Bound, fail, Bound ) :- !. 698explicit_binding(not(_Goal), Bound, fail, Bound ) :- !. 699explicit_binding(Var^Goal, Bound, Goal, Bound+Var) :- !. 700explicit_binding(setof(Var,Goal,Set), Bound, Goal-Set, Bound+Var) :- !. 701explicit_binding(bagof(Var,Goal,Bag), Bound, Goal-Bag, Bound+Var) :- !.
709term_is_free_of(Term, Var) :- 710 \+ var_in_term(Term, Var). 711 712var_in_term(Term, Var) :- 713 Var == Term, 714 !. 715var_in_term(Term, Var) :- 716 compound(Term), 717 arg(_, Term, Arg), 718 var_in_term(Arg, Var), 719 !.
726list_is_free_of([Head|Tail], Var) :- 727 Head \== Var, 728 !, 729 list_is_free_of(Tail, Var). 730list_is_free_of([], _). 731 732 733% term_variables(+Term, +Vars0, -Vars) is det. 734% 735% True if Vars is the union of variables in Term and Vars0. 736% We cannot have this as term_variables/3 is already defined 737% as a difference-list version of term_variables/2. 738 739%term_variables(Term, Vars0, Vars) :- 740% term_variables(Term+Vars0, Vars).
748:- multifile sandbox:safe_meta_predicate/1. 749 750sandbox:safe_meta_predicate(aggregate:foreach/2). 751sandbox:safe_meta_predicate(aggregate:aggregate/3). 752sandbox:safe_meta_predicate(aggregate:aggregate/4). 753sandbox:safe_meta_predicate(aggregate:aggregate_all/3). 754sandbox:safe_meta_predicate(aggregate:aggregate_all/4)
Aggregation operators on backtrackable predicates
This library provides aggregating operators over the solutions of a predicate. The operations are a generalisation of the bagof/3, setof/3 and findall/3 built-in predicates. Aggregations that can be computed incrementally avoid findall/3 and run in constant memory. The defined aggregation operations are counting, computing the sum, minimum, maximum, a bag of solutions and a set of solutions. We first give a simple example, computing the country with the smallest area:
There are four aggregation predicates (aggregate/3, aggregate/4, aggregate_all/3 and aggregate/4), distinguished on two properties.
The aggregate predicates use setof/3 (aggregate/4) or bagof/3 (aggregate/3), dealing with existential qualified variables (
Var^Goal
) and providing multiple solutions for the remaining free variables in Goal. The aggregate_all/3 predicate uses findall/3, implicitly qualifying all free variables and providing exactly one solution, while aggregate_all/4 uses sort/2 over solutions that Discriminator (see below) generated using findall/3.The versions with 4 arguments deduplicate redundant solutions of Goal. Solutions for which both the template variables and Discriminator are identical will be treated as one solution. For example, if we wish to compute the total population of all countries, and for some reason
country(belgium, 11000000)
may succeed twice, we can use the following to avoid counting the population of Belgium twice:All aggregation predicates support the following operators below in Template. In addition, they allow for an arbitrary named compound term, where each of the arguments is a term from the list below. For example, the term
r(min(X), max(X))
computes both the minimum and maximum binding for X.sum(1)
.min(Min, Witness)
, where Min is the minimal version of Expr over all solutions, and Witness is any other template applied to solutions that produced Min. If multiple solutions provide the same minimum, Witness corresponds to the first solution.min(Expr, Witness)
, but producing the maximum result.Acknowledgements
The development of this library was sponsored by SecuritEase, http://www.securitease.com