1:- module( bims, [ 2 bims/0, bims/1, 3 bims_version/2, bims_citation/2, 4 dlp_load/1, dlp_load/2, 5 dlp_sample/1,dlp_sample/3, 6 dlp_call/1,dlp_call/3, 7 dlp_call_sum/2, 8 dlp_seed/0, dlp_path_prob/2, dlp_path_prob/3 9 ] ). 10 11:- use_module(library(lib)). 12% also depends on pack(stoics_lib), pack(lib) will ask to download it if missing 13 14:- ensure_loaded( '../src/init_lib' ). 15:- lib(source(bims), homonyms(true)). 16 17% :- lib(stoics_lib). % we should consider loading the whole of stoics_lib, as we using many of its preds 18% 19:- lib(stoics_lib:en_list/2). 20:- lib(stoics_lib:goal_spec/2). 21:- lib(stoics_lib:portray_clauses/2). 22 23:- lib(mcmcms/12). 24:- lib(os_unique_by_date/2). 25:- lib(n_digits/3). 26:- lib(clean_module/1). 27:- lib(report_triggers/1). 28:- lib(bims_bb_remove/2). 29:- lib(remove_template_duplicates/2). 30:- lib(get_date_time/1). 31:- lib(ord_only_add_elem/3). % needed on ad_expand 32:- lib(dlp_load/1). 33:- lib(dlp_sample/1). 34:- lib(dlp_call/1). 35:- lib(dlp_path_prob/2). 36:- lib(dlp_seed/0). 37:- lib(end(bims)). 38% :- lib_pack_end( bims ).

Bims- Bayesian inference over model structures.

Introduction

Bims (Bayesian inference over model structures) implements MCMC learning over statistical models defined in the Dlp (Distributional logic programming) probabilistic language.

Bims is released under GPL2, or Artistic 2.0

Currently there are 2 model spaces supported:

Carts (Classification & Regression trees), and
Bayesian Networks

Additional model spaces can be easily implemented by defining new likelihood plug-ins and programming appropriate priors.

Examples provided

Carts examples

?- bims([]).
?- bims([data(carts),models(carts),likelihood(carts)]).

The above are two equivalent ways to run the Carts example provided.

This runs 3 chains each of length 100 on the default Carts data using the default likelihood. The default dataset is the breast cancer Winsconsin (BCW) data from the machine learning repository. There are 2 categories, 9 variables and 683 data points in this dataset. You can view the data with

?- edit( pack(bims/data/carts) ).

The default likelihood is an implementation of the classification likelihood function presented in: H Chipman, E George, and R McCulloch. Bayesian CART model search (with discussion). J. of the American Statistical Association, 93:935â960, 1998.

Bns examples

?- bims([models(bns)]).
?- bims([data(bns),models(bns),likelihood(bns)]).

The above are two equivalent ways to run the Bns example provided.

This runs 3 chains each of length 100 on the default bns data using default likelihood. The dataset is a sampled dataset from the ASIA network and it comprises of 8 variables and 2295 datapoints. You can view the data with

?- edit( pack(bims/data/bns) ).

The default BN likelihood is an instance of the BDeu metric for scoring BN structures.

W. L. Buntine. Theory refinement of Bayesian networks. In Bruce DâAmbrosio, Philippe Smets, and Piero Bonissone, editors, Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAIâ1991), pages 52â60, 1991

David Heckerman, Dan Geiger, and David M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197â243, 1995.

Learning models from new datasets

An easy way to run Bims on your data is to create a new directory and within that sub-directory data/ copy your data there and pass options data/1 to the basename of the data file.

For example,

?- bims(data(mydata)).

Learning new statistical models.

By defining a new likelihood function and new priors the system can be used on new statistical models.

Resolution

In addition to model structure learning Bims implements two way of performing resolution over DLPs: stochastic sampling resolution (SSD) and SLD-based probabilisic inference.

Stochastic sampling definite clause (SSD) resolution

These predicates allow to sample from a loaded distributional logic program (Dlp). The resolution strategy here are that of chosing between probabilistic choices according to their relative values. The main idea is that when sampling many times from a top goal will in the long run sample each derivation path in proportion to the probability of the derivation. The probability of a derivation/refutation, is simply the product of all the probabilities attached to resolution steps during the derivation.

See

SLD-based probabilisic inference

These predicates allow standard SLD exploration of a stochastic query against a DLP. Predicates here allow to explore what is derivable and often attach a probability and ather information to each derivation.

Note that in probabilistic inference we often are more interested in failures than in standard LP. This is because there is a probability mass loss which each failed probabilistic branch.

Probabilistic inference predicates

Predicates index

Pack info

author: - Nicos Angelopoulos, http://stoics.org.uk/~nicos; - James Cussens (University of York), http://cs.york.ac.uk/~jc
version: - 2.0 2017/02/21, IJAR paper; - 2.1 2017/03/10, pack lib; - 2.2 2017/04/18, web-doc; de-git; - 2.3 2018/12/21, aux/ -> aux_code; - 2.4,2021/12/29, run on SWI 8.5.4; github core complete; - 2.5,2022/01/02, src/lib clean-up; - 3.0 2023/05/08, sampling & inference preds dlp_*; - 3.1 2025/09/26, fixed en_list/2 on variable input + broken bns likelihood
See also: - http://stoics.org.uk/~nicos/sware/bims
license: - MIT */
To be done: - bims_default(-Def).; - test on Windows (and Mac ?)

179bims_defaults( ArgsPrv, [ 180 chains(3), 181 report([]), 182 results_dir(Dir), 183 iterations(100), 184 tempered([]), 185 seeds(1), 186 progress_percentage(10), 187 progress_stub('.'), 188 DefModelT, 189 prior(ModelSingular), 190 likelihood(Model), 191 data(Model), 192 backtrack(uc), 193 top_goal(ModelSingular), 194 debug(true) 195 ] ) :- 196 en_list( ArgsPrv, Args ), 197 ( (memberchk(results_dir(ArgsDir),Args),ground(ArgsDir)) -> 198 make_directory(ArgsDir) 199 ; 200 true 201 ), 202 DefModel = carts, 203 DefModelT = models(DefModel), 204 append( Args, [DefModelT], Partial ), 205 memberchk( models(Model), Partial ), 206 atom_singular( Model, ModelSingular ), 207 bims_args_results_dir( Args, Dir ).

304bims :- bims( [] ). 305 306bims( ArgsPrv ) :- 307 en_list( ArgsPrv, Args ), 308 bims_defaults( Args, Defs ), 309 append( Args, Defs, Opts ), 310 debug( bims, 'Bims options: ~w', [Opts] ), 311 bims_option_debug( Opts, DbgRestore ), 312 memberchk( results_dir(ResDPrv), Opts ), 313 absolute_file_name( ResDPrv, ResD ), 314 debug( bims, 'Results directory: ~w', ResD ), 315 memberchk( chains(Runs), Opts ), 316 bims_option_seeds( Opts, Seeds ), 317 remove_template_duplicates( Opts, UnqOpts ), 318 number_codes( Runs, Rcodes ), 319 length( Rcodes, RcLen ), 320 n_digits( RcLen, 0, Zero ), 321 atomic_list_concat( [Zero,'opts.pl'], '-', BoptsB ), 322 directory_file_path( ResD, BoptsB, BroF ), 323 bims_write_options( BroF, UnqOpts ), 324 % fixme: likel 325 % likelihood(Lk=Model) 326 % model(Mdl), 327 PersOpts = [ models(Model), prior(Dlp), likelihood(Lkl), 328 data(Data), top_goal(Goal), 329 report(These) 330 ], 331 maplist( list_element(Opts), PersOpts ), 332 report_triggers( These ), 333 bims_ensure_likelihood_loaded( Model, Lkl, AbsLkl ), 334 bims_ensure_data_loaded( Model, Lkl, Data, AbsData ), 335 bims_locate_prior_file( Model, Dlp, AbsDlp ), 336 bims_write_abs_options( BroF, abs(AbsLkl,AbsData,AbsDlp) ), 337 bims_runs( Runs, 1, RcLen, Model, AbsDlp, Seeds, ResD, Goal, Opts ), 338 get_date_time( Now ), 339 portray_clauses( [finished_at(Now)], [file(BroF),mode(append)] ), 340 bims_option_debug_set( DbgRestore ). 341 342bims_runs( 0, _I, _Dgs, _Mdl, _, _Seelds, _ResD, _Goal, _Opts ) :- 343 !, 344 debug( bims, 'Finished all runs', [] ). 345bims_runs( R, I, Dgs, Mdl, Dlp, [Seed|Seeds], ResD, Goal, Opts ) :- 346 copy_term( Goal, TopG ), 347 % * Out, *Stats, Kernel, ModelType, Prior, PB, Repeats, Data, HotChainsIds, *Seed, SGl 348 % Out = chain_output( 01-chain.pl ), Stats= 01_stats.txt 349 debug( bims, 'Run: ~d', I ), 350 n_digits( Dgs, I, Iat ), 351 bims_out_file( ResD, Iat, chain, pl, Chain ), 352 bims_out_file( ResD, Iat, stats, pl, Stats ), 353 bims_out_file( ResD, Iat, report, pl, Rep ), % fixme: nicer name ? 354 MCopts = [ backtrack(BckT), iterations(Its), tempered(Hot), 355 progress_percentage(Pc), progress_stub(Stub) 356 ], 357 maplist( list_element(Opts), MCopts ), %fixme: allow lists 358 bims_progress_reporter( Its, Pc, Stub ), 359 bims_backtrack_term( BckT, Bck, P ), 360 mcmcms( Chain, Stats, Rep, Bck, Mdl, Dlp, P, Its, true, Hot, Seed, TopG ), 361 !, % fixme: do some cleaning 362 J is I + 1, 363 Q is R - 1, 364 bims_runs( Q, J, Dgs, Mdl, Dlp, Seeds, ResD, Goal, Opts ).

?- bims_citation(A, G), write(A), nl. Distributional Logic Programming for Bayesian Knowledge Representation. Nicos Angelopoulos and James Cussens. International Journal of Approximate Reasoning (IJAR). Volume 80, January 2017, pages 52-66.

410bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 411 Atom = 'Distributional Logic Programming for Bayesian Knowledge Representation. 412Nicos Angelopoulos and James Cussens. 413International Journal of Approximate Reasoning (IJAR). 414Volume 80, January 2017, pages 52-66.', 415 Type = article, 416 Key = 'AngelopoulosN_CussensJ_2017', 417 Pairs = [ 418 author = 'Nicos Angelopoulos and James Cussens', 419 title = 'Distributional Logic Programming for Bayesian Knowledge', 420 journal= 'International Journal of Approximate Reasoning', 421 year = 2017, 422 month = 'January', 423 volume = 80, 424 pages = '52-66', 425 url = 'http://www.sciencedirect.com/science/article/pii/S0888613X16301232' 426 ]. 427bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 428 Atom = 'Bayesian learning of Bayesian Networks with informative priors. 429Nicos Angelopoulos and James Cussens (2008). 430Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI) 54(1-3), 53-98.', 431 Type = article, 432 Key = 'AngelopoulosN_Cussens_2008', 433 Pairs = [ 434 author = 'Nicos Angelopoulos and James Cussens', 435 title = 'Bayesian learning of Bayesian Networks with informative priors.', 436 year = '2008', 437 journal= 'Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI)', 438 volume = '54', 439 issue = '1-3', 440 pages = '53-98', 441 url = 'http://dx.doi.org/10.1007/s10472-009-9133-x' 442 ]. 443bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 444 Atom = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees. 445Nicos Angelopoulos and James Cussens (2005) 446In 19th International Joint Conference on Artificial Intelligence (IJCAI-05), 641-646, Edinburgh, UK, August 2005.', 447 Type = inproceedings, 448 Key = 'AngelopoulosN_Cussens_2005', 449 Pairs = [ 450 title = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees', 451 author = 'Nicos Angelopoulos and James Cussens', 452 year = '2005', 453 inproceedings = 'In 19th International Joint Conference on Artificial Intelligence (IJCAI-05)', 454 pages = '641-646', 455 address = 'Edinburgh, UK', 456 month = 'August', 457 year = 2005 458 ]. 459 460bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 461 Atom = 'Tempering for Bayesian C&RT. 462Angelopoulos, Nicos and Cussens, James (2005) 463In 22nd International Conference on Machine Learning (ICML 2005), 17-24, Bonn, Germany, August 2005.', 464 Type = inproceedings, 465 Key = 'AngelopoulosN_CussensJ_2005a', 466 Pairs = [ 467 title = 'Tempering for {B}ayesian {C&RT}', 468 author = 'Angelopoulos, Nicos and Cussens, James', 469 booktitle= '22nd International Conference on Machine Learning (ICML 2005)', 470 address = 'Bonn, Germany', 471 publisher= 'ACM', 472 month = 'August', 473 pages = '17-24', 474 year = '2005', 475 url = 'ftp://ftp.cs.york.ac.uk/pub/aig/Papers/james.cussens/icml05.pdf' 476 ]. 477 478bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 479 Atom = 'Markov chain Monte Carlo using tree-based priors on model structure. 480Nicos Angelopoulos and James Cussens (2001). 481In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001), 16-23, Seattle, USA.', 482 Type = inproceedings, 483 Key = 'AngelopoulosN_Cussens_2001', 484 Pairs = [ 485 title = 'Markov chain Monte Carlo using tree-based priors on model structure.', 486 author = 'Nicos Angelopoulos and James Cussens', 487 year = 2001, 488 proceedings = 'In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001)', 489 pages = '16-23', 490 address = 'Seattle, USA' 491 ]. 492 493bims_lib( Spec ) :- 494 ( Spec = Fname/_Arity -> true; Spec = Fname ), 495 ensure_loaded( bims(src/lib/Fname) ).

502bims_option_seeds( Opts, Seeds ) :- 503 memberchk( chains(Runs), Opts ), 504 memberchk( seeds(PrvSeedS), Opts ), 505 en_list( PrvSeedS, PrvSeeds ), 506 length( PrvSeeds, PrvLen ), 507 Diff is max( Runs - PrvLen, 0 ), 508 last( PrvSeeds, Last ), 509 findall( A, ( between(1,Diff,I), 510 A is Last + I 511 ), As ), 512 append( PrvSeeds, As, Seeds ).

520bims_option_debug( Opts, Bef ) :- 521 memberchk( debug(Dbg), Opts ), 522 debugging( bims, Bef ), 523 !, 524 bims_option_debug_set( Dbg ). 525 526bims_option_debug_set( true ) :- 527 debug( bims ). 528bims_option_debug_set( false) :- 529 nodebug( bims ).

539bims_out_file( Dir, Iat, Tkn, Ext, File ) :- 540 atomic_list_concat( [Iat,Tkn], '-', Bstem ), 541 file_name_extension( Bstem, Ext, Bname ), 542 directory_file_path( Dir, Bname, File ). 543 544bims_ensure_data_loaded( Model, Lkl, DataT, DataF ) :- 545 DataT =.. [Data|Args], 546 % generic section 547 clean_module( data ), 548 AbsOpts = [file_type(directory),solutions(all)], 549 findall( BimsDD, absolute_file_name(bims(data),BimsDD,AbsOpts), BimsDDs ), 550 debug( bims, 'Bims data dirs: ~w', [BimsDDs] ), 551 member( Dir, [data|BimsDDs] ), 552 directory_file_path( Dir, Data, Stem ), 553 file_name_extension( Stem, pl, DataF ), 554 debug( bims, 'Testing existance of data file: ~w', DataF ), 555 exists_file( DataF ), 556 debug( bims, 'Loading data file: ~w', DataF ), 557 data:load_files( DataF, [silent(true)] ), 558 % data:ensure_loaded( DataF ), 559 bims_data_call( Model, Lkl, Args, PrepF, PrepG ), 560 assert( data:data_file(DataF) ), 561 assert( data:data_models(Model) ), 562 assert( data:data_prep_file(PrepF) ), 563 assert( data:data_prep_goal(PrepG) ), 564 !. 565bims_ensure_data_loaded( Model, Lkl, Data, _DataF ) :- 566 throw( fixme(cannot_load_data_for(Model,Lkl,Data)) ). 567 568bims_ensure_likelihood_loaded( Model, Lkl, PlLkl ) :- 569 Sub = models/Model/lklhoods/Lkl/Lkl, 570 clean_module( bims_lkl ), 571 absolute_file_name( bims(Sub), LklStem, [solutions(all)] ), 572 file_name_extension( LklStem, pl, PlLkl ), 573 debug( bims, 'Looking for likelihood in file: ~w', PlLkl ), 574 exists_file( PlLkl ), 575 debug( bims, 'Loading likelihood from file: ~w', PlLkl ), 576 % ensure_loaded( bims:PlLkl ), 577 bims_lkl:load_files( PlLkl, [silent(true),if(true)] ), 578 !. 579bims_ensure_likelihood_loaded( Model, Lkl, _ ) :- 580 throw( fixme(cannot_find_likelihood_for(Lkl,Model)) ). 581 582bims_data_call( Model, Lkl, Args, DataMan, GoalCopy ) :- 583 atom_concat( Lkl, '_data', BaseStem ), 584 file_name_extension( BaseStem, pl, BaseName ), 585 Sub = models/Model/lklhoods/Lkl, 586 AbsOpts = [file_type(directory),solutions(all)], 587 absolute_file_name( bims(Sub), LklD, AbsOpts ), 588 directory_file_path( LklD, BaseName, DataMan ), 589 debug( bims, 'Looking data preparation file in: ~w', DataMan ), 590 exists_file( DataMan ), 591 debug( bims, 'Loading data preparation file: ~w', DataMan ), 592 bims_data:load_files( DataMan, [silent(true),if(true)] ), 593 Goal =.. [BaseStem,Args], 594 duplicate_term( Goal, GoalCopy ), 595 call( bims_data:Goal ). 596 597list_element( List, Element ) :- 598 memberchk( Element, List ). 599 600bims_backtrack_term( Term, Bck, P ) :- 601 functor( Term, Bck, Arity ), 602 bims_backtrack_arity_term( Arity, Term, P ). 603 % fixme: add Arity > 1 error 604 605bims_backtrack_arity_term( 0, Atom, Atom ). 606bims_backtrack_arity_term( 1, Term, Arg ) :- 607 arg( 1, Term, Arg ). 608 609bims_locate_prior_file( Model, DlpStem, AbsDlp ) :- 610 AbsOpts = [file_type(directory),solutions(all)], 611 absolute_file_name( bims(models/Model/dlps), Bims, AbsOpts ), 612 member( Dir, ['dlps',Bims] ), 613 directory_file_path( Dir, DlpStem, DlpPathStem ), 614 file_name_extension( DlpPathStem, dlp, AbsDlp ), 615 debug( bims, 'Looking for prior in file: ~w', AbsDlp ), 616 exists_file( AbsDlp ), 617 debug( bims, 'Will be using prior in file: ~w', AbsDlp ), 618 !. 619bims_locate_prior_file( Model, DlpStem, _AbsDlp ) :- 620 throw( fixme(cannot_locate_prior(Model,DlpStem)) ). 621 622atom_singular( Atom, Singular ) :- 623 atom_concat( Singular, s, Atom ), 624 !. 625atom_singular( Atom, Atom ). 626 627bims_args_results_dir( Args, Dir ) :- 628 memberchk( results_dir(ResDir), Args ), 629 !, 630 Dir = ResDir, 631 bims_args_results_dir_given( ResDir, Args ). 632bims_args_results_dir( Args, Dir ) :- 633 bims_args_results_dir_constructed( Args, Dir ). 634 635bims_args_results_dir_given( ResDir, _Args ) :- 636 ground( ResDir ), 637 !. 638bims_args_results_dir_given( ResDir, Args ) :- 639 var( ResDir ), 640 !, 641 bims_args_results_dir_constructed( Args, ResDir ). 642bims_args_results_dir_given( ResDir, _Args ) :- 643 throw( partially_instantiated_results_dir(ResDir) ). 644 645bims_args_results_dir_constructed( Args, ResDir ) :- 646 ( memberchk(dir_prefix(Pfx),Args) -> true; Pfx = res ), 647 !, 648 os_unique_by_date( Pfx, ResDir ). 649 650bims_write_abs_options( OptsFile, abs(Lkl,Data,Dlp) ) :- 651 open( OptsFile, append, Out ), 652 nl( Out ), 653 write( Out, '% derived' ), nl( Out ), 654 Terms = [ likelihood_file(Lkl), 655 data_file(Data), 656 prior_file(Dlp) 657 ], 658 maplist( mcmcms_write_fact(Out), Terms ), 659 nl( Out ), 660 write( Out, '% date/time stamps' ), nl( Out ), 661 get_date_time( Now ), 662 mcmcms_write_fact( Out, started_at(Now) ), 663 close( Out ). 664 665bims_progress_reporter( Its, Pc, Stub ) :- 666 bims_bb_remove( progress, _ ), 667 number( Pc ), 668 0 < Pc, Pc < 100, 669 !, 670 calc_percentiles( Pc, Pc, Its, [HPtl|TPtiles] ), 671 bims_bb_put( progress, pts(Stub,HPtl,TPtiles) ). 672bims_progress_reporter( _Its, _Pc, _Stub ). 673 674bims_write_options( BroF, UnqOpts ) :- 675 open( BroF, write, Out ), 676 write( Out, '% options' ), nl( Out ), 677 maplist( mcmcms_write_fact(Out), UnqOpts ), 678 close( Out ). 679 680calc_percentiles( Perc, _Step, _Repeats, Ptiles ) :- 681 Perc > 100, 682 !, 683 Ptiles = []. 684calc_percentiles( Perc, Step, Repeats, [F|M] ) :- 685 F is integer( Perc * Repeats / 100 ), 686 NxP is Perc + Step, 687 calc_percentiles( NxP, Step, Repeats, M )