1:- module( bims, [bims/0,bims/1,bims_version/2,bims_citation/2] ). 2 3:- use_module( library(lib) ). 4% :- ensure_loaded( '../src/mcmcms' ). 5% :- ensure_loaded( '../src/init_lib' ). 6:- lib(source(bims), homonyms(true)). 7:- lib(stoics_lib). % en_list/2, goal_spec/2, portray_clauses/2. 8 9:- lib(mcmcms/12). 10:- lib(os_unique_by_date/2). 11:- lib(n_digits/3). 12:- lib(clean_module/1). 13:- lib(report_triggers/1). 14:- lib(bims_bb_remove/2). 15:- lib(remove_template_duplicates/2). 16:- lib(get_date_time/1). 17:- lib(ord_only_add_elem/3). % needed on ad_expand 18 19:- lib(end(bims)). 20% :- lib_pack_end( bims ).
120bims_defaults( ArgsPrv, [
121 chains(3),
122 report([]),
123 results_dir(Dir),
124 iterations(100),
125 tempered([]),
126 seeds(1),
127 progress_percentage(10),
128 progress_stub('.'),
129 DefModelT,
130 prior(ModelSingular),
131 likelihood(Model),
132 data(Model),
133 backtrack(uc),
134 top_goal(ModelSingular),
135 debug(true)
136 ] ) :-
137 en_list( ArgsPrv, Args ),
138 ( (memberchk(results_dir(ArgsDir),Args),ground(ArgsDir)) ->
139 make_directory(ArgsDir)
140 ;
141 true
142 ),
143 DefModel = carts,
144 DefModelT = models(DefModel),
145 append( Args, [DefModelT], Partial ),
146 memberchk( models(Model), Partial ),
147 atom_singular( Model, ModelSingular ),
148 bims_args_results_dir( Args, Dir ).
If the argument (File) corresponds to an existing file, then it is taken to be a settings file. Each argument should be a fact correspond to a known option. For example
chains(3). iterations(100). seeds([1,2,3]).
If the argument (Opts) does not correspond to a file is take to be a list of option terms.
The simplest way to use the software is to make a new directory and run some MCMC chains. The default call,
?- bims(). % equivelant to ?- bims([]).
runs a 3 chains (R=3, below) 100 iterations (I=100) MCMC simulation. The models learnt are classifications trees (carts) based on the default prior and the data are the BCW dataset. The above call is equivelant to:
?- bims([models(carts)]).
To run a toy BN learning example run
?- bims( [models(bns)] ).
This runs 3 chains on some synthetic data of the 8-nodal Asia BN.
To get familiar on how to run bims on private data, make a new directory,
create a subdirecory data
and copy file bims(data/asia.pl)
to
data/test_local.pl
.
?- bims( [data(test_local)] ).
Opts
bns
.debug(bims)
to get debuging messages. If Dbg==false, nodebug(bims)
is called.seeds(1)
when chains(3)
is given expands to seeds([1,2,3])
.top_goal(Top)
) expects. In general the dependency is with the likelihood,
with the prior expected to be compatible with what the likelihood dictates in terms of data.
In the likelihoods provided, Data is the stem of a filename that is loaded in memory.
The file is looked for in Dir/Data[.pl] where Dir is looked for in [./data,bims(Model/data/)].bims(dlps)
.prefix(Pfx)
recognition)all
is expanded to reporting all known reportable terms.All file name based options: Lk, Data, Prior or Rdir, are passed through absolute_file_name/2.
The predicate generates one results directory (Rdir) and files recording information about each run (R) are placed in Rdir.
*/
245bims :- bims( [] ). 246 247bims( ArgsPrv ) :- 248 en_list( ArgsPrv, Args ), 249 bims_defaults( Args, Defs ), 250 append( Args, Defs, Opts ), 251 debug( bims, 'Bims options: ~w', [Opts] ), 252 bims_option_debug( Opts, DbgRestore ), 253 memberchk( results_dir(ResDPrv), Opts ), 254 absolute_file_name( ResDPrv, ResD ), 255 debug( bims, 'Results directory: ~w', ResD ), 256 memberchk( chains(Runs), Opts ), 257 bims_option_seeds( Opts, Seeds ), 258 remove_template_duplicates( Opts, UnqOpts ), 259 number_codes( Runs, Rcodes ), 260 length( Rcodes, RcLen ), 261 n_digits( RcLen, 0, Zero ), 262 atomic_list_concat( [Zero,'opts.pl'], '-', BoptsB ), 263 directory_file_path( ResD, BoptsB, BroF ), 264 bims_write_options( BroF, UnqOpts ), 265 % fixme: likel 266 % likelihood(Lk=Model) 267 % model(Mdl), 268 PersOpts = [ models(Model), prior(Dlp), likelihood(Lkl), 269 data(Data), top_goal(Goal), 270 report(These) 271 ], 272 maplist( list_element(Opts), PersOpts ), 273 report_triggers( These ), 274 bims_ensure_likelihood_loaded( Model, Lkl, AbsLkl ), 275 bims_ensure_data_loaded( Model, Lkl, Data, AbsData ), 276 bims_locate_prior_file( Model, Dlp, AbsDlp ), 277 bims_write_abs_options( BroF, abs(AbsLkl,AbsData,AbsDlp) ), 278 bims_runs( Runs, 1, RcLen, Model, AbsDlp, Seeds, ResD, Goal, Opts ), 279 get_date_time( Now ), 280 portray_clauses( [finished_at(Now)], [file(BroF),mode(append)] ), 281 bims_option_debug_set( DbgRestore ). 282 283bims_runs( 0, _I, _Dgs, _Mdl, _, _Seelds, _ResD, _Goal, _Opts ) :- 284 !, 285 debug( bims, 'Finished all runs', [] ). 286bims_runs( R, I, Dgs, Mdl, Dlp, [Seed|Seeds], ResD, Goal, Opts ) :- 287 copy_term( Goal, TopG ), 288 % * Out, *Stats, Kernel, ModelType, Prior, PB, Repeats, Data, HotChainsIds, *Seed, SGl 289 % Out = chain_output( 01-chain.pl ), Stats= 01_stats.txt 290 debug( bims, 'Run: ~d', I ), 291 n_digits( Dgs, I, Iat ), 292 bims_out_file( ResD, Iat, chain, pl, Chain ), 293 bims_out_file( ResD, Iat, stats, pl, Stats ), 294 bims_out_file( ResD, Iat, report, pl, Rep ), % fixme: nicer name ? 295 MCopts = [ backtrack(BckT), iterations(Its), tempered(Hot), 296 progress_percentage(Pc), progress_stub(Stub) 297 ], 298 maplist( list_element(Opts), MCopts ), %fixme: allow lists 299 bims_progress_reporter( Its, Pc, Stub ), 300 bims_backtrack_term( BckT, Bck, P ), 301 mcmcms( Chain, Stats, Rep, Bck, Mdl, Dlp, P, Its, true, Hot, Seed, TopG ), 302 !, % fixme: do some cleaning 303 J is I + 1, 304 Q is R - 1, 305 bims_runs( Q, J, Dgs, Mdl, Dlp, Seeds, ResD, Goal, Opts ).
date(Y,M,D)
.
?- bims_version(Vers, Date). Vers = 2:5:0, Date = date(2022, 1, 2).
321bims_version( 2:5:0, date(2022,1,2) ).
bibtex(Type,Key,Pairs)
term of the same publication.
Produces all related publications on backtracking.
?- bims_citation( A, G )
, write( A )
nl.
Distributional Logic Programming for Bayesian Knowledge Representation.
Nicos Angelopoulos and James Cussens.
International Journal of Approximate Reasoning (IJAR).
Volume 80, January 2017, pages 52-66.
*/
342bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 343 Atom = 'Distributional Logic Programming for Bayesian Knowledge Representation. 344Nicos Angelopoulos and James Cussens. 345International Journal of Approximate Reasoning (IJAR). 346Volume 80, January 2017, pages 52-66.', 347 Type = article, 348 Key = 'AngelopoulosN_CussensJ_2017', 349 Pairs = [ 350 author = 'Nicos Angelopoulos and James Cussens', 351 title = 'Distributional Logic Programming for Bayesian Knowledge', 352 journal= 'International Journal of Approximate Reasoning', 353 year = 2017, 354 month = 'January', 355 volume = 80, 356 pages = '52-66', 357 url = 'http://www.sciencedirect.com/science/article/pii/S0888613X16301232' 358 ]. 359bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 360 Atom = 'Bayesian learning of Bayesian Networks with informative priors. 361Nicos Angelopoulos and James Cussens (2008). 362Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI) 54(1-3), 53-98.', 363 Type = article, 364 Key = 'AngelopoulosN_Cussens_2008', 365 Pairs = [ 366 author = 'Nicos Angelopoulos and James Cussens', 367 title = 'Bayesian learning of Bayesian Networks with informative priors.', 368 year = '2008', 369 journal= 'Special issue on BN learning. Annals of Mathematics and Artificial Intelligence (AMAI)', 370 volume = '54', 371 issue = '1-3', 372 pages = '53-98', 373 url = 'http://dx.doi.org/10.1007/s10472-009-9133-x' 374 ]. 375bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 376 Atom = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees. 377Nicos Angelopoulos and James Cussens (2005) 378In 19th International Joint Conference on Artificial Intelligence (IJCAI-05), 641-646, Edinburgh, UK, August 2005.', 379 Type = inproceedings, 380 Key = 'AngelopoulosN_Cussens_2005', 381 Pairs = [ 382 title = 'Exploiting Informative Priors for Bayesian Classification and Regression Trees', 383 author = 'Nicos Angelopoulos and James Cussens', 384 year = '2005', 385 inproceedings = 'In 19th International Joint Conference on Artificial Intelligence (IJCAI-05)', 386 pages = '641-646', 387 address = 'Edinburgh, UK', 388 month = 'August', 389 year = 2005 390 ]. 391bims_citation( Atom, bibtex(Type,Key,Pairs) ) :- 392 Atom = 'Markov chain Monte Carlo using tree-based priors on model structure. 393Nicos Angelopoulos and James Cussens (2001). 394In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001), 16-23, Seattle, USA.', 395 Type = inproceedings, 396 Key = 'AngelopoulosN_Cussens_2001', 397 Pairs = [ 398 title = 'Markov chain Monte Carlo using tree-based priors on model structure.', 399 author = 'Nicos Angelopoulos and James Cussens', 400 year = 2001, 401 proceedings = 'In 17th Conference on Uncertainty in Artificial Intelligence (UAI-2001)', 402 pages = '16-23', 403 address = 'Seattle, USA' 404 ]. 405 406bims_lib( Spec ) :- 407 ( Spec = Fname/_Arity -> true; Spec = Fname ), 408 ensure_loaded( bims(src/lib/Fname) ).
415bims_option_seeds( Opts, Seeds ) :-
416 memberchk( chains(Runs), Opts ),
417 memberchk( seeds(PrvSeedS), Opts ),
418 en_list( PrvSeedS, PrvSeeds ),
419 length( PrvSeeds, PrvLen ),
420 Diff is max( Runs - PrvLen, 0 ),
421 last( PrvSeeds, Last ),
422 findall( A, ( between(1,Diff,I),
423 A is Last + I
424 ), As ),
425 append( PrvSeeds, As, Seeds ).
debug(true)
is in Opts.
Before indicates the state of bims debugging prior to the call.
The values for Before are: false or true.433bims_option_debug( Opts, Bef ) :- 434 memberchk( debug(Dbg), Opts ), 435 debugging( bims, Bef ), 436 !, 437 bims_option_debug_set( Dbg ). 438 439bims_option_debug_set( true ) :- 440 debug( bims ). 441bims_option_debug_set( false) :- 442 nodebug( bims ).
?- bims:bims_out_file( dir, '01', chain, pl, File ). File = 'dir/01-chain.pl'.
452bims_out_file( Dir, Iat, Tkn, Ext, File ) :- 453 atomic_list_concat( [Iat,Tkn], '-', Bstem ), 454 file_name_extension( Bstem, Ext, Bname ), 455 directory_file_path( Dir, Bname, File ). 456 457bims_ensure_data_loaded( Model, Lkl, DataT, DataF ) :- 458 DataT =.. [Data|Args], 459 % generic section 460 clean_module( data ), 461 AbsOpts = [file_type(directory),solutions(all)], 462 findall( BimsDD, absolute_file_name(bims(data),BimsDD,AbsOpts), BimsDDs ), 463 debug( bims, 'Bims data dirs: ~w', [BimsDDs] ), 464 member( Dir, [data|BimsDDs] ), 465 directory_file_path( Dir, Data, Stem ), 466 file_name_extension( Stem, pl, DataF ), 467 debug( bims, 'Testing existance of data file: ~w', DataF ), 468 exists_file( DataF ), 469 debug( bims, 'Loading data file: ~w', DataF ), 470 data:load_files( DataF, [silent(true)] ), 471 % data:ensure_loaded( DataF ), 472 bims_data_call( Model, Lkl, Args, PrepF, PrepG ), 473 assert( data:data_file(DataF) ), 474 assert( data:data_models(Model) ), 475 assert( data:data_prep_file(PrepF) ), 476 assert( data:data_prep_goal(PrepG) ), 477 !. 478bims_ensure_data_loaded( Model, Lkl, Data, _DataF ) :- 479 throw( fixme(cannot_load_data_for(Model,Lkl,Data)) ). 480 481bims_ensure_likelihood_loaded( Model, Lkl, PlLkl ) :- 482 Sub = models/Model/lklhoods/Lkl/Lkl, 483 clean_module( bims_lkl ), 484 absolute_file_name( bims(Sub), LklStem, [solutions(all)] ), 485 file_name_extension( LklStem, pl, PlLkl ), 486 debug( bims, 'Looking for likelihood in file: ~w', PlLkl ), 487 exists_file( PlLkl ), 488 debug( bims, 'Loading likelihood from file: ~w', PlLkl ), 489 % ensure_loaded( bims:PlLkl ), 490 bims_lkl:load_files( PlLkl, [silent(true),if(true)] ), 491 !. 492bims_ensure_likelihood_loaded( Model, Lkl, _ ) :- 493 throw( fixme(cannot_find_likelihood_for(Lkl,Model)) ). 494 495bims_data_call( Model, Lkl, Args, DataMan, GoalCopy ) :- 496 atom_concat( Lkl, '_data', BaseStem ), 497 file_name_extension( BaseStem, pl, BaseName ), 498 Sub = models/Model/lklhoods/Lkl, 499 AbsOpts = [file_type(directory),solutions(all)], 500 absolute_file_name( bims(Sub), LklD, AbsOpts ), 501 directory_file_path( LklD, BaseName, DataMan ), 502 debug( bims, 'Looking data preparation file in: ~w', DataMan ), 503 exists_file( DataMan ), 504 debug( bims, 'Loading data preparation file: ~w', DataMan ), 505 bims_data:load_files( DataMan, [silent(true),if(true)] ), 506 Goal =.. [BaseStem,Args], 507 duplicate_term( Goal, GoalCopy ), 508 call( bims_data:Goal ). 509 510list_element( List, Element ) :- 511 memberchk( Element, List ). 512 513bims_backtrack_term( Term, Bck, P ) :- 514 functor( Term, Bck, Arity ), 515 bims_backtrack_arity_term( Arity, Term, P ). 516 % fixme: add Arity > 1 error 517 518bims_backtrack_arity_term( 0, Atom, Atom ). 519bims_backtrack_arity_term( 1, Term, Arg ) :- 520 arg( 1, Term, Arg ). 521 522bims_locate_prior_file( Model, DlpStem, AbsDlp ) :- 523 AbsOpts = [file_type(directory),solutions(all)], 524 absolute_file_name( bims(models/Model/dlps), Bims, AbsOpts ), 525 member( Dir, ['dlps',Bims] ), 526 directory_file_path( Dir, DlpStem, DlpPathStem ), 527 file_name_extension( DlpPathStem, dlp, AbsDlp ), 528 debug( bims, 'Looking for prior in file: ~w', AbsDlp ), 529 exists_file( AbsDlp ), 530 debug( bims, 'Will be using prior in file: ~w', AbsDlp ), 531 !. 532bims_locate_prior_file( Model, DlpStem, _AbsDlp ) :- 533 throw( fixme(cannot_locate_prior(Model,DlpStem)) ). 534 535atom_singular( Atom, Singular ) :- 536 atom_concat( Singular, s, Atom ), 537 !. 538atom_singular( Atom, Atom ). 539 540bims_args_results_dir( Args, Dir ) :- 541 memberchk( results_dir(ResDir), Args ), 542 !, 543 Dir = ResDir, 544 bims_args_results_dir_given( ResDir, Args ). 545bims_args_results_dir( Args, Dir ) :- 546 bims_args_results_dir_constructed( Args, Dir ). 547 548bims_args_results_dir_given( ResDir, _Args ) :- 549 ground( ResDir ), 550 !. 551bims_args_results_dir_given( ResDir, Args ) :- 552 var( ResDir ), 553 !, 554 bims_args_results_dir_constructed( Args, ResDir ). 555bims_args_results_dir_given( ResDir, _Args ) :- 556 throw( partially_instantiated_results_dir(ResDir) ). 557 558bims_args_results_dir_constructed( Args, ResDir ) :- 559 ( memberchk(dir_prefix(Pfx),Args) -> true; Pfx = res ), 560 !, 561 os_unique_by_date( Pfx, ResDir ). 562 563bims_write_abs_options( OptsFile, abs(Lkl,Data,Dlp) ) :- 564 open( OptsFile, append, Out ), 565 nl( Out ), 566 write( Out, '% derived' ), nl( Out ), 567 Terms = [ likelihood_file(Lkl), 568 data_file(Data), 569 prior_file(Dlp) 570 ], 571 maplist( mcmcms_write_fact(Out), Terms ), 572 nl( Out ), 573 write( Out, '% date/time stamps' ), nl( Out ), 574 get_date_time( Now ), 575 mcmcms_write_fact( Out, started_at(Now) ), 576 close( Out ). 577 578bims_progress_reporter( Its, Pc, Stub ) :- 579 bims_bb_remove( progress, _ ), 580 number( Pc ), 581 0 < Pc, Pc < 100, 582 !, 583 calc_percentiles( Pc, Pc, Its, [HPtl|TPtiles] ), 584 bims_bb_put( progress, pts(Stub,HPtl,TPtiles) ). 585bims_progress_reporter( _Its, _Pc, _Stub ). 586 587bims_write_options( BroF, UnqOpts ) :- 588 open( BroF, write, Out ), 589 write( Out, '% options' ), nl( Out ), 590 maplist( mcmcms_write_fact(Out), UnqOpts ), 591 close( Out ). 592 593calc_percentiles( Perc, _Step, _Repeats, Ptiles ) :- 594 Perc > 100, 595 !, 596 Ptiles = []. 597calc_percentiles( Perc, Step, Repeats, [F|M] ) :- 598 F is integer( Perc * Repeats / 100 ), 599 NxP is Perc + Step, 600 calc_percentiles( NxP, Step, Repeats, M )
Bims- Bayesian inference over model structures.
Introduction
Bims (Bayesian inference over model structures) implements MCMC learning over statistical models defined in the Dlp (Distributional logic programming) probabilistic language.
Bims is released under GPL2, or Artistic 2.0
Currently there are 2 model spaces supported:
Additional model spaces can be easily implemented by defining new likelihood plug-ins and programming appropriate priors.
Examples provided
Carts examples
The above are two equivalent ways to run the Carts example provided.
This runs 3 chains each of length 100 on the default Carts data using the default likelihood. The default dataset is the breast cancer Winsconsin (BCW) data from the machine learning repository. There are 2 categories, 9 variables and 683 data points in this dataset. You can view the data with
The default likelihood is an implementation of the classification likelihood function presented in: H Chipman, E George, and R McCulloch. Bayesian CART model search (with discussion). J. of the American Statistical Association, 93:935â960, 1998.
Bns examples
The above are two equivalent ways to run the Bns example provided.
This runs 3 chains each of length 100 on the default bns data using default likelihood. The dataset is a sampled dataset from the ASIA network and it comprises of 8 variables and 2295 datapoints. You can view the data with
The default BN likelihood is an instance of the BDeu metric for scoring BN structures.
W. L. Buntine. Theory refinement of Bayesian networks. In Bruce DâAmbrosio, Philippe Smets, and Piero Bonissone, editors, Proceedings of the Seventh Annual Conference on Uncertainty in Artificial Intelligence (UAIâ1991), pages 52â60, 1991
David Heckerman, Dan Geiger, and David M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197â243, 1995.
Learning models from new datasets
An easy way to run Bims on your data is to create a new directory and within that sub-directory data/ copy your data there and pass options data/1 to the basename of the data file.
For example,
Learning new statistical models.
By defining a new likelihood function and new priors the system can be used on new statistical models.
Pack info
bims_default(-Def)
.