1/*
    2Dirichlet process (DP), see https://en.wikipedia.org/wiki/Dirichlet_process
    3Samples are drawn from a base distribution. New samples have a nonzero
    4probability of being equal to already sampled values. The process depends
    5on a parameter alpha (concentration parameter): with alpha->0, a single 
    6value is sampled, with alpha->infinite the distribution is equal to the base
    7distribution.
    8In this example the base distribution is a Guassian with mean 0 and variance
    91, as in https://en.wikipedia.org/wiki/Dirichlet_process#/media/File:Dirichlet_process_draws.svg
   10To model the process, this example uses the Chinese Restaurant Process: 
   11# Draw <math>X_{1}</math> from the base distribution <math>H</math>.
   12# For <math>n>1</math>: <poem>
   13:::: a)  With probability  <math>\frac{\alpha}{\alpha+n-1}</math>  draw <math>X_{n}</math> from <math>H</math>.
   14
   15:::: b)  With probability  <math>\frac{n_{x}}{\alpha+n-1}</math>  set <math>X_{n}=x</math>, where <math>n_{x}</math> is the number of previous observations <math>X_{j}, j<n</math>, such that <math>X_{j}=x</math>. </poem>
   16
   17The example queries show both the distribution of indexes and values of the DP.
   18Moreover, they show the distribution of unique indexes as in 
   19http://www.robots.ox.ac.uk/~fwood/anglican/examples/viewer/?worksheet=nonparametrics/dp-mixture-model
   20*/

?- hist_val(200,100,G). % show the distribution of values with concentration parameter 10. Should look % like row 2 of https://en.wikipedia.org/wiki/Dirichlet_process#/media/File:Dirichlet_process_draws.svg

*/

   27:- use_module(library(mcintyre)).   28:- if(current_predicate(use_rendering/1)).   29:- use_rendering(c3).   30:- endif.   31:- mc.   32:- begin_lpad.   33
   34% dp_n_values(N0,N,Alpha,L,Counts0,Counts)
   35% returns in L a list of N-N0 samples from the DP with concentration parameter
   36% Alpha and initial counts Counts0. Also returns the updated counts Counts
   37dp_n_values(N,N,_Alpha,[],Counts,Counts):-!.
   38
   39% dp_value(NV,Alpha,V)
   40% returns in V the NVth sample from the DP with concentration parameter
   41% Alpha
   42dp_n_values(N0,N,Alpha,[[V]-1|Vs],Counts0,Counts):-
   43  N0<N,
   44  dp_value(N0,Alpha,Counts0,V,Counts1),
   45  N1 is N0+1,
   46  dp_n_values(N1,N,Alpha,Vs,Counts1,Counts).
   47  
   48dp_value(NV,Alpha,Counts,V,Counts1):-
   49  draw_sample(Counts,NV,Alpha,I),
   50  update_counts(0,I,Alpha,Counts,Counts1),
   51  dp_pick_value(I,V).
   52
   53
   54update_counts(_I0,_I,Alpha,[_C],[1,Alpha]):-!.
   55
   56
   57update_counts(I,I,_Alpha,[C|Rest],[C1|Rest]):-
   58  C1 is C+1.
   59
   60update_counts(I0,I,Alpha,[C|Rest],[C|Rest1]):-
   61  I1 is I0+1,
   62  update_counts(I1,I,Alpha,Rest,Rest1).
   63
   64draw_sample(Counts,NV,Alpha,I):-
   65  NS is NV+Alpha,
   66  maplist(div(NS),Counts,Probs),
   67  length(Counts,LC),
   68  numlist(1,LC,Values),
   69  maplist(pair,Values,Probs,Discrete),
   70  take_sample(NV,Discrete,I).
   71
   72take_sample(_,D,V):discrete(V,D).
   73
   74% dp_pick_value(I,V)
   75% returns in V the value of index I of the base distribution 
   76% (in this case N(0,1))
   77dp_pick_value(_,V):gaussian(V,0,1).
   78
   79div(Den,V,P):- P is V/Den.
   80
   81pair(A,B,A:B).
   82
   83:- end_lpad.   84
   85
   86hist_val(Samples,NBins,Chart):-
   87  mc_sample_arg_first(dp_n_values(0,Samples,10.0,V,[10.0],_),1,V,L),
   88  L=[Vs-_],
   89  histogram(Vs,Chart,[nbins(NBins)])
   90