dp_chinese.pl

1/* 2Dirichlet process (DP), see https://en.wikipedia.org/wiki/Dirichlet_process 3Samples are drawn from a base distribution. New samples have a nonzero 4probability of being equal to already sampled values. The process depends 5on a parameter alpha (concentration parameter): with alpha->0, a single 6value is sampled, with alpha->infinite the distribution is equal to the base 7distribution. 8In this example the base distribution is a Guassian with mean 0 and variance 91, as in https://en.wikipedia.org/wiki/Dirichlet_process#/media/File:Dirichlet_process_draws.svg 10To model the process, this example uses the Chinese Restaurant Process: 11# Draw <math>X_{1}</math> from the base distribution <math>H</math>. 12# For <math>n>1</math>: <poem> 13:::: a) With probability <math>\frac{\alpha}{\alpha+n-1}</math> draw <math>X_{n}</math> from <math>H</math>. 14 15:::: b) With probability <math>\frac{n_{x}}{\alpha+n-1}</math> set <math>X_{n}=x</math>, where <math>n_{x}</math> is the number of previous observations <math>X_{j}, j<n</math>, such that <math>X_{j}=x</math>. </poem> 16 17The example queries show both the distribution of indexes and values of the DP. 18Moreover, they show the distribution of unique indexes as in 19http://www.robots.ox.ac.uk/~fwood/anglican/examples/viewer/?worksheet=nonparametrics/dp-mixture-model 20*/

?- hist_val(200,100,G). % show the distribution of values with concentration parameter 10. Should look % like row 2 of https://en.wikipedia.org/wiki/Dirichlet_process#/media/File:Dirichlet_process_draws.svg

27:- use_module(library(mcintyre)). 28:- if(current_predicate(use_rendering/1)). 29:- use_rendering(c3). 30:- endif. 31:- mc. 32:- begin_lpad. 33 34% dp_n_values(N0,N,Alpha,L,Counts0,Counts) 35% returns in L a list of N-N0 samples from the DP with concentration parameter 36% Alpha and initial counts Counts0. Also returns the updated counts Counts 37dp_n_values(N,N,_Alpha,[],Counts,Counts):-!. 38 39% dp_value(NV,Alpha,V) 40% returns in V the NVth sample from the DP with concentration parameter 41% Alpha 42dp_n_values(N0,N,Alpha,[[V]-1|Vs],Counts0,Counts):- 43 N0<N, 44 dp_value(N0,Alpha,Counts0,V,Counts1), 45 N1 is N0+1, 46 dp_n_values(N1,N,Alpha,Vs,Counts1,Counts). 47 48dp_value(NV,Alpha,Counts,V,Counts1):- 49 draw_sample(Counts,NV,Alpha,I), 50 update_counts(0,I,Alpha,Counts,Counts1), 51 dp_pick_value(I,V). 52 53 54update_counts(_I0,_I,Alpha,[_C],[1,Alpha]):-!. 55 56 57update_counts(I,I,_Alpha,[C|Rest],[C1|Rest]):- 58 C1 is C+1. 59 60update_counts(I0,I,Alpha,[C|Rest],[C|Rest1]):- 61 I1 is I0+1, 62 update_counts(I1,I,Alpha,Rest,Rest1). 63 64draw_sample(Counts,NV,Alpha,I):- 65 NS is NV+Alpha, 66 maplist(div(NS),Counts,Probs), 67 length(Counts,LC), 68 numlist(1,LC,Values), 69 maplist(pair,Values,Probs,Discrete), 70 take_sample(NV,Discrete,I). 71 72take_sample(_,D,V):discrete(V,D). 73 74% dp_pick_value(I,V) 75% returns in V the value of index I of the base distribution 76% (in this case N(0,1)) 77dp_pick_value(_,V):gaussian(V,0,1). 78 79div(Den,V,P):- P is V/Den. 80 81pair(A,B,A:B). 82 83:- end_lpad. 84 85 86hist_val(Samples,NBins,Chart):- 87 mc_sample_arg_first(dp_n_values(0,Samples,10.0,V,[10.0],_),1,V,L), 88 L=[Vs-_], 89 histogram(Vs,Chart,[nbins(NBins)]) 90