A Practical Introduction to Nested Sampling: Part 2

In Part 1 of this introduction to nested sampling scheme we grappled with how to search a likelihood space where we already had a ground truth programmed in, resulting in a fairly trivial example. This was sufficient for introducing the schema and general area, and taxonomy.

Next we will take this to a more real world example, where we will take some (simulated) data points that we have collected, build a probabilistic model for that dataset, and then make an inference on a set of hyper parameters. We keep the dimensionality low, looking to infer the distance of a rotating lighthouse light emitting randomly timed pulses along the coast line, and its distance from the shore.

Rotating Source

Building on the nested sampling methodology in Part 1, we want to build an inference based on a probabilistic model which we interpret through Bayes Formula $$P(\theta | D,M) = \frac{P(D|\theta,M)P(\theta|M)}{P(D|M)} $$ In this case our $\theta $ will ${ x, y }$ where $x $ and $y $ are orthogonal to each other, with $x$ being the distance along the straight coast line, and $y$ being the distance from the coast. The probability distribution used to define a rotating source is the Cauchy distribution, which amongst its many interesting properties, is completely analytical, which is perfect for this example, and it has a Probability Distribution Function(PDF) of: $$ \frac{1}{\pi y \big( 1 + \big( \frac{x – x_0}{y}\big)^2 \big)}$$ and with some rearrangement the PDF value for each data point becomes $$ L_i = \frac{\frac{y}{\pi}}{y^2 + (( x_i – x )^2} $$ where $x_i$ is the position along the coast line where the pulse was measured, and $x$ and $y$ are the current co-ordinates for that particular atom. We then use this to calculate the summated $L_i $ values: $$ L_k = \sum_i log( L_i ) $$ We can clearly see here a tendency to get very low log likelihood values due to summating the samples from the likelihood distribution, but this should not cause any issues with the double floating point precision we are using here. In the example below, I do normalise the log likelihood values so that $$ L_k = \frac{1}{N_{data}} \sum_i log( L_i ) $$

Implementation

The main loop is unchanged from Part 1, but I have defined slightly different Prior and Explore methods, to incorporate the ground truth domain of our data. The hard coded D array represents the positions along a line where rays from the lighthouse were detected.

#define PRIOR_X_MIN ( -10.0 )
#define PRIOR_X_MAX ( 30.0 )
#define PRIOR_Y_MIN ( 0.0 )
#define PRIOR_Y_MAX ( 40.0 )

#define PRIOR_WIDTH(min, max) ( max - min )
#define PRIOR_X_WIDTH ( PRIOR_WIDTH( PRIOR_X_MIN, PRIOR_X_MAX ) )
#define PRIOR_Y_WIDTH ( PRIOR_WIDTH( PRIOR_Y_MIN, PRIOR_Y_MAX ) )

// Object within an n-dimensional space
typedef struct nestedObject {
    double x[N_DIM]; // Pointer to an array with num_dim elements.
    double u[N_DIM];
    double log_likelihood;
    double logarea;
} nestedObject;

// Likelihood ("The Model") prototype.
double _lighthouse_model( nestedObject *object );
#define LIKELIHOOD( object ) ( _lighthouse_model( object ) );


int bounds_fail_check( double x, double y) {
    return ( ( y < PRIOR_Y_MIN ) || ( x < PRIOR_X_MIN ) || ( x >= PRIOR_X_MAX ) );
}

double _lighthouse_model(nestedObject *object)
{
    const int N_Data = 120;
    double D[120] = { -7.10673269,  -46.80201074,    3.51347374,    9.93716449,    0.9671578,
	42.0402484,    49.31837839,   23.16023324,   32.75790862,  -21.36586103,
	-24.67435218,   17.1067892,    -1.05680909,   -6.07492508,  152.72672771,
	-24.40356202,   47.94171621,   42.07266556,   33.42214928,  -20.61308064,
	-41.74847916,  -16.88144346,   11.56842581,   -6.05677264,    4.48931047,
	131.0736122,   -99.98824669,   -6.23758309,    5.04789552,   11.35849687,
	-4.20356311,    6.97391202,  -91.14591724,  180.31645798, -141.26229914,
	-22.14320861,   13.21477981,  -37.28966747,   11.33413848,  -10.06945706,
	10.31704433,   16.70927306,   25.44803236,    3.34080992,   33.65248605,
	6.76394825,   34.68356861,   -9.88313966, -103.80184722,   -4.31492774,
	66.51684856,  -23.95770879,   10.70521586,   12.29880215,   18.0017317,
	33.07586169,    7.62973585,   -4.39100137, -158.47858974,  -33.57391485,
	55.41549009,   11.27813131,   48.0543114,    13.84422087,   16.39517458,
	-183.77689914,  105.28618091,   11.67441646,    1.17495233,   30.81052785,
	41.16920128,   43.46706485,   36.5949243,     0.18780379,  -30.69225096,
	-12.53107989,    3.20745861,   40.22652877,   22.78768322,   30.7076922,
	48.84991245,  -55.27349561,   -8.93504733,   24.16453158,   29.75221924,
	-5.34764689,   21.32217819,   11.65608557,   26.32877194,  -10.02920107,
	33.109761,     32.09296084,   18.57116533,   18.60906021,  -24.53124876,
	-17.61067266,   24.09486465,    8.73644648,  144.52063783,   11.67525365,
	38.11111074,   27.60956459,   72.01092819,  -14.15922508,  140.57220581,
	105.30969272,   26.87199113,   -1.99171779,    6.59312843,    8.65441895,
	22.81108584,    1.36484011,   34.78014171,  -27.03539633,   39.49729209,
	3.65886577,   -9.8719139,    58.89804908,    7.59483289,    1.4160217 };

    if( bounds_fail_check( object->x[0], object->x[1] ) ) {
	return -DBL_MAX;
    }
    
    double logL = 0.0;
    for( int k = 0; k < N_Data; ++k )
    {
	/* const double numerator = log( object->x[1] ) - log( 3.1416 ); */
	/* const double denominator = log( ( object->x[1] * object->x[1] ) + ( D[k] - object->x[0] ) * ( D[k] - object->x[0] ) ); */
	/* logL += ( numerator - denominator ) / N_Data; */
	logL += log( ( object->x[1] / 3.1416 ) / ( ( D[k] - object->x[0] ) * ( D[k] - object->x[0] ) + object->x[1] * object->x[1] ) ) / N_Data;
    }
    return logL;
}

void Prior(nestedObject *object)
{
    object->u[0] = UNIFORM; object->u[1] = UNIFORM;
    object->x[0] = PRIOR_X_WIDTH * object->u[0] + PRIOR_X_MIN;
    object->x[1] = PRIOR_Y_WIDTH * object->u[1];
    object->log_likelihood = LIKELIHOOD( object );
}

// For re-nesting an object after it has been killed.
// We use a very crude genetic algorithm to do this by mixing co-ords from two random objects.
void Explore(nestedObject *object,  // Object we want to replace 
	     nestedObject *objects, // Objects 
	     double logLstar)
{
    double evolvedLogL = -DBL_MAX;
    nestedObject dummy;
    dummy.log_likelihood = -DBL_MAX;
    dummy.logarea = 0.0;
    do
    {
	int parents[2] = { 0 };
	parents[0] = (int)( N * UNIFORM ) % N;
	parents[1] = (int)( N * UNIFORM ) % N;

	if( parents[0] == parents[1] ) continue; // We can not reproduce asexually.

    	for( int i = 0; i < N_DIM; ++i )
	{
	    double mixing_factor = UNIFORM;
	    double new_gene = mixing_factor * objects[parents[0]].u[i] +
		             ( 1 - mixing_factor ) * objects[parents[1]].u[i];
	    dummy.u[i] = new_gene;
	}

	dummy.x[0] = PRIOR_X_WIDTH * dummy.u[0] + PRIOR_X_MIN;
	dummy.x[1] = PRIOR_Y_WIDTH * dummy.u[1];
	if( bounds_fail_check( dummy.x[0], dummy.x[1] ) ) continue;
        evolvedLogL = LIKELIHOOD( &dummy );	
    }
    while( evolvedLogL < logLstar );
    dummy.log_likelihood = evolvedLogL;
    fprintf( stderr, "LogL:%g \n", evolvedLogL );
    dummy.logarea = 0.0;
    *object = dummy;
}

The data for the lighthouse was sampled from a simulated ground truth distribution simulated in Python:

import numpy
lighthouse_position_b = 20 #We know the distance from the coast.
lighthouse_position_a = 10 #True position along coast (what we will infer)

## Draw samples from a cauchy distribution
s =lighthouse_position_a + lighthouse_position_b*numpy.random.standard_cauchy(128)
s = s[(s>-200) & (s<200)]

Results

The ground truth of our data is that the lighthouse is at an $x$ value of 10, and a $y$ value of 20. As we iterate through we can see that we converge to roughly the correct values. Due to sampling along a single hyperplane in our $M$ dimensional space, we have a significant degeneracy between our dataset and the model we are using. That is to say, the gradient of the Cauchy model in the y domain is very close to zero, and thus this is difficult to draw an inference on.

Now let’s have a look at our log likelihood and posterior distributions:

Nothing seems out of the ordinary here. As already discussed we have significant uncertainty in the $y$ direction which is especially shown in our likelihood samples. Thus we have made a successful inference, but we are very well aware that the standard deviation is very large. If we re-use our results method from Part 1, where we assume gaussianity, with a Cauchy distribution being very roughly gaussian-like, at least for its two moments, we can see this in our output:

### Posterior Results ###
Mean(x) = 9.99043, Std. Dev.(x) = 9.85485
Mean(y) = 22.9575, Std. Dev.(y) = 9.71035
# iterations = 1000
Evidence: ln(Z) = -5.37152 +- 0.0204138
Information: H = 0.0416725 nats = 0.0601206 bits

Remember that the posterior peak we above is not the same as processed in the results. This is due to the differeneces between the Cauchy and Gaussian distribution, especially in the $y$ direction. These small discrepancies aside, it is clear that whilst our mean results are pretty much correct, we are correctly calculating that they have very high uncertainty! This is the primary lesson of Bayesian Statistics, it’s not just about inferring a set of hyper parameters, but also your uncertainty in them, based on your evidence values and probabilistic model. With the use of Nested Sampling, we can try different models and compare thanks to the ability to calculate samples from the evidence.

Full Implementation

Here is the full implementation in C, which should compile under gcc or clang like so:

gcc nested_sampling_lighthouse.c -o nested_sampling.out -D VERBOSE_DEBUG_MESSAGES=

// This is adapted from D.S.Sivia's book:
//  Data Analysis: A Bayesian Tutorial
// Chapter 9 - Nested Sampling

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <float.h>

#ifndef VERBOSE_DEBUG_MESSAGES
#define VERBOSE_DEBUG_MESSAGES ( 0 )
#endif

// ### Change these two parameters
#define N ( 100 ) // Number of atoms
#define MAX ( 1000 ) // Max number of iterations

// ## Do not change these.
#define N_DIM ( 2 ) // Dimensionality of the likelihood space
#define UNIFORM ( ( rand() + 0.5 ) / ( RAND_MAX + 1.0 ) ) // Defined on domain (0,1) 
#define PLUS(x,y) ( x > y ? x + log( 1 + exp( y - x ) ) : y + log( 1 + exp( x - y ) ) )

#define PRIOR_X_MIN ( -10.0 )
#define PRIOR_X_MAX ( 30.0 )
#define PRIOR_Y_MIN ( 0.0 )
#define PRIOR_Y_MAX ( 40.0 )

#define PRIOR_WIDTH(min, max) ( max - min )
#define PRIOR_X_WIDTH ( PRIOR_WIDTH( PRIOR_X_MIN, PRIOR_X_MAX ) )
#define PRIOR_Y_WIDTH ( PRIOR_WIDTH( PRIOR_Y_MIN, PRIOR_Y_MAX ) )

// Object within an n-dimensional space
typedef struct nestedObject {
    double x[N_DIM]; // Pointer to an array with num_dim elements.
    double u[N_DIM];
    double log_likelihood;
    double logarea;
} nestedObject;

// Likelihood ("The Model") prototype.
double _lighthouse_model( nestedObject *object );
#define LIKELIHOOD( object ) ( _lighthouse_model( object ) );


void Prior(nestedObject *object); // Sets up an object within our prior boundaries

// For re-nesting an object after it has been killed.
void Explore(nestedObject *object,  // Object we want to replace 
	     nestedObject *objects, // Objects 
	     double logLstar); // The new log likelihood limit.

// Process our discarded points by iterating back over the
// logarea and generating posterior samples
void Results(nestedObject *samples,
	     int nest,
	     double logZ );

void dumpSamples(nestedObject *samples,
		 int nest,
		 double logZ);

int main(void)
{
    nestedObject objects[N];
    nestedObject samples[MAX];
    double logwidth; // ln(width in prior mass)
    double logLstar; // ln(Likelihood constraint)
    double H = 0.0; // Information (sometimes known as the negative log entropy)
    double logZ = -DBL_MAX; // ln(Evidence Z), initially 0
    double logZnew; // Updated logZ
    int i; // Object counter
    int copy; // Duplicated object
    int worst; // Worst object
    int nest; // Nested sampling iteration count
    fprintf( stderr, "Initialising priors on objects...");
    // Set prior objects
    for( i = 0; i < N; ++i )
    {
	Prior(&objects[i]);	    
    }
    fprintf( stderr, "done\n");

#if VERBOSE_DEBUG_MESSAGES
    for( i = 0; i < N; ++i )
    {
	fprintf( stderr, "Prior Object %d: %g %g logL:%g\n", i, objects[i].x[0], objects[i].x[1], objects[i].log_likelihood );
    }
#endif

    // Outermost interval of prior mass
    logwidth = log( 1.0 - exp( - 1.0 / N ) );

    // START - Nested Sampling Loop
    for( nest = 0; nest < MAX; ++nest )
    {
	//fprintf( stderr, "nest:%d logarea:%g\n", nest, logarea );
	worst = 0;
	for( i = 1; i < N; ++i )
        {
	    if( objects[i].log_likelihood < objects[worst].log_likelihood ) {
		worst = i;
	    }
	}
	objects[worst].logarea = logwidth + objects[worst].log_likelihood;
	// Update Evidence Z and Information H
	logZnew = PLUS( logZ, objects[worst].logarea );
	H = exp( objects[worst].logarea - logZnew ) * objects[worst].log_likelihood
	    + exp( logZ - logZnew ) * ( H + logZ ) - logZnew;
	logZ = logZnew;

	// Posterior samples (optional, not needed for optimising over the parameter space)
	samples[nest] = objects[worst];

	// Kill the worst object in favour of the copy of different survivor
	do {
	    copy = (int)( N * UNIFORM ) % N;
	}
	while( copy == worst && N > 1 );
	logLstar = objects[worst].log_likelihood;
	objects[worst] = objects[copy];

	// Evolve the copied object within our new constraint.
	fprintf( stderr, "Nest %d: ", nest );
	Explore( &objects[worst], objects, logLstar );

	// Shrink the interval
	logwidth -= 1.0 / N;
    }
    // END - Nested Sampling Loop
#if VERBOSE_DEBUG_MESSAGES
    for( i = 0; i < N; ++i )
    {
	fprintf( stderr, "Post Nested Sampling Object %d: %g %g logL:%g\n", i, objects[i].x[0], objects[i].x[1], objects[i].log_likelihood );
    }
#endif

    printf( "# iterations = %d\n", nest );
    printf( "Evidence: ln(Z) = %g +- %g\n", logZ, sqrt( H / N ) );
    printf( "Information: H = %g nats = %g bits\n", H, H / log( 2. ) );
    Results( samples, nest, logZ );
    dumpSamples( samples, nest, logZ );
    return 0;
}

int bounds_fail_check( double x, double y) {
    return ( ( y < PRIOR_Y_MIN ) || ( x < PRIOR_X_MIN ) || ( x >= PRIOR_X_MAX ) );
}

double _lighthouse_model(nestedObject *object)
{
    const int N_Data = 120;
    double D[120] = { -7.10673269,  -46.80201074,    3.51347374,    9.93716449,    0.9671578,
	42.0402484,    49.31837839,   23.16023324,   32.75790862,  -21.36586103,
	-24.67435218,   17.1067892,    -1.05680909,   -6.07492508,  152.72672771,
	-24.40356202,   47.94171621,   42.07266556,   33.42214928,  -20.61308064,
	-41.74847916,  -16.88144346,   11.56842581,   -6.05677264,    4.48931047,
	131.0736122,   -99.98824669,   -6.23758309,    5.04789552,   11.35849687,
	-4.20356311,    6.97391202,  -91.14591724,  180.31645798, -141.26229914,
	-22.14320861,   13.21477981,  -37.28966747,   11.33413848,  -10.06945706,
	10.31704433,   16.70927306,   25.44803236,    3.34080992,   33.65248605,
	6.76394825,   34.68356861,   -9.88313966, -103.80184722,   -4.31492774,
	66.51684856,  -23.95770879,   10.70521586,   12.29880215,   18.0017317,
	33.07586169,    7.62973585,   -4.39100137, -158.47858974,  -33.57391485,
	55.41549009,   11.27813131,   48.0543114,    13.84422087,   16.39517458,
	-183.77689914,  105.28618091,   11.67441646,    1.17495233,   30.81052785,
	41.16920128,   43.46706485,   36.5949243,     0.18780379,  -30.69225096,
	-12.53107989,    3.20745861,   40.22652877,   22.78768322,   30.7076922,
	48.84991245,  -55.27349561,   -8.93504733,   24.16453158,   29.75221924,
	-5.34764689,   21.32217819,   11.65608557,   26.32877194,  -10.02920107,
	33.109761,     32.09296084,   18.57116533,   18.60906021,  -24.53124876,
	-17.61067266,   24.09486465,    8.73644648,  144.52063783,   11.67525365,
	38.11111074,   27.60956459,   72.01092819,  -14.15922508,  140.57220581,
	105.30969272,   26.87199113,   -1.99171779,    6.59312843,    8.65441895,
	22.81108584,    1.36484011,   34.78014171,  -27.03539633,   39.49729209,
	3.65886577,   -9.8719139,    58.89804908,    7.59483289,    1.4160217 };

    if( bounds_fail_check( object->x[0], object->x[1] ) ) {
	return -DBL_MAX;
    }
    
    double logL = 0.0;
    for( int k = 0; k < N_Data; ++k )
    {
	/* const double numerator = log( object->x[1] ) - log( 3.1416 ); */
	/* const double denominator = log( ( object->x[1] * object->x[1] ) + ( D[k] - object->x[0] ) * ( D[k] - object->x[0] ) ); */
	/* logL += ( numerator - denominator ) / N_Data; */
	logL += log( ( object->x[1] / 3.1416 ) / ( ( D[k] - object->x[0] ) * ( D[k] - object->x[0] ) + object->x[1] * object->x[1] ) ) / N_Data;
    }
    return logL;
}

void Prior(nestedObject *object)
{
    object->u[0] = UNIFORM; object->u[1] = UNIFORM;
    object->x[0] = PRIOR_X_WIDTH * object->u[0] + PRIOR_X_MIN;
    object->x[1] = PRIOR_Y_WIDTH * object->u[1];
    object->log_likelihood = LIKELIHOOD( object );
}

// For re-nesting an object after it has been killed.
// We use a very crude genetic algorithm to do this by mixing co-ords from two random objects.
void Explore(nestedObject *object,  // Object we want to replace 
	     nestedObject *objects, // Objects 
	     double logLstar)
{
    double evolvedLogL = -DBL_MAX;
    nestedObject dummy;
    dummy.log_likelihood = -DBL_MAX;
    dummy.logarea = 0.0;
    do
    {
	int parents[2] = { 0 };
	parents[0] = (int)( N * UNIFORM ) % N;
	parents[1] = (int)( N * UNIFORM ) % N;

	if( parents[0] == parents[1] ) continue; // We can not reproduce asexually.

    	for( int i = 0; i < N_DIM; ++i )
	{
	    double mixing_factor = UNIFORM;
	    double new_gene = mixing_factor * objects[parents[0]].u[i] +
		             ( 1 - mixing_factor ) * objects[parents[1]].u[i];
	    dummy.u[i] = new_gene;
	}

	dummy.x[0] = PRIOR_X_WIDTH * dummy.u[0] + PRIOR_X_MIN;
	dummy.x[1] = PRIOR_Y_WIDTH * dummy.u[1];
	if( bounds_fail_check( dummy.x[0], dummy.x[1] ) ) continue;
        evolvedLogL = LIKELIHOOD( &dummy );	
    }
    while( evolvedLogL < logLstar );
    dummy.log_likelihood = evolvedLogL;
    fprintf( stderr, "LogL:%g \n", evolvedLogL );
    dummy.logarea = 0.0;
    *object = dummy;
}

void Results(nestedObject *samples,
	     int nest,
	     double logZ )
{
    double x = 0.0, xx = 0.0;
    double y = 0.0, yy = 0.0;
    double w;
    int i;

    for( i = 0; i < nest; ++i )
    {
	w = exp( samples[i].logarea - logZ );
	x += w * samples[i].x[0];
	xx += w * samples[i].x[0] * samples[i].x[0];
	y += w * samples[i].x[1];
	yy += w * samples[i].x[1] * samples[i].x[1];
    }
    fprintf( stderr, "### Posterior Results ###\n" );
    fprintf( stderr, "Mean(x) = %g, Std. Dev.(x) = %g\n", x, sqrt( xx - x * x ) );
    fprintf( stderr, "Mean(y) = %g, Std. Dev.(y) = %g\n", y, sqrt( yy - y * y ) );
}

void dumpSamples(nestedObject *samples,
		 int nest,
		 double logZ)
{
    FILE *write_pointer;
    write_pointer = fopen( "nestedSamples.bin", "wb" );

    for(int i = 0; i < nest; ++i)
    {
	double x = samples[i].x[0];
	double y = samples[i].x[1];
	double logL = samples[i].log_likelihood;
	double logarea = samples[i].logarea;
        fwrite( &x, sizeof( x ), 1, write_pointer );
	fwrite( &y, sizeof( y ), 1, write_pointer );
	fwrite( &logL, sizeof( logL ), 1, write_pointer );
	fwrite( &logarea, sizeof( logarea ), 1, write_pointer );
	fwrite( &logZ, sizeof( logZ ), 1, write_pointer );
    }
    fclose( write_pointer );
}