Home  GitHub  Speaking Engagements  Terms  Email
TensorFlow Probability: Analysing “Brexit” page views with Bayesian Switchpoint Analysis
The TensorFlow probability library allows for detailed data analysis using statistics and probability methodologies. In addition, this library is also quite effective at detecting “switch points” in a time series, or periods of significant change.
In this instance, TensorFlow Probability is used to detect the switchpoint for a time series indicating page views for the term “Brexit” on a daily basis. The time series is available from Wikimedia Toolforge, and the time series ranges from June 2019 to July 2020.
The examples illustrated in this article use the template from the Bayesian Switchpoint Analysis tutorial, of which the original authors (Copyright 2019 The TensorFlow Authors) have made available under the Apache 2.0 license.
Background
Since the onset of COVID19, there has been decreased media interest in the Brexit event than previously.
This is evidenced by a significant decline in the overall trend for page views of the term per day, as well as a sharp decline in “spikes” of page view interest for the term. The time series is expressed in logarithmic format in order to smooth out the series.
Using tensorflow_probability, posterior samples are created in order to generate a probability distribution for the pre and postCOVID periods, which will be estimated by the model. The purpose of a posterior distribution is to assign a probability to potential future events on the basis of prior probability multiplied by the likelihood function (which is used to generate probable parameters of the probability distribution using the maximum likelihood estimator).
Methodology
Probabilities are calculated for both the switch model and sigmoid model.
def target_log_prob_fn(model, s, e, l):
return model.log_prob(s=s, e=e, l=l, d_t=brexit_data)
models = [model_switch, model_sigmoid]
print([target_log_prob_fn(m, 40., 3., .9).numpy() for m in models]) # Somewhat likely result
print([target_log_prob_fn(m, 60., 1., 5.).numpy() for m in models]) # Rather unlikely result
print([target_log_prob_fn(m, 10., 1., 1.).numpy() for m in models]) # Impossible result
Switchpoint Analysis
This analysis is based on the assumption that a switchpoint is present in the time series, i.e. a point at which there is a significant “shift” in the parameters of the time series.
Essentially, the purpose of this model is to test the following hypothesis  is there a significant decline in page views for the term “Brexit” at the time when COVID19 became a mainstream news event in the West?
In the original example referenced by the TensorFlow Blog as above, the switchpoint was based on a year for which safety regulations for the mining industry changed in the UK, and accidents declined accordingly.
For this example, the switchpoint is assumed to be the point at which COVID19 became mainstream media news, and interest in the Brexit event declined accordingly  as evidenced by a decline in page views for the term “Brexit”.
To do this, a Hamiltonian Monte Carlo (HMC) method is used for sampling from the relevant posterior distributions. This post gives much more detail on the HMC method, but the ultimate objective of using this technique is to update the posterior samples based on Hamilton’s equations (a type of differential equation), in order to account for updated probability readings in the Monte Carlo simulation.
BurnIn
Burnin is used to describe a situation where a number of iterations are discarded at the start of an MCMC run. There is considerable debate as to whether burnin should be invoked in such an analysis in the first instance  here is a useful link that provides some interesting insights on the topic.
The most common reason for using burnin is to discard a part of the time series that appears to be unrepresentative of the overall distribution.
As an example, let’s take a look at the time series once again.
Up until between t=50 and t=100, we see that page views for the term “Brexit” are considerably lower than the subsequent observations.
In this regard, a case could potentially be made for using a burnin of t=100. In this example, burnin will be set to both 0 and 100 to determine if any meaningful differences exist between the generated posterior distributions.
The states here are defined as precovid and postcovid. 10000 simulations are generated with 0 burnin steps.
num_results = 10000
num_burnin_steps = 0
@tf.function(autograph=False, experimental_compile=True)
def make_chain(target_log_prob_fn):
kernel = tfp.mcmc.TransformedTransitionKernel(
inner_kernel=tfp.mcmc.HamiltonianMonteCarlo(
target_log_prob_fn=target_log_prob_fn,
step_size=0.05,
num_leapfrog_steps=3),
bijector=[
tfb.Sigmoid(low=0., high=tf.cast(len(date), dtype=tf.float32)),
tfb.Softplus(),
tfb.Softplus(),
])
kernel = tfp.mcmc.SimpleStepSizeAdaptation(
inner_kernel=kernel,
num_adaptation_steps=int(0.8*num_burnin_steps))
states = tfp.mcmc.sample_chain(
num_results=num_results,
num_burnin_steps=num_burnin_steps,
current_state=[
# The three latent variables
tf.ones([], name='init_switchpoint'),
tf.ones([], name='init_pre_covid_rate'),
tf.ones([], name='init_post_covid_rate'),
],
trace_fn=None,
kernel=kernel)
return states
switch_samples = [s.numpy() for s in make_chain(
lambda *args: target_log_prob_fn(model_switch, *args))]
sigmoid_samples = [s.numpy() for s in make_chain(
lambda *args: target_log_prob_fn(model_sigmoid, *args))]
switchpoint, pre_covid_rate, post_covid_rate = zip(
switch_samples, sigmoid_samples)
Posterior Distributions: Burnin = 0
The posterior distributions for the two states are generated:
def _desc(v):
return '(median: {}; 95%ile CI: $[{}, {}]$)'.format(
*np.round(np.percentile(v, [50, 2.5, 97.5]), 2))
for t, v in [
('PreCOVID ($e$) posterior samples', pre_covid_rate),
('PostCOVID ($l$) posterior samples', post_covid_rate),
('Switch point ($s$) posterior samples', date[0] + switchpoint),
]:
fig, ax = plt.subplots(nrows=1, ncols=2, sharex=True)
for (m, i) in (('Switch', 0), ('Sigmoid', 1)):
a = ax[i]
a.hist(v[i], bins=50)
a.axvline(x=np.percentile(v[i], 50), color='k')
a.axvline(x=np.percentile(v[i], 2.5), color='k', ls='dashed', alpha=.5)
a.axvline(x=np.percentile(v[i], 97.5), color='k', ls='dashed', alpha=.5)
a.set_title(m + ' model ' + _desc(v[i]))
fig.suptitle(t)
plt.show()
Here are the “preCOVID” posterior samples:
Here are the “postCOVID” posterior samples:
Additionally, here is an analysis of the switchpoint:
It is observed that:

The median for the preCOVID posterior distribution is 9.19 for the switch model and 9.05 for the sigmoid model, while the median for the postCOVID posterior distribution is lower at 8.35 for the switch model and 8.49 for the sigmoid model.

For the switchpoint analysis, the median of the distribution is 257.16, which means that the “switch” or drop in Brexit page views occurs around 24th February 2020, or t=257 in the analysis.

The sigmoid model suggests the switch to be slightly earlier with a median of 244.84  or 11th February 2020. This coincides with the time period at which COVID19 became a mainstream media event, taking attention away from Brexit with page views subsequently seeing a fall.
Posterior Distributions: Burnin = 100
The same analysis is run, but the burnin_steps are now set to 100 instead of 0.
Let’s have a look at the posterior distributions in this instance.
Here are the “preCOVID” posterior samples:
Here are the “postCOVID” posterior samples:
Additionally, here is an analysis of the switchpoint:
We see that the median readings as well as the indicated switchpoint are similar to that when burnin was set to 0, indicating that modifying this metric has resulted in little difference to our analysis  both instances have identified a drop in median page views for the term “Brexit” with a similar switchpoint indicated.
Of course, the model cannot definitely tell us if this switchpoint is directly due to COVID19 becoming mainstream news and detracting from the Brexit issue. While there is strong coincidental evidence to support this, the drop in page views for Brexit could still be due to other reasons.
Conclusion
This has been an introduction to the use of Bayesian switchpoint analysis using tf.probability.
We have seen:
 The usefulness of posterior distributions in working with time series data
 Why switchpoint identification can be useful in time series analysis
 Use of “burnin” when running a Markov chain simulation
Many thanks for reading, and the associated GitHub repository for this example is accessible here.
Hope you found this useful, and any questions or feedback appreciated!