Home | Portfolio | Terms and Conditions | E-mail me | LinkedIn

# Cumulative Binomial Probability Analysis with R and Shiny

In conducting probability analysis, the two variables that take account of the chance of an event happening are N (number of observations) and λ (lambda – our hit rate/chance of occurrence in a single interval). When we talk about a cumulative binomial probability distribution, we mean to say that the greater the number of trials, the higher the overall probability of an event occurring.

**probability = 1 – ((1 – λ) ^{N})**

For instance, the odds of rolling a number 6 on a fair die is 1/6. However, suppose that same die is rolled 10 times:

**1 – ((1 – 0.1667) ^{10}) = 0.8385**

We see that the probability of rolling a number 6 now increases to 83.85%.

Based on the law of large numbers, the larger the number of trials; the larger the probability of an event happening even if the probability within a single trial is very low. So, let us generate a cumulative binomial probability to demonstrate how probability increases given an increase in the number of trials.

Firstly, we define a function (with probabilities set at 2%, 4%, and 6%, along with trials of up to 100:

```
par(bg = '#191661', fg = '#ffffff', col.main = '#ffffff', col.lab = '#ffffff', col.axis = '#ffffff')
#lambda = probability of event occuring in a single trial
#powers = number of trials
#mu = overall probability given n number of trials
muCalculation <- function(lambda, powers) {1 - ((1 - lambda)^powers)}
probability_at_lambda <- sapply(c(0.02, 0.04, 0.06), muCalculation, seq(0, 100, 1))
Then, we can set up our data as a data frame and then plot as normal:
probability_at_lambdadf=data.frame(probability_at_lambda)
col_headings <- c("probability1","probability2","probability3")
names(probability_at_lambdadf) <- col_headings
probability_at_lambdadf
attach(probability_at_lambdadf)
plot(probability_at_lambdadf$probability1,type="o",col="#b1aef4", xlab="N", ylab="Probability", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
lines(probability_at_lambdadf$probability2,type="o",col="red", xlab="N", ylab="Probability2", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
lines(probability_at_lambdadf$probability3,type="o",col="green", xlab="N", ylab="Probability3", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
title(main="Probability Chart")
grid(nx = NULL, ny = NULL, col = "lightgray", lty = "dotted",
lwd = par("lwd"), equilogs = TRUE)
legend("bottomright", probability[2], c("probability_at_lambda_1","probability_at_lambda_2", "probability_at_lambda_3"), cex=0.6, col=c("#b1aef4","red","green"), pch=21:22, lty=1:2)
proc.time()
```

### Sample Table

Here is a sample table with the calculated probabilities (probability_at_lambdadf):

### Plot

Accordingly, here is a plot of the probabilities:

## Analyse Cumulative Binomial Probability with a Shiny Web Application

This is an example of a Shiny Web application that can calculate cumulative binomial probabilities on the fly.

You’ll remember that our previous R script invoked a function to calculate binomial probabilities based on lambda (the probability of an event happening), and the power value (or number of trials).

The idea is that while the probability of an individual event happening may be low, the cumulative probability of the event happening increases with the number of trials.

**1 – ((1 – λ) ^{N})**

Here is an example of a Shiny Web App that allows us to manipulate the lambda values using a set of sliders and automatically update the probability curve.

To run this app, open the R Studio console and click File -> New File -> Shiny Web App and select either Single File to paste the ui.R and server.R codes together, or Multiple File to paste them separately.

Additionally, if you are new to Shiny you can find my full tutorial on Sitepoint that describes how to build and run a Shiny app from scratch.

A few points when setting up the UI (User Interface):

**lambda**represents the probability of an event occurring in a single trial- The slider input allows the user to set different values for lambda based on the associated probability
- The plot is then outputted with the output being designated the name “ProbPlot”

### ui.R

```
library(shiny)
# Define UI for application that draws a probability plot
shinyUI(fluidPage(
# Application title
titlePanel("Cumulative Binomial Probability Plot"),
# Sidebar with a slider input for value of lambda
sidebarLayout(
sidebarPanel(
sliderInput("lambda",
"Probability 1:",
min = 0,
max = 1,
value = 0.01),
sliderInput("lambda2",
"Probability 2:",
min = 0,
max = 1,
value = 0.01),
sliderInput("lambda3",
"Probability 3:",
min = 0,
max = 1,
value = 0.01)
),
# Show a plot of the generated probability plot
mainPanel(
plotOutput("ProbPlot")
)
)
))
```

Now, we set up the server - this is the part that takes the inputs and calculates the output that is eventually shown in the UI.

- The lambda values represent the inputs that we defined in the UI; i.e. the user sets the probability from the slider.
- The probability function is defined: {1 - ((1 - lambda)^powers)}
- The separate probability arrays are then calculated (probability_at_lambda, probability_at_lambda2, probability_at_lambda3)
- The probability is then plotted.

### server.R

```
library(shiny)
library(ggplot2)
library(scales)
# Shiny Application
shinyServer(function(input, output) {
# Reactive expressions
output$ProbPlot <- renderPlot({
# generate lambda based on input$lambda from ui.R
l=0:1
lambda <- seq(min(l), max(l), length.out = input$lambda)
probability=lambda
l2=0:1
lambda2 <- seq(min(l2), max(l2), length.out = input$lambda2)
probability=lambda
l3=0:1
lambda3 <- seq(min(l3), max(l3), length.out = input$lambda3)
probability=lambda
# generate trials based on lambda value
muCalculation <- function(lambda, powers) {1 - ((1 - lambda)^powers)}
probability_at_lambda <- sapply(input$lambda, muCalculation, seq(0, 100, 1))
probability_at_lambda2 <- sapply(input$lambda2, muCalculation, seq(0, 100, 1))
probability_at_lambda3 <- sapply(input$lambda3, muCalculation, seq(0, 100, 1))
# draw the probability
par(bg = '#191661', fg = '#ffffff', col.main = '#ffffff', col.lab = '#ffffff', col.axis = '#ffffff')
plot(probability_at_lambda,type="o",col="#b1aef4", xlab="N", ylab="Probability", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
lines(probability_at_lambda2,type="o",col="red", xlab="N", ylab="Probability2", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
lines(probability_at_lambda3,type="o",col="green", xlab="N", ylab="Probability3", xlim=c(0, 100), ylim=c(0.0, 1.0), pch=19)
title(main="Cumulative Binomial Probability")
})
})
```

## Conclusion

Today, you have learned how to:

- Generate a cumulative binomial probability distribution using R
- Use Shiny to visualise cumulative binomial probability