10 Binomial random variables

10.1 Learning objectives

Define and understand Binomial random variables.

10.2 Binomial random variable

10.2.1 Definition

Consider a fixed number \(n\) of Bernoulli trials conducted with success probability \(P(S)=p\) and failure probability \(P(F)=q=1-p\) in each trial. Define the random variable \(X\) as follows:

\(X =\) the sum of the number of Successes from \(n\) of the above Bernoulli trials.

Then \(X\) is called a binomial distribution with \(n\) trials and success probability \(p\). The binomial distribution is completely defined by two parameters, \(n\) and \(p\), and we can write:

\(X \sim Binomial(n,p)\)

10.2.2 Probability distribution

10.2.2.1 Population parameters: n=2, p=0.5

Define the random variable \(X\) as follows:

\(X\) = the number of heads in \(2\) flips of a fair coin.

Then \(X \sim Binomial(n=2, p=0.5)\)

To obtain the probability distribution, we must compute the probability of every possible outcome that \(X\) can produce. With \(n=2\) flips, it is easy to see that there are 3 possible outcomes (0, 1, or 2 heads) which are obtained with the following possible events and corresponding probabilities:

\[\begin{align} P(X=0) &= P(TT) \\ &= P(T)P(T) \\ &= 0.5 \times 0.5 \\ &= 0.25 \\\\ P(X=1) &= P(TH) + P(HT) \\ &= P(T)P(H) + P(H)P(T) \\ &= 0.5 \times 0.5 + 0.5 \times 0.5 \\ &= 0.25 + 0.25 \\ &= 0.5 \\\\ P(X=2) &= P(HH) \\ &= P(H)P(H) \\ &= 0.5 \times 0.5 \\ &= 0.25 \end{align}\]

Above, we were able to write \(P(TT)=P(T)P(T)\) because coin flips are statistically independent.

In simple cases like this it’s no big deal to compute the probabilities or all possible events by hand, but this approach quickly becomes intractable in real world situations. Luckily, R provides functions for essentially all the probability distributions you will likely ever find yourself using. In the case of a binomial distribution, we will use the dbinom() function as illustrated below.

10.2.2.2 Population parameters: function of n and p

Now lets run through this exercise with different \(n\) and \(p\) values. Below, notice that when \(p=0.1\), the distribution is skewed to the left. This makes sense because if the probability of success is very small, then the probability of achieving zero or only few successes is high. Similarly, notice that when \(p=0.9\), the distribution is skewed to the right. The same logic applied in the previous sentence applies here. If the probability of success is very high, then we should expect many successes, and most of the probability will correspond to larger outcomes.