 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

## Binomial Distribution

Probability Mass Function The binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately labeled "success" and "failure". The binomial distribution is used to obtain the probability of observing x successes in N trials, with the probability of success on a single trial denoted by p. The binomial distribution assumes that p is fixed for all trials.

The formula for the binomial probability mass function is

$$P(x;p,n) = \left( \begin{array}{c} n \\ x \end{array} \right) (p)^{x}(1 - p)^{(n-x)} \;\;\;\;\;\; \mbox{for x = 0, 1, 2, \cdots , n}$$

where

$$\left( \begin{array}{c} n \\ x \end{array} \right) = \frac{n!} {x!(n-x)! }$$

The following is the plot of the binomial probability density function for four values of p and n = 100. Cumulative Distribution Function The formula for the binomial cumulative probability function is

$$F(x;p,n) = \sum_{i=0}^{x}{\left( \begin{array}{c} n \\ i \end{array} \right) (p)^{i}(1 - p)^{(n-i)}}$$

The following is the plot of the binomial cumulative distribution function with the same values of p as the pdf plots above. Percent Point Function The binomial percent point function does not exist in simple closed form. It is computed numerically. Note that because this is a discrete distribution that is only defined for integer values of x, the percent point function is not smooth in the way the percent point function typically is for a continuous distribution.

The following is the plot of the binomial percent point function with the same values of p as the pdf plots above. Common Statistics
 Mean np Mode p(n + 1) - 1 ≤ x ≤ p(n + 1) Range 0 to n Standard Deviation $$\sqrt{np(1 - p)}$$ Coefficient of Variation $$\sqrt{\frac{(1-p)} {np}}$$ Skewness $$\frac{(1-2p)} {\sqrt{np(1 - p)}}$$ Kurtosis $$3 - \frac{6} {n} + \frac{1} {np(1 - p)}$$
Comments The binomial distribution is probably the most commonly used discrete distribution.
Parameter Estimation The maximum likelihood estimator of p (for fixed n) is

$$\tilde{p} = \frac{x} {n}$$

Software Most general purpose statistical software programs support at least some of the probability functions for the binomial distribution. 