What is the negative binomial distribution formula?+
The negative binomial PMF is P(X = k) = C(k-1, r-1) times p^r times (1-p)^(k-r) for k = r, r+1, r+2, ... where k is the trial number of the r-th success, r is the number of successes needed, and p is the per-trial success probability. C(k-1, r-1) is the binomial coefficient counting ways to arrange r-1 successes among the first k-1 trials, with the k-th trial always being a success.
What is the mean and variance of the negative binomial distribution?+
The mean is mu = r/p and the variance is sigma^2 = r(1-p)/p^2. The standard deviation is sigma = sqrt(r(1-p)/p^2). For r = 3, p = 0.30: mean = 10 trials, variance = 23.33, SD = 4.83. The variance always exceeds the mean (since variance/mean = (1-p)/p which is greater than 1 when p is less than 0.5), making the negative binomial more spread out than a Poisson with the same mean.
How is the negative binomial different from the binomial distribution?+
In the binomial distribution B(n, p), the number of trials n is fixed and the number of successes X is random. In the negative binomial NB(r, p), the number of successes r is fixed and the number of trials X is random. Binomial answers "how many successes in n trials?" while negative binomial answers "how many trials until r successes?" They are complementary views of the same Bernoulli process.
How is the negative binomial related to the geometric distribution?+
The geometric distribution is the special case of the negative binomial with r = 1. With r = 1, P(X = k) = C(k-1, 0) times p times (1-p)^(k-1) = p(1-p)^(k-1), which is exactly the geometric PMF. The geometric models trials until the first success; the negative binomial generalizes this to trials until the r-th success. Set r = 1 in this calculator to get geometric probabilities.
What is the cumulative distribution function (CDF) of the negative binomial?+
P(X at most k) = sum of P(X = j) for j from r to k. There is no simple closed form, so the CDF is computed by summing PMF values. The upper tail P(X at least k) = 1 - P(X at most k-1). The CDF can also be expressed in terms of the regularized incomplete beta function, but for practical calculation, summing PMF values (as this calculator does) is the most straightforward approach.
Why is the distribution called "negative binomial"?+
The name comes from the generalized binomial series: (1-x)^(-r) = sum over k=0 to infinity of C(r+k-1, k) x^k. The PMF of the negative binomial (in the failures-before-r-th-success parameterization) matches the terms of this series with x = 1-p. The "negative" refers to the negative exponent -r in the binomial expansion, distinguishing it from the ordinary binomial (1+x)^n with a positive exponent.
What is the difference between P(X = k) and P(X at most k)?+
P(X = k) is the exact probability the r-th success occurs on trial k, a specific point probability. P(X at most k) is the cumulative probability the r-th success occurs by trial k, summing all probabilities from k = r up to the specific k. P(X at most k) is useful for questions like "what is the chance we achieve r successes within k trials?" while P(X = k) answers "what is the chance it takes exactly k trials?"
Can I use the negative binomial to model overdispersed count data?+
Yes. The negative binomial regression model is commonly used for count outcomes where variance exceeds the mean (overdispersion). In this context, the distribution is parameterized differently: as a Poisson-gamma mixture where the Poisson rate itself follows a gamma distribution. The r parameter becomes the "size" or "dispersion" parameter. Common applications include modelling insurance claims, hospital re-admissions, accident counts, and website visits per user.
What is the maximum likelihood estimate of p in the negative binomial?+
If you observe x_1, x_2, ..., x_n independent negative binomial trials each with the same r successes needed, the MLE of p is p-hat = r / x-bar, where x-bar is the sample mean of the observed trial counts. This follows from differentiating the log-likelihood and setting it to zero. The MLE of r is more complex and usually requires numerical optimization.
What are the assumptions of the negative binomial distribution?+
The negative binomial requires: (1) each trial results in exactly one of two outcomes (success or failure), (2) trials are independent, (3) the probability of success p is constant on every trial, and (4) the experiment continues until exactly r successes are observed. Violations of these assumptions (changing p over time, correlated trials, unknown stopping rule) invalidate the model. Real-world applications should verify these conditions before using the negative binomial.
How does increasing r affect the negative binomial distribution?+
Increasing r shifts the distribution to the right (higher mean) and makes it more bell-shaped. Mean = r/p grows linearly with r. Standard deviation = sqrt(r(1-p)/p^2) grows as sqrt(r), so the coefficient of variation (SD/mean) = sqrt((1-p)/r)/p decreases. As r approaches infinity, the normalized negative binomial converges to a normal distribution by the central limit theorem. The distribution becomes more symmetric and concentrated around the mean.
How does this calculator compute probabilities for large k and r?+
The calculator uses log-space computation to avoid floating-point overflow. The binomial coefficient C(k-1, r-1) is computed as exp(logGamma(k) - logGamma(r) - logGamma(k-r+1)) using the Lanczos approximation for the log-gamma function. The full log-PMF is log C(k-1,r-1) + r log(p) + (k-r) log(1-p), then exponentiated. This handles combinations involving thousands-sized k and r without overflow errors.