Sampling Distribution of the Sample Proportion Calculator
Find probabilities, z-scores, and standard error for the sampling distribution of a sample proportion p-hat given population proportion p and sample size n.
📊 What is the Sampling Distribution of the Sample Proportion?
The sampling distribution of the sample proportion is the probability distribution of all possible values of the sample proportion p-hat that could result from drawing random samples of size n from a population where the true proportion is p. Every time you survey a random sample and compute the fraction of respondents with a certain characteristic, you obtain one observation from this sampling distribution.
Three key properties define the distribution: (1) the mean of p-hat equals the population proportion p, meaning the sample proportion is an unbiased estimator; (2) the standard error (standard deviation of p-hat) is SE = sqrt(p*(1-p)/n), which shrinks as n grows; (3) by the Central Limit Theorem, the distribution is approximately normal when both np and n*(1-p) are at least 10. These properties underpin confidence intervals, hypothesis tests about proportions, and survey margin-of-error calculations that appear in political polling, clinical trials, quality control audits, and A/B testing.
A common misconception is that the sampling distribution describes individual observations. It does not. If you survey 100 voters, the sampling distribution tells you how the fraction p-hat (e.g., 0.52 for 52 respondents out of 100 preferring candidate A) would vary across many hypothetical repetitions of the same survey. A single survey gives one p-hat; the sampling distribution characterizes all the p-hat values you would get if you repeated the survey thousands of times.
Another important point is that the normal approximation improves with larger n but is not exact. For small samples or extreme proportions (p near 0 or 1), the binomial exact distribution is more appropriate. When np or n*(1-p) falls below 10, this calculator displays a caution note. For rigorous small-sample inference, use a binomial test or Fisher exact test instead. For typical survey research with n above 100 and p between 0.1 and 0.9, the normal approximation is highly accurate and produces results that match simulation studies to within rounding error.
📐 Formula
📖 How to Use This Calculator
Steps
💡 Example Calculations
Example 1 - Voter Poll (p = 40%, n = 100, query = 45%)
In an election where 40% of voters prefer candidate A, what is the probability that a poll of 100 voters shows 45% or more support?
Example 2 - Quality Control (p = 5% defect rate, n = 200, query = 8%)
A factory has a 5% defect rate. What is the probability that a sample of 200 units shows 8% or more defects?
Example 3 - Survey Range (p = 50%, n = 400, between 47% and 53%)
A 50-50 election: what fraction of polls of 400 voters will show results between 47% and 53%?
❓ Frequently Asked Questions
🔗 Related Calculators
What is the sampling distribution of the sample proportion?
When you draw a random sample of n observations from a population where the true proportion is p, the sample proportion p-hat = x/n varies from sample to sample. The collection of all possible p-hat values and their probabilities forms the sampling distribution. Its mean is p, its standard deviation is sqrt(p*(1-p)/n), and by the Central Limit Theorem it is approximately normal for large n.
What is the formula for the standard error of a proportion?
The standard error is SE = sqrt(p*(1-p)/n), where p is the population proportion and n is the sample size. For p=0.5 (maximum uncertainty) and n=100, SE = sqrt(0.25/100) = 0.05. Larger n always decreases SE; the proportion p(1-p) is maximized at p=0.5 and shrinks toward zero as p approaches 0 or 1.
When can I use the normal approximation for the sample proportion?
The normal approximation works well when both np and n(1-p) are at least 10. For example, with p=0.2 and n=50, np=10 and n(1-p)=40, so the approximation is just barely acceptable. With p=0.05 and n=50, np=2.5, which is too small and the distribution is skewed; use a binomial exact test instead.
How do I calculate P(p-hat at most 0.55) when p=0.5 and n=100?
Step 1: SE = sqrt(0.5*0.5/100) = 0.05. Step 2: z = (0.55 - 0.5)/0.05 = 1.00. Step 3: P(p-hat at most 0.55) = normCDF(1.00) = 0.8413, or about 84.13%. This means about 84% of random samples of size 100 from a 50% population will have a sample proportion of 55% or less.
What is the mean and variance of the sample proportion distribution?
The mean (expected value) is E[p-hat] = p. The variance is Var(p-hat) = p*(1-p)/n. The standard deviation (standard error) is SE = sqrt(p*(1-p)/n). For example, with p=0.3 and n=200, the variance is 0.3*0.7/200 = 0.00105 and SE = 0.0324.
How does sample size affect the sampling distribution of p-hat?
Larger sample size concentrates the sampling distribution more tightly around p. Doubling n multiplies the variance by 1/2, which divides SE by sqrt(2). Quadrupling n halves the SE. This is why larger surveys give narrower confidence intervals and more reliable proportion estimates.
What is the difference between p and p-hat in statistics?
The parameter p is the fixed, unknown true proportion in the population (e.g., the fraction of all voters who prefer a candidate). The statistic p-hat is the observed proportion in a particular sample. Because sampling is random, p-hat varies from sample to sample; p does not. The goal of inference is to estimate p using information about the distribution of p-hat.
How is the Between Values mode useful?
The Between Values mode computes P(p1 at most p-hat at most p2) using P(p1 at most p-hat at most p2) = normCDF(z2) minus normCDF(z1), where z1 = (p1-p)/SE and z2 = (p2-p)/SE. This answers questions like: if the true approval rating is 40%, what is the probability that a poll of 500 people shows between 37% and 43% approval? Answer: P(0.37 at most p-hat at most 0.43) = normCDF(-0.67) subtracted from normCDF(0.67) = 0.4972.