Tutoring statistics, confidence intervals are important.
A two-sided confidence interval for the population mean is given by
sample_mean – (standard_dev/n1/2)*sig_factor, sample_mean + (standard_dev/n1/2)*sig_factor
The sig_factor (significance factor) depends on the certainty (confidence level) with which we want the confidence interval to include the population mean; typically it’s around 2 (aka, 1.96) for 95% confidence.
The standard deviation might be known or might be calculated from the sample itself. If it’s known, the normal distribution is used; if calculated, then technically the t-distribution should be used (see point 3 below).
There are a few points that make the two-sided confidence interval for the population mean an elegant construct:
- Its lower and upper boundaries depend on the sample size, but not the population size.
- For sample size n≥31, the parent population needn’t be normal for the sample mean to be normally distriubted. This validates the confidence interval even for a non-normal population for n≥31. It’s a consequence of the Central Limit Theorem. (Actually, the rule of thumb is n≥30, but for the purpose of the next point, I like 31.)
- For n≥31, the t-distribution approximates the normal to around 4%, so the normal approximation can probably be used even for unknown population standard deviation.
Harnett, Donald L. and James L. Murphy. Statistical Analysis for Business and Economics, first Can. ed. Don Mills: Addison-Wesley, 1993.
Jack of Oracle Tutoring by Jack and Diane, Campbell River, BC.