Statistics: confidence interval for the mean (two sided)
Tutoring statistics, confidence intervals are important.
A two-sided confidence interval for the population mean is given by
sample_mean – (standard_dev/n1/2)*sig_factor, sample_mean + (standard_dev/n1/2)*sig_factor
The sig_factor (significance factor) depends on the certainty (confidence level) with which we want the confidence interval to include the population mean; typically it’s around 2 (aka, 1.96) for 95% confidence.
The standard deviation might be known or might be calculated from the sample itself. If it’s known, the normal distribution is used; if calculated, then technically the t-distribution should be used (see point 3 below).
There are a few points that make the two-sided confidence interval for the population mean an elegant construct:
- Its lower and upper boundaries depend on the sample size, but not the population size.
- For sample size n≥31, the parent population needn’t be normal for the sample mean to be normally distriubted. This validates the confidence interval even for a non-normal population for n≥31. It’s a consequence of the Central Limit Theorem. (Actually, the rule of thumb is n≥30, but for the purpose of the next point, I like 31.)
- For n≥31, the t-distribution approximates the normal to around 4%, so the normal approximation can probably be used even for unknown population standard deviation.
Source:
Harnett, Donald L. and James L. Murphy. Statistical Analysis for Business and Economics, first Can. ed. Don Mills: Addison-Wesley, 1993.
Jack of Oracle Tutoring by Jack and Diane, Campbell River, BC.
Leave a Reply
You must be logged in to post a comment.