Statistics: a confidence interval for the median
The tutor explains, with an example, how to construct a confidence interval for the median.
Let’s imagine we have the following data for purchases from a coffee shop, arranged least to greatest:
2.05, 2.05, 2.05, 2.05, 2.05, 2.05, 2.55, 2.55, 2.85, 2.85, 3.95, 3.95, 4.10, 4.10, 4.95, 4.95, 5.10, 5.10, 5.40, 5.40, 5.95, 6.25, 6.55, 8.20, 8.80, 9.65, 10.25, 12.25, 12.25, 15.65, 17.50, 18.80, 19.95, 20.00, 20.10, 22.95, 25.00, 25.00, 25.00, 35.25
We want a 95% confidence interval for the median purchase. To construct it, we use the following ideas:
- Each entry has p=0.5 probability of being greater than the median.
- The number of entries greater than the median is then a binomial variable with standard deviation σ=square root(np(1-p)) = (np(1-p))^1/2. In this particular case σ=(40×0.5×0.5)^1/2=3.162.
- For this situation, the standard deviation refers to a number of entries, rather than a price value.
- For a 95% confidence interval, we can safely use margin of error 1.96σ, since we have 40 entries. (The threshold is 30).
- 1.96σ=1.96×3.162=6.2
- 6.2 is not an integer. To be safe, we will imagine ± 7 entries, to be more than 95% confident of capturing the median.
- A way to construct the interval is to realize that it reaches out 7 each side from the middle. Since there are 40 entries, we remove the bottom 13 and top 13, so we are left with the middle 14.
- Our 95% confidence interval for the median is
4.10, 4.95, 4.95, 5.10, 5.10, 5.40, 5.40, 5.95, 6.25, 6.55, 8.20, 8.80, 9.65, 10.25
- With 95% confidence, we assume the median is between 4.10 and 10.25 inclusive:)
Source:
Mills, Richard L. Statistics for Applied Economics and Business. Toronto: McGraw-Hill, 1977.
Jack of Oracle Tutoring by Jack and Diane, Campbell River, BC.