Statistics: goodness-of-fit test with Chi-square distribution
The tutor returns to a more academic focus with today’s post on goodness-of-fit.
In business, as well as in science, people often try to fit data to a mathematical equation. The obvious advantage of doing so is being able to predict results – if the model used is reliable.
To help arrive at such a model, many scientific calculators have regression functions that are easy to use; I’ll cover some of them in future posts. Today, we’ll look at a case where we already have a model, but we want confirmation that it’s reliable.
Example: At a production operation, the equation C=40u+25 models the cost, C, of producing u units. Head office, in charge of budget allocation, suspects this model of over-estimating the costs. The potential problem is that, from an accounting point of view, the sales division of the company is overpaying the production division.
The following data are collected:
| units produced | actual Cost | predicted Cost: 40u+25 |
| 0 | 30 | 25 |
| 15 | 650 | 625 |
| 20 | 900 | 825 |
| 30 | 1125 | 1225 |
| 40 | 1500 | 1625 |
| 50 | 1850 | 2025 |
| 75 | 2772 | 3025 |
To test the data against the model, we first evaluate
error stat=(Cactual – Cpredicted)2/Cpredicted
for each row. Since there are seven rows, we get seven different values, as follows:
| units produced | actual Cost | predicted Cost: 40u+25 | error stat |
| 0 | 30 | 25 | 1 |
| 15 | 650 | 625 | 1 |
| 20 | 900 | 825 | 6.818 |
| 30 | 1125 | 1225 | 8.163 |
| 40 | 1500 | 1625 | 9.615 |
| 50 | 1850 | 2025 | 15.123 |
| 75 | 2772 | 3025 | 21.16 |
The sum of the n error stats has a Chi-square (Χ2) distribution with n-1 degrees of freedom :
Χ2n-1=Σerror stats
In our case, we have seven error stats; their sum matches Χ26.
The sum of the error stats in this case is
1+1+6.818+8.163+9.615+15.123+21.16=62.879
At 0.5% significance, we must reject the model for a sum 18.5 or greater. Clearly, 62.879 is much greater than that. Therefore, the model C=40u + 25 must be rejected.
Business examples such as this one can make applied math entertaining – at least to a dusty, armchair-bound academic:)
Source:
Harnett, Donald L. and James L. Murphy: Statistical Analysis for Business and Economics. Don Mills: Addison-Wesley, 1993.
Jack of Oracle Tutoring by Jack and Diane, Campbell River, BC.
Leave a Reply
You must be logged in to post a comment.