Statistics: goodness-of-fit test with Chi-square distribution

The tutor returns to a more academic focus with today’s post on goodness-of-fit.

In business, as well as in science, people often try to fit data to a mathematical equation. The obvious advantage of doing so is being able to predict results – if the model used is reliable.

To help arrive at such a model, many scientific calculators have regression functions that are easy to use; I’ll cover some of them in future posts. Today, we’ll look at a case where we already have a model, but we want confirmation that it’s reliable.

Example: At a production operation, the equation C=40u+25 models the cost, C, of producing u units. Head office, in charge of budget allocation, suspects this model of over-estimating the costs. The potential problem is that, from an accounting point of view, the sales division of the company is overpaying the production division.

The following data are collected:

 units produced  actual Cost  predicted Cost: 40u+25
0 30 25
15 650 625
20 900 825
30 1125 1225
40 1500 1625
50 1850 2025
75 2772 3025

To test the data against the model, we first evaluate

error stat=(Cactual – Cpredicted)2/Cpredicted

for each row. Since there are seven rows, we get seven different values, as follows:

 units produced  actual Cost  predicted Cost: 40u+25  error stat
0 30 25 1
15 650 625 1
20 900 825 6.818
30 1125 1225 8.163
40 1500 1625 9.615
50 1850 2025 15.123
75 2772 3025 21.16

The sum of the n error stats has a Chi-square (Χ2) distribution with n-1 degrees of freedom :

Χ2n-1=Σerror stats

In our case, we have seven error stats; their sum matches Χ26.

The sum of the error stats in this case is

1+1+6.818+8.163+9.615+15.123+21.16=62.879

At 0.5% significance, we must reject the model for a sum 18.5 or greater. Clearly, 62.879 is much greater than that. Therefore, the model C=40u + 25 must be rejected.

Business examples such as this one can make applied math entertaining – at least to a dusty, armchair-bound academic:)

Source:

Harnett, Donald L. and James L. Murphy: Statistical Analysis for Business and Economics. Don Mills: Addison-Wesley, 1993.

Jack of Oracle Tutoring by Jack and Diane, Campbell River, BC.

Leave a Reply