{"id":10817,"date":"2015-06-04T19:15:05","date_gmt":"2015-06-04T19:15:05","guid":{"rendered":"http:\/\/www.oracletutoring.ca\/blog\/?p=10817"},"modified":"2017-09-07T01:21:46","modified_gmt":"2017-09-07T01:21:46","slug":"statistics-goodness-of-fit-test-with-chi-square-distribution","status":"publish","type":"post","link":"https:\/\/www.oracletutoring.ca\/blog\/statistics-goodness-of-fit-test-with-chi-square-distribution\/","title":{"rendered":"Statistics:  goodness-of-fit test with Chi-square distribution"},"content":{"rendered":"<h1>The tutor returns to a more academic focus with today&#8217;s post on goodness-of-fit.<\/h1>\n<p>In business, as well as in science, people often try to fit data to a mathematical equation. The obvious advantage of doing so is being able to predict results &#8211; if the model used is reliable.<\/p>\n<p>To help arrive at such a model, many scientific calculators have regression functions that are easy to use; I&#8217;ll cover some of them in future posts. Today, we&#8217;ll look at a case where we already have a model, but we want confirmation that it&#8217;s reliable.<\/p>\n<p><strong>Example:<\/strong> At a production operation, the equation C=40u+25 models the cost, C, of producing u units. Head office, in charge of budget allocation, suspects this model of over-estimating the costs. The potential problem is that, from an accounting point of view, the sales division of the company is overpaying the production division.<\/p>\n<p>The following data are collected:<\/p>\n<table style=\"display: block; width: 80%; margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"border: 1px solid black;\">\u00a0units produced<\/td>\n<td style=\"border: 1px solid black;\">\u00a0actual Cost<\/td>\n<td style=\"border: 1px solid black;\">\u00a0predicted Cost: 40u+25<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\">30<\/td>\n<td style=\"text-align: center;\">25<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">15<\/td>\n<td style=\"text-align: center;\">650<\/td>\n<td style=\"text-align: center;\">625<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">20<\/td>\n<td style=\"text-align: center;\">900<\/td>\n<td style=\"text-align: center;\">825<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">30<\/td>\n<td style=\"text-align: center;\">1125<\/td>\n<td style=\"text-align: center;\">1225<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">40<\/td>\n<td style=\"text-align: center;\">1500<\/td>\n<td style=\"text-align: center;\">1625<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">50<\/td>\n<td style=\"text-align: center;\">1850<\/td>\n<td style=\"text-align: center;\">2025<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">75<\/td>\n<td style=\"text-align: center;\">2772<\/td>\n<td style=\"text-align: center;\">3025<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>To test the data against the model, we first evaluate<\/p>\n<p>error stat=(C<sub>actual<\/sub> &#8211; C<sub>predicted<\/sub>)<sup>2<\/sup>\/C<sub>predicted<\/sub><\/p>\n<p>for each row. Since there are seven rows, we get seven different values, as follows:<\/p>\n<table style=\"display: block; width: 95%; margin: auto;\">\n<tbody>\n<tr>\n<td style=\"border: 1px solid black;\">\u00a0units produced<\/td>\n<td style=\"border: 1px solid black;\">\u00a0actual Cost<\/td>\n<td style=\"border: 1px solid black;\">\u00a0predicted Cost: 40u+25<\/td>\n<td style=\"border: 1px solid black;\">\u00a0error stat<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">0<\/td>\n<td style=\"text-align: center;\">30<\/td>\n<td style=\"text-align: center;\">25<\/td>\n<td style=\"text-align: center;\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">15<\/td>\n<td style=\"text-align: center;\">650<\/td>\n<td style=\"text-align: center;\">625<\/td>\n<td style=\"text-align: center;\">1<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">20<\/td>\n<td style=\"text-align: center;\">900<\/td>\n<td style=\"text-align: center;\">825<\/td>\n<td style=\"text-align: center;\">6.818<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">30<\/td>\n<td style=\"text-align: center;\">1125<\/td>\n<td style=\"text-align: center;\">1225<\/td>\n<td style=\"text-align: center;\">8.163<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">40<\/td>\n<td style=\"text-align: center;\">1500<\/td>\n<td style=\"text-align: center;\">1625<\/td>\n<td style=\"text-align: center;\">9.615<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">50<\/td>\n<td style=\"text-align: center;\">1850<\/td>\n<td style=\"text-align: center;\">2025<\/td>\n<td style=\"text-align: center;\">15.123<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;\">75<\/td>\n<td style=\"text-align: center;\">2772<\/td>\n<td style=\"text-align: center;\">3025<\/td>\n<td style=\"text-align: center;\">21.16<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The sum of the n error stats has a Chi-square (\u03a7<sup>2<\/sup>) distribution with n-1 degrees of freedom :<\/p>\n<p>\u03a7<sup>2<\/sup><sub>n-1<\/sub>=\u03a3error stats<\/p>\n<p>In our case, we have seven error stats; their sum matches \u03a7<sup>2<\/sup><sub>6<\/sub>.<\/p>\n<p>The sum of the error stats in this case is<\/p>\n<p>1+1+6.818+8.163+9.615+15.123+21.16=62.879<\/p>\n<p>At 0.5% significance, we must reject the model for a sum 18.5 or greater. Clearly, 62.879 is much greater than that. Therefore, the model C=40u + 25 must be rejected.<\/p>\n<p>Business examples such as this one can make applied math entertaining &#8211; at least to a dusty, armchair-bound academic:)<\/p>\n<p>Source:<\/p>\n<p>Harnett, Donald L. and James L. Murphy: <u>Statistical Analysis for Business and Economics<\/u>. Don Mills: Addison-Wesley, 1993.<\/p>\n<p>Jack of <a href=\"https:\/\/www.oracletutoring.ca\">Oracle Tutoring by Jack and Diane,<\/a> Campbell River, BC.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The tutor returns to a more academic focus with today&#8217;s post on goodness-of-fit. In business, as well as in science, people often try to fit data to a mathematical equation. The obvious advantage of doing so is being able to &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"more-link\" href=\"https:\/\/www.oracletutoring.ca\/blog\/statistics-goodness-of-fit-test-with-chi-square-distribution\/\"> <span class=\"screen-reader-text\">Statistics:  goodness-of-fit test with Chi-square distribution<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[371,19],"tags":[859,856,857,858],"class_list":["post-10817","post","type-post","status-publish","format-standard","hentry","category-business","category-statistics","tag-business-example","tag-chi-square-distribution","tag-goodness-of-fit","tag-regression-model"],"_links":{"self":[{"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/posts\/10817","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/comments?post=10817"}],"version-history":[{"count":40,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/posts\/10817\/revisions"}],"predecessor-version":[{"id":23421,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/posts\/10817\/revisions\/23421"}],"wp:attachment":[{"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/media?parent=10817"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/categories?post=10817"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.oracletutoring.ca\/blog\/wp-json\/wp\/v2\/tags?post=10817"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}