{"id":83,"date":"2021-01-12T22:19:42","date_gmt":"2021-01-12T22:19:42","guid":{"rendered":"https:\/\/textbooks.jaykesler.net\/introstats\/chapter\/two-population-means-independent\/"},"modified":"2021-05-11T20:12:18","modified_gmt":"2021-05-11T20:12:18","slug":"two-population-means-independent","status":"publish","type":"chapter","link":"https:\/\/textbooks.jaykesler.net\/introstats\/chapter\/two-population-means-independent\/","title":{"rendered":"Two Population Means (Independent)"},"content":{"raw":"<span style=\"display: none;\">\r\n[latexpage]\r\n<\/span>\r\n<div id=\"01e79f95-f1ac-4eab-bc98-375b3e0c6948\" class=\"chapter-content-module\" data-type=\"page\" data-cnxml-to-html-ver=\"2.1.0\">\r\n<ol id=\"element-984\">\r\n \t<li>The two independent samples are simple random samples from two distinct populations.<\/li>\r\n \t<li>For the two distinct populations:\r\n<ul id=\"fs-idm39244848\">\r\n \t<li>if the sample sizes are small, the distributions are important (should be normal)<\/li>\r\n \t<li>if the sample sizes are large, the distributions are not important (need not be normal)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<div id=\"fs-idm107349216\" class=\"ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\"><header>\r\n<h3 class=\"os-title\" data-type=\"title\"><span id=\"1\" class=\"os-title-label\" data-type=\"\">NOTE<\/span><\/h3>\r\n<\/header><section>\r\n<div class=\"os-note-body\">\r\n<p id=\"fs-idm165699296\">The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch t-test. The degrees of freedom formula was developed by Aspin-Welch.<\/p>\r\n\r\n<\/div>\r\n<\/section><\/div>\r\nThe comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the sample means, $\\bar X_1 - \\bar X_2$, and divide by the standard error in order to standardize the difference. The result is a t-score test statistic.\r\n\r\nBecause we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or <span id=\"term197\" data-type=\"term\">standard error<\/span>, of <strong>the difference in sample means<\/strong>, $\\bar X_1 - \\bar X_2$.\r\n\r\nT<span data-type=\"title\">he standard error is:<\/span>\r\n\r\n$$\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}$$\r\n\r\nThe test statistic (<em data-effect=\"italics\">t<\/em>-score) is calculated as follows:\r\n\r\n$$t=\\frac{(\\bar x_1 - \\bar x_2) - (\\mu_1-\\mu_2)}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}$$\r\n\r\nwhere:\r\n<div id=\"list1\" data-type=\"list\">\r\n<ul>\r\n \t<li><em data-effect=\"italics\">s<\/em><sub>1<\/sub> and <em data-effect=\"italics\">s<\/em><sub>2<\/sub>, the sample standard deviations, are estimates of <em data-effect=\"italics\">\u03c3<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03c3<\/em><sub>2<\/sub>, respectively.<\/li>\r\n \t<li><em data-effect=\"italics\">\u03c3<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03c3<\/em><sub>2<\/sub> are the unknown population standard deviations.<\/li>\r\n \t<li>$\\bar x_1$ and $\\bar x_2$ are the sample means.<\/li>\r\n \t<li><em data-effect=\"italics\">\u03bc<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03bc<\/em><sub>2<\/sub> are the population means, and the null hypothesis will always be <em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2 <\/sub><\/em><\/li>\r\n<\/ul>\r\n<div class=\"textbox note\">\r\n<h3>Note<\/h3>\r\n<em data-effect=\"italics\">\u03bc<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03bc<\/em><sub>2<\/sub> are the population means, and the null hypothesis will always be <em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2 <\/sub>. <\/em>This means that the calculation $(\\mu_1-\\mu_2)$ in our test statistic will always be zero, so we can simplify the test statistic calculation as\r\n\r\n$$t=\\frac{\\bar x_1 - \\bar x_2}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}$$\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"element-256\" class=\"finger\">The number of <span id=\"term198\" data-type=\"term\">degrees of freedom (<em data-effect=\"italics\">df<\/em>)<\/span> requires a somewhat complicated calculation. However, for this introductory course, we will be using a simplified calculation. The test statistic calculated previously is approximated by the Student's <em data-effect=\"italics\">t<\/em>-distribution with <em data-effect=\"italics\">df <\/em>that is 1 less than the smaller of $n_1$ and $n_2$ from your two samples. For example, if $n_1=14$ and $n_2=9$, we would use $df=8$.<\/p>\r\n<p id=\"element-316\">Notice that the sample variances (<em data-effect=\"italics\">s<\/em><sub>1<\/sub>)<sup>2<\/sup> and (<em data-effect=\"italics\">s<\/em><sub>2<\/sub>)<sup>2<\/sup> are not pooled, meaning we don't consider the standard deviation of the samples as if they were both just one big sample together. (If the question comes up, do not pool the variances for means.)<\/p>\r\n\r\n<div id=\"element-208\" class=\"ui-has-child-title\" data-type=\"example\"><header>\r\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.1<\/span><\/h3>\r\n<\/header><section>\r\n<div class=\"body\">\r\n<h4 id=\"4\" data-type=\"title\">Independent groups<\/h4>\r\n<p id=\"element-687\">The average amount of time boys and girls aged seven to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in <a class=\"autogenerated-content\" href=\"#uid888\">Table 9.1<\/a>. Each populations has a normal distribution.<\/p>\r\n\r\n<div id=\"uid888\" class=\"os-table \">\r\n<table summary=\"Table 9.1 \" data-id=\"uid888\">\r\n<thead valign=\"top\">\r\n<tr>\r\n<th scope=\"col\" data-align=\"center\"><\/th>\r\n<th scope=\"col\" data-align=\"center\">Sample Size<\/th>\r\n<th scope=\"col\" data-align=\"center\">Average Number of Hours Playing Sports Per Day<\/th>\r\n<th scope=\"col\" data-align=\"center\">Sample Standard Deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody valign=\"top\">\r\n<tr>\r\n<td>Girls<\/td>\r\n<td>9<\/td>\r\n<td>2<\/td>\r\n<td>0.866<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Boys<\/td>\r\n<td>16<\/td>\r\n<td>3.2<\/td>\r\n<td>1.00<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.1<\/span><\/div>\r\n<\/div>\r\n<div id=\"element-114\" class=\" unnumbered\" data-type=\"exercise\"><header><\/header><section>\r\n<div id=\"id6322791\" data-type=\"problem\">\r\n<div class=\"os-problem-container \">\r\n<p id=\"element-939\">Is there a difference in the mean amount of time boys and girls aged seven to 11 play sports each day? Test at the 5% level of significance.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"id21329984\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\">\r\n<div class=\"ui-toggle-wrapper\"><\/div>\r\n<section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\r\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.1<\/span><\/h4>\r\n<div class=\"os-solution-container\">\r\n<p id=\"element-296\"><strong>The population standard deviations are not known.<\/strong> Let <em data-effect=\"italics\">g<\/em> be the subscript for girls and <em data-effect=\"italics\">b<\/em> be the subscript for boys. Then, <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> is the population mean for girls and <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> is the population mean for boys. This is a test of two <strong>independent groups<\/strong>, two population <strong>means<\/strong>.<\/p>\r\n<p id=\"element-221\"><span id=\"term199\" data-type=\"term\">Random variable<\/span>: $\\bar X_g-\\bar X_b=$ difference in the sample mean amount of time girls and boys play sports each day.<\/p>\r\n<span data-type=\"newline\">\r\n<\/span><em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>\u2003\u2003<em data-effect=\"italics\">H<sub>0<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2013 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> = 0\r\n\r\n<span data-type=\"newline\">\r\n<\/span><em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2260 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>\u2003\u2003<em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2013 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> \u2260 0\r\n<span data-type=\"newline\">\r\n<\/span>The words <strong>\"the same\"<\/strong> tell you <em data-effect=\"italics\">H<sub>0<\/sub><\/em> has an \"=\". Since there are no other words to indicate <em data-effect=\"italics\">H<sub>1<\/sub><\/em>, assume it says <strong>\"is different.\"<\/strong>\r\nThis is a two-tailed test (from the fact that <em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2260 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>),.\r\n\r\n<strong>Calculate the test statistic:<\/strong>\r\n\r\n$$t=\\frac{\\bar x_1 - \\bar x_2)}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}=\\frac{2- 3.2}{\\sqrt{\\frac{(0.866)^2}{9}+\\frac{(1)^2}{16}}}=\u22123.142390292$$\r\n<p id=\"element-529\"><strong>Distribution for the test:<\/strong>\r\nUse <em data-effect=\"italics\">t<sub>df<\/sub><\/em> where <em data-effect=\"italics\">df<\/em> = min($n_g, n_b$)-1 = min(9,16)-1 = 9-1=8.<\/p>\r\n<strong>Get the area into the tail beyond the test statistic:<\/strong>\r\n\r\nSince our test statistic is negative, we'll get the area to the left using Google Sheets.\r\n\r\n<code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\" default-formula-text-color\" dir=\"auto\">=<\/span><span class=\" default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\" default-formula-text-color\" dir=\"auto\">( <\/span><span class=\"number\" dir=\"auto\">3.142390292<\/span><span class=\" default-formula-text-color\" dir=\"auto\">, <\/span><span class=\"number\" dir=\"auto\">8<\/span><span class=\" default-formula-text-color\" dir=\"auto\">, <\/span><span class=\"number\" dir=\"auto\">2 <\/span><span class=\" default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code>\r\n\r\nThe first value given to the TDIST function is the <strong>positive version<\/strong> of our test statistic, the second value is our degrees of freedom, and the third is the number of tails.\r\n\r\n<em data-effect=\"italics\">p<\/em>-value = 0.0138\r\n<p id=\"element-823\"><strong>Make a decision:<\/strong> Since <em data-effect=\"italics\">\u03b1<\/em> &gt; <em data-effect=\"italics\">p<\/em>-value, reject <em data-effect=\"italics\">H<sub>0<\/sub><\/em>. This means you reject <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>. The means are different.<\/p>\r\n<p id=\"element-914\"><strong>Conclusion:<\/strong> At the 5% level of significance, the sample data show there is sufficient evidence to conclude that the mean number of hours that girls and boys aged seven to 11 play sports per day is different (mean number of hours boys aged seven to 11 play sports per day is greater than the mean number of hours played by girls OR the mean number of hours girls aged seven to 11 play sports per day is greater than the mean number of hours played by boys).<\/p>\r\n\r\n<div class=\"textbox\">If we used our student-<em>t<\/em> distribution table instead of Google Sheets, we look in the 8th row, and find between which two columns our test statistic, rounded to 3.142, would be.\r\n<img class=\"aligncenter size-full wp-image-628\" src=\"https:\/\/textbooks.jaykesler.net\/introstats\/wp-content\/uploads\/sites\/2\/2021\/01\/example10_1-pvalue.png\" alt=\"\" width=\"1176\" height=\"493\">\r\nWe see that 3.142 is less than 3.355, but greater than 2.896, so we find that our <em>p<\/em>-value is greater than 0.01 and less than 0.02.\r\n0.01 &lt; <em>p<\/em>-value &lt; 0.02\r\nSince the significance level is 0.05, we know that our <em>p<\/em>-value is less than the significance level (if our <em>p<\/em>-value is less than 0.02, it <strong>must<\/strong> be less than 0.05), so the conclusion is the same as using Google Sheets.<\/div>\r\n&nbsp;\r\n\r\n<\/div>\r\n<\/section><\/div>\r\n<\/section><\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<div id=\"fs-idp67783984\" class=\"statistics try ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\"><header>\r\n<h3 class=\"os-title\"><span class=\"os-title-label\">Try It <\/span><span class=\"os-number\">9.1<\/span><\/h3>\r\n<\/header><section>\r\n<div id=\"fs-idm235264144\" class=\" unnumbered\" data-type=\"exercise\"><header><\/header><section>\r\n<div id=\"fs-idm162906288\" data-type=\"problem\">\r\n<div class=\"os-problem-container \">\r\n<p id=\"fs-idm193306272\">Two samples are shown in <a class=\"autogenerated-content\" href=\"#fs-idm99631856\">Table 9.2<\/a>. Both have normal distributions. The means for the two populations are thought to be the same. Is there a difference in the means? Test at the 5% level of significance.<\/p>\r\n\r\n<div id=\"fs-idm99631856\" class=\"os-table \">\r\n<table summary=\"Table 9.2 \" data-id=\"fs-idm99631856\">\r\n<thead>\r\n<tr>\r\n<th scope=\"col\"><\/th>\r\n<th scope=\"col\">Sample Size<\/th>\r\n<th scope=\"col\">Sample Mean<\/th>\r\n<th scope=\"col\">Sample Standard Deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>Population A<\/td>\r\n<td>25<\/td>\r\n<td>5<\/td>\r\n<td>1<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Population B<\/td>\r\n<td>16<\/td>\r\n<td>4.7<\/td>\r\n<td>1.2<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.2<\/span><\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<\/section><\/div>\r\n<div id=\"element-968\" class=\"ui-has-child-title\" data-type=\"example\"><header>\r\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.2<\/span><\/h3>\r\n<\/header><section>\r\n<div class=\"body\">\r\n<p id=\"element-980\">A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is 4 math classes with a standard deviation of 1.5 math classes. College B samples 10 graduates. Their average is 3.5 math classes with a standard deviation of 1 math class. The community group believes that a student who graduates from college A <strong>has taken more math classes,<\/strong> on the average. Both populations have a normal distribution. Test at a 1% significance level. Answer the following questions.<\/p>\r\n&nbsp;\r\n<div id=\"fs-idm178911728\" class=\" unnumbered\" data-type=\"exercise\"><header><\/header><section>\r\n<div id=\"fs-idm177203296\" data-type=\"problem\">\r\n<div class=\"os-problem-container \">\r\n<ol>\r\n \t<li id=\"fs-idm171395200\">Is this a test of two means or two proportions?<\/li>\r\n \t<li>Are the populations standard deviations known or unknown?<\/li>\r\n \t<li>Which distribution do you use to perform the test?<\/li>\r\n \t<li>What is the random variable?<\/li>\r\n \t<li>What are the null and alternate hypotheses? Write the null and alternate hypotheses in words and in symbols.<\/li>\r\n \t<li>Is this test right-, left-, or two-tailed?<\/li>\r\n \t<li>What is the <em data-effect=\"italics\">p<\/em>-value?<\/li>\r\n \t<li>Do you reject or not reject the null hypothesis?<\/li>\r\n \t<li><strong>Conclusion:<\/strong><\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<div id=\"fs-idm65144592\" class=\" unnumbered\" data-type=\"exercise\"><section>\r\n<div id=\"fs-idm31976880\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\"><section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\r\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.2<\/span><\/h4>\r\n<ol>\r\n \t<li>two means<\/li>\r\n \t<li>unknown<\/li>\r\n \t<li>Student's <em data-effect=\"italics\">t<\/em><\/li>\r\n \t<li>$\\bar X_A -\\bar X_B$ (or $\\bar X_1 -\\bar X_2$ )<\/li>\r\n \t<li>$H_0: \\mu_A = \\mu_B$\r\n$H_1: \\mu_A &gt; \\mu_B$<\/li>\r\n \t<li>right tail<\/li>\r\n \t<li>0.1942 (by Google Sheets with the following function)\r\n<code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\" default-formula-text-color\" dir=\"auto\">=<\/span><span class=\" default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\" default-formula-text-color\" dir=\"auto\">(<\/span><span class=\"number\" dir=\"auto\">0.906032848<\/span><span class=\" default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">9<\/span><span class=\" default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">1<\/span><span class=\" default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code>\r\n(0.906032848 is the test statistic)<\/li>\r\n \t<li>Do not reject H<sub>0<\/sub><\/li>\r\n \t<li>At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from college A has taken more math classes, on the average, than a student who graduates from college B.<\/li>\r\n<\/ol>\r\n<\/section><\/div>\r\n<\/section><\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<div id=\"fs-idp145514592\" class=\"statistics try ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\"><header>\r\n<h3 class=\"os-title\"><span class=\"os-title-label\">Try It <\/span><span class=\"os-number\">9.2<\/span><\/h3>\r\n<\/header><section>\r\n<div id=\"eip-idp166826064\" class=\" unnumbered\" data-type=\"exercise\"><header><\/header><section>\r\n<div id=\"eip-idp166826320\" data-type=\"problem\">\r\n<div class=\"os-problem-container \">\r\n<p id=\"fs-idm170467472\">A study is done to determine if Company A retains its workers longer than Company B. Company A samples 15 workers, and their average time with the company is five years with a standard deviation of 1.2. Company B samples 20 workers, and their average time with the company is 4.5 years with a standard deviation of 0.8. The populations are normally distributed.<\/p>\r\n\r\n<ol id=\"fs-idp55515664\" type=\"a\">\r\n \t<li>Are the population standard deviations known?<\/li>\r\n \t<li>Conduct an appropriate hypothesis test. At the 5% significance level, what is your conclusion?<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<\/section><\/div>\r\n<div id=\"fs-idm22935408\" class=\"ui-has-child-title\" data-type=\"example\"><header>\r\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.3<\/span><\/h3>\r\n<\/header><section>\r\n<div class=\"body\">\r\n<p id=\"fs-idm10086800\">A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his face-to-face statistics class. He believed that the mean of the final exam scores for the online class would be lower than that of the face-to-face class. Was the professor correct? The randomly selected 30 final exam scores from each group are listed in <a class=\"autogenerated-content\" href=\"#fs-idm34056944\">Table 9.3<\/a> and <a class=\"autogenerated-content\" href=\"#fs-idm124378288\">Table 9.4<\/a>.<\/p>\r\n\r\n<div id=\"fs-idm34056944\" class=\"os-table \">\r\n<table summary=\"Table 9.3 Online Class \" data-id=\"fs-idm34056944\">\r\n<tbody>\r\n<tr>\r\n<td>67.6<\/td>\r\n<td>41.2<\/td>\r\n<td>85.3<\/td>\r\n<td>55.9<\/td>\r\n<td>82.4<\/td>\r\n<td>91.2<\/td>\r\n<td>73.5<\/td>\r\n<td>94.1<\/td>\r\n<td>64.7<\/td>\r\n<td>64.7<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>70.6<\/td>\r\n<td>38.2<\/td>\r\n<td>61.8<\/td>\r\n<td>88.2<\/td>\r\n<td>70.6<\/td>\r\n<td>58.8<\/td>\r\n<td>91.2<\/td>\r\n<td>73.5<\/td>\r\n<td>82.4<\/td>\r\n<td>35.5<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>94.1<\/td>\r\n<td>88.2<\/td>\r\n<td>64.7<\/td>\r\n<td>55.9<\/td>\r\n<td>88.2<\/td>\r\n<td>97.1<\/td>\r\n<td>85.3<\/td>\r\n<td>61.8<\/td>\r\n<td>79.4<\/td>\r\n<td>79.4<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.3<\/span> <span class=\"os-title\" data-type=\"title\">Online Class<\/span><\/div>\r\n<\/div>\r\n<div id=\"fs-idm124378288\" class=\"os-table \">\r\n<table summary=\"Table 9.4 Face-to-face Class \" data-id=\"fs-idm124378288\">\r\n<tbody>\r\n<tr>\r\n<td>77.9<\/td>\r\n<td>95.3<\/td>\r\n<td>81.2<\/td>\r\n<td>74.1<\/td>\r\n<td>98.8<\/td>\r\n<td>88.2<\/td>\r\n<td>85.9<\/td>\r\n<td>92.9<\/td>\r\n<td>87.1<\/td>\r\n<td>88.2<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>69.4<\/td>\r\n<td>57.6<\/td>\r\n<td>69.4<\/td>\r\n<td>67.1<\/td>\r\n<td>97.6<\/td>\r\n<td>85.9<\/td>\r\n<td>88.2<\/td>\r\n<td>91.8<\/td>\r\n<td>78.8<\/td>\r\n<td>71.8<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>98.8<\/td>\r\n<td>61.2<\/td>\r\n<td>92.9<\/td>\r\n<td>90.6<\/td>\r\n<td>97.6<\/td>\r\n<td>100<\/td>\r\n<td>95.3<\/td>\r\n<td>83.5<\/td>\r\n<td>92.9<\/td>\r\n<td>89.4<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.4<\/span> <span class=\"os-title\" data-type=\"title\">Face-to-face Class<\/span><\/div>\r\n<\/div>\r\n<div id=\"fs-idm298748224\" class=\" unnumbered\" data-type=\"exercise\"><header><\/header><section>\r\n<div id=\"fs-idm181409216\" data-type=\"problem\">\r\n<div class=\"os-problem-container \">\r\n<p id=\"fs-idm28696464\">Is the mean of the Final Exam scores of the online class lower than the mean of the Final Exam scores of the face-to-face class? Test at a 5% significance level. Answer the following questions:<\/p>\r\n\r\n<ol id=\"fs-idp180387984\" type=\"a\">\r\n \t<li>Is this a test of two means or two proportions?<\/li>\r\n \t<li>Are the population standard deviations known or unknown?<\/li>\r\n \t<li>Which distribution do you use to perform the test?<\/li>\r\n \t<li>What is the random variable?<\/li>\r\n \t<li>What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.<\/li>\r\n \t<li>Is this test right, left, or two tailed?<\/li>\r\n \t<li>What is the <em data-effect=\"italics\">p<\/em>-value?<\/li>\r\n \t<li>Do you reject or not reject the null hypothesis?<\/li>\r\n \t<li>At the ___ level of significance, from the sample data, there ______ (is\/is not) sufficient evidence to conclude that ______.<\/li>\r\n<\/ol>\r\n<p id=\"fs-idp169691824\">(See the conclusion in <a class=\"autogenerated-content\" href=\"#element-968\">Example 9.2<\/a>, and write yours in a similar fashion)<\/p>\r\n\r\n<div id=\"fs-idm107478368\" class=\"ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\"><header>\r\n<h3 class=\"os-title\" data-type=\"title\"><span id=\"28\" class=\"os-title-label\" data-type=\"\">Note<\/span><\/h3>\r\n<\/header><section>\r\n<div class=\"os-note-body\">\r\n<p id=\"fs-idm218153776\">Be careful not to mix up the information for Group 1 and Group 2!<\/p>\r\n\r\n<\/div>\r\n<\/section><\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"fs-idm168184384\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\">\r\n<div class=\"ui-toggle-wrapper\"><\/div>\r\n<section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\r\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.3<\/span><\/h4>\r\nBefore you begin, you should copy and paste data from each sample into a Google Sheet. Then use the AVERAGE, STDEV.S, and COUNT functions to get the sample mean, sample standard deviation, and sample size for each sample.\r\n\r\nYou can then use the information above to calculate the test statistic as -3.229\r\n<div class=\"os-solution-container\">\r\n<ol id=\"fs-idm11713040\" type=\"a\">\r\n \t<li>two means<\/li>\r\n \t<li>unknown<\/li>\r\n \t<li>Student's <em data-effect=\"italics\">t<\/em> with <em>df<\/em> = 30-1 = 29<\/li>\r\n \t<li>$\\bar X_1-\\bar X_2$\r\n<ol id=\"fs-idm109578096\">\r\n \t<li><em data-effect=\"italics\">H<sub>0<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2<\/sub><\/em> Null hypothesis: the means of the final exam scores are equal for the online and face-to-face statistics classes.<\/li>\r\n \t<li><em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> &lt; <em data-effect=\"italics\">\u03bc<sub>2<\/sub><\/em> Alternative hypothesis: the mean of the final exam scores of the online class is less than the mean of the final exam scores of the face-to-face class.<\/li>\r\n<\/ol>\r\n<\/li>\r\n \t<li>left-tailed<\/li>\r\n \t<li><em data-effect=\"italics\">p<\/em>-value = 0.0015 using the Google Sheets function\r\n<code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\" default-formula-text-color\" dir=\"auto\">=<\/span><span class=\" default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\" default-formula-text-color\" dir=\"auto\">(<\/span><span class=\"number\" dir=\"auto\">3.229<\/span><span class=\" default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">29<\/span><span class=\" default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">1<\/span><span class=\" default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code><\/li>\r\n \t<li>Reject the null hypothesis<\/li>\r\n \t<li>The professor was correct. The evidence shows that the mean of the final exam scores for the online class is lower than that of the face-to-face class.\r\n<span data-type=\"newline\">\r\n<\/span>At the <u data-effect=\"underline\">5%<\/u> level of significance, from the sample data, there is (is\/is not) sufficient evidence to conclude that the mean of the final exam scores for the online class is less than <u data-effect=\"underline\">the mean of final exam scores of the face-to-face class.<\/u><\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/section><\/div>\r\n<\/section><\/div>\r\n<\/div>\r\n<\/section><\/div>\r\n<\/div>\r\n","rendered":"<p><span style=\"display: none;\"><br \/>\n[latexpage]<br \/>\n<\/span><\/p>\n<div id=\"01e79f95-f1ac-4eab-bc98-375b3e0c6948\" class=\"chapter-content-module\" data-type=\"page\" data-cnxml-to-html-ver=\"2.1.0\">\n<ol id=\"element-984\">\n<li>The two independent samples are simple random samples from two distinct populations.<\/li>\n<li>For the two distinct populations:\n<ul id=\"fs-idm39244848\">\n<li>if the sample sizes are small, the distributions are important (should be normal)<\/li>\n<li>if the sample sizes are large, the distributions are not important (need not be normal)<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<div id=\"fs-idm107349216\" class=\"ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\">\n<header>\n<h3 class=\"os-title\" data-type=\"title\"><span id=\"1\" class=\"os-title-label\" data-type=\"\">NOTE<\/span><\/h3>\n<\/header>\n<section>\n<div class=\"os-note-body\">\n<p id=\"fs-idm165699296\">The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch t-test. The degrees of freedom formula was developed by Aspin-Welch.<\/p>\n<\/div>\n<\/section>\n<\/div>\n<p>The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the sample means, $\\bar X_1 &#8211; \\bar X_2$, and divide by the standard error in order to standardize the difference. The result is a t-score test statistic.<\/p>\n<p>Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or <span id=\"term197\" data-type=\"term\">standard error<\/span>, of <strong>the difference in sample means<\/strong>, $\\bar X_1 &#8211; \\bar X_2$.<\/p>\n<p>T<span data-type=\"title\">he standard error is:<\/span><\/p>\n<p>$$\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}$$<\/p>\n<p>The test statistic (<em data-effect=\"italics\">t<\/em>-score) is calculated as follows:<\/p>\n<p>$$t=\\frac{(\\bar x_1 &#8211; \\bar x_2) &#8211; (\\mu_1-\\mu_2)}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}$$<\/p>\n<p>where:<\/p>\n<div id=\"list1\" data-type=\"list\">\n<ul>\n<li><em data-effect=\"italics\">s<\/em><sub>1<\/sub> and <em data-effect=\"italics\">s<\/em><sub>2<\/sub>, the sample standard deviations, are estimates of <em data-effect=\"italics\">\u03c3<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03c3<\/em><sub>2<\/sub>, respectively.<\/li>\n<li><em data-effect=\"italics\">\u03c3<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03c3<\/em><sub>2<\/sub> are the unknown population standard deviations.<\/li>\n<li>$\\bar x_1$ and $\\bar x_2$ are the sample means.<\/li>\n<li><em data-effect=\"italics\">\u03bc<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03bc<\/em><sub>2<\/sub> are the population means, and the null hypothesis will always be <em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2 <\/sub><\/em><\/li>\n<\/ul>\n<div class=\"textbox note\">\n<h3>Note<\/h3>\n<p><em data-effect=\"italics\">\u03bc<\/em><sub>1<\/sub> and <em data-effect=\"italics\">\u03bc<\/em><sub>2<\/sub> are the population means, and the null hypothesis will always be <em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2 <\/sub>. <\/em>This means that the calculation $(\\mu_1-\\mu_2)$ in our test statistic will always be zero, so we can simplify the test statistic calculation as<\/p>\n<p>$$t=\\frac{\\bar x_1 &#8211; \\bar x_2}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}$$<\/p>\n<\/div>\n<\/div>\n<p id=\"element-256\" class=\"finger\">The number of <span id=\"term198\" data-type=\"term\">degrees of freedom (<em data-effect=\"italics\">df<\/em>)<\/span> requires a somewhat complicated calculation. However, for this introductory course, we will be using a simplified calculation. The test statistic calculated previously is approximated by the Student&#8217;s <em data-effect=\"italics\">t<\/em>-distribution with <em data-effect=\"italics\">df <\/em>that is 1 less than the smaller of $n_1$ and $n_2$ from your two samples. For example, if $n_1=14$ and $n_2=9$, we would use $df=8$.<\/p>\n<p id=\"element-316\">Notice that the sample variances (<em data-effect=\"italics\">s<\/em><sub>1<\/sub>)<sup>2<\/sup> and (<em data-effect=\"italics\">s<\/em><sub>2<\/sub>)<sup>2<\/sup> are not pooled, meaning we don&#8217;t consider the standard deviation of the samples as if they were both just one big sample together. (If the question comes up, do not pool the variances for means.)<\/p>\n<div id=\"element-208\" class=\"ui-has-child-title\" data-type=\"example\">\n<header>\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.1<\/span><\/h3>\n<\/header>\n<section>\n<div class=\"body\">\n<h4 id=\"4\" data-type=\"title\">Independent groups<\/h4>\n<p id=\"element-687\">The average amount of time boys and girls aged seven to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in <a class=\"autogenerated-content\" href=\"#uid888\">Table 9.1<\/a>. Each populations has a normal distribution.<\/p>\n<div id=\"uid888\" class=\"os-table\">\n<table summary=\"Table 9.1\" data-id=\"uid888\">\n<thead valign=\"top\">\n<tr>\n<th scope=\"col\" data-align=\"center\"><\/th>\n<th scope=\"col\" data-align=\"center\">Sample Size<\/th>\n<th scope=\"col\" data-align=\"center\">Average Number of Hours Playing Sports Per Day<\/th>\n<th scope=\"col\" data-align=\"center\">Sample Standard Deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody valign=\"top\">\n<tr>\n<td>Girls<\/td>\n<td>9<\/td>\n<td>2<\/td>\n<td>0.866<\/td>\n<\/tr>\n<tr>\n<td>Boys<\/td>\n<td>16<\/td>\n<td>3.2<\/td>\n<td>1.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.1<\/span><\/div>\n<\/div>\n<div id=\"element-114\" class=\"unnumbered\" data-type=\"exercise\">\n<header><\/header>\n<section>\n<div id=\"id6322791\" data-type=\"problem\">\n<div class=\"os-problem-container\">\n<p id=\"element-939\">Is there a difference in the mean amount of time boys and girls aged seven to 11 play sports each day? Test at the 5% level of significance.<\/p>\n<\/div>\n<\/div>\n<div id=\"id21329984\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\">\n<div class=\"ui-toggle-wrapper\"><\/div>\n<section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.1<\/span><\/h4>\n<div class=\"os-solution-container\">\n<p id=\"element-296\"><strong>The population standard deviations are not known.<\/strong> Let <em data-effect=\"italics\">g<\/em> be the subscript for girls and <em data-effect=\"italics\">b<\/em> be the subscript for boys. Then, <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> is the population mean for girls and <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> is the population mean for boys. This is a test of two <strong>independent groups<\/strong>, two population <strong>means<\/strong>.<\/p>\n<p id=\"element-221\"><span id=\"term199\" data-type=\"term\">Random variable<\/span>: $\\bar X_g-\\bar X_b=$ difference in the sample mean amount of time girls and boys play sports each day.<\/p>\n<p><span data-type=\"newline\"><br \/>\n<\/span><em data-effect=\"italics\">H<\/em><sub>0<\/sub>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>\u2003\u2003<em data-effect=\"italics\">H<sub>0<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2013 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> = 0<\/p>\n<p><span data-type=\"newline\"><br \/>\n<\/span><em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2260 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>\u2003\u2003<em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2013 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em> \u2260 0<br \/>\n<span data-type=\"newline\"><br \/>\n<\/span>The words <strong>&#8220;the same&#8221;<\/strong> tell you <em data-effect=\"italics\">H<sub>0<\/sub><\/em> has an &#8220;=&#8221;. Since there are no other words to indicate <em data-effect=\"italics\">H<sub>1<\/sub><\/em>, assume it says <strong>&#8220;is different.&#8221;<\/strong><br \/>\nThis is a two-tailed test (from the fact that <em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> \u2260 <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>),.<\/p>\n<p><strong>Calculate the test statistic:<\/strong><\/p>\n<p>$$t=\\frac{\\bar x_1 &#8211; \\bar x_2)}{\\sqrt{\\frac{(s_1)^2}{n_1}+\\frac{(s_2)^2}{n_2}}}=\\frac{2- 3.2}{\\sqrt{\\frac{(0.866)^2}{9}+\\frac{(1)^2}{16}}}=\u22123.142390292$$<\/p>\n<p id=\"element-529\"><strong>Distribution for the test:<\/strong><br \/>\nUse <em data-effect=\"italics\">t<sub>df<\/sub><\/em> where <em data-effect=\"italics\">df<\/em> = min($n_g, n_b$)-1 = min(9,16)-1 = 9-1=8.<\/p>\n<p><strong>Get the area into the tail beyond the test statistic:<\/strong><\/p>\n<p>Since our test statistic is negative, we&#8217;ll get the area to the left using Google Sheets.<\/p>\n<p><code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\"default-formula-text-color\" dir=\"auto\">=<\/span><span class=\"default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\"default-formula-text-color\" dir=\"auto\">( <\/span><span class=\"number\" dir=\"auto\">3.142390292<\/span><span class=\"default-formula-text-color\" dir=\"auto\">, <\/span><span class=\"number\" dir=\"auto\">8<\/span><span class=\"default-formula-text-color\" dir=\"auto\">, <\/span><span class=\"number\" dir=\"auto\">2 <\/span><span class=\"default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code><\/p>\n<p>The first value given to the TDIST function is the <strong>positive version<\/strong> of our test statistic, the second value is our degrees of freedom, and the third is the number of tails.<\/p>\n<p><em data-effect=\"italics\">p<\/em>-value = 0.0138<\/p>\n<p id=\"element-823\"><strong>Make a decision:<\/strong> Since <em data-effect=\"italics\">\u03b1<\/em> &gt; <em data-effect=\"italics\">p<\/em>-value, reject <em data-effect=\"italics\">H<sub>0<\/sub><\/em>. This means you reject <em data-effect=\"italics\">\u03bc<sub>g<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>b<\/sub><\/em>. The means are different.<\/p>\n<p id=\"element-914\"><strong>Conclusion:<\/strong> At the 5% level of significance, the sample data show there is sufficient evidence to conclude that the mean number of hours that girls and boys aged seven to 11 play sports per day is different (mean number of hours boys aged seven to 11 play sports per day is greater than the mean number of hours played by girls OR the mean number of hours girls aged seven to 11 play sports per day is greater than the mean number of hours played by boys).<\/p>\n<div class=\"textbox\">If we used our student-<em>t<\/em> distribution table instead of Google Sheets, we look in the 8th row, and find between which two columns our test statistic, rounded to 3.142, would be.<br \/>\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-628\" src=\"https:\/\/textbooks.jaykesler.net\/introstats\/wp-content\/uploads\/sites\/2\/2021\/01\/example10_1-pvalue.png\" alt=\"\" width=\"1176\" height=\"493\" \/><br \/>\nWe see that 3.142 is less than 3.355, but greater than 2.896, so we find that our <em>p<\/em>-value is greater than 0.01 and less than 0.02.<br \/>\n0.01 &lt; <em>p<\/em>-value &lt; 0.02<br \/>\nSince the significance level is 0.05, we know that our <em>p<\/em>-value is less than the significance level (if our <em>p<\/em>-value is less than 0.02, it <strong>must<\/strong> be less than 0.05), so the conclusion is the same as using Google Sheets.<\/div>\n<p>&nbsp;<\/p>\n<\/div>\n<\/section>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<div id=\"fs-idp67783984\" class=\"statistics try ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\">\n<header>\n<h3 class=\"os-title\"><span class=\"os-title-label\">Try It <\/span><span class=\"os-number\">9.1<\/span><\/h3>\n<\/header>\n<section>\n<div id=\"fs-idm235264144\" class=\"unnumbered\" data-type=\"exercise\">\n<header><\/header>\n<section>\n<div id=\"fs-idm162906288\" data-type=\"problem\">\n<div class=\"os-problem-container\">\n<p id=\"fs-idm193306272\">Two samples are shown in <a class=\"autogenerated-content\" href=\"#fs-idm99631856\">Table 9.2<\/a>. Both have normal distributions. The means for the two populations are thought to be the same. Is there a difference in the means? Test at the 5% level of significance.<\/p>\n<div id=\"fs-idm99631856\" class=\"os-table\">\n<table summary=\"Table 9.2\" data-id=\"fs-idm99631856\">\n<thead>\n<tr>\n<th scope=\"col\"><\/th>\n<th scope=\"col\">Sample Size<\/th>\n<th scope=\"col\">Sample Mean<\/th>\n<th scope=\"col\">Sample Standard Deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Population A<\/td>\n<td>25<\/td>\n<td>5<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>Population B<\/td>\n<td>16<\/td>\n<td>4.7<\/td>\n<td>1.2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.2<\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/section>\n<\/div>\n<div id=\"element-968\" class=\"ui-has-child-title\" data-type=\"example\">\n<header>\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.2<\/span><\/h3>\n<\/header>\n<section>\n<div class=\"body\">\n<p id=\"element-980\">A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is 4 math classes with a standard deviation of 1.5 math classes. College B samples 10 graduates. Their average is 3.5 math classes with a standard deviation of 1 math class. The community group believes that a student who graduates from college A <strong>has taken more math classes,<\/strong> on the average. Both populations have a normal distribution. Test at a 1% significance level. Answer the following questions.<\/p>\n<p>&nbsp;<\/p>\n<div id=\"fs-idm178911728\" class=\"unnumbered\" data-type=\"exercise\">\n<header><\/header>\n<section>\n<div id=\"fs-idm177203296\" data-type=\"problem\">\n<div class=\"os-problem-container\">\n<ol>\n<li id=\"fs-idm171395200\">Is this a test of two means or two proportions?<\/li>\n<li>Are the populations standard deviations known or unknown?<\/li>\n<li>Which distribution do you use to perform the test?<\/li>\n<li>What is the random variable?<\/li>\n<li>What are the null and alternate hypotheses? Write the null and alternate hypotheses in words and in symbols.<\/li>\n<li>Is this test right-, left-, or two-tailed?<\/li>\n<li>What is the <em data-effect=\"italics\">p<\/em>-value?<\/li>\n<li>Do you reject or not reject the null hypothesis?<\/li>\n<li><strong>Conclusion:<\/strong><\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<div id=\"fs-idm65144592\" class=\"unnumbered\" data-type=\"exercise\">\n<section>\n<div id=\"fs-idm31976880\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\">\n<section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.2<\/span><\/h4>\n<ol>\n<li>two means<\/li>\n<li>unknown<\/li>\n<li>Student&#8217;s <em data-effect=\"italics\">t<\/em><\/li>\n<li>$\\bar X_A -\\bar X_B$ (or $\\bar X_1 -\\bar X_2$ )<\/li>\n<li>$H_0: \\mu_A = \\mu_B$<br \/>\n$H_1: \\mu_A &gt; \\mu_B$<\/li>\n<li>right tail<\/li>\n<li>0.1942 (by Google Sheets with the following function)<br \/>\n<code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\"default-formula-text-color\" dir=\"auto\">=<\/span><span class=\"default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\"default-formula-text-color\" dir=\"auto\">(<\/span><span class=\"number\" dir=\"auto\">0.906032848<\/span><span class=\"default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">9<\/span><span class=\"default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">1<\/span><span class=\"default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code><br \/>\n(0.906032848 is the test statistic)<\/li>\n<li>Do not reject H<sub>0<\/sub><\/li>\n<li>At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from college A has taken more math classes, on the average, than a student who graduates from college B.<\/li>\n<\/ol>\n<\/section>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<div id=\"fs-idp145514592\" class=\"statistics try ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\">\n<header>\n<h3 class=\"os-title\"><span class=\"os-title-label\">Try It <\/span><span class=\"os-number\">9.2<\/span><\/h3>\n<\/header>\n<section>\n<div id=\"eip-idp166826064\" class=\"unnumbered\" data-type=\"exercise\">\n<header><\/header>\n<section>\n<div id=\"eip-idp166826320\" data-type=\"problem\">\n<div class=\"os-problem-container\">\n<p id=\"fs-idm170467472\">A study is done to determine if Company A retains its workers longer than Company B. Company A samples 15 workers, and their average time with the company is five years with a standard deviation of 1.2. Company B samples 20 workers, and their average time with the company is 4.5 years with a standard deviation of 0.8. The populations are normally distributed.<\/p>\n<ol id=\"fs-idp55515664\" type=\"a\">\n<li>Are the population standard deviations known?<\/li>\n<li>Conduct an appropriate hypothesis test. At the 5% significance level, what is your conclusion?<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/section>\n<\/div>\n<div id=\"fs-idm22935408\" class=\"ui-has-child-title\" data-type=\"example\">\n<header>\n<h3 class=\"os-title\"><span class=\"os-title-label\">Example <\/span><span class=\"os-number\">9.3<\/span><\/h3>\n<\/header>\n<section>\n<div class=\"body\">\n<p id=\"fs-idm10086800\">A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his face-to-face statistics class. He believed that the mean of the final exam scores for the online class would be lower than that of the face-to-face class. Was the professor correct? The randomly selected 30 final exam scores from each group are listed in <a class=\"autogenerated-content\" href=\"#fs-idm34056944\">Table 9.3<\/a> and <a class=\"autogenerated-content\" href=\"#fs-idm124378288\">Table 9.4<\/a>.<\/p>\n<div id=\"fs-idm34056944\" class=\"os-table\">\n<table summary=\"Table 9.3 Online Class\" data-id=\"fs-idm34056944\">\n<tbody>\n<tr>\n<td>67.6<\/td>\n<td>41.2<\/td>\n<td>85.3<\/td>\n<td>55.9<\/td>\n<td>82.4<\/td>\n<td>91.2<\/td>\n<td>73.5<\/td>\n<td>94.1<\/td>\n<td>64.7<\/td>\n<td>64.7<\/td>\n<\/tr>\n<tr>\n<td>70.6<\/td>\n<td>38.2<\/td>\n<td>61.8<\/td>\n<td>88.2<\/td>\n<td>70.6<\/td>\n<td>58.8<\/td>\n<td>91.2<\/td>\n<td>73.5<\/td>\n<td>82.4<\/td>\n<td>35.5<\/td>\n<\/tr>\n<tr>\n<td>94.1<\/td>\n<td>88.2<\/td>\n<td>64.7<\/td>\n<td>55.9<\/td>\n<td>88.2<\/td>\n<td>97.1<\/td>\n<td>85.3<\/td>\n<td>61.8<\/td>\n<td>79.4<\/td>\n<td>79.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.3<\/span> <span class=\"os-title\" data-type=\"title\">Online Class<\/span><\/div>\n<\/div>\n<div id=\"fs-idm124378288\" class=\"os-table\">\n<table summary=\"Table 9.4 Face-to-face Class\" data-id=\"fs-idm124378288\">\n<tbody>\n<tr>\n<td>77.9<\/td>\n<td>95.3<\/td>\n<td>81.2<\/td>\n<td>74.1<\/td>\n<td>98.8<\/td>\n<td>88.2<\/td>\n<td>85.9<\/td>\n<td>92.9<\/td>\n<td>87.1<\/td>\n<td>88.2<\/td>\n<\/tr>\n<tr>\n<td>69.4<\/td>\n<td>57.6<\/td>\n<td>69.4<\/td>\n<td>67.1<\/td>\n<td>97.6<\/td>\n<td>85.9<\/td>\n<td>88.2<\/td>\n<td>91.8<\/td>\n<td>78.8<\/td>\n<td>71.8<\/td>\n<\/tr>\n<tr>\n<td>98.8<\/td>\n<td>61.2<\/td>\n<td>92.9<\/td>\n<td>90.6<\/td>\n<td>97.6<\/td>\n<td>100<\/td>\n<td>95.3<\/td>\n<td>83.5<\/td>\n<td>92.9<\/td>\n<td>89.4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"os-caption-container\"><span class=\"os-title-label\">Table <\/span><span class=\"os-number\">9.4<\/span> <span class=\"os-title\" data-type=\"title\">Face-to-face Class<\/span><\/div>\n<\/div>\n<div id=\"fs-idm298748224\" class=\"unnumbered\" data-type=\"exercise\">\n<header><\/header>\n<section>\n<div id=\"fs-idm181409216\" data-type=\"problem\">\n<div class=\"os-problem-container\">\n<p id=\"fs-idm28696464\">Is the mean of the Final Exam scores of the online class lower than the mean of the Final Exam scores of the face-to-face class? Test at a 5% significance level. Answer the following questions:<\/p>\n<ol id=\"fs-idp180387984\" type=\"a\">\n<li>Is this a test of two means or two proportions?<\/li>\n<li>Are the population standard deviations known or unknown?<\/li>\n<li>Which distribution do you use to perform the test?<\/li>\n<li>What is the random variable?<\/li>\n<li>What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.<\/li>\n<li>Is this test right, left, or two tailed?<\/li>\n<li>What is the <em data-effect=\"italics\">p<\/em>-value?<\/li>\n<li>Do you reject or not reject the null hypothesis?<\/li>\n<li>At the ___ level of significance, from the sample data, there ______ (is\/is not) sufficient evidence to conclude that ______.<\/li>\n<\/ol>\n<p id=\"fs-idp169691824\">(See the conclusion in <a class=\"autogenerated-content\" href=\"#element-968\">Example 9.2<\/a>, and write yours in a similar fashion)<\/p>\n<div id=\"fs-idm107478368\" class=\"ui-has-child-title\" data-type=\"note\" data-has-label=\"true\" data-label=\"\">\n<header>\n<h3 class=\"os-title\" data-type=\"title\"><span id=\"28\" class=\"os-title-label\" data-type=\"\">Note<\/span><\/h3>\n<\/header>\n<section>\n<div class=\"os-note-body\">\n<p id=\"fs-idm218153776\">Be careful not to mix up the information for Group 1 and Group 2!<\/p>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"fs-idm168184384\" data-type=\"solution\" aria-label=\"show solution\" aria-expanded=\"false\">\n<div class=\"ui-toggle-wrapper\"><\/div>\n<section class=\"ui-body\" style=\"display: block; overflow: hidden;\" role=\"alert\">\n<h4 data-type=\"solution-title\"><span class=\"os-title-label\">Solution <\/span><span class=\"os-number\">9.3<\/span><\/h4>\n<p>Before you begin, you should copy and paste data from each sample into a Google Sheet. Then use the AVERAGE, STDEV.S, and COUNT functions to get the sample mean, sample standard deviation, and sample size for each sample.<\/p>\n<p>You can then use the information above to calculate the test statistic as -3.229<\/p>\n<div class=\"os-solution-container\">\n<ol id=\"fs-idm11713040\" type=\"a\">\n<li>two means<\/li>\n<li>unknown<\/li>\n<li>Student&#8217;s <em data-effect=\"italics\">t<\/em> with <em>df<\/em> = 30-1 = 29<\/li>\n<li>$\\bar X_1-\\bar X_2$\n<ol id=\"fs-idm109578096\">\n<li><em data-effect=\"italics\">H<sub>0<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> = <em data-effect=\"italics\">\u03bc<sub>2<\/sub><\/em> Null hypothesis: the means of the final exam scores are equal for the online and face-to-face statistics classes.<\/li>\n<li><em data-effect=\"italics\">H<sub>1<\/sub><\/em>: <em data-effect=\"italics\">\u03bc<sub>1<\/sub><\/em> &lt; <em data-effect=\"italics\">\u03bc<sub>2<\/sub><\/em> Alternative hypothesis: the mean of the final exam scores of the online class is less than the mean of the final exam scores of the face-to-face class.<\/li>\n<\/ol>\n<\/li>\n<li>left-tailed<\/li>\n<li><em data-effect=\"italics\">p<\/em>-value = 0.0015 using the Google Sheets function<br \/>\n<code><bdo dir=\"ltr\"><span class=\"formula-content\"><span class=\"default-formula-text-color\" dir=\"auto\">=<\/span><span class=\"default-formula-text-color\" dir=\"auto\">TDIST<\/span><span class=\"default-formula-text-color\" dir=\"auto\">(<\/span><span class=\"number\" dir=\"auto\">3.229<\/span><span class=\"default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">29<\/span><span class=\"default-formula-text-color\" dir=\"auto\">,<\/span><span class=\"number\" dir=\"auto\">1<\/span><span class=\"default-formula-text-color\" dir=\"auto\">)<\/span><\/span><\/bdo><\/code><\/li>\n<li>Reject the null hypothesis<\/li>\n<li>The professor was correct. The evidence shows that the mean of the final exam scores for the online class is lower than that of the face-to-face class.<br \/>\n<span data-type=\"newline\"><br \/>\n<\/span>At the <u data-effect=\"underline\">5%<\/u> level of significance, from the sample data, there is (is\/is not) sufficient evidence to conclude that the mean of the final exam scores for the online class is less than <u data-effect=\"underline\">the mean of final exam scores of the face-to-face class.<\/u><\/li>\n<\/ol>\n<\/div>\n<\/section>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n<\/section>\n<\/div>\n<\/div>\n","protected":false},"author":1,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-83","chapter","type-chapter","status-publish","hentry"],"part":81,"_links":{"self":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapters\/83","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/wp\/v2\/users\/1"}],"version-history":[{"count":3,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapters\/83\/revisions"}],"predecessor-version":[{"id":393,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapters\/83\/revisions\/393"}],"part":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/parts\/81"}],"metadata":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapters\/83\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/wp\/v2\/media?parent=83"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/pressbooks\/v2\/chapter-type?post=83"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/wp\/v2\/contributor?post=83"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/textbooks.jaykesler.net\/introstats\/wp-json\/wp\/v2\/license?post=83"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}