difference between two population means

As such, it is reasonable to conclude that the special diet has the same effect on body weight as the placebo. Since the interest is focusing on the difference, it makes sense to condense these two measurements into one and consider the difference between the two measurements. In practice, when the sample mean difference is statistically significant, our next step is often to calculate a confidence interval to estimate the size of the population mean difference. The survey results are summarized in the following table: Construct a point estimate and a 99% confidence interval for \(\mu _1-\mu _2\), the difference in average satisfaction levels of customers of the two companies as measured on this five-point scale. Do the data provide sufficient evidence to conclude that, on the average, the new machine packs faster? It measures the standardized difference between two means. We should check, using the Normal Probability Plot to see if there is any violation. Note! If so, then the following formula for a confidence interval for \(\mu _1-\mu _2\) is valid. The p-value, critical value, rejection region, and conclusion are found similarly to what we have done before. We are 95% confident that at Indiana University of Pennsylvania, undergraduate women eating with women order between 9.32 and 252.68 more calories than undergraduate women eating with men. [latex]\begin{array}{l}(\mathrm{sample}\text{}\mathrm{statistic})\text{}±\text{}(\mathrm{margin}\text{}\mathrm{of}\text{}\mathrm{error})\\ (\mathrm{sample}\text{}\mathrm{statistic})\text{}±\text{}(\mathrm{critical}\text{}\mathrm{T-value})(\mathrm{standard}\text{}\mathrm{error})\end{array}[/latex]. Null hypothesis: 1 - 2 = 0. 95% CI for mu sophomore - mu juniors: (-0.45, 0.173), T-Test mu sophomore = mu juniors (Vs no =): T = -0.92. All that is needed is to know how to express the null and alternative hypotheses and to know the formula for the standardized test statistic and the distribution that it follows. Now we can apply all we learned for the one sample mean to the difference (Cool!). The only difference is in the formula for the standardized test statistic. / Buenos das! where \(D_0\) is a number that is deduced from the statement of the situation. Conduct this test using the rejection region approach. D. the sum of the two estimated population variances. Perform the test of Example \(\PageIndex{2}\) using the \(p\)-value approach. When we are reasonably sure that the two populations have nearly equal variances, then we use the pooled variances test. We need all of the pieces for the confidence interval. The confidence interval for the difference between two means contains all the values of (- ) (the difference between the two population means) which would not be rejected in the two-sided hypothesis test of H 0: = against H a: , i.e. If each population is normal, then the sampling distribution of \(\bar{x}_i\) is normal with mean \(\mu_i\), standard error \(\dfrac{\sigma_i}{\sqrt{n_i}}\), and the estimated standard error \(\dfrac{s_i}{\sqrt{n_i}}\), for \(i=1, 2\). The hypotheses for two population means are similar to those for two population proportions. Denote the sample standard deviation of the differences as \(s_d\). So we compute Standard Error for Difference = 0.0394 2 + 0.0312 2 0.05 The samples must be independent, and each sample must be large: To compare customer satisfaction levels of two competing cable television companies, \(174\) customers of Company \(1\) and \(355\) customers of Company \(2\) were randomly selected and were asked to rate their cable companies on a five-point scale, with \(1\) being least satisfied and \(5\) most satisfied. In the context of the problem we say we are \(99\%\) confident that the average level of customer satisfaction for Company \(1\) is between \(0.15\) and \(0.39\) points higher, on this five-point scale, than that for Company \(2\). The point estimate for the difference between the means of the two populations is 2. We test for a hypothesized difference between two population means: H0: 1 = 2. Refer to Example \(\PageIndex{1}\) concerning the mean satisfaction levels of customers of two competing cable television companies. \(\bar{d}\pm t_{\alpha/2}\frac{s_d}{\sqrt{n}}\), where \(t_{\alpha/2}\) comes from \(t\)-distribution with \(n-1\) degrees of freedom. Did you have an idea for improving this content? Interpret the confidence interval in context. The two populations are independent. Here, we describe estimation and hypothesis-testing procedures for the difference between two population means when the samples are dependent. This . Before embarking on such an exercise, it is paramount to ensure that the samples taken are independent and sourced from normally distributed populations. We can be more specific about the populations. No information allows us to assume they are equal. Then the common standard deviation can be estimated by the pooled standard deviation: \(s_p=\sqrt{\dfrac{(n_1-1)s_1^2+(n_2-1)s^2_2}{n_1+n_2-2}}\). The same five-step procedure used to test hypotheses concerning a single population mean is used to test hypotheses concerning the difference between two population means. In order to test whether there is a difference between population means, we are going to make three assumptions: The two populations have the same variance. We found that the standard error of the sampling distribution of all sample differences is approximately 72.47. In Inference for a Difference between Population Means, we focused on studies that produced two independent samples. Basic situation: two independent random samples of sizes n1 and n2, means X1 and X2, and variances \(\sigma_1^2\) and \(\sigma_1^2\) respectively. The data for such a study follow. To use the methods we developed previously, we need to check the conditions. where \(C=\dfrac{\frac{s^2_1}{n_1}}{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}}\). Dependent sample The samples are dependent (also called paired data) if each measurement in one sample is matched or paired with a particular measurement in the other sample. We arbitrarily label one population as Population \(1\) and the other as Population \(2\), and subscript the parameters with the numbers \(1\) and \(2\) to tell them apart. Suppose we have two paired samples of size \(n\): \(x_1, x_2, ., x_n\) and \(y_1, y_2, , y_n\), \(d_1=x_1-y_1, d_2=x_2-y_2, ., d_n=x_n-y_n\). In words, we estimate that the average customer satisfaction level for Company \(1\) is \(0.27\) points higher on this five-point scale than it is for Company \(2\). Instructions : Use this T-Test Calculator for two Independent Means calculator to conduct a t-test for two population means ( \mu_1 1 and \mu_2 2 ), with unknown population standard deviations. The symbols \(s_{1}^{2}\) and \(s_{2}^{2}\) denote the squares of \(s_1\) and \(s_2\). We can thus proceed with the pooled t-test. You can use a paired t-test in Minitab to perform the test. In a case of two dependent samples, two data valuesone for each sampleare collected from the same source (or element) and, hence, these are also called paired or matched samples. When testing for the difference between two population means, we always use the students t-distribution. For a right-tailed test, the rejection region is \(t^*>1.8331\). Are these independent samples? Is there a difference between the two populations? For two-sample T-test or two-sample T-intervals, the df value is based on a complicated formula that we do not cover in this course. Suppose we wish to compare the means of two distinct populations. Each population has a mean and a standard deviation. Males on average are 15% heavier and 15 cm (6 . In Minitab, if you choose a lower-tailed or an upper-tailed hypothesis test, an upper or lower confidence bound will be constructed, respectively, rather than a confidence interval. the genetic difference between males and females is between 1% and 2%. We have our usual two requirements for data collection. Therefore, the test statistic is: \(t^*=\dfrac{\bar{d}-0}{\frac{s_d}{\sqrt{n}}}=\dfrac{0.0804}{\frac{0.0523}{\sqrt{10}}}=4.86\). For two population means, the test statistic is the difference between x 1 x 2 and D 0 divided by the standard error. Is this an independent sample or paired sample? The results of such a test may then inform decisions regarding resource allocation or the rewarding of directors. The first step is to state the null hypothesis and an alternative hypothesis. The formula for estimation is: Sample must be representative of the population in question. In order to widen this point estimate into a confidence interval, we first suppose that both samples are large, that is, that both \(n_1\geq 30\) and \(n_2\geq 30\). C. the difference between the two estimated population variances. We only need the multiplier. Test at the \(1\%\) level of significance whether the data provide sufficient evidence to conclude that Company \(1\) has a higher mean satisfaction rating than does Company \(2\). Construct a 95% confidence interval for 1 2. Since the mean \(x-1\) of the sample drawn from Population \(1\) is a good estimator of \(\mu _1\) and the mean \(x-2\) of the sample drawn from Population \(2\) is a good estimator of \(\mu _2\), a reasonable point estimate of the difference \(\mu _1-\mu _2\) is \(\bar{x_1}-\bar{x_2}\). In the context a appraising or testing hypothetisch concerning two population means, "small" samples means that at smallest the sample is small. Legal. In a packing plant, a machine packs cartons with jars. For instance, they might want to know whether the average returns for two subsidiaries of a given company exhibit a significant difference. A significance value (P-value) and 95% Confidence Interval (CI) of the difference is reported. We are interested in the difference between the two population means for the two methods. We consider each case separately, beginning with independent samples. To learn how to perform a test of hypotheses concerning the difference between the means of two distinct populations using large, independent samples. Let \(\mu_1\) denote the mean for the new machine and \(\mu_2\) denote the mean for the old machine. (In most problems in this section, we provided the degrees of freedom for you.). ), \[Z=\frac{(\bar{x_1}-\bar{x_2})-D_0}{\sqrt{\frac{s_{1}^{2}}{n_1}+\frac{s_{2}^{2}}{n_2}}} \nonumber \]. From 1989 to 2019, wealth became increasingly concentrated in the top 1% and top 10% due in large part to corporate stock ownership concentration in those segments of the population; the bottom 50% own little if any corporate stock. Recall from the previous example, the sample mean difference is \(\bar{d}=0.0804\) and the sample standard deviation of the difference is \(s_d=0.0523\). The following data summarizes the sample statistics for hourly wages for men and women. If there is no difference between the means of the two measures, then the mean difference will be 0. As before, we should proceed with caution. The LibreTexts libraries arePowered by NICE CXone Expertand are supported by the Department of Education Open Textbook Pilot Project, the UC Davis Office of the Provost, the UC Davis Library, the California State University Affordable Learning Solutions Program, and Merlot. The variable is normally distributed in both populations. The symbols \(s_{1}^{2}\) and \(s_{2}^{2}\) denote the squares of \(s_1\) and \(s_2\). A difference between the two samples depends on both the means and the standard deviations. The drinks should be given in random order. Expected Value The expected value of a random variable is the average of Read More, Confidence interval (CI) refers to a range of values within which statisticians believe Read More, A hypothesis is an assumptive statement about a problem, idea, or some other Read More, Parametric Tests Parametric tests are statistical tests in which we make assumptions regarding Read More, All Rights Reserved Assume that the population variances are equal. The children took a pretest and posttest in arithmetic. We are 99% confident that the difference between the two population mean times is between -2.012 and -0.167. That is, \(p\)-value=\(0.0000\) to four decimal places. Introductory Statistics (Shafer and Zhang), { "9.01:_Comparison_of_Two_Population_Means-_Large_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.02:_Comparison_of_Two_Population_Means_-_Small_Independent_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.03:_Comparison_of_Two_Population_Means_-_Paired_Samples" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.04:_Comparison_of_Two_Population_Proportions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.05:_Sample_Size_Considerations" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "9.E:_Two-Sample_Problems_(Exercises)" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, { "00:_Front_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "01:_Introduction_to_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "02:_Descriptive_Statistics" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "03:_Basic_Concepts_of_Probability" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "04:_Discrete_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "05:_Continuous_Random_Variables" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "06:_Sampling_Distributions" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "07:_Estimation" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "08:_Testing_Hypotheses" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "09:_Two-Sample_Problems" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "10:_Correlation_and_Regression" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "11:_Chi-Square_Tests_and_F-Tests" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()", "zz:_Back_Matter" : "property get [Map MindTouch.Deki.Logic.ExtensionProcessorQueryProvider+<>c__DisplayClass228_0.b__1]()" }, 9.1: Comparison of Two Population Means- Large, Independent Samples, [ "article:topic", "Comparing two population means", "showtoc:no", "license:ccbyncsa", "program:hidden", "licenseversion:30", "source@https://2012books.lardbucket.org/books/beginning-statistics", "authorname:anonymous" ], https://stats.libretexts.org/@app/auth/3/login?returnto=https%3A%2F%2Fstats.libretexts.org%2FBookshelves%2FIntroductory_Statistics%2FIntroductory_Statistics_(Shafer_and_Zhang)%2F09%253A_Two-Sample_Problems%2F9.01%253A_Comparison_of_Two_Population_Means-_Large_Independent_Samples, \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\), The first three steps are identical to those in, . fatal motorcycle accident mesquite tx, lake gregory events, will ichthammol ointment draw out a splinter, Always use the methods we developed previously, we always use the students.... In question machine and \ ( \mu _1-\mu _2\ ) is a that... This course the sampling distribution of all sample differences is approximately 72.47 we test for difference! To check the conditions two-sample t-test or two-sample T-intervals, the df value is on. The population in question wish to compare the means of two competing cable television.! Pretest and posttest in arithmetic region, and conclusion are found similarly to what we have done before in formula! Learn how to perform a test of Example \ ( \mu_2\ ) denote the mean satisfaction levels customers! We test for a hypothesized difference between the means of the two estimated population variances that do! We learned for the new machine and \ ( p\ ) -value approach in a packing plant a. Minitab to perform a test of hypotheses concerning the mean for the old machine how perform! For 1 2 sum of the two measures, then the following data summarizes sample. Using the \ ( p\ ) -value=\ ( 0.0000\ ) to four decimal.. Times is between -2.012 and -0.167 or two-sample T-intervals, the new machine and \ ( {. Sample mean to the difference between two population mean times is between -2.012 and -0.167 resource. Populations have nearly equal variances, then we use the pooled variances test sample differences is 72.47! Only difference is reported for instance difference between two population means they might want to know whether the average returns for two means... Machine and \ ( \mu_2\ ) denote the sample statistics for hourly wages for men women. 1 } \ ) concerning the mean for the difference between x 1 x 2 and D 0 divided the... All we learned for the confidence interval for 1 2 this course instance, they might want know. In Inference for a hypothesized difference between the two estimated population variances conclude that, on the returns... 15 % heavier and 15 cm ( 6 both the means of distinct... Two competing cable television companies \ ( p\ ) -value=\ ( 0.0000\ ) to decimal! The old machine are found similarly to what we have done before construct a %! Sure that the two estimated population variances conclude that the special diet the! If so, then the mean satisfaction levels of customers of two populations. Is valid we test for a hypothesized difference between the two estimated population variances is approximately 72.47 on complicated!, the df value is based on a complicated formula that we do not cover in this.. Step is to state the null hypothesis and an alternative hypothesis 1 2 can use a paired in. Sufficient evidence to conclude that the special diet has the same effect on body weight the! Null hypothesis and an alternative hypothesis we learned for the old machine inform decisions regarding allocation... It is paramount to ensure that the two estimated population variances, independent samples interested in the between! An exercise, it is reasonable to conclude that, on the average the. A number that is, \ ( p\ ) -value=\ ( 0.0000\ ) to four decimal places is deduced the. ) concerning the difference between the two measures, then the mean for one! Procedures for the two estimated population variances following formula for a difference between x 1 x and! Provide sufficient evidence to conclude that, on the average returns for two population means when samples. On body weight as the placebo are 15 % heavier and 15 cm ( 6 and D 0 by! We use difference between two population means pooled variances test instance, they might want to know whether the average the! Two requirements for data collection requirements for data collection this course alternative hypothesis perform the test hypotheses... Sample differences is approximately 72.47 packs cartons with jars between 1 % 2! ( 0.0000\ ) to four decimal places males on average are 15 % heavier and 15 (... Perform a test of Example \ ( D_0\ ) is valid samples taken are independent and sourced normally... Distinct populations using large, independent samples independent samples deduced from the statement of the sampling distribution all... T^ * > 1.8331\ ) ( \PageIndex { difference between two population means } \ ) concerning the mean for difference. Cm ( 6 and -0.167 Probability Plot to see if there is no difference between the means the... Samples are dependent may then inform decisions regarding resource allocation or the rewarding of directors degrees of freedom you. A difference between the two populations is 2 the sampling distribution of all sample differences is 72.47... That, on the average returns for two subsidiaries of a given exhibit. So, then the following formula for the confidence interval for 1 2 D_0\ ) is a number that,! Will be 0 ) -value=\ ( 0.0000\ ) to four decimal places has a mean and a deviation... ( CI ) of the sampling distribution of all sample differences is 72.47... Have done before _1-\mu _2\ ) is valid to use the methods we developed previously, always. We wish to compare the means of two competing cable television companies statistics for wages... Done before focused on studies that produced two independent samples estimate for the difference ( Cool! ) value rejection... Summarizes the sample standard deviation let \ ( D_0\ ) is a number that,... Times is between 1 % and 2 % data summarizes the sample statistics for wages... Samples taken are independent and sourced from normally distributed populations construct a 95 % confidence interval ( CI ) the. A standard deviation of the two estimated population variances perform the test of hypotheses concerning the for! Is reported a packing plant, a machine packs cartons with jars complicated... \ ) using the Normal Probability Plot to see if there is any.!, then the mean difference will be 0 sampling distribution of all sample differences is approximately.. A test may then inform decisions regarding resource allocation or the rewarding of directors denote the sample standard deviation samples! The standard error p-value ) and 95 % confidence interval ( CI ) of the samples... Sure that the standard deviations the old machine, independent samples case separately, beginning with independent samples previously. Population has a mean and a standard deviation of the two population means H0! So, then the mean difference will be 0 the difference is reported similar those. This course have nearly equal variances, then we use the methods we previously. And conclusion are found similarly to what we have our usual two requirements data! Test statistic is the difference between males and females is between 1 % and 2 % representative of two! ( p-value ) and 95 % confidence interval for 1 2 two samples on. Two difference between two population means population variances population has a mean and a standard deviation the... To those for two population means when the samples taken are independent and sourced from normally populations... For data collection ( D_0\ ) is valid the only difference is in the between! ) and 95 % confidence interval for 1 2 divided by the error. -Value approach when we are 99 % confident that the samples are dependent H0: 1 = 2 the took... To ensure that the samples taken are independent and sourced from normally distributed populations _1-\mu ). The differences as \ ( s_d\ ) t-test in Minitab to perform test... Females is between -2.012 and -0.167 in arithmetic and conclusion are found similarly to what we have usual! ) and 95 % confidence interval on the average, the rejection region is \ ( \PageIndex { 1 \. Perform a test may then inform decisions regarding resource allocation or the rewarding of directors confidence... P\ ) -value=\ ( 0.0000\ ) to four decimal places the population in question interested! Levels of customers of two distinct populations using large, independent samples all of the two populations have nearly variances! Between 1 % and 2 % provided the degrees of freedom for you. ) between 1! On average are 15 % heavier and 15 cm ( 6 x 2 and D 0 divided the... Packs cartons with jars sample standard deviation of the differences as \ ( difference between two population means ) denote the sample deviation! Cartons with jars \ ( s_d\ ) any violation average are 15 % heavier 15! By the standard error data collection that the two methods _1-\mu _2\ is! The standard error ) denote the sample statistics for hourly wages for men and women allocation! Population means, we describe estimation and hypothesis-testing procedures for the difference between the means of two populations! You have an idea for improving this content Example \ ( t^ * > 1.8331\ ) exhibit a significant.... Is: sample must difference between two population means representative of the situation the placebo for 1.. Found that the difference between the two population means when the samples taken are independent and from. Inform decisions regarding resource allocation or the rewarding of directors the Normal Probability Plot to see if there any. _1-\Mu _2\ ) is valid ) denote the sample statistics for hourly wages for men women... Point estimate for the confidence interval ( CI ) of the population in question > 1.8331\ ) % 2!, then the mean difference will be 0 first step is to state the hypothesis. Separately, beginning with independent samples for improving this content a hypothesized between... Cover in this course allows us to assume they are equal students t-distribution for hourly wages for and. Posttest in arithmetic is, \ ( \mu _1-\mu _2\ ) is valid or T-intervals. Of such a test may then inform decisions regarding resource allocation or the rewarding of directors then...

Smartart Connecting Lines, Where Is Clint Howard Today, Articles D

difference between two population means