A town official claims that the average vehicle in their area sells for more than the 40th percentile of your data set. Using the data, you obtained in week 1, as well as the summary statistics you found for the original data set (excluding the super car outlier), run a hypothesis test to determine if the claim can be supported. Make sure you state all the important values, so your fellow classmates can use them to run a hypothesis test as well. Use the descriptive statistics you found during Week 2 NOT the new SD you found during Week 4. Because again, we are using the original 10 sample data set NOT a new smaller sample size. Use alpha = .05 to test your claim.
(Note: You will want to use the function =PERCENTILE.INC in Excel to find the 40th percentile of your data set. Hopefully this Excel function looks familiar to you from Week 2.)
First determine if you are using a z or t-test and explain why. Then conduct a four-step hypothesis test including a sentence at the end justifying the support or lack of support for the claim and why you made that choice.
I encourage you to review the Week 6 Hypothesis Testing PDF at the bottom of the discussion. This will give you a step by step example on how to calculate and run a hypothesis test using Excel. I DO NOT recommend doing this by hand. Let Excel do the heavy lifting for you. You can also use this PDF in Quizzes section.
There were 5 additional PDFs that were created to help you with the Homework, Lessons and Tests in Quizzes section. While they won't be used to answer the questions in the discussion, they are just as useful and beneficial. I encourage you to review these ASAP! These PDFs are also located at the bottom of the discussion.
In this document we will discuss 2 – sample Z- hypothesis testing and confidence
intervals that uses a mean’s and known population S’s.
This PDF discusses Z-Critical Value and you are discussing a sample mean and a
population S.
There are still 3 different hypothesis scenarios with a 2 – Sample Z Hypothesis
Test.
Lower Tail Test (1 tail):
Ho: �̅�1 − �̅�2 = 0
Ha: �̅�1 − �̅�2 < 0
Upper Tailed Test (1 tail):
Ho: �̅�1 − �̅�2 = 0
Ha: �̅�1 − �̅�2 > 0
Two Tailed Test:
Ho:�̅�1 − �̅�2 = 0
Ha: �̅�1 − �̅�2 ≠ 0
The hypothesized value is 0 and the same key words apply from a 1 – sample
hypothesis test to determine which scenario to use. 𝜇1 − 𝜇2 is the difference
between the average in the first sample and the average in the second sample.
The Z – Test Statistic = �̅�1− �̅�2−0
√ 𝑆1
2
𝑛1 +
𝑆2 2
𝑛2
Where S is the population standard deviation, 𝜇1 𝑎𝑛𝑑 𝜇2 are averages and n1
and n2 are the sample sizes.
We can use =NORM.S.DIST to find the p-values. These should look familiar from
the discussion forum.
Example:
A dietitian has developed a diet that is low in fats, carbs, and cholesterol. The
dietitian wishes to examine the effects this diet has on the weights of obese
people. Two random samples of 30 obese each are selected, and one group of 30
people is places on the low-fat diet. The other 30 people are places on a diet that
contains approximately the same quantity of food, but has is not low in fats,
carbs, and cholesterol. For each person the amount of weight lost (or gained) in a
three-week period is recorded. There is a difference in the population mean
weight losses for the two diets? The population S1 = 4.67 and the population S2 =
4.04. Use alpha = .05. Here we see we are given the Raw Data set.
WL Low Diet WL Regular Diet
8 6
21 14 13 4
8 6 11 13
4 11
3 11 6 8
16 14 5 8
10 6 8 4
8 12
12 2 7 1
3 2 12 6
14 1
16 0 11 9
10 5 9 10
10 6 8 6
14 9
3 8 7 3
14 1 11 7
14 8
First step is to state the hypothesis scenario. Because the key word says
difference this means it is a two tailed test.
Ho: �̅�1 − �̅�2 = 0
Ha: �̅�1 − �̅�2 ≠ 0
Before we start calculating anything by hand and because we are given the raw
data set, we can actually run this hypothesis test in Excel. And since you installed
the Data Analysis Toolpak it is easy to do. First you need to input this Raw Data
into Excel.
Then go to Data -> Data Analysis -> and scroll to where it says z-Test Two Sample
for Means and click OK
Under Input:
Variable 1 Range: you will highlight the WL Low Diet column and make sure you
include the top row where the Label is located.
Variable 2 Range: you will highlight the WL Regular Diet column and make sure
you include the top row where the Label is located.
Hypothesize Mean Difference: can be left as 0
Variance 1 Variance (known): Here is where you will put the Known Variance for
the First Sample. In the problem we are given the Known Standard Deviation. To
find the Variance all we did was Square it. Input that value in the box.
Variance 2 Variance (known): Here is where you will put the Known Variance for
the Second Sample. In the problem we are given the Known Standard Deviation.
To find the Variance all we did was Square it. Input that value in the box.
Check the “Labels” box because we did include the first row of labels. For Alpha
out 0.05 but this can be change depending on what significance level you use.
Then make sure the bubble for New Workbook Ply: highlight and click OK. It
should look similar to the screenshot below.
Once you click OK in a new Worksheet this should populate.
z-Test: Two Sample for Means
WL Low Diet WL Regular Diet
Mean 9.866666667 6.7 Known Variance 21.8089 16.3216
Observations 30 30
Hypothesized Mean Difference
0
z 2.808838232
P(Z<=z) one-tail 0.002486031
z Critical one-tail 1.644853627
P(Z<=z) two-tail 0.004972062
z Critical two-tail 1.959963985
Here we have all the values we need to state a conclusion.
We see the Z – Test Statistic = 2.8088 and because we ran a two tailed test the
p-value = .00497.
p -value = .00497 < .05. This p-value is less than .05 which means we Reject Ho.
Yes, there is statistical evidence that there is a difference in the population mean
weight losses for the two diets.
If we were running a 1-tailed test, we are given the p-value which is .002486. Z-
Test Statistic is the same and so is the conclusion for a 1-tailed test.
Using Excel to run a hypothesis test when we are given the Raw Data is very
convenient. But if we aren’t given the Raw Data and we are given the averages
and known S’s we will need to compute the Z-Test Stat by hand and then use the
Excel function to find the p-value.
To find the Z-Test Stat we will use this equation and plug in what we know. You
should know by now how to calculate the average and SD using Excel. Which is
what I did here.
Z – Test Statistic = �̅�1− �̅�2−0
√ 𝑆1
2
𝑛1 +
𝑆2 2
𝑛2
Z – Test Statistic = 9.86667− 6.7−0
√4.672
30 +
4.042
30
Z – Test Statistic = 3.16667
√.72696333+.54405333
Z – Test Statistic = 3.16667
√1.27101666
Z – Test Statistic = 3.16667
1.1273937
Z – Test Statistic = 2.80884
When we calculate the Test Stat by hand using algebra we get the same value.
Next, we need to find the p-value. We will use the =NORM.S.DIST function to find
the p-value.
In Excel input =NORM.S.DIST(2.80884,TRUE) and hit Enter. We will type True
because this is a cumulative test.
We see this p-value = .997514 BUT remember when we use this function in Excel,
this function is in the less than form. This means if we were running a Lower
Tailed test, this would be our p-value. If we were running an Upper Tailed Test
we need to take 1 – .997514 to get the p-value for our test.
P-value = 1 – .997514 = .002486. This is the p-value for an upper tailed test.
But since we are running a Two Tailed, we take whichever p-value is smaller and
multiply it by 2.
p-value = .002486*2 = .004972. This is the p-value for a two tailed test. And if we
compare these to the Excel output that should be the same and draw the same
conclusion.
This is how you would run a 2 – sample Z hypothesis test using averages and
population S’s when we don’t have the raw data and can’t use Excel.
Now that we ran a hypothesis test, let calculate a confidence interval and draw
the same conclusion.
The equation for a 2 – sample Z confidence interval:
�̅�1 − �̅�2 ± 𝑍𝛼 2
∗ ∗ √ 𝑆1
2
𝑛1 +
𝑆2 2
𝑛2
Where Standard Error (SE) = √ 𝑆1
2
𝑛1 +
𝑆2 2
𝑛2
Margin of Error = 𝑍𝛼 2
∗ ∗ √ 𝑆1
2
𝑛1 +
𝑆2 2
𝑛2
We have all the values we need let’s plug them into our equation.
9.86667 − 6.7 ± 𝑍𝛼 2
∗ ∗ √4.672
30 +
4.042
30
The last thing we need to find is a Z-Critical Value. We will use the =NORM.S.INV
in Excel to find the Z-Critical Value.
If we want to find a 95% confidence interval, then alpha = 1 – .95 = .05. But
because this is a confidence interval and we need to take into account the plus
AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2
= .025. Then we take 1 – .025 = .975. We will use this value in our Excel function.
=NORM.S.INV(.975)
We see the Z – Critical Value is 1.96. We will plug this into the equation and
solve. But if you compare this Critical Value to the Excel output we got when we
ran the hypothesis test it is the same because we used Alpha = .05 in the output.
But this value will change depending on what you input for Alpha.
z-Test: Two Sample for Means
WL Low Diet WL Regular Diet
Mean 9.866666667 6.7
Known Variance 21.8089 16.3216 Observations 30 30
Hypothesized Mean Difference
0
z 2.808838232
P(Z<=z) one-tail 0.002486031
z Critical one-tail 1.644853627
P(Z<=z) two-tail 0.004972062
z Critical two-tail 1.959963985
9.86667 − 6.7 ± 𝑍𝛼 2
∗ ∗ √4.672
30 +
4.042
30
9.86667 − 6.7 ± 1.96 ∗ √4.672
30 +
4.042
30
3.16667 ± 1.96 ∗ 1.1273937
3.16667 ± 2.209697
The confidence interval goes from .95697 to 5.376367. This interval goes from a
positive value to a positive value. This means that 0 is NOT in this interval.
Because 0 is NOT in the interval, Yes, it is Significant, and we Reject Ho. This is the
same conclusion that we got with the hypothesis test.
,
This week will continue to discuss hypothesis testing and confidence interval, but
now we will discuss 2 samples.
Just like with 1 – sample hypothesis testing there are 4 steps we will follow. To
review those 4 steps please review the Week 6 Hypothesis Testing PDF.
But the conclusion will still be the same.
If the p-value is < alpha, you Reject Ho and state this test is significant.
If the p-value is > alpha, you Do Not Reject Ho and state this test is not significant.
In this document we will discuss 2 – sample proportion hypothesis testing and
confidence intervals.
There are still 3 different hypothesis scenarios with a 2 – sample proportion
hypothesis test.
Lower Tail Test (1 tail):
Ho: 𝑝1 − 𝑝2 = 0
Ha: 𝑝1 − 𝑝2 < 0
Upper Tailed Test (1 tail):
Ho: 𝑝1 − 𝑝2 = 0
Ha: 𝑝1 − 𝑝2 > 0
Two Tailed Test:
Ho: 𝑝1 − 𝑝2 = 0
Ha: 𝑝1 − 𝑝2 ≠ 0
The hypothesized value is 0 and the same key words apply from a 1 – sample
hypothesis test to determine which scenario to use.
The Z – Test Statistic = 𝑝1− 𝑝2
√𝑝∗𝑞( 1
𝑛1 +
1
𝑛2 )
Where 𝑝 = 𝑥1+𝑥2
𝑛1+𝑛2
𝑞 = 1 − 𝑝
We will then use =NORM.S.DIST function in Excel to find the p-value. This Excel
function should look familiar from Week 4.
Example:
In a developing section of a district 50 people were surveyed and 38 were in favor
of the new proposal. For the rest of the district 100 people were surveyed and
only 65 people were in favor of the new proposal. Is there evidence that the
number of people favoring the new proposal is greater in the developing section
than the rest of the district? Use alpha = .05
First step is to state the hypothesis scenario. Because the key word says greater
this means it is an upper tailed test.
Ho: 𝑝1 − 𝑝2 = 0
Ha: 𝑝1 − 𝑝2 > 0
The first proportion is favoring the new proposal in the developing district and the
second proportion is favoring the new proposal in the rest of the district.
𝑝1 = 38
50 = .76
𝑞1 = 1 − .76 = .24
𝑝2 = 65
100 = .65
𝑞2 = 1 − .65 = .35
𝑝 = 38 + 65
50 + 100 = .68667
𝑞 = 1 − .68667 = .31333
Now that we have these values we can plug them in to find the Test Statistic.
Z – Test Statistic = .76−.65
√.68667∗.31333( 1
50 +
1
100 )
= 1.369
Now that we have the Z-Test Statistics we can use the =NORM.S.DIST function to
find the p-value.
And yes, we can have a negative Z- Test Statistic, if we do that is fine. You DO
NOT have to take the absolute value of anything. Use the Test Stat. as is in the
Excel function.
In Excel input =NORM.S.DIST(1.369,TRUE)
We will write out TRUE because this test is cumulative.
We see this p-value = .9145 BUT remember when we use this function in Excel,
this function is in the less than form. This means if we were running a Lower
Tailed test, this would be our p-value. BUT since we are running an Upper Tailed
Test we need to take 1 – .9145 to get the p-value for our test.
P-value = 1 – .9145 = .0855.
We see the p-value for our upper tailed test is .0855. If we compare this to .05,
we see that:
.0855 > .05. Since the p-value is greater than alpha, We Do Not Reject Ho. This
test is not significant and No, there is no evidence that the proportion of people
favoring the new proposal is greater in the developing section than the rest of the
district at alpha = .05.
What if we were running a two tailed test? To find this p-value we would take
whichever p-value is smaller and multiple it by 2.
.0855*2 = .171. The p-value for a two tailed test would be .171.
Now that we ran a hypothesis test, let calculate a confidence interval and draw
the same conclusion.
The equation for a 2 – sample proportion is:
𝑝1 − 𝑝2 ± 𝑍𝛼 2
∗ √
𝑝1𝑞1
𝑛1 +
𝑝2𝑞2
𝑛2
Where Standard Error (SE) = √ 𝑝1𝑞1
𝑛1 +
𝑝2𝑞2
𝑛2
Margin of Error (ME) = 𝑍𝛼 2
∗ √
𝑝1𝑞1
𝑛1 +
𝑝2𝑞2
𝑛2
Plugging in what we know:
. 76 − .65 ± 𝑍𝛼 2
∗√ . 76 ∗ .24
50 +
. 65 ∗ .35
100
The last thing we need to find is the Z- Critical Value. We will use the
=NORM.S.INV function to find this. This function should look familiar from Week
4.
If we want to find a 95% confidence interval, then alpha = 1 – .95 = .05. But
because this is a confidence interval and we need to take into account the plus
AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2
= .025. Then we take 1 – .025 = .975. We will use this value in our Excel function.
=NORM.S.INV(.975)
We see the Z – Critical Value is 1.96. We will plug this into the equation and
solve.
. 76 − .65 ± 𝑍𝛼 2
∗√ . 76 ∗ .24
50 +
. 65 ∗ .35
100
. 76 − .65 ± 1.96√ . 76 ∗ .24
50 +
. 65 ∗ .35
100
. 76 − .65 ± 𝑍𝛼 2
∗√ . 76 ∗ .24
50 +
. 65 ∗ .35
100
. 76 − .65 ± 1.98(.076961)
. 11 ± .1508
(-.0408, .2608)
The confidence interval goes from -4.08% to 26.08%. This interval goes from a
negative value to a positive value. This means that 0 is in fact in this interval.
Because 0 is in the interval it is Not Significant, and we Do Not Reject Ho. This is
the same conclusion that we got with the hypothesis test.
,
Hypothesis Testing is a decision-making process called a Test of Significance.
There are 4 unique parts to Hypothesis Testing.
1) The Hypothesis Scenario. This includes the Null and Alternative scenarios.
a. Ho: Null Hypothesis
Ha or H1: Alternative Hypothesis
2) Z- Test Statistic
Z- Test Stat = �̂�−𝑝0
(√ 𝑝0∗𝑞0
𝑛 )
Where “𝑝0” is the hypothesized value and 𝑞0 = 1 − 𝑝0. 3) P- value. The p-value tells you if something will be significant or not and if
you can Accept or Reject the claim. You will use the p-value to draw a
conclusion regarding the hypothesis test.
a. We will use =NORM.S.DIST function to find the p-value. It should
look familiar from Week 4.
4) Conclusion:
a. If the p-value is less than alpha (< α) then Reject Ho/Accept Ha.
b. If the p-value is greater than alpha (> α) then We Do Not Reject Ho.
c. The most common alpha value is .05. If no, alpha value is given it will
default to .05 but do note that alpha can also be, .10, .01, and .005 to
name a few. Essentially alpha can be any value the statistician
deems fit, but the most common values are .05, .01 and .10.
One last thing before we get to an example. There are 3 different scenarios that
are associated with the Hypothesis Scenario.
1) There is a Lower tailed (one tailed) Test or a Left Tailed Test. If the problem
asks if there a significant decrease or less than or lower than or fewer than,
then the problem is a lower tailed test. The “<” sign corresponds with the
Ha. The hypothesis scenario will look like:
a. Ho: �̂� = 𝑝0
Ha: �̂� < 𝑝0
(Here we see that “𝑝0” is the hypothesized value and the Less
Than Sign “<” lines up with the Ha)
2) There is an Upper tailed (one tailed) Test or a Right Tailed Test. If the
problem asks is there a significant increase or more than or greater than or
higher than, then the problem is an upper tailed test. The “>” sign
corresponds with the Ha. The hypothesis scenario will look like:
a. Ho: �̂� = 𝑝0
Ha: �̂� > 𝑝0
(Here we see that “𝑝0” is the hypothesized value and the
Greater Than Sign “>” lines up with the Ha)
3) There is a Two tailed Test. If the problem asks is there a significant
difference or statistical evidence or asks if it is not the same, then the
problem is a two-tailed test. The “≠” sign corresponds with the Ha. The
hypothesis scenario will look like:
a. Ho: �̂� = 𝑝0
Ha: �̂� ≠ 𝑝0
(Here we see that “𝑝0” is the hypothesized value and the
Greater Than Sign “≠” lines up with the Ha)
The hypothesized value is what we think should happen or what has been found
to be true in the past.
Now let’s continue to look at our car price data from Week 3. In Week 3, I asked
you to calculate the average and then find how many data points fell below the
average. We called this value p and then we found q. If we look back at my data
set, we see that p = .70 and q = .30.
We will call this �̂� = .70 and �̂� = .30.
We want to run a test to see how close our data set is to a 50/50 spread? 50% of
the data would fall above the mean and 50% of the data would fall below the
mean, in a perfect world.
In other words, is there a difference between your data set and 50%? We will
calculate a 95% hypothesis to test this claim.
(Note: YES! I realize that some of you did see in your Week 3 forum that you did
get p = .50 and q = .50. If this is the case, your Test Statistic will be 0 and the p-
value will come out to be 1. That is fine, BUT it is still a good idea to go through
this example and make sure you can run a hypothesis test to get the correct
results. Extra practice never hurt anyone.)
Getting back to our test, this tells us that the hypothesized value is .50. The
hypothesis scenario will look like this:
1) Ho: �̂� = .50
Ha: �̂� ≠ .50
2) Z-Stat = �̂�−𝑝0
(√ �̂�∗�̂�
𝑛 )
= .70−.50
(√ .50∗.50
10 )
= 1.264911
Note: If your Z-Stat is negative that is fine. That does not mean the
problem is incorrect. And if your �̂� = .50, your Z-Stat would be 0 here
and that is fine also.
3) To find the p-value we will use the =NORM.S.DIST function. In Excel
type in =NORM.S.DIST(1.264911,TRUE) and hit Enter. We type in TRUE
because the hypothesis test is cumulative.
We see that the p-value = .897048. But remember this is in the Less Than form. If
we were running a Lower Tailed Test this would be our p-value. To find the p-
value for an Upper Tailed Test we would take p-value = 1 – .897048 = .102952.
Since we are running a Two Tailed Test, to get the correct p-value we would
multiply whichever p-value is smaller by 2. It will be different depending on the
test, so you need to make sure you use whichever one is smaller. Remember, p-
values CANNOT be greater than 1. If you get a p-value greater than 1, you did
something wrong.
p-value = .102952*2 = .205904. This is the p-value we will use for our conclusion.
If your Z-Stat is 0 then your p-value in this test will be 1. That is fine. Your p-value
can be 1 but it CANNOT be greater than 1.
4) Lastly, we need t