STAT 104 - Chapter 17 and 20 - Significance Tests

  1. What are the two general parts of significance testing?
    First, we identify a preconceived idea that we have about the parameter

    Then, we use the data in the sample to measure the strength of evidence for the preconceived idea.
  2. 1. What is another name for the research hypothesis

    2. Suppose we suspect that the average wage of all qualified electricians has increased from what it was in 1980 when it was $20.35.

    How would we denote the research hypothesis and the null hypothesis?

    3. We collect a random sample of wages of electricians and calculate the average (x̄). We then compare x̄ with $20.35.

    a. What happens if x̄ is a lot bigger than $20.35?

    b. What happens if x̄ is less than $20.35?

    c. What happens if x̄ is only a little bigger than $20.35?

    4. In this case, what happens the higher x̄ is above 20.35.
    1. The alternative hypothesis. 

    • 2.  Ha: μ > 20.35 
    •      Ho: μ ≤ 20.35

    3a. We have evidence for our claim that μ > 20.35

    b. We can't claim that μ > 20.35.

    c. We can't claim that μ > 20.35 because it might be true that μ = 20.35 and we just happened to get a freak sample of electricians who are receiving above average wages.

    We have insufficient evidence.

    4. The higher that x̄ is above 20.35, the stronger the evidence for the idea that μ > 20.35
  3. What do we need to remember about the null hypothesis? (2)
    It is the opposite of the research hypothesis, and it always contains the equal sign.

    ex. Ho = 220, Ho ≥ 220 or Ho ≤ 220
  4. 1. What does p-value stand for?

    2. What happens when the p-value gets smaller?

    3. Fill in the following:

    What is the p-value given the following strength of evidence for research hypothesis?

    a. Very strong
    b. Strong
    c. Moderate
    d. Weak
    e. None
    b.
    1. Probability value, or significance.

    2. The smaller the p-value, the stronger the evidence for the research hypothesis.

    3a. Very strong = p-value less than .01

    b. Strong = p-value between .01 and .05

    c. Moderate = p-value between .05 and .1

    d. Weak = p-value between .1 and .2

    e. None = p-value more than .2
  5. 1. What is the strength of evidence for research hypothesis for the following p-values?

    a. more than .2
    b. between .1 and .2
    c. between .05 and .1
    d. between .01 and .05
    e. less than .1

    2. What is Gillian's fun rhyme for low p-values?
    1a None

    b. Wesk

    c. Moderate

    d. Strong

    e. Very strong

    2. When p-value is low, Ho must go!
  6. 1. What are the four steps to the Four Step Process (generally), and how many sub-steps are there in each?

    2. Describe the Four Step Process (in detail)

    Use exercise 17.14, melting point of copper as a reference. 
    Image Upload 1
    1. State (1), Plan (4.5), Solve (4), Conclude (1)

    • 2. THE FOUR STEP PROCESS
    • State: How strong is the evidence that the average of all possible measurements of the melting point of copper is not 1084.8°C?

    Plan: μ = average of all possible measurements of the melting point of this copper. 

    • Ho: μ = 1084.8
    • Ha: μ ≠ 1084.8

    • We'll use 1-sample z because σ is given.
    • σ = 0.25

    • Solve: 
    • State the conditions: 
    • 1. Representative sample
    • 2. No outliers
    • 3. Normal population
    • 4. If n ≥ 30, disregard #'s 2 and 3.

    • Checking the conditions
    • 1. The question tells us to assume SRS.
    • 2. The boxplot shows that there are no outliers
    • 3. The histogram suggests that the population is normal. 

    • Test statistic: z = 0
    • p-value = 1.00

    Conclude: Since the p-value is larger than 0.2, we have no evidence that the average of all possible measurements for the melting point of this copper is not 1084.8°C.

    ***It is ok to have a double negative here
  7. 1. What should we do before beginning the Four Step Process?

    2. What are the four steps in the Four Step Process (generally), and how many sub-steps are in each?

    3. What is wrong with Ha: μ = 1084.8 and why?

    4. What is wrong with Ha: x̄ > 15 ?
    1. Identify the parameter by finding the sample size, individual, population and variable.

    • 2
    • (1) State - 1
    • (2) Plan - 4.5
    • (3) Solve - 4
    • (4) Conclude - 1

    3. We can never set up a research hypothesis that allows us to claim equality. 

    This is because we are looking at only a sample, not the population, so we can never know exactly what μ is. 

    ** The best we can say is that μ is close to 1084.8.

    4. Hypotheses are all about estimating the mean of a population (μ), not the mean of a sample (x̄).
  8. Describe the Four Step Process (in detail)

    Use exercise 19.36 and 37, Very Low Birth Weight (VLBW) men as a reference. 
    Image Upload 2
    THE FOUR STEP PROCESS

    State: How strong is the evidence that the average IQ at age 20 of all VLBW males is less than 100?

    Plan: μ = average IQ (at age 20) of all VLBW males. 

    • Ho: μ ≥ 100
    • Ha: μ < 100

    • We'll use 1-sample z because σ is given.
    • σ = 15

    • Solve
    • State the conditions
    • 1. Representative sample
    • 2. No outliers
    • 3. Normal population
    • 4. If n ≥ 30, disregard #'s 2 and 3.

    • Checking the conditions
    • 1. No information is given about the selection process for the 113 male infants. 

    In practice, we would contact the person who collected the data to find out more.

    Numbers 2 and 3 don`t matter because n = 113

    • Test statistic: z = -8.79
    • p-value = 0.00

    Conclude: Since the p-value is less than 0.01, we have very strong evidence that the average IQ of all VLBW males at age 20 is less than 100.
  9. What is the relationship between the p-value and the test statistic (4 points)
    The p-value is the chance of observing a test statistic which is more in favour of the research hypothesis than the observed value.

    The chance is calculated assuming that the null hypothesis is true.

    If the null hypothesis is true, how liekly is it that we would observe a test statistic more extreme than the test statistic we actually observed? 

    If there's only a very small chance, then we have strong evidence for the research hypothesis.

    * a test statistic of z = -8.79 is off the left end of the z-table, and therefore so the p-value will be approximately 0.
  10. 1. What is a level of significance?

    2. What does α stand for, and what are three common values?

    3. What do the three common values represent?
    1. A level of significance expresses the chance of wrongly claiming that the research hypothesis is true

    2. α is an error rate. Commonly used values of α (alpha) are: 0.1, 0.5 and 0.01

    • 3. 
    • A risk taker might use α = 0.1

    This expresses the idea that the user is comfortable if they claim that the research hypothesis is true in 10% of tests

    Therefore, they will wrongly claim that the research hypothesis is true in 1 out of 10 tests

    A middle-of-road kind of person might use α = 0.05

    This expresses the idea that the user is comfortable if they wrongly claim that the research hypothesis is true in 5% of tests.

    Therefore, they will wrongly claim that the research hypothesis is true in 1 out of 20 tests.

    A conservative person might use α = 0.01

    This expresses the idea that the user is comfortable if they wrongly claim that the research hypothesis is true in 1% of tests. 

    Therefore, they will wrongly claim that the research hypothesis is true in 1 out of 100 tests.
  11. In a significance test:

    1. What does it mean when the p-value is less than or equal to α? (3 parts)

    2. What does it mean when the p-value exceeds α?
    1. Then the results are significant.

    (In traditional language: We reject Ho)

    We can claim that the research hypothesis is true using the given level of significance. 

    2. Then the results are not significant. 

    (In traditional language: We do not reject Ho)

    We cannot claim that the research hypothesis is true using the given level of significance.
  12. In exercise 17.40 p. 409 - This wine stinks, Minitab gives us a p-value of 0.023.

    Research hypothesis: How strong is the evidence that the average odor threshold
    or all untrained wine tasters exceeds 25?

    How do we answer the following questions (if asked on a test?)

    1a. Using α = 0.05, what is the decision, and why?

    1b. Write down the conclusion

    2a. Using α = 0.01, what is the decision and why?

    2b. Write down the conclusion

    3. What can we never do in a significance test , and what is an analogy to explain it?
    • *There are only two possible decisions!
    • Either you can claim your research hypothesis, or you can't.

    1a. We reject Ho because p = 0.023 is less than α = 0.05

    1b. Testing at a 5% level of significance, we can claim that the average threshold of all untrained wine tasters exceeds 25. 

    2a. We do not reject Ho because the p-value of 0.023 exceeds α = 0.01

    2b. Testing at a 1% level of significance, we have insufficient evidence to claim that the average threshold for all untrained tasters exceeds 25. 

    3. We can NEVER accept Ho.

    It is like declaring someone innocent in a court case, rather than not guilty.

    There is insufficient evidence to prove guilt, as there is insufficient evidence to prove the null hypothesis. 

    Proving the defendant guilty beyond reasonable doubt, is like having a small p-value.

    In a court case, everyone is required to assume that the defendant is innocent. In a significance test, the they are required to assume that Ho is true. 

    In a court case, the only possible decisions are 'guilty' and 'not guilty'. In a significance test, the only possible decisions are 'Reject Ho" and "Do not reject Ho", leading to the connclusino that we cannot claim that Ha is true.
  13. 1. What are the only 3 differences between using 1-sample z and 1-sample t in the Four Step Process?

    2. When we see the word 'evidence' in a question, what are we being asked to do?

    3. What can we say if there are outliers or if the data is skewed (not normal population)?

    4. How do we make Minitab compute Ha < 400?
    1a. In the PLAN stage using 1-sample t, we will instead state the following.

    We'll use 1-sample t because σ is not given. 

    1b. In the SOLVE step, when stating the conditions, #4 will state the following:

    #2 and #3 don't matter if n ≥ 40

    (rather than n ≥ 30 in 1-sample z)

    1c. Further in the SOLVE step, we'll state the test statistic as follows:

    Test statistic: t = -0.97

    (rather than z = -0.97)

    2. When we see the word 'evidence', you will want to do a significance test (FOUR STEP PROCESS).

    3. Outliers will make things a bit inaccurate as they will inflate the average, but it doesn't invalidate the test. 

    If the histogram suggests that the population is not normal (skewed), it would also make the test a bit inaccurate, but would not invalidate it. 

    4. Go to 1-sample t options, and pull down the box that says Alternative Hypothesis

    • Use 'Mean < hypothesized mean
    • Image Upload 3
Author
MissionMindhack
ID
347212
Card Set
STAT 104 - Chapter 17 and 20 - Significance Tests
Description
Prep for Midterm
Updated