Chapter 9 More About Correlation

The flashcards below were created by user damea134 on FreezingBlue Flashcards.

  1. The correlation coefficient

    Pure Number
    The Correlation coefficient is a pure number, without units. It is not affected by

    • - interchanging the two variables of a graph
    • - adding the same numeric value to all values of JUST ONE variable. (just x or y). 
    • - Multiplying all the values of one variable by the same positive number.

    If you switch the variable it does not change

    (X x Y) = (Y x X). (Note: have to switch all variables for x and y. Switching one value of y  or x will affect the calculation of the correlation coefficient R)
  2. Changing the SD
    Two graphs or more can have the same correlation coefficient (r) value (r=.70). However changing the SD of one will effect how the graphs appears. A graph with a smaller SD will be more tightly clustered around the SD line because the SDs are smaller.  

    Note: R measures the clustering not the in absolute terms but in relative terms relative to the SDs.
  3. Combining separate diagrams with the same variables.
    If you combine data from similar diagrams that took the same test and came up with similar r values, then combining them would make the clustering more linear and thereby increasing the r value. example:R value = 3  

    This is because the more data you have the variance is bigger. And the SD is bigger.
  4. Distant of variables below or above the SD line.
    If r is close to 0, then a typical point is above or below the line by an amount roughly comparable in size to the vertical SD (Variables on the Y axis)

    If r is close to 1, then a typical point is on the line
  5. Formula For Distance Above Or Below The SD Line
    The r.m.s vertical distance to the SD line:

    Image Upload 1

    This is for the spread around the vertical (y axis)

    if formula produced a value of .3 then the spread would be about 30% of a vertical SD (Y axis)
  6. Using Coefficent Corellation R
    R only measure linear associations not associations in general.

    Note: R measures the clustering not the in absolute terms but in relative terms relative to the SDs.
  7. Ecological Correlation
    Ecological correlations that are based on rates or averages tend to overstate the strength of the correlation association. 

    A Correlation that are based on individuals (and not on Sums of averages or rate), is more accurate.

    • Example:
    • The correlation between income and education for men between 25-64 in the United State is r=0.42

    however, when we compute the averages of all US state (Ecological Correlation), the correlation coefficient is r=0.70
  8. Association Not Causation
    Correlation measure association. But association is not the same as causation


    there could be a strong correlation between fat in diet and cancer. However it may not be the causation of cancer or if it is it could be hard to single out. 

    For example countries who consume large amounts of fat also consume large amounts of sugar and carbohydrates. Therefore we shouldn't single out meat as being the only factor of causation. Even though there is a strong correlation. It could be Fat consumption, Sugar, or Carbohydrates or all of them combine that is causing the cancer.
Card Set
Chapter 9 More About Correlation
More Correlation
Show Answers