
what is a statistic
 any numerical indicator of a set data
 b. the application of procedures to produce numerical descriptions and statistical inferences

descriptive statistics
 used to summerize the information in a given data set pertaining to a particular sample

inferential statistics
trying to infer; draw conclusions about the data so that we can make generalizations

Two key components to Inferential Statistics
 estimation: estimating the characteristics of a population from data gathered on a sample; how representative is my estimation of my population
 significance testing: testing for significant statistical differences between groups and significant relationships between variables; p<.05

Five Key Types of Descriptive Stats
 Central Tendency: a center point in my data; could be the mean
 Dispersion: how spread out are the participants
 Standard scores: standard deviation and z scores; takes the numbers and standardized
 Frequencies: how many in each group;
 Visual Displays: graphs and charts

Central Tendency: Mode
 Mode: simplest; what number occurs most often; can have multiple modes
  Appropriate for nominal data/ not for ordinal

Central Tendency: Median
  middle most score in a distribution; cut the distribution in half
  appropriate for ordinal data
  it is resistant to extreme scores (outlyer)
  does not describe "typical"

Central Tendency: Mean
  arithmetic average; it is not resistant
  most appropriate and effective for interval/ration data
  often fractional (round to two decimal points)

Dispersion: Range
  simplest measure
  reports the distance between our highest and lowest score
  general sense of the spectrum of scores
  non resistant: like the mean, an extreme score will affect the range

Dispersion: Variance
 mathematical index of the average distance of teh scores in a distribution from the mean
  tells us the amount of error in our study

Dispersion: Standard Deviation
  average deviation fromt the mean espressed in the original unit of measure
  most often used by researchers
  square root of variance

Standard Score
  common unit of measurement that indicates how far any particular score is away from the mean
  they locate scores within a distribution

Z score
 several uses beyond "locating":
  multiple raters
  same scale but different context
  different scales

Frequencies:
  frequency distribution: used to calculate the mode
  absolute frequency
  relative frequency: the proportion of times each data occurs
  cumulative frequency

Visual Displays of Frequency
 pie charts
 bar charts
 histograms: like a bar chart, except it is using a ratio or interval variable

Estimating Population Parameters
guessing at the characteristics of our population, statistically speaking

estimates
statistics computed

Normality Assumption
the variable of interest is "normally" distributed in the population

Random Sample
rarely have a true random sample

Normal Distribution
  theoretical distribution representing the location of deviations about the mean and the probablity of these deviations happening
  interval or ratio data
  deviations about the mean are expressed in units: SD's
  the normal distribution tells researchers the probability of a score falling in any given area of the curve

689599.7 Rule
 99.7 of scores fall 3 SD above of 3 SD below the mean
 95% of scores fall between 2 and 2 SD
 69% of scores fall between 1 and 1 SD

Abnormal Distributions
 it is not perfectly symmetrical
 can be abnormal in two ways
  kurtosis: how pointed is my normal distribution
  skewness: direction of asymmetry

Mesokurtic (0)
Perfectly normal distribution

Leptokurtic (>0)
pointy kurtosis

Platykurtic (<0)
flat kurtosis; most people are widely distributed

Skewed Distribution
all about the direction of the tail; mode, median, then mean (not all perfectly aligned)

Central Limit Theorem
  larger sample size: the distribution of the means is normal
  larger samples give more accurate results than do smaller samples
  if you cant do random, do large

Making Inferences
 standard error of the mean: how much does my sample mean differ from my population mean; look at sampling distribution
  confidence level: how confident am I that my mean in my sample, represents the populatin mean
  confidence interval: range of my mean score associated with the confidence level
  size of CL influenced by: variability: factors you cant necessarily control that could affect your findings confidence level: sample size

Statistical significance
patterns or relationships between variables are likely to exist in the real world

Do we really test research hypotheses?
We dont actually test the hypotheses proposed in the study. We test the null hypothesis

Null Hypothesis
  a statement that statistical differences or relationships have occurred for no reason other than chance
  we use statistics to determine whether or not to accept or reject the null, not to prove or disprove H's
  we focus on estimating the probability that H's are true/not true. Hence, our language regarding findings is qualified and tentative

Null Decision
  accept or reject the null
  based upon statistical significance
  in making this decision, we risk making one of two errors

