-
statistics
mathematical tools used to organize, summarize, and manipulate data
-
data
- observations gathered from an experiment, survey, or observational study
- scores on variables, info expressed as numbers (quantitatively)
-
variables
- quantities that vary
- traits or characteristics that can change values from case to case (ex. age, gender, income)
-
cases
the entities from which data are gathered (ex. ppl, groups, provinces, countries)
-
3 types of descriptive statistics
- univariate
- bivariate
- multivariate
-
univariate statistics
summarize a single variable (ex. GPA)
-
bivariate statistics
- summarize the strength and direction of the relationship between two variablesĀ
- ex. older students having higher GPA than younger students
-
multivariate statistics
- summarize the relationship between 3 or more variables
- ex. GPA increases with age for females but not males
-
inferential statistics
generalize/infer from a sample to a population
-
independent variable
- variable beyond our control
- used to help explain dependent variables
-
dependent variable
- what we want to explain (usually)
- depends on the independent variable
-
discrete variables
- measured in units that can't be subdivided
- ex. marital status, country of birth
-
continuous variables
- measured in units that can be subdivided (often infinitely)
- age in years, income in dollars
-
nominal values
- allow for only qualitative classification
- scores are different from one another but can't be treated as numbers, and don't have order to them
- ex. region, 1=urban, 2=suburban, 3=rural
-
ordinal values
- scores ranked from high to low or more to less
- always discrete variables
- can say something is more/less than something else
- ex. 1=strongly disagree, 2=disagree, 3=agree, etc
-
interval-ratio values
- scores are actual numbers and have equal intervals between them
- can be discrete or continuous
- ex. age (years), income (dollars), number of kids
-
rate
- refers to the number of actual occurrences of an event divided by the number of possible occurrences, per some unit of time
- rate=#actual/#possible in a period of time
-
percent point difference
- refers to absolute change
- ex. the new percentage - the old percentage
-
percent change
- refers to relative change
-
frequency distributions
summarize the distribution of a variable by reporting the # of times each score of a variable occured
-
requirements for the categories of frequency distributions
- exhaustive
- mutually exclusive
- relatively homogenous
-
-
pie chart
show differences in frequencies and percentages among the categories of a nominal or ordinal variable
-
bar chart
show the differences in frequencies or percentages among categories of a nominal or ordinal level variable
-
histogram
- show differences in frequencies and percentages among categories of an interval-ratio variable
- bars are touching to indicate a continuous variable
-
frequency polygon
- similar to histograms
- uses lines and dots to represent frequencies
-
measures of central tendency
- the most typical, central, or common score of a variable
- mode, median, mean
-
mode
- most common score
- can be used at all 3 levels of measurement
- most often used with nominal level variables
-
mean
- score of the middle case
- divides the distribution into two equal parts
- can be used with ordinal or interval-ratio variables because they have some kind of order to them
-
-
n
number of cases in a sample
-
N
the number of vases in the population
-
special characteristics of the mean
- all scores cancel out around the mean
- the mean is the point of minimized variation
- the mean incorporates all of the scores
-
skewness
- the shape of the distribution
- can be unskewed (perf normal curve), positively skewed, negatively skewed
|
|