-
statistic
science of data
-
individuals/cases
objects being described by set of data
-
variable
any characteristic of an individual and can take on different values
-
quantitative variable
numeric
-
qualitative variable
wordy
-
values
particular things variables take on
-
observational study
observes individual and measures variables of interest but does not influence responses. used to describe group or situation
-
response
variable that measures an outcome or result of a study
-
sampling
to gain info of whole through one part
-
sample surveys
survery goup of individuals by studying only some of members. it represents the larger group
-
population
entire group of individuals about which we want information (group want to study)
-
sample
part of population from which we actually get the information and use it to draw conclusions about the whole
-
census
sample survey that attempts to include entire population in sample
-
experiments
deilberately imposing some treatment on individuals to observe responses. can give cause and effect
-
biased
statisitical study that systematically favors certain outcomes.
-
convenience sampling
selection of individuals who are easiest to reach
-
voluntary response sampling
chooses itself by responding to a general appeal (call in)
-
simple random sample
choose a sample of n individuals form the population by a way that every set of n individuals has a chance to actually be selected
-
table of random digits
- long string of digits with two properties...
- 1. each entry is equally likely to be 0-9
- 2. entires are independent of each other
-
parameter
number that describes the population
-
statistic
number that describes a sample
-
parameter is to __________ as statistic is to ______.
population; sample
-
variablility
describes how spread out values are
-
-
p
proportion (fraction thats divided)
-
-
margin of error
plus or minus 2% points of how close sample stat is to pop parameter
-
95% confident
truth lies within the margin of error ( what % of all possible samples satisfy margin of error)
-
confidence statements
- fact about what happens in all possible samples and is used to say how trustworthy result of sample is
- 1. margin of error
- 2. level of confidence
-
sampling errors
- errors caused by thw act of taking a sample
- -undercoverage
- -random sampling error
- -biased sampling methods
-
random sampling error
deviation between sample stat and the population paramenter cause by chance in selecting a random sample
-
nonsampling error
- errors not related to the act of selecting a sample from the population
- -processing erros
- -response error
- -nonresponse
-
sampling frame
list of every individual from population
-
undercoverage
occurs when some groups in population are left our of process of choosing the sample
-
processing errors
mistakes in mechanical tasks like arithmatic or entering responses into a computer
-
response error
when subject gives incorrect response (lie, guess, bad memory)
-
nonresponse
failure to obtain data from individual selected for a sample (cant contact or no coorpation)
-
stratified random sample
- 1. strata - divide sampling frame into distinct groups of individuals
- 2. clusters - take separate SRS in each stratum and combine to make complete sample
-
probability sample
sample chosen by chance
-
response variable
variable that measures an outcome or reult of study (dependent)
-
explanatory variable
variable that we think explains or causes change to response variable (independent)
-
subjects
individuals studied in an experiment
-
treatment
specific experimental condition applied to subjects
-
lurking variable
variable that has important effect on relationship among variables in study but isnt one of explanatory variables studied
-
confounded
when 2 variables have effect on a response variable and cannot distinguish from each other
-
clinical trials
experiment that studies effectiveness of medical treatments on actual patients
-
placebo
dumby treatment with no active ingredients
-
placebo effect
response to dumby treatment
-
double-blind
neither subjects nor testers recording know which treatment was to who
-
randomized comparative experiment
one that compares two treatments and allow us to draw cause and effect and is random and compares two things that are actually operating equally
-
control group
can be placebo group (no treatment at all)
-
control
effects of lurking variable on response, most simply by comparing 2 or mor treatment
-
randomize
use impersonal chance to assign subjects to treatments
-
statistically significant
observed effect of a size that would rarely occur by chance
-
comparative
good, compare in observance
-
matching
combine comparison in creating a control group
-
nonadherers
subjects who participate but do not follow the experimental treatment
-
dropouts
those hwo begin an experiment that continues over extended period of time then they do not complete it
-
generalizability
accurate of whole population
-
completely randomized
experimental design, all the experiemental subjects are allocated at random among all treatments
-
matched pair design
compares 2 treatments that the pairs of subjects are closely matched as possible
-
block design
random assignment of subjects to treatment is carried out sepaately within each block
-
block
experimental subjects that are similar in some way prior to experiment that is expected to affect response of treatments
-
measure
a property of person or thing that we assign a number to represent the property
-
instrument
make a measurment
-
units
used to record the measurment
-
variable
result of measurement is numberical
-
valid
meassure of a property if it is relevant or appropriate as a representation of that property
-
predictive validity
can be used to predict success on tasks that are related to the property measured
-
bias in measurement
sustematically tends to overstate or understate true value of measured property
-
random error in measurement
repeated measurements on same individual but gives different results
-
reliable
if random error is small
-
average in measurement
several repeated measurements of same individual is more reliable than a single measurement
-
distribution
variable that tells us what values it takes and how often it takes these
-
frequency table
- raw data
- values || frequency
-
roundoff errors
rounded entries dont quite add to total which is rounded seperately
-
pie chart
show how a whole is divided into parts and forces us to see parts that make a whole
-
bar graph
help distinguish tween variables whose values have meaningful numerical scale
-
categorical variable
places individual into one of several groups of categories
-
quantitative varaible
takes numerical values for which arithmatic operations like ading and averaging make sense
-
pictogram
bar graph in whic pictures replace bars and ar not proportional
-
line-graph
to display change over-time and plits each variable against time
-
histogram
distribution of quantitative variable and bars touch
-
center
midpoint of distribution
-
spread
variability of data (dont count outliers)
-
shape
peaks (unimodal, bimodal, mutlimodal), symmetric
-
right skew (positively skewed)
when the tail goes to the right
-
left skew (negatively skewed)
when the tail goes to the left
-
stemplot
stem is on the left and leaves are on the right
-
median
midpoint of distribution, the # that is positioned half way tween all the observations
-
quartiles Q1 and Q3
midpoints from beginning to median and median to end and divides observations into quarters
-
five number summary
min, Q1, median, Q3, max
-
boxplot
graph of five num sum
-
Mean (x-bar)
average of set of observations
-
mode
most frequent number
-
standard deviation (s)
measures average distance of observations from mean
|
|