-
More than one dependent variable and/or more than one independent variable and their relationships/correlation/etc
Multivariate procedures
-
used to analyze the effects of two or more independent variable on a continuous dependent variable
multiple regression analysis
-
multiple ______ and multiple _____ will be used almost interchangeably
-
_______ analysis is used to make predictions
regression
-
one independent variable (X) is used to predict a dependent variable (Y)
simple regression
-
used to determine a straight line fit to the data that minimizes deviations from the line
linear regression
-
most are small; occur because the correlation between X and Y is not perfect (only when r= 1.00 or -1.00 are they perfect)
errors of prediction (e)
-
standard regression is said to use this because the regression equation solves for a and b in a way that minimizes errors of prediction; more precisely, the solution minimizes the sums of squares of prediction errors
least squares criterion
-
standard regression is sometimes called this.ordinary least square (OLS) regression
ordinary least square (OLS) regression
-
-
Expresses how variation in one variable is associated with variation in another; if r = 0.9 then r squared =0.81 meaning 81% of the variability in Y values can be understood in terms of variability in X values.
correlation coefficient (r?):
-
With correlation coefficients, the stronger the correlation, the better the _______ (the stronger the correlation, the greater the _______ of variance explained)
-
The index when using two or more independent variable (Pearson’s r is used with bivariate correlation)
multiple correlation coefficient (R)
-
R (unlike r) does not have negative values so it can show the _____ of relationship between several independent variables and a dependent variable.
strength
-
R cannot be ______; it ranges from ___ to ____.
-
R is based on _______ scores.
standardized
-
R can show the _____ of a prediction or relationship but NOT the _______.
-
___________ predicts a DV from more than 1 IV.
Multiple Linear Regression
-
What does R squared tell you?
how much all the IVs contribuite to DV
-
What should you do to learn how much influence each IV has on the DV?
Look at the Beta weight
-
Three ways of entering predictor variables.
- Simultaneous
- Hierarchical
- Stepwise
-
Dependent variables in multiple regression analysis (ANOVA) should be measured on a _________ scale; independent variables can be _________.
- interval or ratio
- interval or ratio OR categorical
-
When a regression coefficient (b) is divided by its standard error, the result is a value for the t statistic, which can be used to assess the significance of ____________.
individual predictors
-
A significant t indicates that the regression coefficient (b) is significantly __________.
different from zero
-
In ____________, the coefficients represent the number of units the dependent variable is predicted to change for each unit change in a given independent variable when the effects of other predictors are held constant (they are statistically controlled) - can enhance a study’s internal validity.
multiple regression
-
enters all predictor variables into the regression equation at the same time; there is no basis for considering any particular predictor as causally prior to another.
multiple regression
-
involves entering predictors into the equation in a series of steps; researchers control the order of entry (typically based on theoretical considerations).
hierarchical multiple regression
-
empirically selecting the combination of independent variables with the most predictive power.
stepwise multiple regression
-
the regression coefficients for each z are standardized regression coefficients called ?
beta weights
-
___________ eliminate the problem of differing units by transforming all variables to scores with a mean of 0.0 and a standard deviation of 1.00
standard scores (z scores)
-
__________ are the difference between a score and the mean of that score divided by the standard deviation
z scores
-
What is the problem with beta weights?
the regression coefficients will be the same no matter what the order of entry of the variables, but they are unstable, the value of beta weights tend to fluctuate from sample to sample and change if a variable is added to or subtracted from the regression equation so it is difficult to attach theoretical importance to them
-
Power Analysis for Multiple Regression: a ratio of ______ for simultaneous and hierarchical regression and a ratio of ______ for stepwise
-
Power Analysis for Multiple Regression: N should be greater than _________ times the number of predictors (independent variables)
50 + 8
-
An estimation of the number of participants needed to reject the null that R equals zero based on effect size, number of predictors, desired power, and the significance criterion
power analysis
-
used to compare the means of two or more groups, adjusts for initial differences so that the results more precisely reflect the effect of an intervention
Analysis of Covariance (ANCOVA):
-
offers post-hoc statistical control- assumes randomization.
ANCOVA
-
___________ can statistically control for pretest scores - the posttest score is the DV and the IV is experimental/comparison group status and the covariate is pretest scores
ANCOVA
-
usually continuous variables (ex: anxiety scores) but can sometimes be dichotomous variables (male/female)
covariates
-
independent variable for covariates is a ______-level variable
nominal
-
covariates should be variables that you suspect are correlated with the ________ variable
dependent
-
techniques that fit data to straight-line (linear) solutions; foundation for the t-test, ANOVA, and multiple regression
general linear model (GLM):
-
group of means on the dependent variable after removing the effect of covariates
adjusted means
-
adjusted means allow researchers to determine _________.
net effects
-
techniques that fit data to straight-line (linear) solutions; foundation for such procedures as the t-test, ANOVA, and multiple regression
general linear model (GLM):
-
used to test the significance of differences in group means for multiple dependent variables.
MANOVA
-
allows for the control of confounding variables (covariates) when there are two or more dependent variables.
MANCOVA
-
makes predictions about membership in groups; ex: predict membership in such groups as compliant vs noncompliant patients
- discriminant analysis
- (equation is called discriminant fxn)
-
an equation developed using discriminant analysis for a categorical dependent variable, with independent variables that are either dichotomous or continuous
discriminant function
-
researchers begin with data from people whose group membership is known and develop an equation to predict membership when only measures of the independent variables are available - the _________ indicates to which group each person would likely belong
discriminant function
-
indicates the proportion of variance unaccounted for by predictors
Wilkes’ lambda
-
analyzes the relationship between multiple independent variables and a dependent variable; used to predict categorical dependent variables
logistic regression
-
used in logistic regression to estimate the parameters most likely to have generated the observed data
maximum likelihood estimation (MLE):
-
the factor by which the odds change; provides an estimate
odds ratio
-
dependent variable in binary logistic regression is a _______ variable
dichotomous
-
_______ variables can be continuous variables, categorical variables, or interaction terms; can be entered in an equation in different ways (simultaneous, hierarchical, and stepwise)
predictor
-
________ variables (indicator variables) are a common method of representing dichotomous predictors
dummy-coded
-
one group in an analysis of a variable with more than two categories, given a OR of 1.0 and the other groups (categories of the variable) would have OR’s in relation to the ___________.
reference group
-
based on the residuals for all cases in the analysis (the difference between the observed probability of an event and the predicted probability)
goodness-of-fit statistic
-
compares the prediction model to a hypothetically “perfect” model (one that contains the exact set of predictors needed to duplicate the observed frequencies in the dependent variable)
Hosmer-Lemeshow test
-
to test the significance of individual predictors in the model; distributed as a chi-square
Wald Statistic
-
most frequently reported pseudo R squared index
Nagelkerke
-
widely used by epidemiologists when the dependent variable is a time interval between an initial event (onset of a disease) and a terminal event (death)
survival analysis
-
time-related data are ________ when the observation period does not cover all possible events
censored
-
testing a hypothesized causal explanation of a phenomenon, typically with data from non experimental studies.
Causal Modeling
-
Two approaches to causal modeling.
- Path analysis
- Structural equations modeling (SEM)
-
a method for studying causal patterns among variables; not a method for discovering causes (uses least-squares estimation)
path analysis
-
Model of path analysis where causal flow is unidirectional (variable 2 is a cause of variable 3, and variable 3 is NOT a cause of variable 2)
recursive model
-
the weights representing the effect of one variable on another; indicates the proportion of a standard deviation difference in the caused variable that is directly attributable to a 1 SD difference in the specified causal variable
path coefficient
-
uses maximum likelihood estimation and is a more powerful approach than path analysis (assumes causal flow is recursive/non directional, variables are measured without error, and residuals are uncorrelated - both not usually plausible)
structural equations modeling (SEM):
-
can accommodate measurement errors, correlated residuals, and nonrecursive models (allows for reciprocal causation)
structural equations modeling (SEM):
-
can be used to analyze causal models involving latent variables (an unmeasured variable corresponding to an abstract construct) two phases
structural equations modeling (SEM):
-
Multivariate statistics allow for what two things?
- to examine complex phenomena
- to move have 3 or more variables
-
In ________ one IV is used to predict a DV.
Simple Linear Regression
-
What does R squared tell you?
- accuracy of a prediction equation
- (How much all IVs contribute to the DV)
-
Sample size for simultaneous multiple regression.
20:1 (20 or more per IV)
-
Sample size for hierarchical multiple regression.
20:1 (20 or more per IV)
-
Sample size for Stepwise multiple regression.
40:1 (40 or more per IV)
-
Researchers often try to improve predictions of Y by including multiple IVs, which are often called _______ variables in a multiple regression context.
predictor
-
What is the index in bivariate correlation? With two or more IVs?
- Pearson's r
- multiple correlation coefficient (R)
-
The proportion of variance in Y accounted for by the combined, simultaneous influence of the IVs.
R squared
-
R is never less than the highest r b/w a _______ and the _______.
-
What does a high correlation amond IVs do to the predictive power?
decreases it
-
What happens to increments to R as more IVs are added to the regression equation?
they decrease
-
What is difficult to avoid as more and more variables are added to the regression equation?
redundancy
-
Three tests of significance for mult linear regression.
- Tests of Overall Equation and R
- Tests for Adding Predictors
- Tests of the Regression Coefficients
-
What is the basic null hypotheis in a multiple regression?
R= ZERO
(R= population multiple correlation coefficient)
-
What is used to decide if a third predictor will increase the ability to predict Y after two predictors have been used?
- F-statistic
- (tests for adding predictors)
-
A significant t indicates that the regression coefficient is what?
significantly different from zero
-
In simple regression, the ______ indicates the amt of change in predicted values of Y, for a specified rate of change in X. In multiple regression, the _______ represent the number of units the DV is predicted to change for each unit change in a given IV.
-
Strategy used when there is no basis for considering any particular predictor as causally prior to another and when the predictors are of comparable importance to the research problem.
Simultaneous Multiple Regression
-
Any data for which ANOVA is appropriate can be analyzed by __________, but the reverse is not true.
multiple regression
-
used to examine the effect of a key independent variable after first removing (controlling) the effect of confounding variables
Hierarchical multiple regression
-
the analog of the overall F test in multiple regression (chi-squared distribution)
Goodness-of-fit statistic
-
researchers posit causal linkages among three or more variables and then test whether hypothesized pathways from the causes to the effect are consistent with the data
Causal Modeling
|
|