GU Research Mod 10

Home

Get App

Take Quiz

Create

More than one dependent variable and/or more than one independent variable and their relationships/correlation/etc

Multivariate procedures
used to analyze the effects of two or more independent variable on a continuous dependent variable

multiple regression analysis
multiple ______ and multiple _____ will be used almost interchangeably
- regression
- correlation
_______ analysis is used to make predictions

regression
one independent variable (X) is used to predict a dependent variable (Y)

simple regression
used to determine a straight line fit to the data that minimizes deviations from the line

linear regression
most are small; occur because the correlation between X and Y is not perfect (only when r= 1.00 or -1.00 are they perfect)

errors of prediction (e)
standard regression is said to use this because the regression equation solves for a and b in a way that minimizes errors of prediction; more precisely, the solution minimizes the sums of squares of prediction errors

least squares criterion
standard regression is sometimes called this.ordinary least square (OLS) regression

ordinary least square (OLS) regression
error terms

residuals
Expresses how variation in one variable is associated with variation in another; if r = 0.9 then r squared =0.81 meaning 81% of the variability in Y values can be understood in terms of variability in X values.

correlation coefficient (r?):
With correlation coefficients, the stronger the correlation, the better the _______ (the stronger the correlation, the greater the _______ of variance explained)
- prediction
- percentage
The index when using two or more independent variable (Pearson’s r is used with bivariate correlation)

multiple correlation coefficient (R)
R (unlike r) does not have negative values so it can show the _____ of relationship between several independent variables and a dependent variable.

strength
R cannot be ______; it ranges from ___ to ____.
- negative
- 0-1
R is based on _______ scores.

standardized
R can show the _____ of a prediction or relationship but NOT the _______.
- strength
- direction
___________ predicts a DV from more than 1 IV.

Multiple Linear Regression
What does R squared tell you?

how much all the IVs contribuite to DV
What should you do to learn how much influence each IV has on the DV?

Look at the Beta weight
Three ways of entering predictor variables.
- Simultaneous
- Hierarchical
- Stepwise
Dependent variables in multiple regression analysis (ANOVA) should be measured on a _________ scale; independent variables can be _________.
- interval or ratio
- interval or ratio OR categorical
When a regression coefficient (b) is divided by its standard error, the result is a value for the t statistic, which can be used to assess the significance of ____________.

individual predictors
A significant t indicates that the regression coefficient (b) is significantly __________.

different from zero
In ____________, the coefficients represent the number of units the dependent variable is predicted to change for each unit change in a given independent variable when the effects of other predictors are held constant (they are statistically controlled) - can enhance a study’s internal validity.

multiple regression
enters all predictor variables into the regression equation at the same time; there is no basis for considering any particular predictor as causally prior to another.

multiple regression
involves entering predictors into the equation in a series of steps; researchers control the order of entry (typically based on theoretical considerations).

hierarchical multiple regression
empirically selecting the combination of independent variables with the most predictive power.

stepwise multiple regression
the regression coefficients for each z are standardized regression coefficients called ?

beta weights
___________ eliminate the problem of differing units by transforming all variables to scores with a mean of 0.0 and a standard deviation of 1.00

standard scores (z scores)
__________ are the difference between a score and the mean of that score divided by the standard deviation

z scores
What is the problem with beta weights?

the regression coefficients will be the same no matter what the order of entry of the variables, but they are unstable, the value of beta weights tend to fluctuate from sample to sample and change if a variable is added to or subtracted from the regression equation so it is difficult to attach theoretical importance to them
Power Analysis for Multiple Regression: a ratio of ______ for simultaneous and hierarchical regression and a ratio of ______ for stepwise
- 20:1
- 40:1
Power Analysis for Multiple Regression: N should be greater than _________ times the number of predictors (independent variables)

50 + 8
An estimation of the number of participants needed to reject the null that R equals zero based on effect size, number of predictors, desired power, and the significance criterion

power analysis
used to compare the means of two or more groups, adjusts for initial differences so that the results more precisely reflect the effect of an intervention

Analysis of Covariance (ANCOVA):
offers post-hoc statistical control- assumes randomization.

ANCOVA
___________ can statistically control for pretest scores - the posttest score is the DV and the IV is experimental/comparison group status and the covariate is pretest scores

ANCOVA
usually continuous variables (ex: anxiety scores) but can sometimes be dichotomous variables (male/female)

covariates
independent variable for covariates is a ______-level variable

nominal
covariates should be variables that you suspect are correlated with the ________ variable

dependent
techniques that fit data to straight-line (linear) solutions; foundation for the t-test, ANOVA, and multiple regression

general linear model (GLM):
group of means on the dependent variable after removing the effect of covariates

adjusted means
adjusted means allow researchers to determine _________.

net effects
techniques that fit data to straight-line (linear) solutions; foundation for such procedures as the t-test, ANOVA, and multiple regression

general linear model (GLM):
used to test the significance of differences in group means for multiple dependent variables.

MANOVA
allows for the control of confounding variables (covariates) when there are two or more dependent variables.

MANCOVA
makes predictions about membership in groups; ex: predict membership in such groups as compliant vs noncompliant patients
- discriminant analysis
- (equation is called discriminant fxn)
an equation developed using discriminant analysis for a categorical dependent variable, with independent variables that are either dichotomous or continuous

discriminant function
researchers begin with data from people whose group membership is known and develop an equation to predict membership when only measures of the independent variables are available - the _________ indicates to which group each person would likely belong

discriminant function
indicates the proportion of variance unaccounted for by predictors

Wilkes’ lambda
analyzes the relationship between multiple independent variables and a dependent variable; used to predict categorical dependent variables

logistic regression
used in logistic regression to estimate the parameters most likely to have generated the observed data

maximum likelihood estimation (MLE):
the factor by which the odds change; provides an estimate

odds ratio
dependent variable in binary logistic regression is a _______ variable

dichotomous
_______ variables can be continuous variables, categorical variables, or interaction terms; can be entered in an equation in different ways (simultaneous, hierarchical, and stepwise)

predictor
________ variables (indicator variables) are a common method of representing dichotomous predictors

dummy-coded
one group in an analysis of a variable with more than two categories, given a OR of 1.0 and the other groups (categories of the variable) would have OR’s in relation to the ___________.

reference group
based on the residuals for all cases in the analysis (the difference between the observed probability of an event and the predicted probability)

goodness-of-fit statistic
compares the prediction model to a hypothetically “perfect” model (one that contains the exact set of predictors needed to duplicate the observed frequencies in the dependent variable)

Hosmer-Lemeshow test
to test the significance of individual predictors in the model; distributed as a chi-square

Wald Statistic
most frequently reported pseudo R squared index

Nagelkerke
widely used by epidemiologists when the dependent variable is a time interval between an initial event (onset of a disease) and a terminal event (death)

survival analysis
time-related data are ________ when the observation period does not cover all possible events

censored
testing a hypothesized causal explanation of a phenomenon, typically with data from non experimental studies.

Causal Modeling
Two approaches to causal modeling.
- Path analysis
- Structural equations modeling (SEM)
a method for studying causal patterns among variables; not a method for discovering causes (uses least-squares estimation)

path analysis
Model of path analysis where causal flow is unidirectional (variable 2 is a cause of variable 3, and variable 3 is NOT a cause of variable 2)

recursive model
the weights representing the effect of one variable on another; indicates the proportion of a standard deviation difference in the caused variable that is directly attributable to a 1 SD difference in the specified causal variable

path coefficient
uses maximum likelihood estimation and is a more powerful approach than path analysis (assumes causal flow is recursive/non directional, variables are measured without error, and residuals are uncorrelated - both not usually plausible)

structural equations modeling (SEM):
can accommodate measurement errors, correlated residuals, and nonrecursive models (allows for reciprocal causation)

structural equations modeling (SEM):
can be used to analyze causal models involving latent variables (an unmeasured variable corresponding to an abstract construct) two phases

structural equations modeling (SEM):
Multivariate statistics allow for what two things?
- to examine complex phenomena
- to move have 3 or more variables
In ________ one IV is used to predict a DV.

Simple Linear Regression
What does R squared tell you?
- accuracy of a prediction equation
- (How much all IVs contribute to the DV)
Sample size for simultaneous multiple regression.

20:1 (20 or more per IV)
Sample size for hierarchical multiple regression.

20:1 (20 or more per IV)
Sample size for Stepwise multiple regression.

40:1 (40 or more per IV)
Researchers often try to improve predictions of Y by including multiple IVs, which are often called _______ variables in a multiple regression context.

predictor
What is the index in bivariate correlation? With two or more IVs?
- Pearson's r
- multiple correlation coefficient (R)
The proportion of variance in Y accounted for by the combined, simultaneous influence of the IVs.

R squared
R is never less than the highest r b/w a _______ and the _______.
- predictor
- DV
What does a high correlation amond IVs do to the predictive power?

decreases it
What happens to increments to R as more IVs are added to the regression equation?

they decrease
What is difficult to avoid as more and more variables are added to the regression equation?

redundancy
Three tests of significance for mult linear regression.
- Tests of Overall Equation and R
- Tests for Adding Predictors
- Tests of the Regression Coefficients
What is the basic null hypotheis in a multiple regression?

R= ZERO

(R= population multiple correlation coefficient)
What is used to decide if a third predictor will increase the ability to predict Y after two predictors have been used?
- F-statistic
- (tests for adding predictors)
A significant t indicates that the regression coefficient is what?

significantly different from zero
In simple regression, the ______ indicates the amt of change in predicted values of Y, for a specified rate of change in X. In multiple regression, the _______ represent the number of units the DV is predicted to change for each unit change in a given IV.
- value of b
- coefficients
Strategy used when there is no basis for considering any particular predictor as causally prior to another and when the predictors are of comparable importance to the research problem.

Simultaneous Multiple Regression
Any data for which ANOVA is appropriate can be analyzed by __________, but the reverse is not true.

multiple regression
used to examine the effect of a key independent variable after first removing (controlling) the effect of confounding variables

Hierarchical multiple regression
the analog of the overall F test in multiple regression (chi-squared distribution)

Goodness-of-fit statistic
researchers posit causal linkages among three or more variables and then test whether hypothesized pathways from the causes to the effect are consistent with the data

Causal Modeling

Author

MeganM

311613

Card Set

GU Research Mod 10

Description

Mod 10 Chapter 18 Multivariate Stats

Updated

2015-11-18T11:52:06Z