-
Biserial correlation
a standardized measure of the strength of relationship between two variables when one of the two variables is dichotomous. The biserial correlation coefficient is used when one variable is a continuous dichotomy
-
Bivariate correlation
a correlation between two variables
-
Coefficient of determination
the proportion of variance in one variable explained by a second variable. It is the Person correlation coefficient squared.
-
Covariance
a measure of the 'average' relationship between two variables. It is the average cross-product deviation
-
Cross-product deviations
a measure of the 'total' relationship between two variables. It is the deviation of one variable from its mean multiplied by the other variable's deviation from its mean
-
Kendall's tau
a non-parametric correlation coefficient, but should be used in preference for a small data set with a large number of tied ranks
-
Partial corelation
a measure of the relationship between two variables while 'controlling' the effect of one or more additional variables has on both
-
Pearson correlation coefficient
or Pearson's product-moment correlation coefficient to give its full name, is a standardized measure of the strength of relationship between two variables. It can take any value from -1 (as one variable changes, the other changes in the opposite direction by the same amount), though 0 (as one variable changes the other doesn't change at all), to +1 (as one variable changes, the other changes in the same direction by the same amount).
-
Point-biserial correlation
a standardized measue of strength of relationship between two variables when on of the two variables is dichotomous. The point-biserial correlation coefficient is used when the dichotomy is discrete, or true, dichotomy. An example of this is pregnancy: you can be either pregnant or not, there is no in between
-
Semi-partial correlation
a measure of the relationship between two variables whle 'controlling' the effect that one or more additional variables has on one of those variables. If we call our variables x and y, it gives us a measure of the variance in y that x alone shares
-
Spearman's correlation coefficient
a standardized measure of the strength of relationship between two variables that does not rely on the assumptions of a parametric test. It is Pearson's correlation coefficient performed on data that have been converted into ranked scores
-
Standardization
the process of converting a variable into a standard unit of measurement. The unit of measurement typically used is standard deviation units. Standardization allows us to compare data when differnt units of measurement have been used
-
i
standardized regression coefficient. Indicates the strength of relationship between a given predictor, i, and an outcome in a standardized form. It is the change in the outcome associated with a one standard deviation change in the predictor
-
DFFit
a measure of the influence of a case. It is the difference between the adjusted predicted value of a particular case. If a case is not influential then its DFFit should be zero - hence, we expect non-influential cases to have samll DFFit values. However, we have the problem that this statistic depends on the units of measurement of the outcome and so a DFFit of 0.5 will be very small if the outcome ranges from 1 to 100, but very large if the outcome varies from 0 to 1
-
F-ratio
a test statistic with a known probability distribution. It is the ratio of the average variability in the data that a given model can explain to the average variability unexplained by that same model. It is used to test the overall fit of the model in simple regression and multiple regression, and to test for overall differences between group means in experiments.
-
Generalization
the ability of a statistical model to say something beyond the set of obsevations that spawned it. If a model generalized it is assumed that predictions from that model cam be applied not just to the sample on which it is based, but to a wider population from which the sample came.
-
Goodness of fit
an index of how well a model fits the data from which it was generated. It's usually based on how well the data predicted by the model correspond to the data that were actually collected
-
Heteroscedasticity
the opposite of homoscedasticity. This occurs when the residuals at eah level of the predictor variables have unequal variances. Put another way, at each point along any predictor variable, the spread of residuals is different
-
Hierarchical regression
a method of multiple regression in which the order in which predictors are entered into the regression model is determined by the researcher based on previous research: variables already known to be predictors are entered first, new variables are entered subsequently.
-
Homoscedasticity
an assumption in regression analysis that the residuals at each level of the predictor variables have similar variances
-
Independent errors
for any two observations in regression the residuals should be uncorrelated (or independent)
-
Mean squares
a measure of average variability.
-
Model sum of squares
a measure of the total amount of variability for which a model can account. It is the difference between the total sum of squares and the residual sum of squares
-
Multicollinearity
a situation in which two or more variables are very closely linearly related
-
Multiple R
the multiple correlation coefficient. it is the correlation between the observed values of an outcome and the values of the outcome predicted by a multiple regression model
-
Multiple regression
an extension of simple regression in which an outcome is predicted by a linear combination of two or more predictor variables
-
Outcome variable
a variable whose values we are trying to predict from one or more predictor variables
-
Perfect collinerity
exists when at least one predictor in a regression model is a perfect linear combination of the others
-
Predictor variable
a variable that is used to try to predict values of another variable known as an outcome variable
-
Residual
the difference between the value a model predicts and the value observed in the data on which the model is based
-
Residual sum of squares
a measure of the variability that cannot be explained by the model fitted to the data. It is the total squared deviance between the obsevations, and the value of those observation predicted by whatever model is fitteed to the data
-
Shrinkage
the loss of predictive power of a regression model if the model had been derived from the population from which the sample was taken, rather than the sample itself
-
Simple regression
a linear model in which one variable or outcome is predicted from a single predictor variable
-
Standardized residuals
the residuals of a model expressed in standard deviation units
-
Stepwise regression
a method of multiple regression in which variables are entered into the model based on a statistical criterion
-
Suppressor effects
when a predictor has a significant effect but only when another variable is held constant
-
t-statistics
student's t is a test statistic with a known probability distribution
-
Tolerance
tolerance statistics measure multicollinearity and are simply the reciprocal of the variance inflation factor (1/VIF)
-
Total sum of squares
a measure of the total variability within a set of observations
-
Unstandardized residuals
the residuals of a model is expressed in the units in which the original outcome variable was measured
-
Variance inflation factor (VIF)
a measure of multicollinearity. The VIF indicates whether a predictor has a strong linear relationship with the other predictor
|
|