-
Limitations of Linear Models
- difficult to assert normality and constant var for resp variable (can transform like ln(x) to satisfy)
- values from resp var may be restricted to be > 0 (violates assumption of normality)
- if resp var strictly > 0 then σ2 → 0 as μ → 0 ⇒σ2 is a fctn of μ
- additivity effect not realistic for some applications
-
Generalized Linear Model assumptions
- (GLM1) random component: each cpnt of Y is independent and is from one of the exponential family of distribution
- (GLM2) systematic component: the p covariates are combined to give the linear predictor η = X β
- (GLM3) link fctn: relationship btwn rdm & syst cpnts is specified via link fctn g that is differentiable & monotonic such that E[Y] = μ = g-1(η)
-
What changed from LM to GLM
- no additivity assumption
- no assumption that the response var has constant var
- Var(Yi) = φVar(μi) / ωi
- reponse variable is not assumed to be normal, but rather from a member of the exponential family
- Y depends on X first & then g-1(ΣβiXi) + ε
-
Advantages of exponential family
- (+) each dist is fully specified in terms of μ and σ2
- (+) σ2 is a function of its μ: Var(Yi) = φV(μi) / ωi
- (+) incl normal, poisson, gamma, binomial, inv gaussian
-
Canonical link function
- Distribution | g(x) | g-1(x)
- Normal | x | x
- Poisson | ln(x) | ex
- Gamma | 1/x | 1/x
- Binomial | ln(x(1-x)) | ex(1+ex)
- Inverse Gaussian | 1/x2 | 1/√x
-
GLM Aliasing
- solving routine to remove as many param as necessary to make the model uniquely defined
- occurs when there is a linear dependency among covariates
- intrinsic: dependencies inherent in the definition of covariates
- extrinsinc: from the nature of the data (eg: if X = . Y is .)
- choice of alias does not modify fitted values
- near aliasing: occurs when 2 var are almost 100% correlated. Convergence problems may occur, so exclude, delete or reclassify
-
GLM Model Diagnostics
- std error: speed w which log-likelihood falls from the maximum given a change in parameter
- deviance test: measures how much fitted values diff from obs. Adjusts for V(x) giving more weight to deviance if V(x) is small. Helps assess theoretical significance of a particular factor.
|
|