A.06.Anderson et al.

  1. Limitations of Linear Models
    • difficult to assert normality and constant var for resp variable (can transform like ln(x) to satisfy)
    • values from resp var may be restricted to be > 0 (violates assumption of normality)
    • if resp var strictly > 0 then σ2 → 0 as μ → 0 ⇒σ2 is a fctn of μ 
    • additivity effect not realistic for some applications
  2. Generalized Linear Model assumptions
    • (GLM1) random component: each cpnt of Y is independent and is from one of the exponential family of distribution
    • (GLM2) systematic component: the p covariates are combined to give the linear predictor η = X β
    • (GLM3) link fctn: relationship btwn rdm & syst cpnts is specified via link fctn g that is differentiable & monotonic such that E[Y] = μ = g-1(η)
  3. What changed from LM to GLM
    • no additivity assumption
    • no assumption that the response var has constant var
    • Var(Yi) = φVar(μi) / ωi
    • reponse variable is not assumed to be normal, but rather from a member of the exponential family
    • Y depends on X first & then g-1(ΣβiXi) + ε
  4. Advantages of exponential family
    • (+) each dist is fully specified in terms of μ and σ2
    • (+) σ2 is a function of its μ: Var(Yi) = φV(μi) / ωi
    • (+) incl normal, poisson, gamma, binomial, inv gaussian
  5. Canonical link function
    • Distribution         | g(x)              | g-1(x)
    • Normal                 | x                   | x
    • Poisson                 | ln(x)             | ex
    • Gamma                 |  1/x               | 1/x
    • Binomial                | ln(x(1-x))     | ex(1+ex)
    • Inverse Gaussian | 1/x2              | 1/√x
  6. GLM Aliasing
    • solving routine to remove as many param as necessary to make the model uniquely defined
    • occurs when there is a linear dependency among covariates
    • intrinsic: dependencies inherent in the definition of covariates
    • extrinsinc: from the nature of the data (eg: if X = . Y is .)
    • choice of alias does not modify fitted values
    • near aliasing: occurs when 2 var are almost 100% correlated. Convergence problems may occur, so exclude, delete or reclassify
  7. GLM Model Diagnostics
    • std error: speed w which log-likelihood falls from the maximum given a change in parameter
    • deviance test: measures how much fitted values diff from obs. Adjusts for V(x) giving more weight to deviance if V(x) is small. Helps assess theoretical significance of a particular factor.
Author
Exam8
ID
161157
Card Set
A.06.Anderson et al.
Description
A practitioner's Guide to Generalized Linear Models
Updated