econometrics.txt

  1. What is Econometrics?
    Unification of necessary 3 views, Statistics, Economics Theory, and Mathematics.

    A field of economics that concerns itself with the application of mathematical statistics and the tools of statistics inference to the empirical measurement of relationships postulated by economics theory
  2. Symmetric Matrix
    M=M'
  3. Idempotent Matrix
    M=M*M
  4. Vector Space
    • Closed under scalar multiplication
    • Closed under addition
  5. Basis vectors
    A linearly independent set of vectors that span a vector space
  6. Linearly independent vectors
    The only solution for Ax=0 is A=0
  7. Singular vs. Non-singular matrices
    Det(A)=0 <=> Singular
  8. Properties of Determinant
    • 1. one of row(colum)=0 => det=0
    • 2. det(A')=det(A)
    • 3. interchanging two rows(columns) => change the sign of det
    • 4. If 2 rows (columns) are identical => det=0
    • 5. If one row (column) is a multiple of another => det=0
    • 6. Linearly independent of rows (colums) <=> det <>0
    • 7. det(A*B)=det(A)*det(B)
  9. Row Rank & Column Rank
    the maximum number of linearly independent rows (columns)
  10. Properties of Rank
    • 1. rank(A*B)<=min(rank(A),rank(B))
    • 2. rank(A)=rank(A'*A)=rank(A*A')
    • 3. If A is full rank, then Ax<>0 for non-zero x
  11. Inverse matrix
    AA^(-1)=A^(-1)A=I
  12. Properties of inverse matrix
    • 1. det(inv(A))=1/det(A)
    • 2. inv(inv(A))=A
    • 3. inv(A)'=inv(A')
    • 4. A is symmetric => inv(A) is symmetric
    • 5. inv(ABC)=inv(C)inv(B)inv(A)
  13. Characteristic roots & vectors
    =Eigen values & vectors

    • (A-lambdaI)*c=0
    • >> lambda=eigen values
    • >> c=eigen vectors
  14. Properties of characteristic roots
    1. Zero characteristic roots possible

    • 2. Rank of symmetric matrix=# of non-zero characteristic roots
    • => rank of any matrix = # of non-zero eigenvalues of A'A (symmetric)

    3. det=product of its characteristic roots
  15. Trace of a square matrix
    Sum(aii) for all i=1,...,n
  16. Properties of trace
    • 1. tr(A)=tr(A')
    • 2. tr(AB)=tr(BA)
    • 3. tr(ABC)=tr(BCA)=tr(CAB)
    • 4. A scalar = its trace
  17. Quadratic form & definiteness
    q=x'Ax for any non-zero x:

    1. q>0 <=> positive definite <=> eigenvalues all +

    2. q>=0 <=> positive semidefinite <=> some eigenvalues +, some 0

    3. q<0 <=> negative definite <=> eigenvalues all -

    4. q<=0 <=> negative semidefinite <=> some eigenvalues -, some 0

    5. q<>0 <=> indefinite, some eigenvalues -, some +
  18. Properties of a symmetric matrix A of a quadratic form
    1. If A is positive (semi)definite, then det(A)>=0

    2. If A is positive definite, then det(inv(A)) is also <=> characteristic roots of inv(A) are reciprocals of these of A

    3. If (nXK) matrix has full (column) rank, then A'A is positive definite => xA'Ax >0
  19. Compare size of matrices
    • Q. definite of (A-B)
    • => for all non-zero x,
    • x'(A-B)x >0 or <0 ?
    • positive definite or negative definite?
  20. Randome variable
    Continuous vs. Discrete
  21. PDF vs. CDF
    • PDF: f(X=x) continuous f(X=x)=0,
    • CDF: F(c)=sum(integral) x<=c f(x)
  22. Moments
    • 1. r-th moment about the origin: E[Xr]
    • 2. r-th moment about the mean of X: E[(X-E(X))r]
  23. E(X)?
    • sum f(x)*x
    • integral f(x)*x dx
  24. Properties of E(X)
    • 1. E(b)=b, b is a scalar
    • 2. Y=aX+b => E(Y)=aE(X)+b
    • 3. if X and Y are independent, then E(XY)=E(X)*E(Y)
  25. 2nd moment about the mean = variance
    • a measure of dispersion
    • sum f(x)*(x-E(x))2
  26. E[(x-E(x))2]?
    E[X^2]-(E[X]^2)
  27. 3rd moment about the mean
    • skewness
    • if it>0 => positive skew (왼쪽에 봉우리)
    • if it<0 => negative skew (오른쪽에 봉우리)
  28. 4th moment about the mean
    • kurtosis
    • low kurtosis: fat tails
    • high kurtosis: thin tails
  29. Moment Generating Function (MGF)
    E(exp(xt))=M(t)

    => M(n)(t)=E(Xn)
  30. Normal Dist (mu,sigma2)
    f(x)=memorize?!
  31. Standard normal dist
    • Z=(X-mu)/sigma
    • when X~N(mu,sigma2)
  32. Chi square dist(d)
    • d=degrees of freedom
    • Chi(d)=sum d of z2
  33. t distribution (d)
    • t=z/sqr(chi(d)/d)
    • t->z as n->inf
  34. F distribution (n1,n2)
    • [Chi(n1)/n1]/[Chi(n2)/n2]
    • e.g. F[n-1,n-k]=[R2/(n-1)]/[(1-R2)/(n-k)] when H0=all coefficients of CLRM are 0's
  35. Joint Distribution
    f(x,y)
  36. Marginal probability
    • fx(x)=sumyf(x,y)
    • fy(y)=sumxf(x,y)
  37. Independence of joint distribution
    • 1. f(x,y)=fx(x)*fy(y)
    • 2. for any functions g1(x) and g2(y),
    • E[g1(x)g2(y)]=E[g1(x)]*E[g2(y)]
  38. Covariance
    • E[(x-E(x))*(y-E(y))]
    • = E[xy]-E[x]*E[y]
  39. What if X and Y are independent?
    Cov=0
  40. Correlatoin
    Cov(x,y)/(st.dev(x) st.dev(y))
  41. Q. Correlation=0 => independent?
    No
  42. Var-Cov matrix
    • diagonal = var(xi)
    • off-diagonal=Cov(xi,xj)
  43. Conditional Distribution
    f(y|x)=f(x,y)/fx(x)
  44. Distributions of functions of r.v.s
    • a. change of variables
    • b. using MGF

    • a. Assume that we know f(x) & y=g(x)
    • 1. x=g-1(y)
    • 2. dx/dy
    • 3. domain of y
    • 4. f(g-1(y))abs(dx/dy)
    • or f(g-1(y))det(dx/dy)

    b. using MGF e.g. E[exp(axt)]
  45. Statistics
    A function of r.v.s that does not dependent on unknown parameters

    e.g. sample mean, median...
  46. Random sample <=> iid (independently identically distributed)
    A sample of n observations on one or more variables, x1, ..., xn, drawn independently from the same probability distribution f(x1,...,xn|theta)
  47. Estimators vs. Estimates
    Estimators (statistics) = A formula for using data to estimate a parameters

    Estimates = the value you get by plugging data into estimators
  48. Method of moments
    • sample moments=popoulation moments
    • e.g. sum(xi)/n = E[x]
  49. Maximum likelihood estimation
    : likelihood function & log-likelihood fn.
    • cf. dist is known
    • maximize L(theta|x1,...,xn) or lnL(.)
  50. MLE procedures
    • 1. Find L by multiplying f(xi)'s
    • 2. Take the log (not necessarily)
    • 3. Find the theta's to maximize lnL(.)
    • 4. Use FOC=0
    • 5. Check SOC: negative definite
  51. Ways to evaluate estimators
    • 1. Monte-Carlo Analysis
    • 2. Pre-data anlysis (small/large sample properties)
  52. Small Sample Properties
    • 1. Unbiasedness
    • 2. Variance (Precision)
    • 3. Mean Square Error
    • 4. Efficiency
  53. Unbiased
    E(theta_hat)=theta

    Bias=E(theta_hat)-theta
  54. Variance
    We prefer an estimator with smaller variance
  55. MSE (Mean Squared Error)
    • theta_hat=t:
    • MSE(t)=Var(t)+[Bias(t)]2
    • =E[(t-E(t))2]+[E(t)-theta]2
    • =E[(t-theta)2]
  56. Efficiency
    • Unbiased &
    • the smallest variance

    • => Cramer-Rao lower bound
    • if the estimator is unbiased, the variance >=CRLB=[-E[SOC of lnL(.)]]-1

    cf. sufficient condition, not necessary condition
  57. Large sample property
    =asymptotic property as the sample size -> inf

    • 1. consistent
    • 2. asymptotically efficient
  58. Consistency
    plim theta_hat=theta
  59. Asymptotically efficient
    consistent & the smallest asymptotic variance
  60. Convergence in Probability
    • . . p
    • xn-->c
    • limn->inf Pr(|xn-c|>eps)=0
    • <=> limn->infPr(|xn-c|<eps)=1 for any eps>0
  61. Mean Square Convergence
    • . . ms
    • xn--->c
    • mun converges to c & sigma2n converges to 0 as n->inf
  62. Mean Sq. Convergence => Convergence in Probability (not true conversely)
    • Because of Chebyshev's inequality
    • : Pr(|x-mu|>eps)<=(sigma2/eps2)

    • e.g. x_bar (sample mean)
    • E(sample mean)=mu
    • Var(sample mean)=sigma2/n
    • as n->inf, E(.)->mu & Var(.)->0, thus it is consistent
  63. Khinchine's Weak Law of Large numbers
    If x1,...xn is a random iid sample from a distribution with a finite mean E(xn)=mu, then plim(sample mean)=mu
  64. Convergence in Distribution
    • F(x): limiting distribution
    • if limn->inf|Fn(xn)-F(x)|=0 at all continuity points of F(x)
    • . . d
    • xn-->x
  65. Convergence in dist.
    Q. xn converges to constant?
    No. different form the convergence in probability

    Convergence in dist. related to CLT
  66. Lindberg-Levy univariate central limit theorem (Asymptotic normality)
    • Sums of r.v.s (like, sample mean, weighted sum) are normally distributed in large samples, no matter the distribution of the original population
    • Formal def: let x1,...,xn be a random sample from a probabilistic distribution with finite mean mu and finite variance sigma2. Then, sqrt(n)(sample mean of xn - mu) converges to the distribution N(0,sigma2)
  67. Repeated sampling
    Get samples from the identical population distribution
  68. Difference b/w joint dist & likelihood fn.
    • Joint dist = L(x1,..,xn|theta)
    • Likelihood = L(theta|x1,...,xn)
  69. Classical estimators vs. Bayesian approach
    estimation is not one of deducing the values of parameter, but rather one of continually updating and sharpening our subjective beliefs about the state of the world
Author
lucia831124
ID
78957
Card Set
econometrics.txt
Description
Econometrics
Updated