1. Briey describe the shortcoming of univariate approaches.´╗┐
    They do not accurately take into account the e ffect of other rating variables´╗┐
  2. Minimum Bias Techniques
    • Refers to:
    • balance principle that requires sum of indicated weighted pure prems equals sum of wtd observed LC for every level of every rating var

    Iteratively standardized univariate approaches

    • Select rating structure
    • Select Bias function - compare procedure's observed loss stats to indicated loss stats & measuring mismatch
  3. Identify the circumstances that led to the adoption of multivariate techniques.
    • Computing power
    • Data warehouse initiatives
    • Competitive Pressure
  4. Identify the bene fits of multivariate methods.
    • Adjust for exposure correlations
    • Allow for nature of random process
    • Provide diagnostics
    • Allow interaction variables
    • Considered transparent
  5. Describe how Multivariate Techniques are supposed to be considered transparent
    Regardless of how mathematically sophisticated, must be able to follow and communicate how results are developed
  6. Describe GLM
    • Generalized version of linear models
    • Removes restrictions of normality assumption & constant variance
    • Link function: Defines relationship btwn expected response and linear comb of predictor vars
    • (not necessarily in additive fashion)
  7. Briey describe four reasons an actuary may prefer to model on Loss Cost Data rather than Loss Ratio.
    • Modeling loss ratios requires premium @CRL which can be difficult at granular level
    • Experienced actuaries have an a priori expectation of frequency and severity patterns: In contrast, loss ratio patterns dependent on current rates; Actuary can better distinguish signal from noise
    • Loss ratio models become obsolete when rates and rating structures are changed
    • No commonly accepted distribution for modeling loss ratios
  8. You are modeling driver age for personal automobile bodily injury. The results of a univariate analysis and a multivariate analysis are significantly diff erent. Explain.
    • Disparity suggests age is strongly correlated with another variable in model
    • E.g., prior accident experience, use of auto
    • Univariate results are distorted
  9. Briey describe the bene fits of statistical diagnostics with GLMs.
    • Aid modeler in understanding certainty of results and appropriateness of model.
    • Some can help determine if predictive variable has a systematic eff ect on insurance losses
    • and others assess modeler's assumptions around the link function and error term
  10. Briey describe two of four statistical diagnostics used with GLMs.
    • Standard errors:
    • Narrow standard errors suggest variable is statistically signifi cant;
    • Wide standard errors, often around 1.0, suggest factor detecting mostly noise, and should eliminate from model

    • Deviance tests:
    • Measure how much fitted values diff er from observations;
    • Deviance of models compared to assess whether the additional variables in a broader model are worth keeping
  11. Briefly describe two more statistical diagnostics used with GLMs.
    Consistency with time: Compare results from individual years; Gauge consistency of results from one year to the next

    • Validation:
    • Compare expected outcome of the model with historical results on a hold-out sample of data:
    • Considerable diff erences between actual and expected may indicate model is over or under- fitting
  12. Briey describe over-fi tting a model.
    • Over-fi tting results when variables in model reflect noise or over-specify model with high order polynomials
    • Replicates historical data well but doesn't project future reliably: Future experience unlikely to have same noise
  13. Briey describe under-fi tting a model.
    • Under- fitting if model is omitting statistically signi cant variables
    • Model doesn't have enough explanatory power

    Will predict future outcomes but not help explain what is driving result
  14. Briey describe seven important areas that the actuary needs to consider when using GLMs
    1. Ensuring data is adequate for level of detail of the classi cation ratemaking analysis: Avoiding GIGO principle - garbage in, garbage out

    2. Developing appropriate methods to communicate model results: Considering company's ratemaking objectives

    3. Identifying when anomalous results dictate additional exploratory analysis

    4. Reviewing model results in consideration of both statistical theory and business application

    5. Retrieval of data requires careful consideration: Volume of data, Defi nition of homogeneous claim types, Method of organization (e.g., policy vs. accident year), Treatment of midterm policy changes, Large losses, Underwriting changes during experience period, Eff ect of inflation and loss development

    6. Always must balance stability and responsiveness: Choice of experience period and geographies

    • 7. Commercial considerations: IT constraints, Marketing objectives,
    • Regulatory requirements
  15. Briey describe four tasks to successfully use GLMs in the ratemaking analysis.
    • 1. Have solid background in company's data warehouses
    • 2. Develop some understanding of statistical methods and diagnostics
    • 3. Work collaboratively with other professionals who know portfolio of business
    • 4. Communicate eff ectively with stakeholders of company to ensure the technical results are expressed in relation to company's business objectives
  16. Briefly describe four ways data mining techniques can be used to enhance a ratemaking analysis.
    • 1. Shorten long list of potential explanatory variables to use in GLM
    • 2. Provide guidance in how to categorize discrete variables
    • 3. Reduce dimension of multi-level discrete variables
    • 4. Identifying candidates for interaction variables within GLMs by detecting patterns of interdependency between variables
  17. Identify five data mining techniques and briefly describe their use to enhance the underlying classi fication analysis.
    • 1. Cluster Analysis: Seeks to combine small groups of similar risks into larger homogeneous categories; Targets minimizing di fferences within a category and maximizing diff erence between categories
    • 2. CART (Classi cation and Regression Trees): Develop tree-building algorithms to determine a set of if-then logical conditions; Help improve classi fication and detect interactions between variables; Helps identify strongest list of initial variables and how to categorize each
    • 3. Factor Analysis: Reduce number of parameter estimates in classi cation analysis; May reduce number of variables or levels within a variable
    • 4. MARS (Multivariate Adaptive Regression Spline): Multiple piecewise linear regression where each breakpoint defi nes region for a particular linear regression equation: Use to select breakpoints for categorizing continuous variables
    • 5. Neural Networks: User gathers test data and invokes training algorithms designed to automatically learn structure of the data; Results of neural networks can be fed into GLM
  18. Identify and give an example of each of four types of external data sources used to supplement company data to be used with GLMs.
    • 1. Geo-demographics: E.g., population density of an area, average length of homeownership in an area
    • 2. Weather: E.g., average rainfall, number of days below freezing in an area
    • 3. Property characteristics: E.g., square footage of home or business, quality of fire department in area
    • 4. Information about insured individuals or business: E.g., credit info, occupation
Card Set