
Briey describe the shortcoming of univariate approaches.
They do not accurately take into account the effect of other rating variables

Identify the circumstances that led to the adoption of multivariate techniques.
 Computing power
 Data warehouse initiatives
 Competitive Pressure

Identify the benefits of multivariate methods.
 Adjust for exposure correlations
 Allow for nature of random process
 Provide diagnostics
 Allow interaction variables
 Considered transparent

Briey describe four reasons an actuary may prefer to model on Loss Cost Data rather than Loss Ratio.
 Modeling loss ratios requires premium @CRL which can be difficult at granular level
 Experienced actuaries have an a priori expectation of frequency and severity patterns: In contrast, loss ratio patterns dependent on current rates; Actuary can better distinguish signal from noise
 Loss ratio models become obsolete when rates and rating structures are changed
 No commonly accepted distribution for modeling loss ratios

You are modeling driver age for personal automobile bodily injury. The results of a univariate analysis and a multivariate analysis are signicantly different. Explain.
 Disparity suggests age is strongly correlated with another variable in model
 E.g., prior accident experience, use of auto
 Univariate results are distorted

Briey describe the benefits of statistical diagnostics with GLMs.
Aid modeler in understanding certainty of results and appropriateness of model. Some can help determine if predictive variable has a systematic effect on insurance losses and others assess modeler's assumptions around the link function and error term

Briey describe four statistical diagnostics used with GLMs.
 Standard errors: Narrow standard errors suggest variable is statistically significant; Wide standard errors, often around 1.0, suggest factor detecting mostly noise, and should eliminate from model
 Deviance tests: Measure how much fitted values differ from observations; Deviance of models compared to assess whether the additional variables in a broader model are worth keeping

Briefly describe two more statistical diagnostics used with GLMs.
 Consistency with time: Compare results from individual years; Gauge consistency of results from one year to the next
 Validation: One option to compare expected outcome of the model with historical results on a holdout sample of data: Considerable differences between actual and expected may indicate model is over or underfitting

Briey describe overfitting a model.
 Overfitting results when variables in model reflect noise or overspecify model with high order polynomials
 Replicates historical data well but doesn't project future reliably: Future experience unlikely to have same noise

Briey describe underfutting a model.
 Underfitting if model is omitting statistically signicant variables
 Model doesn't have enough explanatory power

Briey describe seven important areas that the actuary needs to consider when using GLMs
 1. Ensuring data is adequate for level of detail of the classication ratemaking analysis: Avoiding GIGO principle  garbage in, garbage out
 2. Developing appropriate methods to communicate model results: Considering company's ratemaking objectives
 3. Commercial considerations: IT constraints, Marketing objectives, Regulatory requirements
 4. Identifying when anomalous results dictate additional exploratory analysis
 5. Reviewing model results in consideration of both statistical theory and business application
 6. Retrieval of data requires careful consideration: Volume of data, Definition of homogeneous claim types, Method of organization (e.g., policy vs. accident year), Treatment of midterm policy changes, Large losses, Underwriting changes during experience period, Effect of inflation and loss development
 7. Always must balance stability and responsiveness: Choice of experience period and geographies

Briey describe four actions the actuary should take to successfully use GLMs in the ratemaking analysis.
 1. Have solid background in company's data warehouses
 2. Develop some understanding of statistical methods and diagnostics
 3. Work collaboratively with other professionals who know portfolio of business
 4. Communicate effectively with stakeholders of company to ensure the technical results are expressed in relation to company's business objectives

Briey describe four ways data mining techniques can be used to enhance a ratemaking analysis.
 1. Shorten long list of potential explanatory variables to use in GLM
 2. Provide guidance in how to categorize discrete variables
 3. Reduce dimension of multilevel discrete variables
 4. Identifying candidates for interaction variables within GLMs by detecting patterns of interdependency between variables

Identify five data mining techniques and briefly describe their use to enhance the underlying classification analysis.
 1. Cluster Analysis: Seeks to combine small groups of similar risks into larger homogeneous categories; Targets minimizing differences within a category and maximizing difference between categories
 2. CART (Classication and Regression Trees): Develop treebuilding algorithms to determine a set of ifthen logical conditions; Help improve classification and detect interactions between variables; Helps identify strongest list of initial variables and how to categorize each
 3. Factor Analysis: Reduce number of parameter estimates in classication analysis; May reduce number of variables or levels within a variable
 4. MARS (Multivariate Adaptive Regression Spline): Multiple piecewise linear regression where each breakpoint defines region for a particular linear regression equation: Use to select breakpoints for categorizing continuous variables
 5. Neural Networks: User gathers test data and invokes training algorithms designed to automatically learn structure of the data; Results of neural networks can be fed into GLM

Identify and give an example of each of four types of external data sources used to supplement company data to be used with GLMs.
 1. Geodemographics: E.g., population density of an area, average length of homeownership in an area
 2. Weather: E.g., average rainfall, number of days below freezing in an area
 3. Property characteristics: E.g., square footage of home or business, quality of fire department in area
 4. Information about insured individuals or business: E.g., credit info, occupation

