Briey describe the shortcoming of univariate approaches.
They do not accurately take into account the effect of other rating variables
Minimum Bias Techniques
- Refers to:
- balance principle that requires sum of indicated weighted pure prems equals sum of wtd observed LC for every level of every rating var
Iteratively standardized univariate approaches
- Select rating structure
- Select Bias function - compare procedure's observed loss stats to indicated loss stats & measuring mismatch
Identify the circumstances that led to the adoption of multivariate techniques.
- Computing power
- Data warehouse initiatives
- Competitive Pressure
Identify the benefits of multivariate methods.
- Adjust for exposure correlations
- Allow for nature of random process
- Provide diagnostics
- Allow interaction variables
- Considered transparent
Describe how Multivariate Techniques are supposed to be considered transparent
Regardless of how mathematically sophisticated, must be able to follow and communicate how results are developed
- Generalized version of linear models
- Removes restrictions of normality assumption & constant variance
- Link function: Defines relationship btwn expected response and linear comb of predictor vars
- (not necessarily in additive fashion)
Briey describe four reasons an actuary may prefer to model on Loss Cost Data rather than Loss Ratio.
- Modeling loss ratios requires premium @CRL which can be difficult at granular level
- Experienced actuaries have an a priori expectation of frequency and severity patterns: In contrast, loss ratio patterns dependent on current rates; Actuary can better distinguish signal from noise
- Loss ratio models become obsolete when rates and rating structures are changed
- No commonly accepted distribution for modeling loss ratios
You are modeling driver age for personal automobile bodily injury. The results of a univariate analysis and a multivariate analysis are significantly different. Explain.
- Disparity suggests age is strongly correlated with another variable in model
- E.g., prior accident experience, use of auto
- Univariate results are distorted
Briey describe the benefits of statistical diagnostics with GLMs.
- Aid modeler in understanding certainty of results and appropriateness of model.
- Some can help determine if predictive variable has a systematic effect on insurance losses
- and others assess modeler's assumptions around the link function and error term
Briey describe two of four statistical diagnostics used with GLMs.
- Standard errors:
- Narrow standard errors suggest variable is statistically significant;
- Wide standard errors, often around 1.0, suggest factor detecting mostly noise, and should eliminate from model
- Deviance tests:
- Measure how much fitted values differ from observations;
- Deviance of models compared to assess whether the additional variables in a broader model are worth keeping
Briefly describe two more statistical diagnostics used with GLMs.
Consistency with time:
Compare results from individual years; Gauge consistency of results from one year to the next
- Compare expected outcome of the model with historical results on a hold-out sample of data:
- Considerable differences between actual and expected may indicate model is over or under-fitting
Briey describe over-fitting a model.
- Over-fitting results when variables in model reflect noise or over-specify model with high order polynomials
- Replicates historical data well but doesn't project future reliably: Future experience unlikely to have same noise
Briey describe under-fitting a model.
- Under-fitting if model is omitting statistically signicant variables
- Model doesn't have enough explanatory power
Will predict future outcomes but not help explain what is driving result
Briey describe seven important areas that the actuary needs to consider when using GLMs
1. Ensuring data is adequate for level of detail of the classication ratemaking analysis:
Avoiding GIGO principle - garbage in, garbage out
2. Developing appropriate methods to communicate model results:
Considering company's ratemaking objectives
3. Identifying when anomalous results dictate additional exploratory analysis
4. Reviewing model results in consideration of both statistical theory and business application
5. Retrieval of data requires careful consideration:
Volume of data, Definition of homogeneous claim types, Method of organization (e.g., policy vs. accident year), Treatment of midterm policy changes, Large losses, Underwriting changes during experience period, Effect of inflation and loss development
6. Always must balance stability and responsiveness: Choice of experience period and geographies
- 7. Commercial considerations: IT constraints, Marketing objectives,
- Regulatory requirements
Briey describe four tasks to successfully use GLMs in the ratemaking analysis.
- 1. Have solid background in company's data warehouses
- 2. Develop some understanding of statistical methods and diagnostics
- 3. Work collaboratively with other professionals who know portfolio of business
- 4. Communicate effectively with stakeholders of company to ensure the technical results are expressed in relation to company's business objectives
Briefly describe four ways data mining techniques can be used to enhance a ratemaking analysis.
- 1. Shorten long list of potential explanatory variables to use in GLM
- 2. Provide guidance in how to categorize discrete variables
- 3. Reduce dimension of multi-level discrete variables
- 4. Identifying candidates for interaction variables within GLMs by detecting patterns of interdependency between variables
Identify five data mining techniques and briefly describe their use to enhance the underlying classification analysis.
- 1. Cluster Analysis: Seeks to combine small groups of similar risks into larger homogeneous categories; Targets minimizing differences within a category and maximizing difference between categories
- 2. CART (Classication and Regression Trees): Develop tree-building algorithms to determine a set of if-then logical conditions; Help improve classification and detect interactions between variables; Helps identify strongest list of initial variables and how to categorize each
- 3. Factor Analysis: Reduce number of parameter estimates in classication analysis; May reduce number of variables or levels within a variable
- 4. MARS (Multivariate Adaptive Regression Spline): Multiple piecewise linear regression where each breakpoint defines region for a particular linear regression equation: Use to select breakpoints for categorizing continuous variables
- 5. Neural Networks: User gathers test data and invokes training algorithms designed to automatically learn structure of the data; Results of neural networks can be fed into GLM
Identify and give an example of each of four types of external data sources used to supplement company data to be used with GLMs.
- 1. Geo-demographics: E.g., population density of an area, average length of homeownership in an area
- 2. Weather: E.g., average rainfall, number of days below freezing in an area
- 3. Property characteristics: E.g., square footage of home or business, quality of fire department in area
- 4. Information about insured individuals or business: E.g., credit info, occupation