
Hazard group
Collection of workers compensation classifications that have relatively similar expected XS loss factor over a broad range of limits

1993 NCCI review
 used principal components analysis on 3 yrs
 serious clm freq / total clm frq by class / statewide
 serious clm indemnity severity by class / statewide
 serious PP by class / statewide

2007 NCCI review
 based on class ELF & cluster analysis
 ELF vary by state, but not hazard group
 R_{j}(L) = Σ w_{i,j} s_{i} (L / μ_{i,j})
 w_{i,j} = % loss due to injury type i
 S_{i} = state normlized xs ratio function
 L / μ_{i,j} = entry ratio point
 XS ratio vector = R_{C} = (R_{C}(L_{1}), ..., R_{C}(L_{n}))

Corro & Engl
A distribution is characterized by its excess ratios and so there is no loss of information in working with xs ratios rather than w size of loss

Robertson hazard group credibility
 z = min(n / n+ k * 1.5, 1), k = mean clm cnt
 1  z given to R_{HG} (previous hazard group)
 k  using median: too low, z too high
 exl medical only
 incl only serious claims
 k = mean of all classes w some minimum threshould → rejected, k was too high

Building hazard groups  Limits
 how to choose n and actual limits
 correlation btwn neighboring XS ratios is high
 looked at more limits, but they weren't gaining much info due to strong correlation for closer limits

Euclidian distance btwn vectors (L^{2})

Cluster method
 if 2 objects are in diff clusters in the k cluster partition, then they will be in different clusters in all partitions w more than k elements
 kmean: for k clusters, group classes into k groups as to minimize the euclidian distance between elements
 centroid: avg xs ratio vector for i^{th} group
 HG_{i} = # of classes in hazard group i

Optimal # of clusters
 Calinski & Harabasz statistic = [trace(B) / (k 1) / [trace(W)] / (n  k)]
 n = # of classes, k = # of clusters
 maximize it (high means higher btwn & lower within)
 Cubic Clustering Criterion (CCC): compares amt of variance explained by a given set of clusters to rdm clusters
 () less reliable when data is elongated (variables are highly correlated)

Reasons why NCCI kept 7 hazard groups
 Calinksi & Harabasz gave right answer more time than CCC on control data
 CCC less reliable when var are highly correlated
 both test indicated 7 when only class w high cred were used
 9 HG sln produces crossovers

NCCI update  why B & E have many classes
 XS ratios were credibility weighted w prior HG
 low cred classes have similar vectors → end up together

