Loss Simulation Model
Testing and Enhancement
Casualty Loss Reserve Sem inar
By
K ailan Shang
Sept. 2011
Agenda
Research Overview
Model Testing
Real Data
Model Enhancement
Further Development
Enterprise Risk Management
I. Research Overview
Background Why use the LSM
Reserving is a challenging task which requires a lot of
judgements on assumption setting
The loss simulation model (LSM) is a tool created by the
CAS Loss Simulation Model Working Party (LSMWP) to
generate claims that can be used to test loss reserving
methods and models
It helps us understand the impact of assumptions on
reserving from a different perspective distribution based
on simulations that resemble the real experience
In addition, stochastic reserving is also a popular trend.
Enterprise Risk Management
Background How to use the LSM
Real Claim
Data and
Reserve Data
fit into statistical models
frequency
trend
severity
state
Loss
Simulation
Model
run simulations
Stochastic
claim and
reserve data
Choose
the best
reserve
method
Compare
against the
simulated
claim data
reserve distributions
Apply
different
reserve
methods
Pass
Test against
real
experience /
model
assumption
We do not expect an accurate estimation of the claim amount.
We are more concerned about the adequacy of our reserve.
At what probability that the reserve is expected to be below the final
payment?
Enterprise Risk Management
Background How to use the LSM
Amount
10
15
20
25
30
Claim Distribution vs Reserve Distribution
0.16
Claim
prob. density function
0.14
Mean
Method B
Method A
0.12
0.08
Method A Method B
73.7%
81.2%
90.3%
96.7%
96.6%
99.5%
99.9%
98.9%
99.6%
100.0%
99.9% percentile of method B
<
Reserve
Method B is
good
enough?
0.1
Claim
83.5%
95.7%
99.0%
99.8%
99.9%
99.9% percentile of claim
Without stochastic analysis, method B
might be chosen. The LSM can help
you on it!
0.06
0.04
0.02
0
2 4 6
8 10 12 14 16 18 20 22 24 26 28 30
Aggregate Claim
Claim
Method A
One out of
hundreds of
examples
Method B
Enterprise Risk Management
Overview
Test some items suggested but not fully addressed in the
CAS LSMWP summary report Modeling Loss Emergence
and Settlement Processes
Fit real claim data to models.
Build two-state regime-switching feature in the LSM to
add an extra layer of flexibility to describe claim data.
Software: LSM and R. The source code of model testing
and model fitting using R is provided.
Enterprise Risk Management
Model Testing
Real Claim
Data and
Reserve Data
fit into statistical models
frequency
trend
severity
state
Loss
Simulation
Model
run simulations
Stochastic
claim and
reserve data
Choose
the best
reserve
method
Compare
against the
simulated
claim data
reserve distributions
Apply
different
reserve
methods
Pass
Test against
real
experience /
model
assumption
Test against model assumption
Negative binomial frequency distribution
Correlation
Severity trend
Case reserve adequacy distribution
Enterprise Risk Management
Real Data Model Fitting
Real Claim
Data and
Reserve Data
fit into statistical models
frequency
trend
severity
state
Loss
Simulation
Model
run simulations
Stochastic
claim and
reserve data
Choose
the best
reserve
method
Compare
against the
simulated
claim data
reserve distributions
Apply different
reserve
methods
Pass
Test against
real
experience /
model
assumption
Fit real claim data to statistical models
frequency
Severity
Trend
Correlation
Enterprise Risk Management
Model Enhancement
Real Claim
Data and
Reserve Data
fit into statistical models
frequency
trend
severity
state
Loss
Simulation
Model
run simulations
Stochastic
claim and
reserve data
Choose
the best
reserve
method
Compare
against the
simulated
claim data
reserve distributions
Apply different
reserve
methods
Pass
Test against
real
experience /
model
assumption
Two-state regime-switching distribution
Switch between states at specified probability
Each state represents a distinct distribution
Enterprise Risk Management
10
II. Model Testing
11
DAY ONE
9 AM
Tom, our company plans to use
the loss simulation model to
help our reserving works. Lets do
some tests first to get a better
understanding of the model.
Start from the frequency
model.
Boss, where shall
we start?
Enterprise Risk Management
12
Negative Binomial Frequency Testing
Frequency simulation
One Line with annual frequency Negative Binomial (size=100, prob.=0.4)
Monthly exposure: 1
R code extract
Frequency Trend: 1
# draw histogram
Seasonality: 1
hist(dataf1,main="Histogram of observed data")
Accident Year: 2000
# QQPlot
Random Seed: 16807
freq.ex<-(rnbinom(n=1000,size=100,prob=0.4))
No. of Simulations: 1000
qqplot(dataf1,freq.ex,main="QQ-plot distr. Negative Binomial")
abline(0,1) ## a 45-degree reference line is plotted
Histogram and QQ plot
QQ-plot distr. Negative Binomial
160
freq.ex
120
140
100
100
50
0
Frequency
150
180
200
200
220
Histogram of observed data
100
120
140
160
dataf1
180
200
100
120
140
160
180
200
Enterprise Risk Management
dataf1
13
Negative Binomial Frequency Testing
Goodness of fit test - Pearsons 2
Pearson
197.4
p value
0.64
Maximum likelihood (ML) estimation
Estimation
S.D.
size
117.2
9.5
144.2
0.57
Model Assumption
Size
100
Prob.
0.4
Mean ()
150
Variance
375
ML estimation
117
0.448
144.2
321.5
R code extract
# Goodness of fit test
library(vcd) #load package vcd
gf<-goodfit(dataf1,type="nbinom",par=list(size=100,prob=0.4))
# Maximum likelihood estimation
gf<-goodfit(dataf1,type= "nbinom",method= "ML")
fitdistr(dataf1, "Negative Binomial")
Enterprise Risk Management
14
DAY ONE
5 PM
Good job Tom!
Lets get the correlation test
done tomorrow.
Enterprise Risk Management
15
Correlation
Correlation among frequencies of different lines
- Gaussian Copula
- Clayton Copula
- Frank Copula
- Gumbel Copula
- t Copula
Correlation between claim size and report lag
- Gaussian Copula
- Clayton Copula
- Frank Copula
- Gumbel Copula
Use R package copula
- t Copula
Enterprise Risk Management
16
Frequencies Frank Copula
Gumbel Copula:
Cn (u ) =
ln(1 +
(e u1 1)(e u 2 1) (e u n 1)
(e 1) n 1
>0
- Ui: marginal cumulative distribution function (CDF)
- C(u): joint CDF
Frequencies simulation
- Two Lines with annual frequency Poisson ( = 96)
- Monthly exposure: 1
- Frequency Trend: 1
- Seasonality: 1
- Accident Year: 2000
- Random Seed: 16807
- Frequency correlation: = 8, n = 2
- # of Simulations: 1000
Test Method
- Scatter plot
- Goodness-of-fit test
1. Parameter estimation based on maximum
likelihood and inverse of Kendalls tau
2. Cramer-von Mises (CvM) statistic
= {Cn( k ) (U i( k ) ) C(nk ) (U i( k ) )}2
n
(k )
n
i =1
3. p value by parametric
bootstrapping
Enterprise
Risk Management
17
Frequencies Frank Copula
Scatter plot
Simulated Frequencies
0.6
0.0
0.2
0.4
Line2
0.8
1.0
Frank Copula (=8)
0.0
Goodness-of-fit test
-
Maximum Likelihood method
Parameter estimate(s): 7.51
Std. error: 0.28
CvM statistic: 0.016 with p-value 0.31
Inversion of Kendalls tau method
Parameter estimate(s): 7.54
Std. error: 0.31
CvM statistic: 0.017 with p-value 0.20
0.2
0.4
0.6
0.8
1.0
Line1
R code extract
# construct a Gumbel copula object
gumbel.cop <- gumbelCopula(3, dim=2)
# parameter estimation
fit.gumbel<-fitCopula(gumbel.cop,x,method="ml")
fit.gumbel<-fitCopula(gumbel.cop,x,method="itau")
# Copula Goodness-of-fit test
gofCopula(gumbel.cop, x, N=100, method = "mpl")
gofCopula(gumbel.cop, x, N=100, method = "itau")
Enterprise Risk Management
18
Claim Size and Report Lag Normal Copula
1
1
n
Normal Copula a.k.a. Gaussian Copula: C (u ) = ( (u1 ), , (u n ))
: correlation matrix
: normal cumulative distribution function
Claim simulation
- One Line with annual frequency Poisson ( = 120)
- Monthly exposure: 1
- Frequency Trend: 1.05
- Seasonality: 1
- Accident Year: 2000
- Random Seed: 16807
- Payment Lag: Exponential with rate = 0.00274, which implies a
mean of 365 days.
- Size of entire loss: Lognormal with = 11.17 and = 0.83
- Correlation between payment lag and size of loss: normal copula
with correlation = 0.85, dimension 2
- # of Simulations: 10
Enterprise Risk Management
19
Claim Size and Report Lag Normal Copula
Scatter plot
Simulated claim size vs. report lag
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
V2
x[,2]
0.6
0.8
1.0
1.0
Normal Copula (0.85)
0.0
0.2
0.4
0.6
0.8
1.0
x[,1]
Goodness-of-fit test
-
Maximum Likelihood method
Parameter estimate(s): 0.83
Std. error: 0.01
CvM statistic: 0.062 with p-value 0.05
Inversion of Kendalls tau method
Parameter estimate(s): 0.85
Std. error: 0.01
CvM statistic: 0.029 with p-value 0.015
0.0
0.2
0.4
0.6
0.8
1.0
V1
Enterprise Risk Management
20
DAY THREE
9 AM
We often see trends in our
claim data. How is it handled
in the simulation model?
Enterprise Risk Management
21
Severity Trend
The LSM has two ways to model it
Trend factor (cum)
(Persistency of the force of the trend)
cum pmt _ date
trend = (cumacc _ date )
cum
acc _ date
= (cumacc _ date )1 (cum pmt _ date )
Trend factor Test Parameters
- One Line with annual frequency Poisson ( = 96)
- Monthly exposure: 1
- Frequency Trend: 1
- Seasonality: 1
- Accident Year: 2000 to 2005
- Random Seed: 16807
- Size of entire loss: Lognormal with = 11.17 and = 0.83
- Severity trend: 1.5
- # of Simulations: 300
Enterprise Risk Management
22
Severity Trend
Trend factor Test
data
5000
5e+05
trend
2000
2001
2002
2003
Time
2004
2005
-20000
20000
remainder
1e+05
2e+05
3e+05
3e+05
ts1
-5000 0
4e+05
seasonal
5e+05
Decomposition
1e+05 3e+05 5e+05 7e+05
Mean loss size
6e+05
7e+05
- Decomposition of Time Series by Loess (Locally weighted
regression) into trend, seasonality, and remainder
1e+05
2006
2000
2001
2002
2003
2004
2005
- Time series analysis (linear regression)
R code extract
# set up time series
ts1<-ts(data,start=2000,frequency=12)
plot(ts1)
#decomposition
plot(stl(ts1,s.window="periodic"))
# linear trend fitting
trend = time(ts1)-2000
reg = lm(log(ts1)~trend, na.action=NULL)
2006
time
Log(Mean Loss Size) = Intercept + trend * (time 2000) + error term
Coefficients:
t value
Pr(>|t |)
Estimate
Std. Error
(Intercept)
11.034162 0.007526
1466.1
<2e -16
trend
0.405552 0.002196
184.7
<2e -16
Residual standard error: 0.03226 on 70 degrees of freedom
Multiple R-squared: 0.998,
Adjusted R-squared: 0.9979
F -statistic: 3.412e+04 on 1 and 70 DF, p -value: < 2.2e -16
Enterprise
exp(0.405552) = 1.50013 vs. model input
1.5 Risk Management
23
Severity Trend
Trend persistency Test Parameters
- One Line with annual frequency Poisson ( = 96)
- Monthly exposure: 1
- Frequency Trend: 1
- Seasonality: 1
- Accident Year: 2000 to 2001
- Random Seed: 16807
- Size of entire loss: Lognormal with m = 11.17 and s = 0.83
- Severity trend: 1.5
- Alpha = 0.4
- # of Simulations: 1000
But how do we test it?
Choose the loss payments with report date during the 1st month
and payment date during the 7th month.
The severity trend is (1.51 / 12 ) (10.4) (1.57 / 12 ) 0.4 1.122
The expected loss size is 1.122 e11.17 + 0.83 / 2 112,175
2
Enterprise Risk Management
24
Severity Trend
Trend persistency Test
Histogram and fitted pdf
QQ plot of severity
Lognormal pdf and histogram
6e+05
Seve.ex
4e+05
4e-06
0e+00
2e+05
2e-06
0e+00
yhist
6e-06
8e+05
1e+06
8e-06
QQ-plot distr. Lognormal
0e+00
1e+05
2e+05
3e+05
4e+05
5e+05
6e+05
0e+00
1e+05
2e+05
3e+05
4e+05
5e+05
6e+05
xhist
- Maximum likelihood estimation (mean of severity=113,346)
Estimation
Standard Deviation
meanlog
11.32
0.052
- Normality test of log (severity)
sdlog
0.80
0.037
Kolmogorov-Smirnov test: p-value = 0.82
Anderson-Darling normality test: p-value = 0.34
R code extract
# Kolmogorov-Smirnov Tests
ks.test(a,"plnorm", meanlog=11.32,
sdlog=0.8)
# Anderson-Darling Test
library(nortest) ## package loading
ad.test(datas1.norm)
Enterprise Risk Management
25
DAY FOUR
9 AM
I heard you guys plan to use
the loss simulation model.
Is it capable of modeling
case reserve adequacy?
Enterprise Risk Management
26
Case Reserve Adequacy
In the LSM, the case reserve adequacy (CRA) distribution attempts to
model the reserve process by generating case reserve adequacy ratio at
each valuation date
- Case reserve = generated final claim amount case reserve adequacy ratio
Case Reserve Simulation
- One Line with annual frequency Poisson ( = 96)
- Monthly exposure: 1
- Frequency Trend: 1
- Seasonality: 1
- Accident Year: 2000 to 2001
- Random Seed: 16807
- Size of entire loss: Lognormal with = 11.17 and = 0.83
- Severity trend: 1
- P(0) = 0.4
- Est P(0) = 0.4
- # of Simulations: 8
Test 40% time point (60report date +
40%final payment date) case reserve
adequacy ratio
0.25 + 0.05
Mean: e
/2
1.2856
Enterprise Risk Management
27
Case Reserve Adequacy
Case Reserve Adequacy Test
QQ plot of CRA ratio
case reserve is generated on the
simulated valuation dates.
1.4
1.5
QQ-plot distr. Lognorm
1.3
Linear interpolation method is used to
get case reserve ratio at 40% time
point.
1.2
Seve.ex
Where went wrong?
0.4
0.6
0.8
1.0
1.2
1.4
1.6
On the report date, a case reserve of
2,000 is allocated for each claim.
1.8
- Maximum likelihood estimation
Estimation
Standard Deviation
meanlog
0.08
0.014
sdlog
0.32
0.010
If the second valuation date > 40%
time point, linear interpolation method
is not appropriate.
- Normality test of log (CRA ratio)
Kolmogorov-Smirnov test: p-value = 0.00
Anderson-Darling normality test: p-value = 0.00
Enterprise Risk Management
28
III. Real Data
Enterprise Risk Management
29
DAY FIVE
5 PM
Wait a minute Tom! I want
you to think about how to use
real claim data for model
calibration during the
weekend!
Enterprise Risk Management
30
Real Data
Marine claim data for distribution fitting, trend analysis, and correlation
analysis
- two product lines: Property and Liability
- data period: 2006 2010
- accident date, payment date, and final payment amount
Fit the frequency
- Draw time series and decomposition
Decomposition
10
1 2
5
6
4
2006
2007
2008
2009
-4
remainder trend
-2
10
seasonal data
15
15
Historical Frequency
ts1
2010
2006
Time
2007
2008
2009
2010
time
Enterprise Risk Management
31
Real Data
Fit the frequency (continued)
- Linear regression for trend analysis
Log(Monthly Frequency) = Intercept + trend * (time 2006) + error term
Coefficients:
Estimate Std. Error t value
Pr(>|t |)
(Intercept)
1.93060 0.15164 12.732
<2e- 16
trend
-0.14570
0.05919 -2.462
0.0172
Residual standard error: 0.5649 on 52 degrees of freedom.
Multiple R-squared: 0.1044, Adjusted R-squared: 0.08715.
F -statistic: 6.06 on 1 and 52 DF, p -value: 0.01718.
0.0
0.5
1.0
1.5
2.0
2.5
Trend Fitting
log(ts1)
2006
2007
2008
Time
2009
2010
Enterprise Risk Management
32
Real Data
Fit the frequency (continued)
- Detrend the frequency and fit to the lognormal distribution
meanlog
sdlog
Estimation
9.5539259
Standard Deviation 0.4260991
3.1311762
0.3012976
- Normality test of log (detrended freq.)
Kolmogorov-Smirnov test: p-value = 0.84
QQ plot of detrended freq.
10
15
20
QQ-plot distr. normal
freq.ex
10
15
detrend$Freq
20
Enterprise Risk Management
33
Real Data
Fit the Severity
Correlation calibration
Empirical Correlation
1.0
Frank Copula (1.3)
0.8
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
x[,1]
Maximum Likelihood method
Parameter estimate(s): 1.51
CvM statistic: 0.027 with p-value 0.35
0.6
Line2
0.6
0.4
0.2
0.0
x[,2]
0.8
1.0
Inversion of Kendalls tau method
Parameter estimate(s): 1.34
CvM statistic: 0.028 with p-value 0.40
0.0
0.2
0.4
0.6
0.8
1.0
Line1
What is missing?
Historical reserve data which
are essential for case reserve
adequacy modeling.
Enterprise Risk Management
34
IV. Model Enhancement
Enterprise Risk Management
35
Two-state regime-switching model
Sometimes the frequency and severity distribution are not stable over
time
- Structural change
- Cyclical pattern
- Idiosyncratic character
The model
- Two distinct distributions represent different states
- Transition rules from one state to another
P 11: state 1 persistency, the probability that the state will be 1 next
month given that it is 1 this month.
P 12: the probability that the state will be 2 next month given that it is 1
this month.
P 21: the probability that the state will be 1 next month given that it is 2
this month.
P 22: state 2 persistency, the probability that the state will be 2 next
month given that it is 2 this month.
P P
1: steady probability of state 1.
(1 2 ) 11 12 = (1 2 )
P21 P22
2: steady probability of state 2.
P11 = 1 P12
P21 = 1 P22
Enterprise Risk Management
1 + 2 = 1
36
Two-state regime-switching model
The Simulation
- Steps
1.
2.
3.
4.
5.
Generate uniform random number randf0 on range [0,1].
If randf0< 1, state of first month state is 1, else, it is 2.
Generate uniform random number randfi on range [0,1].
For previous month state I, if randfi<Pi1, then state is 1, else it is 2.
Repeat step 3 and 4 until the end of the simulation is reached.
- Test Parameters
State 1: Poisson Distribution ( = 120)
State 2: Negative Binomial Distribution (size = 36, prob = 0.5)
Assume the trend, monthly exposure, and seasonality are all 1
State 1 persistency: 0.5
Random Number (RN)
State
Criteria
State 2 persistency: 0.7
0.634633548790589
2
RN>0.375
Seed: 16807
0.801362191326916
1
RN>0.7
1 =
1 P22
1 0.7
=
= 0.375
2 P11 P22 2 0.5 0.7
2 =
1 P11
1 0.7
=
= 0.625
2 P11 P22 2 0.5 0.7
0.529508789768443
2
0.0441845036111772
2
0.994539848994464
1
0.21886122901924
1
0.0928565948270261
1
0.797880138037726
2
0.129500501556322
2
0.24027365935035
2
0.797712686471641
1
0.0569291599094868 Enterprise
1
Risk
RN>0.5
RN<0.7
RN>0.7
RN<0.5
RN<0.5
RN>0.5
RN<0.7
RN<0.7
RN>0.7
Management
RN<0.5
37
Two-state regime-switching model
The Test Transition Matrix
- Frequency
State 1: Poisson ( = 120); State 1 persistency: 0.2
State 2: Negative Binomial (size = 36, prob = 0.5); State 2 persistency: 0.9
Line 1 Frequency
Line 2 Frequency
P11 P12 0.15 0.85
=
P
P
0
.
1
0
.
9
22
21
( 1 2 ) = (10.53% 89.47% )
P11 P12 0.2 0.8
=
P
P
0
.
1
0
.
9
22
21
( 1 2 ) = (11.11% 88.89%)
Non Zero Cases:
State 1: 391
State 1: 410
State 2: 2797
State 2: 2733
Probability of Zero Cases:
State 1: 0.005% (e -10)
State 1: 0.005% (e -10)
State 2: 0.135 (e -2)
State 2: 0.125 (prob3)
Estimated all Cases: Non Zero Cases/ (1 Probability of Zero Cases)
State 1: 391
State 1: 410
State 2: 3188 (2797/(1-0.125))
State 2: 3161 (2733/(1-0.135))
Total Cases: # of simulations * 12 months = 3600
Steady-state probability (compared with P1 & P2)
State 1: 391/3600 = 10.86%
State 1: 410/3600 = 11.4%
State 2: 1-10.86% = 89.14%
State 2: 1-11.4% = 88.6%
Enterprise Risk Management
38
The Test Correlation
Set 1
1.0
Set 2
0.0
0.2
0.4
0.6
0.8
0.8
Line2
1.0
0.0
0.0
0.0
0.2
0.2
0.2
0.4
0.4
0.4
Line2
0.6
0.6
0.6
0.8
0.8
1.0
1.0
Normal Copula (0.95)
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Line1
rcopula(normal.cop, 1000)[,1]
Line1
Set 4
Set 1: State 1 for line 1 and state
1 for line 1
Set 2: State 1 for line 1 and state
2 for line 2
Set 3: State 2 for line 1 and state
1 for line 1
Set 4: State 2 for line 2 and state
2 for line 2
0.6
0.6
0.8
0.8
1.0
1.0
Set 3
0.0
0.0
0.2
0.2
0.4
0.4
Line2
Line2
rcopula(normal.cop, 1000)[,2]
Two-state regime-switching model
0.0
0.2
0.4
0.6
Line1
0.8
1.0
0.0
0.2
0.4
0.6
Line1
0.8
1.0
Goodness-of-fit test is also
conducted.
Enterprise Risk Management
39
Interface
Input
Enterprise Risk Management
40
Interface
Output
- Additional column in claim and transaction output files to record the state
- Showing state and random number while simulating
Enterprise Risk Management
41
THREE MONTHS LATER
Well done! It improved our
reserve adequacy a lot and
reduced our earnings volatility.
We created a new manager
position for you.
Congratulations!
Enterprise Risk Management
42
V. Further Development
Enterprise Risk Management
43
Further Development
Case reserve adequacy test shows that the assumption is
not consistent with simulation data.
This may be caused by the linear interpolation method
used to derive 40% time point case reserve.
It is suggested revising the way in which valuation date is
determined in the LSM. In addition to the simulated
valuation dates based on the waiting-period distribution
assumption as in the LSM, some deterministic time points
can be added as valuation dates.
In the LSM, 0%, 40%, 70%, and 90% time-points, case
reserve adequacy distribution can be input into the model.
Therefore, 0%, 40%, 70% and 90% time points may be
added as deterministic valuation dates.
Enterprise Risk Management
44
Thank you!
Enterprise Risk Management
45