Claims Modelling With Three-Component Composite Models
Claims Modelling With Three-Component Composite Models
Article
Claims Modelling with Three-Component Composite Models
Jackie Li 1, * and Jia Liu 2
1 Department of Econometrics and Business Statistics, Monash University, Melbourne 3800, Australia
2 Research School of Finance, Actuarial Studies & Statistics, Australian National University,
Canberra 0200, Australia; [email protected]
* Correspondence: [email protected]
Abstract: In this paper, we develop a number of new composite models for modelling individual
claims in general insurance. All our models contain a Weibull distribution for the smallest claims, a
lognormal distribution for the medium-sized claims, and a long-tailed distribution for the largest
claims. They provide a more detailed categorisation of claims sizes when compared to the existing
composite models which differentiate only between the small and large claims. For each proposed
model, we express four of the parameters as functions of the other parameters. We fit these models
to two real-world insurance data sets using both maximum likelihood and Bayesian estimation,
and test their goodness-of-fit based on several statistical criteria. They generally outperform the
existing composite models in the literature, which comprise only two components. We also perform
regression using the proposed models.
Keywords: composite models; loss data; fire insurance claims; vehicle insurance claims; tail quantiles
1. Introduction
1.1. Current Literature
Modelling individual claim amounts which have a long-tailed distribution is an
important task for general insurance actuaries. The usual candidates with a heavy tail
include the two-parameter Weibull, lognormal, Pareto, and three-parameter Burr models
(e.g., Dickson 2016). Venter (1983) introduced the four-parameter generalised beta type-II
(GB2) model, which nests more than 20 popular distributions (e.g., Dong and Chan 2013)
and can provide more flexibility in describing the skewness and kurtosis of the claims.
Citation: Li, Jackie, and Jia Liu. 2023. McNeil (1997) applied the generalised Pareto distribution (GPD) to the excesses above a
Claims Modelling with Three-
high threshold based on the extreme value theory. Many advanced models have been built
Component Composite Models. Risks
with these various distribution assumptions, as it is crucial for an insurer to provide an
11: 196. https://doi.org/10.3390/
adequate allowance for potential adverse financial outcome.
risks11110196
In order to deliver a reasonable parametric fit for both smaller claims and very large
Received: 28 September 2023 claims, Cooray and Ananda (2005) constructed the two-parameter composite lognormal-
Revised: 29 October 2023 Pareto model. It is composed of a lognormal density up to an unknown threshold and a
Accepted: 8 November 2023 Pareto density beyond that threshold. Using a fire insurance data set, they demonstrated
Published: 13 November 2023 a better performance by the composite model when compared to traditional models like
the gamma, Weibull, lognormal, and Pareto. Scollnik (2007) improved the lognormal-
Pareto model by allowing the weights to vary and also introduced the lognormal-GPD
model, in which the tail is modelled by the GPD instead. By contrast, Nadarajah and
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
Bakar (2014) modelled the tail with the Burr density. Scollnik and Sun (2012) and Bakar
This article is an open access article
et al. (2015) further tested several composite models which use the Weibull distribution
distributed under the terms and below the threshold and a variety of heavy-tailed distributions above the threshold. In
conditions of the Creative Commons all these extensions, an important feature is that the threshold selection is based on the
Attribution (CC BY) license (https:// data. Moreover, all the authors hitherto imposed continuity and differentiability conditions
creativecommons.org/licenses/by/ on the threshold point, and so the effective number of parameters is reduced by two.
4.0/). While there are some other similar mixture models (e.g., Calderín-Ojeda and Kwok 2016;
are some other similar mixture models (e.g., Calderín-Ojeda and Kwok 2016; Reynkens et
al. 2017) in the literature, we preserve the term “composite model” for only those with
Reynkens
these et al. 2017) in the literature,
continuity-differentiability we preserve
requirements thepaper.
in this term “composite
Some othermodel”recent forandonly
re-
thosestudies
lated with these continuity-differentiability
include those of Laudagé et al. requirements
(2019), Wang in et this paper.and
al. (2020), Some other recent
Poufinas et al.
and related studies include those of Laudagé et al. (2019), Wang et al. (2020), and Poufinas
(2023).
et al. (2023).
1.2. Proposed Composite Models
1.2. Proposed Composite Models
All the composite models mentioned above have only two components. For a very
All the composite models mentioned above have only two components. For a very
large data set, the behaviour of claims of different sizes may differ vastly, which would
large data set, the behaviour of claims of different sizes may differ vastly, which would
then call for a finer division between the claim amounts and thus more components to be
then call for a finer division between the claim amounts and thus more components to
incorporated (e.g., Grün and Miljkovic 2019). In this paper, we develop new three-com-
be incorporated (e.g., Grün and Miljkovic 2019). In this paper, we develop new three-
ponent composite models with an attempt to provide a better description of the charac-
component composite models with an attempt to provide a better description of the
teristics of different data ranges. Each of our models contains a Weibull distribution for
characteristics of different data ranges. Each of our models contains a Weibull distribution
the smallest claims, a lognormal distribution for the medium-sized claims, and a heavy-
for the smallest claims, a lognormal distribution for the medium-sized claims, and a heavy-
tailed distribution for the largest claims. We choose the sequence of starting with the
tailed distribution for the largest claims. We choose the sequence of starting with the
Weibull and then lognormal for a few reasons. First, as shown in Figure 1, the Weibull
Weibull and then lognormal for a few reasons. First, as shown in Figure 1, the Weibull
distribution tends to have a more flexible shape on the left side, which makes it potentially
distribution tends to have a more flexible shape on the left side, which makes it potentially
more usefulfor
more useful forthe
thesmallest
smallest claims.
claims. Second,
Second, the lognormal
the lognormal distribution
distribution usually usually has a
has a heaver
heaver tail, given the mean and variance, as the limiting density ratio
tail, given the mean and variance, as the limiting density ratio of Weibull to lognormal of Weibull to lognor-
mal approaches
approaches zero when x goesxto goes
zero when infinityto (see
infinity (see Appendix
Appendix A). This
A). This means thatmeans that the
the lognormal
lognormal distribution would be more suitable for claims of larger
distribution would be more suitable for claims of larger sizes. Nevertheless, both the sizes. Nevertheless,
both
Weibullthe and
Weibull and lognormal
lognormal do notpossess
do not really really possess a sufficiently
a sufficiently heavy
heavy tail for tail for model-
modelling the
ling
largest claims. Comparatively, a heavy-tailed distribution like Pareto, Burr, andBurr,
the largest claims. Comparatively, a heavy-tailed distribution like Pareto, GPDand are
GPD
betterare betterfor
options options for thisWe
this purpose. purpose.
apply the Weproposed
apply thethree-component
proposed three-component
composite models com-
posite
to twomodels
real-world to two real-world
insurance datainsurance
sets and use databoth
sets maximum
and use both maximum
likelihood andlikelihood
Bayesian
and
methods to estimate the model parameters for comparison. Based on severalon
Bayesian methods to estimate the model parameters for comparison. Based several
statistical
statistical
tests on the tests on the goodness-of-fit,
goodness-of-fit, we find
we find that that the
the new new composite
composite models outperform
models outperform not just
not just the traditional
the traditional models but models butearlier
also the also the earlier two-component
two-component compositecomposite
models. Inmodels. In
particular,
particular, it would be informative to see how the fitted models
it would be informative to see how the fitted models indicate the splits or thresholds indicate the splits or
thresholds
to separatetodifferent
separateclaimdifferent
sizes claim sizes categories:
into three into three categories:
small, medium,small,andmedium,
large. andWe
large.
experiment with applying regression under the proposed model structure and realise and
We experiment with applying regression under the proposed model structure that
realise
differentthat different
claims sizesclaims
have sizes havesignificant
different different significant
covariates.covariates.
Moreover,Moreover,
we consider we acon-
3D
sider a 3D map
map which whichascan
can serve serve
a risk as a risk management
management tool andthe
tool and summarise summarise
entire modelthe entire
space
model
and theirspace and their
resulting tailresulting tail risk
risk estimates. estimates.
Note that weNotefocusthat
on we
thefocus
claimon the claim
severity (butseverity
not the
(but
claimnot the claimin
frequency) frequency)
this study.in this study.
Figure 1.
Figure Examples of
1. Examples of density
density functions
functions of
of Weibull
Weibull and lognormal distributions.
The remainder
The remainderofofthe paper
the is asisfollows.
paper Sections
as follows. 2–4 introduce
Sections the composite
2–4 introduce Weibull-
the composite
Weibull-lognormal-Pareto, Weibull-lognormal-GPD, and Weibull-lognormal-Burr5
lognormal-Pareto, Weibull-lognormal-GPD, and Weibull-lognormal-Burr models. Section
provides a numerical illustration using two insurance data sets of fire claims and vehicle
claims. Section 6 sets forth the concluding remarks. The Appendix A presents some
Risks 2023, 11, 196 3 of 16
JAGS (specific software for Bayesian modelling) outputs of Bayesian simulation for the
proposed models.
2. Weibull-Lognormal-Pareto Model
Suppose X is a random variable with probability density function (pdf)
1
w1
θτ
f1 (x) for 0 < x ≤ θ1
1−exp − φ1τ
f ( x ) = w2 1 , (1)
Φ ln θ2 −µ −Φ ln θ1 −µ f 2 ( x ) for θ1 < x ≤ θ2
σ σ
for θ2 < x < ∞
( 1 − w1 − w2 ) f 3 ( x )
where
τx τ −1
xτ
f1 (x) = exp − τ ,
φτ φ
!
1 (ln x − µ)2
f2 (x) = √ exp − ,
xσ 2π 2σ2
αθ2α
f3 (x) = .
x α +1
In effect, f 1 ( x ) is the pdf of Weibull(φ, τ ) for φ, τ > 0, f 2 ( x ) is the pdf of
Lognormal(µ, σ ) for −∞ < µ < ∞ and σ > 0, and f 3 ( x ) is the pdf of Pareto(α, θ2 ) for
α, θ2 > 0, where φ, τ, µ, σ, and α are the model parameters. The weights w1 and w2 decide
the total probability of each segment. The thresholds θ1 and θ2 are the points at which
the Weibull and lognormal distributions are truncated, and they represent the splitting
points between the three data ranges. We refer to this model as the Weibull-lognormal-
Pareto model.
In line with previous authors including Cooray and Ananda (2005), two continuity
conditions f (θ1 −) = f (θ1 +) and f (θ2 −) = f (θ2 +), and also two differentiability condi-
tions f 0 (θ1 −) = f 0 (θ1 +) and f 0 (θ2 −) = f 0 (θ2 +) are imposed at the two thresholds. It can
be deduced that the former leads to the two equations below for the weights:
(ln θ1 −µ)2
τ
θ1 θ1τ
1 − exp − φτ exp φτ − 2σ2
φτ
w1 = w2 τ √ ,
ln θ2 −µ ln θ1 −µ
θ1 τ σ 2π Φ σ − Φ σ
2
2 −µ)
exp − (ln θ2σ 2
1
w1 = 1 − w2
1 + √ ,
σ α 2π Φ ln θ2 −µ ln θ1 −µ
σ −Φ σ
θ1τ ln θ1 − µ
= 1+ ,
φτ τ σ2
ln θ2 − µ
= α.
σ2
Because of these four relationships, there are effectively five unknown parameters,
including τ, σ, α, θ1 , and θ2 , with the others φ, µ, w1 , and w2 expressed as functions of
these parameters. As in all the previous works on composite models, the second derivative
Risks 2023, 11, 196 4 of 16
requirement is not imposed here because it often leads to inconsistent parameter constraints.
One can readily derive that the kth moment of X is given as follows (see Appendix A):
k θτ ln θ2 −µ−σ2 k ln θ1 −µ−σ2 k
φk γ exp(µk + 12 σ2 k2 ) Φ
1
τ +1, φτ σ − Φ σ
E Xk = w1 τ + w2
ln θ2 −µ
ln θ1 −µ
θ
1−exp − τ1 Φ σ − Φ σ
φ
αθ k
+(1 − w1 − w2 ) α−2k ,
Rz
in which γ(s, z) = 0 ts−1 exp(−t)dt is the lower incomplete gamma function and α > k.
3. Weibull-Lognormal-GPD Model
Similarly, we construct the Weibull-lognormal-GPD model as
τ −1
τ
w 1
1
τ
τx τ exp − x τ for 0 < x ≤ θ1
θ φ φ
1−exp − φ1τ
f ( x ) = w2 (ln x−µ)2
ln θ2 −µ
1 √
ln θ1 −µ xσ 2π
1
exp − 2σ2
for θ1 < x ≤ θ2 . (2)
Φ σ − Φ σ
α λ+θ )α
(1 − w1 − w2 ) λ( + x)α2+1 for θ2 < x < ∞
(
Note that we use the GPD version as in Scollnik (2007), and that α, λ, θ2 > 0. Under
the continuity and differentiability conditions, the weights are determined as follows:
(ln θ1 −µ)2
τ
θ1 θ1τ
1 − exp − τ exp τ − 2σ2
φτ φ φ
w1 = w2 τ √ ,
ln θ2 −µ ln θ1 −µ
θ1 τ σ 2π Φ σ − Φ σ
(ln θ2 −µ)2
exp − 2σ2
λ + θ2
w1 = 1 − w2 1 + √ ,
θ2 σ α 2π Φ ln − ln −
θ 2 µ
−Φ θ 1 µ
σ σ
ln θ2 − µ θ α−λ
= 2 .
σ2 θ2 + λ
There are six effective model parameters of τ, σ, α, λ, θ1 , and θ2 , with the others φ, µ,
w1 , and w2 given as functions of these parameters. The kth moment of X is equal to
k θτ ln θ2 −µ−σ2 k ln θ1 −µ−σ2 k
φk γ exp(µk + 12 σ2 k2 ) Φ
1
τ +1, φτ σ − Φ σ
E Xk = w1 τ + w2
ln θ2 −µ
ln θ1 −µ
θ
1−exp − τ1 Φ σ − Φ σ
φ
4. Weibull-Lognormal-Burr Model
Lastly, we define the Weibull-lognormal-Burr model as
τ
τx τ −1
w 1
1
θ τ
φ τ exp − φx τ for 0 < x ≤ θ1
1
1 − exp − φτ
(ln x−µ)2
1 1
f ( x ) = w2 Φ ln θσ2 −µ −Φ ln θσ1 −µ xσ 2π exp − 2σ2 for θ1 < x ≤ θ2 .
√
(3)
γ −1
α γ x βγ
( 1 − w − w ) 1
for θ2 < x < ∞
1 2
α γ α +1
βγ
1+ βx γ
γ γ
β + θ2
For α, β, γ, θ2 > 0, the Burr distribution is truncated from below. Again, the continuity
and differentiability conditions lead to the following equations for the weights:
(ln θ1 −µ)2
τ
θ1 θ1τ
1 − exp − φτ exp φτ − 2σ2
φτ
w1 = w2 τ √ ,
ln θ2 −µ
θ1 τ σ 2π Φ σ − Φ ln θσ1 −µ
2
2 −µ)
γ exp − (ln θ2σ 2
θ2 + βγ
w1 = 1 − w2
1 + √ ,
γ
2π Φ ln θ2 −µ ln θ1 −µ
θ2 σαγ
σ −Φ σ
γ
ln θ2 − µ θ ( α + 1) γ
= 2γ − γ.
σ2 θ2 + β γ
There are effectively seven model parameters to be estimated, including τ, σ, α, β, γ,
θ1 , and θ2 . The others φ, µ, w1 , and w2 are derived from these parameters. The kth moment
of X is computed as
k θτ ln θ2 −µ−σ2 k ln θ1 −µ−σ2 k
φk γ exp(µk + 12 σ2 k2 ) Φ
1
τ +1, φτ σ −Φ σ
E Xk = w1 τ + w2
ln θ2 −µ
ln θ1 −µ
θ
1−exp − φ1τ Φ σ −Φ σ
βγ
α βk B γ γ ;α− γk ,1+ γk
β + θ2
+(1 − w1 − w2 ) α ,
βγ
γ
β γ + θ2
Rz
in which B(z; a, b) = 0 t a−1 (1 − t)b−1 dt is the incomplete beta function.
Figure 2 gives a graphical illustration of the three new composite models. All the
graphs are based on the values of w1 = 0.2 and w2 = 0.6, that is, the expected proportions
of small, medium, and large claims are 20%, 60%, and 20%, respectively. For illustration
purposes, the parameters are arbitrarily chosen such that each set gives rise to exactly the
same expected proportions of the three claim sizes. For the case in the top panel, which
has similar Weibull and lognormal parameters and the same weights amongst the three
models, the Pareto tail is heavier than the GPD tail, followed by the Burr one. In the bottom
panel, while all the three Weibull-lognormal-Pareto models have the same component
weights, the differences in the parameter values can generate very different shapes and
tails of the densities. The three-component composite models can provide much flexibility
for modelling individual claims of different lines of business.
actly the same expected proportions of the three claim sizes. For the case in the top panel,
which has similar Weibull and lognormal parameters and the same weights amongst the
three models, the Pareto tail is heavier than the GPD tail, followed by the Burr one. In the
bottom panel, while all the three Weibull-lognormal-Pareto models have the same com-
ponent weights, the differences in the parameter values can generate very different shapes
Risks 2023, 11, 196 6 of 16
and tails of the densities. The three-component composite models can provide much flex-
ibility for modelling individual claims of different lines of business.
Figure2.
Figure Examplesof
2.Examples ofdensity
densityfunctions
functionsof
ofthree-component
three-componentcomposite
compositemodels
modelswith
withweights
weightsof
of20%,
20%,
60%, and
60%, and 20%,
20%, respectively.
respectively.
generally correspond to the MLE estimates. For each MCMC chain, we omit the first
5000 iterations and collect 5000 samples afterwards. Since the estimated Monte Carlo
errors are all well within 5% of the sample posterior standard deviations, the level of
convergence to the stationary distribution is considered adequate in our analysis. Some
JAGS outputs of MCMC simulation are provided in the Appendix A. We employ the “ones
trick” (Spiegelhalter et al. 2003) to specify the new models in JAGS. The Bayesian estimates
provide a useful reference for checking the MLE estimates. Despite the major differences in
their underlying theories, their numerical results are expected to be reasonably close here,
as we use non-informative priors, leading to most of the weights being allocated to the
posterior mean rather than the prior mean. Since the posterior distribution of the unknown
parameters of the proposed models are analytically intractable, the MCMC simulation
procedure is a useful method for approximating the posterior distribution (Li 2014).
Table 1 reports the negative log-likelihood (NLL), AIC, Bayesian Information Criterion
(BIC), Kolmogorov-Smirnov (KS) test statistic, and Deviance Information Criterion (DIC)
values1 for the 14 models tested. The ranking of each model under each test is given in
brackets, in which the top three performers are highlighted for each test. Overall, the
Weibull-lognormal-Pareto model appears to provide the best fit, with the lowest AIC, BIC,
and DIC values and the second lowest NLL and KS values. The second position is taken
by the Weibull-lognormal-GPD model, which produces the lowest NLL and KS values
and the second (third) lowest AIC (DIC). The Weibull-lognormal-Burr and Weibull-Burr
models come next, each of which occupies at least two top-three positions. Apparently,
the new three-component composite models outperform the traditional models as well as
the earlier two-component composite models. The P–P (probability–probability) plots in
Figure 3 indicate clearly that the new models describe the data very well. Recently, Grün
and Miljkovic (2019) tested 16 × 16 = 256 two-component models on the same Danish
data set, using a numerical method (via numDeriv in R) to find the derivatives for the
differentiability condition rather than deriving the derivatives from first principles as in
the usual way. Based on their reported results, the Weibull-Inverse-Weibull model gives
the lowest BIC (7671.30), and the Paralogistic-Burr and Inverse-Burr-Burr models give the
lowest KS test values (0.015). Comparatively, as shown in Table 1, the Weibull-lognormal-
Pareto model produces a lower BIC (7670.88) and all the three new composite models give
lower KS values (around 0.011), which are smaller than the critical value at 5% significance
level, and imply that the null hypothesis is not rejected.
Table 2 compares the fitted model quantiles (from MLE) against the empirical quantiles.
It can be seen that the differences between them are generally small. This result conforms
with the P–P plots in Figure 3. Note that the estimated weights of the three-component
composite models are about w1 = 0.08 and w2 = 0.54. These estimates suggest that
the claim amounts can be split into three categories of small, medium, and large sizes,
Risks 2023, 11, 196 8 of 16
with expected proportions of 8%, 54%, and 38%. For pricing, reserving, and reinsurance
purposes, the three groups of claims may further be studied separately, possibly with
different sets of covariates where feasible, as they may have different underlying driving
factors (especially for long-tailed lines of business).
Table 2. Empirical and fitted composite model quantiles for Danish fire insurance claims data.
Table 3 lists the parameter estimates of the three-component composite models ob-
tained from the MLE method and also the Bayesian MCMC method. It is reassuring to see
that not only the MLE estimates and the Bayesian estimates but also their corresponding
standard errors and posterior standard deviations are fairly consistent with one another
in general. A few exceptions include λ and β, which may suggest that these parameter
estimates are not as robust and are less significant. This implication is in line with the fact
that the Weibull-lognormal-GPD and Weibull-lognormal-Burr models are only the second
and third best models for this Danish data set.
Table 3. Parameter estimates of fitting three-component composite models to Danish fire insurance
claims data.
Table 5. Empirical and fitted composite model quantiles for vehicle insurance claims data.
Blostein
Blostein and Miljkovic and Miljkovic
(2019) proposed (2019)
a grid proposed
map as a riska grid map as a risk
management toolmanagement
for risk tool
managers to consider the trade-off between the
managers to consider the trade-off between the best model based on the AIC or BIC andbest model based on the AIC or B
the risk measure. It covers the entire space of models
the risk measure. It covers the entire space of models under consideration, and allows under consideration, and allo
to have a comprehensive view of the different outcomes
one to have a comprehensive view of the different outcomes under different models. In under different models. In
4, we extend this grid map idea into a 3D map,
Figure 4, we extend this grid map idea into a 3D map, considering more than just one considering more than just one
selection criterion.
model selection criterion. It can serve It as
cana summary
serve as aofsummary of the
the tail risk tail risk
measures measures
given by the given by
14 models being tested, comparing the tail estimates between the best models and the models
models being tested, comparing the tail estimates between the best other and th
models under two chosen statistical criteria. For both data
models under two chosen statistical criteria. For both data sets, it is informative to see that sets, it is informative to
the 99% value-at-risk (VaR) estimates are robust amongst
the 99% value-at-risk (VaR) estimates are robust amongst the few best model candidates, the few best model can
while of
while there is a range there is a range
outcomes of outcomes
for the other lessforthantheoptimal
other less than optimal
models (the 99%modelsVaR is (the 99%
calculated as the calculated as the
99th percentile 99thon
based percentile
the fittedbased
model).on the fitted model).
It appears that the It appears
risk measure that the ris
estimates becomeure estimates
more and more become
stablemore and more as
and consistent stable and consistent
we move as we move
to progressively betterto progr
performing models.better performing
This 3D map can models.
be seen Thisas a3D map
new can
risk be seen as a tool
management newand riskitmanagement
would too
would be useful for risk managers to have an
be useful for risk managers to have an overview of the whole model space and examine overview of the whole model sp
examine how the selection criteria would affect the resulting
how the selection criteria would affect the resulting assessment of the risk. In particular, in assessment of the
particular,
many other modelling cases,inthere
many otherbemodelling
could cases, well-performing
several equally there could be severalmodels equally
whichwell-perf
models
produce significantly which risk
different produce significantly
measures, and thisdifferent risk measures,
tool can provide a clear and this tool can pr
illustration
for more informed model
clear selection.
illustration for Note
more that other risk
informed model measures
selection.and selection
Note criteria
that other risk measu
than those in Figure 4 cancriteria
selection be adoptedthan in a similar
those way.4 can be adopted in a similar way.
in Figure
Figure 4. 3D mapFigure
of 14 4.
models’
3D map99%of 14VaR estimates
models’ against
99% VaR BIC and
estimates KS values
against BIC andforKSDanish
values fire
for Danish fi
insurance claims data
ance(left) and
claims vehicle
data (left)insurance
and vehicleclaims data (right).
insurance The three
claims data major
(right). The categories
three majorare
categories a
noted as traditional models (triangles), two-component composite models (empty circles), and new and ne
as traditional models (triangles), two-component composite models (empty circles),
componentmodels
three-component composite composite
(solidmodels (solid circles).
circles).
the various sources or reasons behind the claims, and it is very important to take into
account these subtle discrepancies in order to obtain a more accurate price on the risk. A
final note is that while θ2 = 4.637 remains about the same level after embedding regression,
θ1 has increased to 0.734 (when compared to Table 6). The inclusion of the explanatory
variables has led to a larger allocation to the Weibull component but a smaller allocation to
the lognormal component.
Table 7. Parameter estimates and standard errors of fitting Weibull-lognormal-GPD regression model
to vehicle insurance claims data with covariates.
Standard
Model Component Covariate Estimate t-Ratio p-Value
Error
Intercept 0.850 0.644 1.32 0.19
Exposure −0.085 0.055 −1.54 0.12
Weibull Component
Vehicle Age 0.233 0.253 0.92 0.36
(small claims)
Driver Age 1.921 0.013 143.55 0.00
Gender −0.012 0.009 −1.29 0.20
Intercept −57.411 26.128 −2.20 0.03
Lognormal Exposure 8.179 26.821 0.30 0.76
Component Vehicle Age 7.670 5.425 1.41 0.16
(medium claims) Driver Age −5.023 4.670 −1.08 0.28
Gender −1.221 11.118 −0.11 0.91
Intercept 2.269 0.192 11.80 0.00
Exposure −1.028 0.186 −5.52 0.00
GPD Component
Vehicle Age −0.116 0.043 −2.71 0.01
(large claims)
Driver Age −0.049 0.023 −2.08 0.04
Gender 0.275 0.074 3.72 0.00
Note: The figures are produced from the authors’ calculations.
As a whole, it is interesting to see the gradual development over time in the area
of modelling individual claim amounts. As illustrated in Tables 1 and 4, the simple
models (Weibull, lognormal, Pareto) fail to capture the important features of the complete
data set when its size is large. More general models with additional parameters and so
more flexibility (Burr, GB2) are then explored as an alternative, which does bring some
improvement over the simple models. The two-component composite lognormal-kind
models represent a significant innovation in combining two distinct densities, though
these models do not always lead to obvious improvement over traditional three- and
four-parameter distributions. Later, some studies showed that two-component composite
Weibull-, Paralogistic-, and Inverse-Burr-kind models can produce better fitting results. In
the present work, we take a step ahead and demonstrate that a three-component composite
model, with the Weibull for small claims, lognormal for moderate claims, and a heavy
tail for large claims, can further improve the fitting performance. Moreover, based on the
estimated parameters, there is a rather objective guide for splitting the claims into different
groups, which can then be analysed separately for their own underlying features (e.g.,
Cebrián et al. 2003). This kind of separate analysis is particularly important for some long-
tailed lines of business, such as public and product liability, for which certain large claims
can delay significantly due to specific legal reasons. Note that the previous two-component
composite models, when fitted to the two insurance data sets, suggest a split at around
the 10% quantile, which is in line with the estimated values of w1 reported earlier. The
proposed three-component composite models can make a further differentiation between
moderate and large claim sizes.
Risks 2023, 11, 196 13 of 16
6. Concluding Remarks
We have constructed three new composite models for modelling individual claims in
general insurance. All our models are composed of a Weibull distribution for the smallest
claims, a lognormal distribution for the moderate claims, and a long-tailed distribution for
the largest claims. Under each proposed model, we treat four of the parameters as functions
of the other parameters. We have applied these models to two real-world insurance data
sets of fire claims and vehicle claims, via both maximum likelihood and Bayesian estimation
methods. Based on standard statistical criteria, the proposed three-component composite
models are shown to outperform the earlier two-component composite models. We have
also devised a 3D map for analysing the impact of selection criteria on the resulting risk
measures, and experimented with applying regression under a three-component composite
model, from which the effects of different covariates on different claim sizes are illustrated
and compared. Note that inflation has been very high in recent years, and can have a
serious impact on the claim sizes. Accordingly, it is advisable to adjust recent claim sizes
with suitable inflation indices before the claims modelling, similar to the Danish data set.
There are a few areas that would require more investigation. For the two data sets
considered, each of which has a few thousand observations, it appears that three distinct
components are adequate to describe the major data patterns. For other much larger data
sets, however, we conjecture that an incorporation of more than three components can
become an optimal choice. Additionally, if the data set is sufficiently large, clustering
techniques can be applied, and the corresponding results can be compared to those of the
proposed approach. When clustering methods are used, the next step is to fit a distribution
or multiple distributions to different claim sizes, while our proposed approach has the
convenience of performing both in one single step. Moreover, we select the Weibull and
then lognormal distributions because of their suitability for the smallest and medium-sized
claims, as shown and discussed earlier, and the fact that they have been the common choices
in the existing two-component composite models. While we use these two distributions
as the base for the first two components, it may be worthwhile to test other distributions
instead and see whether they can significantly improve the fitting performance. Finally, as in
Pigeon and Denuit (2011), heterogeneity of the two threshold parameters can be introduced
by setting appropriate mixing distributions. In this way, the threshold parameters are
allowed to differ between observations. There are also other interesting and related studies
such as those of Frees et al. (2016), Millennium and Kusumawati (2022), and Poufinas et al.
(2023).
Author Contributions: Methodology, J.L. (Jackie Li) and J.L. (Jia Liu); Formal analysis, J.L. (Jackie Li)
and J.L. (Jia Liu); Writing—original draft, J.L. (Jackie Li) and J.L. (Jia Liu). All authors have read and
agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: The data used are publicly available, as in the links provided.
Conflicts of Interest: The authors declare no conflict of interest.
Appendix A
As shown in the example below, given the mean and variance, the limiting density
ratio of Weibull to lognormal tends to zero when x approaches infinity. This indicates that
the lognormal distribution has a heaver tail than the Weibull distribution.
Appendix A
As shown in the example below, given the mean and variance, the limiting density
Risks 2023, 11, 196 14 of 16
ratio of Weibull to lognormal tends to zero when x approaches infinity. This indicates
that the lognormal distribution has a heaver tail than the Weibull distribution.
Figure
Figure A1.
A1. Density
Density ratios
ratios of
of Weibull
Weibull (3.8511,
(3.8511, 0.7717)
0.7717) to
to Lognormal
Lognormal (1,
(1, 1)
1) (left
(left graph)
graph) and
and Weibull
Weibull
(0.7071, 2) to Lognormal (−0.5881, 0.4915) (right graph).
(0.7071, 2) to Lognormal (−0.5881, 0.4915) (right graph).
+𝜙 x (1 − w1 − w2 ) f 3 ( x )dx 𝜎 𝜎
θ2 !
R θ2 ( ln x −µ)2
R θ1 τ −1
τ
xk 1
√ exp − dx
x k τxφτ exp− xτ dx 2σ2
+ 𝑥 (1 − 𝑤 − 𝑤 )𝑓 (𝑥)𝑑𝑥
θ1 xσ 2π
0
= w1 τ φ + w2 ln θ2 −µ ln θ1 −µ
θ
1−exp − φ1τ Φ σ −Φ σ
R∞ αθ2α
+(1 − w − w ) xk dx
𝜏𝑥 𝑥
1 2 θ2
1 (ln 𝑥 − 𝜇)
x α +1
𝑥 exp − 𝑑𝑥φ γ τ +1, φθ1τ 𝑥 exp(µkexp
+ 21 σ2 k2− 𝑑𝑥ln θ1 −σµ−σ2 k
θ2 − µ − σ 2 k
τ
k k
) Φ ln2𝜎 −Φ
𝜙 𝜙= w 𝑥𝜎√2𝜋 σ αθ k
=𝑤 1 + 𝑤 θ τ + w2
ln θ2 −µ
ln θ1 −µ
+ (1 − w1 − w2 ) α−2k
𝜃 1−exp − φ1τ ln 𝜃 − 𝜇 Φ ln
σ 𝜃 −− Φ 𝜇σ
1 − exp − Φ −Φ
𝜙 𝜎 𝜎
The following plots show the JAGS outputs of MCMC simulation when fitting the
𝛼𝜃Weibull-lognormal-Pareto model to the Danish data, using uninformative uniform priors.
+(1 − 𝑤 − 𝑤 ) 𝑥 All𝑑𝑥
the parameters τ, σ, α, θ1 , and θ2 are included. For each parameter, the four graphs
𝑥 include the history plot, posterior distribution function, posterior density function (in
histogram), and autocorrelation plot (between iterations). The history and autocorrelation
plots strongly
Φ suggest that the
Φ level of convergence to the underlying stationary distribution
,
=𝑤 +𝑤 + (1 − 𝑤 − 𝑤 )
is highly satisfactory (Spiegelhalter et al. 2003). .
Φ Φ
The following plots show the JAGS outputs of MCMC simulation when fitting the
Weibull-lognormal-Pareto model to the Danish data, using uninformative uniform priors.
All the parameters τ , σ , α , θ1 , and θ2 are included. For each parameter, the four
graphs include the history plot, posterior distribution function, posterior density function
(in histogram), and autocorrelation plot (between iterations). The history and autocorre-
lation plots strongly suggest that the level of convergence to the underlying stationary
distribution is highly satisfactory (Spiegelhalter et al. 2003).
Risks 2023,11,
Risks2023, 11,196
x FOR PEER REVIEW 15 of
16 of16
17
FigureA2.
Figure A2.History
Historyplot,
plot,posterior
posteriordistribution
distributionfunction,
function,posterior
posterior density
density function,
function, and
and autocorre-
autocorrela-
lation plot of Weibull-lognormal-Pareto model parameters for Danish fire insurance claims
tion plot of Weibull-lognormal-Pareto model parameters for Danish fire insurance claims data. (The data.
(The blue and purple lines represent two separate chains of simulations.)
blue and purple lines represent two separate chains of simulations.)
Notes
Notes
1 1. TheAIC
The AICisis defined
defined as as −−2l2l++2n2pn, pand
, and the
the BIC −2+l n+p nlnp nlnd , nwhere
BICasas−2l d , where l iscomputed
l is the the computed maximum
maximum log-likelihood
log-likelihood value,
value, np
isnthe is
effective
the number
effective of
number parameters
of parameters estimated, and
estimated, n d
and is n
the number
is the of
numberobservations.
of The
observations. KS
The test
KS statistic
test is calculated
statistic is as
calculated
p d
max| Fn ( x ) − F ( x )|, that is, the maximum distance between the empirical and fitted distribution functions. The DIC is computed
asasthemax
posterior x) − Fof(the
| Fn ( mean x) |deviance
, that is, the
plusmaximum distance
the effective number between the empirical
of parameters under and fitted distribution
the Bayesian framework functions. The DIC
(Spiegelhalter et al.is
2003).
computed as the posterior mean of the deviance plus the effective number of parameters under the Bayesian framework (Spie-
gelhalter et al. 2003).
Risks 2023, 11, 196 16 of 16
2 The link functions are ln∅ = ρ1,0 + ρ1,1 x1 + ρ1,2 x2 + ρ1,3 x3 + ρ1,4 x4 , µ = ρ2,0 + ρ2,1 x1 + ρ2,2 x2 + ρ2,3 x3 + ρ2,4 x4 , and lnβ =
ρ3,0 + ρ3,1 x1 + ρ3,2 x2 + ρ3,3 x3 + ρ3,4 x4 , where ρ’s are the regression coefficients and x1 , x2 , x3 , x4 are the four covariates. We have
checked the covariates in the data, and there is no multicollinearity issue.
References
Bakar, S. A. Abu, Nor A. Hamzah, Mastoureh Maghsoudi, and Saralees Nadarajah. 2015. Modeling loss data using composite models.
Insurance: Mathematics and Economics 61: 146–54.
Blostein, Martin, and Tatjana Miljkovic. 2019. On modeling left-truncated loss data using mixtures of distributions. Insurance:
Mathematics and Economics 85: 35–46. [CrossRef]
Calderín-Ojeda, Enrique, and Chun Fung Kwok. 2016. Modeling claims data with composite Stoppa models. Scandinavian Actuarial
Journal 2016: 817–36. [CrossRef]
Cebrián, Ana C., Michel Denuit, and Philippe Lambert. 2003. Generalized Pareto fit to the Society of Actuaries’ large claims database.
North American Actuarial Journal 7: 18–36. [CrossRef]
Cooray, Kahadawala, and Malwane M. A. Ananda. 2005. Modeling actuarial data with a composite lognormal-Pareto model.
Scandinavian Actuarial Journal 2005: 321–34. [CrossRef]
Dickson, David C. M. 2016. Insurance Risk and Ruin, 2nd ed. Cambridge: Cambridge University Press.
Dong, Alice X. D., and Jennifer S. K. Chan. 2013. Bayesian analysis of loss reserving using dynamic models with generalized beta
distribution. Insurance: Mathematics and Economics 53: 355–65. [CrossRef]
Frees, Edward W., Gee Lee, and Lu Yang. 2016. Multivariate frequency-severity regression models in insurance. Risks 4: 4. [CrossRef]
Grün, Bettina, and Tatjana Miljkovic. 2019. Extending composite loss models using a general framework of advanced computational
tools. Scandinavian Actuarial Journal 2019: 642–60. [CrossRef]
Laudagé, Christian, Sascha Desmettre, and Jörg Wenzel. 2019. Severity modeling of extreme insurance claims for tariffication. Insurance:
Mathematics and Economics 88: 77–92. [CrossRef]
Li, Jackie. 2014. A quantitative comparison of simulation strategies for mortality projection. Annals of Actuarial Science 8: 281–97.
[CrossRef]
McNeil, Alexander J. 1997. Estimating the tails of loss severity distributions using extreme value theory. ASTIN Bulletin 27: 117–37.
[CrossRef]
Millennium, Ratih Kusuma, and Rosita Kusumawati. 2022. The simulation of claim severity and claim frequency for estimation of loss
of life insurance company. In AIP Conference Proceedings. College Park: AIP Publishing, vol. 2575.
Nadarajah, Saralees, and SA Abu Bakar. 2014. New composite models for the Danish fire insurance data. Scandinavian Actuarial Journal
2014: 180–87. [CrossRef]
Pigeon, Mathieu, and Michel Denuit. 2011. Composite lognormal-Pareto model with random threshold. Scandinavian Actuarial Journal
2011: 177–92.
Plummer, Martyn. 2017. JAGS Version 4.3.0 User Manual. Available online: https://sourceforge.net/projects/mcmc-jags/files/
Manuals/ (accessed on 1 November 2023).
Poufinas, Thomas, Periklis Gogas, Theophilos Papadimitriou, and Emmanouil Zaganidis. 2023. Machine learning in forecasting motor
insurance claims. Risks 11: 164. [CrossRef]
Reynkens, Tom, Roel Verbelen, Jan Beirlant, and Katrien Antonio. 2017. Modelling censored losses using splicing: A global fit strategy
with mixed Erlang and extreme value distributions. Insurance: Mathematics and Economics 77: 65–77. [CrossRef]
Scollnik, David P. M. 2007. On composite lognormal-Pareto models. Scandinavian Actuarial Journal 2007: 20–33. [CrossRef]
Scollnik, David PM, and Chenchen Sun. 2012. Modeling with Weibull-Pareto models. North American Actuarial Journal 16: 260–72.
[CrossRef]
Spiegelhalter, David, Andrew Thomas, Nicky Best, and Dave Lunn. 2003. WinBUGS User Manual. Available online: https://www.
mrc-bsu.cam.ac.uk/software/bugs/ (accessed on 1 September 2023).
Venter, Gary C. 1983. Transformed beta and gamma distributions and aggregate losses. Proceedings of the Casualty Actuarial Society 70:
156–93.
Wang, Yinzhi, Ingrid Hobæk Haff, and Arne Huseby. 2020. Modelling extreme claims via composite models and threshold selection
methods. Insurance: Mathematics and Economics 91: 257–68. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.