0% found this document useful (0 votes)
37 views22 pages

MAS-I Formula Sheet

The document outlines key concepts in probability models, including moment generating functions, claim severity distributions, and various statistical properties such as CDFs, survival functions, and hazard functions. It also discusses applications in insurance, specifically the impact of deductibles on claim frequency and the ultimate formula for insurance calculations. Additionally, it includes algorithms for greedy assignments and transformations relevant to continuous distributions.

Uploaded by

adventurine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views22 pages

MAS-I Formula Sheet

The document outlines key concepts in probability models, including moment generating functions, claim severity distributions, and various statistical properties such as CDFs, survival functions, and hazard functions. It also discusses applications in insurance, specifically the impact of deductibles on claim frequency and the ultimate formula for insurance calculations. Additionally, it includes algorithms for greedy assignments and transformations relevant to continuous distributions.

Uploaded by

adventurine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MAS-I

Updated 4/9/24

Probability
PROBABILITY MODELSModels Moment Generating Function (MGF) Claim Severity Distributions
𝑀𝑀/ (𝑧𝑧) = E[𝑒𝑒 0/ ] Common Distributions
Basics (1)
𝑀𝑀/ (0) = E[𝑋𝑋 1 ] S-P Pareto(𝛼𝛼, 𝜃𝜃) ~ Pareto(𝛼𝛼, 𝜃𝜃) + 𝜃𝜃
CDFs, Survival Functions, and (1) Beta(𝑎𝑎 = 1, 𝑏𝑏 = 1, 𝜃𝜃) ~ Uniform(0, 𝜃𝜃)
where 𝑀𝑀/ is the 𝑛𝑛23 derivative
Hazard Functions Weibull(𝜃𝜃, 𝜏𝜏 = 1) ~ Exponential(𝜃𝜃)
!
Probability Generating Function (PGF) Gamma(𝛼𝛼 = 1, 𝜃𝜃) ~ Exponential(𝜃𝜃)
𝐹𝐹(𝑥𝑥) = Pr(𝑋𝑋 ≤ 𝑥𝑥) = * 𝑓𝑓(𝑡𝑡) d𝑡𝑡
"# 𝑃𝑃/ (𝑧𝑧) = E[𝑧𝑧 / ]
# (1) Gamma CDF Shortcut
𝑆𝑆(𝑥𝑥) = Pr(𝑋𝑋 > 𝑥𝑥) = * 𝑓𝑓(𝑡𝑡) d𝑡𝑡 𝑃𝑃/ (1) = E[𝑋𝑋(𝑋𝑋 − 1) … (𝑋𝑋 − 𝑛𝑛 + 1)]
𝐹𝐹/ (𝑥𝑥) = 1 − Pr(𝑁𝑁 < 𝛼𝛼)
! (1)
where 𝑃𝑃/ is the 𝑛𝑛23 derivative • 𝛼𝛼 is a positive integer
𝑓𝑓(𝑥𝑥)
ℎ(𝑥𝑥) = • 𝑋𝑋 ~ Gamma(𝛼𝛼, 𝜃𝜃)
𝑆𝑆(𝑥𝑥) Conditional Distribution
!
Pr(𝐴𝐴 ∩ 𝐵𝐵) Pr(𝐵𝐵 ∣ 𝐴𝐴) Pr(𝐴𝐴) • 𝑁𝑁 ~ Poisson(𝑥𝑥⁄𝜃𝜃)
𝐻𝐻(𝑥𝑥) = * ℎ(𝑡𝑡) d𝑡𝑡 = − ln 𝑆𝑆(𝑥𝑥) Pr(𝐴𝐴 ∣ 𝐵𝐵) = =
"# Pr(𝐵𝐵) Pr(𝐵𝐵) Properties of Exponential Distribution
𝑆𝑆(𝑥𝑥) = 𝑒𝑒 "$(!) 𝑓𝑓/ (𝑥𝑥)
𝑓𝑓/∣56/6+ (𝑥𝑥) = 𝑋𝑋9 ~ Exponential(𝜃𝜃9 )
Pr(𝑗𝑗 < 𝑋𝑋 < 𝑘𝑘)
Percentiles E[𝑋𝑋] = 𝜃𝜃
where 𝑗𝑗 < 𝑥𝑥 < 𝑘𝑘
100𝑞𝑞th percentile is 𝜋𝜋' where 𝐹𝐹9𝜋𝜋' : = 𝑞𝑞. ℎ(𝑥𝑥) = 1⁄𝜃𝜃 = 𝜆𝜆
Law of Total Probability
Pr(𝑋𝑋 > 𝑡𝑡 + 𝑠𝑠|𝑋𝑋 > 𝑡𝑡) = Pr(𝑋𝑋 > 𝑠𝑠)
Mode Pr(𝑋𝑋 = 𝑥𝑥) = E7 [Pr(𝑋𝑋 = 𝑥𝑥 ∣ 𝑌𝑌)] 𝜆𝜆,
Mode is 𝑥𝑥 that maximizes 𝑓𝑓(𝑥𝑥). Pr(𝑋𝑋, < 𝑋𝑋* ) =
Law of Total Expectation 𝜆𝜆, + 𝜆𝜆*
Moments E/ [𝑋𝑋] = E7 fE/ [𝑋𝑋 ∣ 𝑌𝑌]g 1
# min(𝑋𝑋, , 𝑋𝑋* , … , 𝑋𝑋1 ) ~ Exponential } 
∑19:, 𝜆𝜆9
E[𝑔𝑔(𝑋𝑋)] = * 𝑔𝑔(𝑥𝑥) ⋅ 𝑓𝑓(𝑥𝑥) d𝑥𝑥
"# Law of Total Variance 1
#
Var/ [𝑋𝑋] = E7 fVar/ [ 𝑋𝑋 ∣ 𝑌𝑌 ]g Ä 𝑋𝑋9 ~Gamma(𝑛𝑛, 𝜃𝜃) where 𝜃𝜃9 = 𝜃𝜃
= * 𝑔𝑔( (𝑥𝑥) ⋅ 𝑆𝑆(𝑥𝑥) d𝑥𝑥 9:,
) + Var7 fE/ [𝑋𝑋 ∣ 𝑌𝑌]g
*] *
Var[𝑔𝑔(𝑋𝑋)] = E[𝑔𝑔(𝑋𝑋) − E[𝑔𝑔(𝑋𝑋)] Greedy Algorithms
𝜇𝜇+( = E[𝑋𝑋 + ] Independence Algorithm A
𝜇𝜇 = 𝜇𝜇,( = E[𝑋𝑋] Pr(𝐴𝐴 ∩ 𝐵𝐵) = Pr(𝐴𝐴) ⋅ Pr(𝐵𝐵) For 𝑖𝑖 = 1, 2, … , 𝑛𝑛:
𝜇𝜇+ = E[(𝑋𝑋 − 𝜇𝜇)+ ] For independent 𝑋𝑋 and 𝑌𝑌: 1. Choose the assignment with the lowest
𝜎𝜎 * = 𝜇𝜇* = Var[𝑋𝑋] • 𝑓𝑓/,7 (𝑥𝑥, 𝑦𝑦) = 𝑓𝑓/ (𝑥𝑥) ⋅ 𝑓𝑓7 (𝑦𝑦) cost, i.e., min 𝐶𝐶9,5 , among all 𝑛𝑛 − 𝑖𝑖 + 1
5
Cov[𝑋𝑋, 𝑌𝑌] = E[𝑋𝑋𝑋𝑋] − E[𝑋𝑋] ⋅ E[𝑌𝑌] • E[𝑔𝑔(𝑋𝑋) ⋅ ℎ(𝑌𝑌)] = E[𝑔𝑔(𝑋𝑋)] ⋅ E[ℎ(𝑌𝑌)]
possible assignments.
Cov[𝑋𝑋, 𝑋𝑋] = Var[𝑋𝑋]
2. Assign that job to that employee.
𝜎𝜎
Coefficient of variation, 𝐶𝐶𝐶𝐶 = 3. Remove that employee and that job from
𝜇𝜇
𝜇𝜇- their respective sets.
Skewness = -
𝜎𝜎
𝜇𝜇. Algorithm B
Kurtosis = . For 𝑘𝑘 = 𝑛𝑛* , (𝑛𝑛 − 1)* , … , 1* :
𝜎𝜎
1. Choose the assignment with the
lowest cost, i.e., min 𝐶𝐶9,5 , among
9,5
all 𝑘𝑘 possible assignments.
2. Assign that job to that employee.
3. Remove that employee and that job from
their respective sets.
1
1
E[Total Cost] = 𝜃𝜃 Ä
𝑖𝑖
9:,
where 𝐶𝐶9,5 ~ Exponential(𝜃𝜃)

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 1
Transformations Insurance Applications Impact of Deductibles on Claim Frequency
• Scaling 𝑌𝑌 ; : payment per loss For 𝑣𝑣 = Pr(𝑋𝑋 > 𝑑𝑑), the # of payments 𝑁𝑁′:
𝜃𝜃 is a scale parameter for all continuous
Policy Limits, 𝑢𝑢 𝑁𝑁 𝑁𝑁′
distributions on the exam table, except
; 𝑋𝑋, 𝑋𝑋 < 𝑢𝑢
lognormal, inverse Gaussian, and log-𝑡𝑡. 𝑌𝑌 = 𝑋𝑋 ∧ 𝑢𝑢 = min(𝑋𝑋, 𝑢𝑢) = î Poisson 𝜆𝜆 𝑣𝑣𝑣𝑣
𝑢𝑢, 𝑋𝑋 ≥ 𝑢𝑢
• CDF Method E[(𝑌𝑌 ; )+ ] = E[(𝑋𝑋 ∧ 𝑢𝑢)+ ] Binomial 𝑚𝑚, 𝑞𝑞 𝑚𝑚, 𝑣𝑣𝑣𝑣
• PDF Method <
+ +
= * 𝑥𝑥 𝑓𝑓(𝑥𝑥) d𝑥𝑥 + 𝑢𝑢 ⋅ 𝑆𝑆(𝑢𝑢) Neg. Binomial 𝑟𝑟, 𝛽𝛽 𝑟𝑟, 𝑣𝑣𝑣𝑣
• MGF Method )
<
Mixtures = * 𝑘𝑘𝑥𝑥 +", 𝑆𝑆(𝑥𝑥) d𝑥𝑥 The Ultimate Formula for Insurance
) 𝑚𝑚
Discrete Mixture
E[𝑌𝑌 ; ] = 𝛼𝛼(1 + 𝑟𝑟) úE ù𝑋𝑋 ∧ û
1 1
Deductibles, 𝑑𝑑 1 + 𝑟𝑟
𝑓𝑓7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝑓𝑓/! (𝑦𝑦) , where Ä 𝑤𝑤9 = 1 𝑑𝑑
Ordinary deductible: − E ü𝑋𝑋 ∧ †°
9:, 9:, 1 + 𝑟𝑟
0, 𝑋𝑋 < 𝑑𝑑
1 𝑌𝑌 ; = (𝑋𝑋 − 𝑑𝑑)= = ë where
𝑋𝑋 − 𝑑𝑑, 𝑋𝑋 ≥ 𝑑𝑑
𝐹𝐹7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝐹𝐹/! (𝑦𝑦) 𝑑𝑑: deductible (set to 0 if not applicable)
9:,
E[𝑌𝑌 ; ] = E[(𝑋𝑋 − 𝑑𝑑)= ] = E[𝑋𝑋] − E[𝑋𝑋 ∧ 𝑑𝑑]
1 E[(𝑌𝑌 ; )+ ] = Ef(𝑋𝑋 − 𝑑𝑑)+= g 𝑢𝑢: policy limit (set to ∞ if not applicable)
𝑆𝑆7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝑆𝑆/! (𝑦𝑦) # 𝛼𝛼: coinsurance (set to 1 if not applicable)
9:, = * (𝑥𝑥 − 𝑑𝑑)+ 𝑓𝑓(𝑥𝑥) d𝑥𝑥 𝑟𝑟: inflation rate (set to 0 if not applicable)
1 >
# 𝑢𝑢
E[𝑌𝑌 +]
= Ä 𝑤𝑤9 ⋅ Ef𝑋𝑋9+ g 𝑚𝑚: maximum covered loss = + 𝑑𝑑
= * 𝑘𝑘(𝑥𝑥 − 𝑑𝑑)+", 𝑆𝑆(𝑥𝑥) d𝑥𝑥 𝛼𝛼
9:, >

Continuous Mixture Loss elimination ratio: Tail Properties of Distributions


• Poisson-Gamma Mixture E[𝑋𝑋 ∧ 𝑑𝑑] 𝑞𝑞 quantile
LER = 𝜋𝜋' = 𝐹𝐹/", (𝑞𝑞)
𝑋𝑋|Λ ~ Poisson(Λ) E[𝑋𝑋]
Λ ~ Gamma(𝛼𝛼, 𝜃𝜃)
Franchise deductible: Conditional Tail Expectation (CTE)
𝑋𝑋 ~ Negative Binomial(𝑟𝑟 = 𝛼𝛼, 𝛽𝛽 = 𝜃𝜃)
0, 𝑋𝑋 < 𝑑𝑑 1 − 𝑞𝑞 : tolerance probability
• Exponential-Gamma Mixture 𝑌𝑌 ; = ë
𝑋𝑋, 𝑋𝑋 ≥ 𝑑𝑑 CTE' (𝑋𝑋) = Ef𝑋𝑋 ∣ 𝑋𝑋 > 𝜋𝜋' g
𝑋𝑋|Λ ~ Exponential(Λ) E[𝑌𝑌 ; ] = E[(𝑋𝑋 − 𝑑𝑑)= ] + 𝑑𝑑 ⋅ 𝑆𝑆(𝑑𝑑)
E[𝑋𝑋] − Ef𝑋𝑋 ∧ 𝜋𝜋' g
Λ ~ Inverse Gamma(𝛼𝛼, 𝜃𝜃) = 𝜋𝜋' +
Payment per Payment 1 − 𝑞𝑞
𝑋𝑋 ~ Pareto(𝛼𝛼, 𝜃𝜃)
𝑌𝑌 ? : payment per payment
Splices CTE' (𝑋𝑋)
E[𝑌𝑌 ? ] = 𝑒𝑒(𝑑𝑑) = E[ 𝑋𝑋 − 𝑑𝑑 ∣ 𝑋𝑋 > 𝑑𝑑 ]
⎧ 𝑐𝑐, ⋅ 𝑓𝑓/" (𝑦𝑦),
𝑎𝑎) < 𝑦𝑦 < 𝑎𝑎, E[𝑌𝑌 ; ] E[(𝑋𝑋 − 𝑑𝑑)= ] 𝜙𝜙9𝑧𝑧' :
⎪ 𝑐𝑐* ⋅ 𝑓𝑓/ (𝑦𝑦),
𝑎𝑎, < 𝑦𝑦 < 𝑎𝑎* = = Normal 𝜇𝜇 + 𝜎𝜎 § ¶
𝑓𝑓7 (𝑦𝑦) = # 𝑆𝑆(𝑑𝑑) 𝑆𝑆(𝑑𝑑) 1 − 𝑞𝑞
⎨ ⋮ ⋮
⎪𝑐𝑐1 ⋅ 𝑓𝑓/ (𝑦𝑦), 𝑎𝑎1", < 𝑦𝑦 < 𝑎𝑎1 Special Cases for 𝑒𝑒(𝑑𝑑)
⎩ $
Φ9𝜎𝜎 − 𝑧𝑧' :
where ∑19:, 𝑐𝑐9 does not need to equal 1. Lognormal E[𝑋𝑋] ⋅ § ¶
Loss Excess Loss 1 − 𝑞𝑞
Bernoulli Shortcut Exponential(𝜃𝜃) Exponential(𝜃𝜃)
Tail Weight
Var[𝑋𝑋] = (𝑎𝑎 − 𝑏𝑏)* 𝑞𝑞(1 − 𝑞𝑞)
𝑎𝑎, Pr(𝑋𝑋 = 𝑎𝑎) = 𝑞𝑞 Uniform(𝑎𝑎, 𝑏𝑏) Uniform(0, 𝑏𝑏 − 𝑑𝑑) • The fewer positive raw moments that
where 𝑋𝑋 = ë exist, the greater the tail weight.
𝑏𝑏, Pr(𝑋𝑋 = 𝑏𝑏) = 1 − 𝑞𝑞
Pareto(𝛼𝛼, 𝜃𝜃) Pareto(𝛼𝛼, 𝜃𝜃 + 𝑑𝑑) • If the ratio of the survival functions or
the density functions approaches infinity
S-P Pareto(𝛼𝛼, 𝜃𝜃) Pareto(𝛼𝛼, 𝑑𝑑)
as 𝑥𝑥 increases, the numerator has a
Beta(1, 𝑏𝑏, 𝜃𝜃) Beta(1, 𝑏𝑏, 𝜃𝜃 − 𝑑𝑑) heavier tail.
• If the hazard rate function decreases with
𝑥𝑥, the distribution has a heavy tail.
• The larger a given CTE or quantile is, the
greater the tail weight.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 2
Poisson Processes Reliability Theory* Random Graphs
Counting process where non-overlapping • A parallel system functions as long as one • 𝑛𝑛1"* minimal path sets
Poisson increments are independent of the components functions. • 21", − 1 minimal cut sets
• A series system functions only when all • 𝑃𝑃9,5 is the probability nodes 𝑖𝑖 and 𝑗𝑗
𝑁𝑁(𝑡𝑡 + ℎ) − 𝑁𝑁(𝑡𝑡) ~ Poisson(𝜆𝜆)
@=A components function. are connected.
where 𝜆𝜆 = * 𝜆𝜆(𝑢𝑢) d𝑢𝑢 • A 𝑘𝑘-out-of-𝑛𝑛 system functions only when • 𝑃𝑃1 is the probability that a random graph
@
at least 𝑘𝑘 out of 𝑛𝑛 components function. is connected, where all 𝑃𝑃9,5 = 𝑝𝑝.
• Homogeneous if 𝜆𝜆(𝑡𝑡) is constant
• A minimal path set, 𝐴𝐴5 , is a minimal set of
• Non-homogeneous if 𝜆𝜆(𝑡𝑡) varies with 𝑡𝑡 1, 𝑛𝑛 = 1
components whose functioning ⎧
⎪ 𝑝𝑝, 𝑛𝑛 = 2
Time between Events guarantees the functioning of the system. 𝑃𝑃1 = 1",
𝑇𝑇+ : Time until the 𝑘𝑘th event occurs • A minimal cut set, 𝐶𝐶5 , is a minimal set of ⎨1 − Ä ú𝑛𝑛 − 1° 𝑞𝑞+(1"+) 𝑃𝑃 , 𝑛𝑛 > 2
⎪ 𝑘𝑘 − 1 +
𝑉𝑉+ = 𝑇𝑇+ − 𝑇𝑇+", components whose failure guarantees the ⎩ +:,

failure of the system. 1 − 𝑃𝑃1 ≤ (𝑛𝑛 + 1)𝑞𝑞1",


Homogeneous Poisson process: 𝑛𝑛
1 − 𝑃𝑃1 ≥ 𝑛𝑛𝑞𝑞1", − ´ ¨ 𝑞𝑞*1"-
• 𝑉𝑉+ ~ Exponential(𝜃𝜃 = 1⁄𝜆𝜆) Combining Systems 2
• 𝑇𝑇+ ~ Gamma(𝛼𝛼 = 𝑘𝑘, 𝜃𝜃 = 1⁄𝜆𝜆) Placement 𝑃𝑃1 ≈ 1 − 𝑛𝑛𝑞𝑞1",
Action
of Systems Lifetime of Systems
Conditional Distribution of Arrival Times
• Given that 𝑁𝑁(𝑡𝑡) = 𝑛𝑛, past events # of Minimal Parallel Sum Pr(𝑇𝑇 > 𝑡𝑡) = 𝑟𝑟[𝐒𝐒(𝑡𝑡)]
#
𝑇𝑇, , 𝑇𝑇* , … , 𝑇𝑇1 are order statistics of i.i.d. Path Sets Series Product E[𝑇𝑇] = * 𝑟𝑟[𝐒𝐒(𝑡𝑡)] d𝑡𝑡
Uniform(0, 𝑡𝑡). )
# of Minimal Parallel Product
• Given that 𝑇𝑇1 = 𝑡𝑡, past events For 𝑘𝑘-out-of-𝑛𝑛 systems whose components
𝑇𝑇, , 𝑇𝑇* , … , 𝑇𝑇1", are order statistics of i.i.d. Cut Sets Series Sum are 𝑟𝑟9 ~ Exponential(𝜃𝜃):
Uniform(0, 𝑡𝑡). 1
Reliability of Systems 1
E[𝑇𝑇] = Ef𝑋𝑋(1"+=,) g = 𝜃𝜃 Ä
Other Properties 𝑟𝑟(𝐩𝐩) = Pr[𝜙𝜙(𝐗𝐗) = 1] = E[𝜙𝜙(𝐗𝐗)] 𝑖𝑖
9:+
• Subprocesses are Poisson processes with
Bounds on Reliability Function Increasing Failure Rate (IFR) Distribution
proportional rates.
Method of Inclusion and Exclusion: ℎ(𝑥𝑥) is an increasing function of 𝑥𝑥.
• Sum of Poisson processes:
1 1 First two bounds using minimal path sets:
Decreasing Failure Rate (DFR) Distribution
Ä 𝑁𝑁9 ~ Poisson ©Ä 𝜆𝜆9 ™ F
ℎ(𝑥𝑥) is a decreasing function of 𝑥𝑥.
9:, 9:, 𝑟𝑟(𝐩𝐩) ≤ Ä Ø∞ 𝑝𝑝9 ≤
• Probability of observing 𝑛𝑛 events from 𝑁𝑁, 5:, 9∈E%
Increasing Failure on the Average (IFRA)
before 𝑚𝑚 events from 𝑁𝑁* is: F F
𝐻𝐻(𝑥𝑥)⁄𝑥𝑥 is an increasing function of 𝑥𝑥.
1=B", 𝑟𝑟(𝐩𝐩) ≥ Ä Ø∞ 𝑝𝑝9 ≤ − Ä Ä Ø ∞ 𝑝𝑝9 ≤
𝑛𝑛 + 𝑚𝑚 − 1 9 (1 • An IFR distribution is also IFRA.
Ä ´ ¨ 𝑞𝑞 − 𝑞𝑞)1=B","9 5:, 9∈E% 5:, +H5 9∈E% ∪E&
𝑖𝑖 • A monotone system’s lifetime distribution
9:1
B", First two bounds using minimal cut sets: is IFRA if the lifetimes of all components
𝑛𝑛 − 1 + 𝑗𝑗 1
Ä´ ¨ 𝑞𝑞 (1 − 𝑞𝑞)5 B are IFRA.
𝑛𝑛 − 1
5:) 1 − 𝑟𝑟(𝐩𝐩) ≤ Ä Ø∞(1 − 𝑝𝑝9 )≤
𝜆𝜆, 5:, 9∈I% *Key information on Reliability Theory is on
where 𝑞𝑞 =
𝜆𝜆, + 𝜆𝜆* B page 5.
1 − 𝑟𝑟(𝐩𝐩) ≥ Ä Ø∞(1 − 𝑝𝑝9 )≤
Compound Poisson Processes
5:, 9∈I%
C(@)
B
𝑆𝑆(𝑡𝑡) = Ä 𝑋𝑋9
− Ä Ä Ø ∞ (1 − 𝑝𝑝9 )≤
9:,
5:, +H5 9∈I% ∪I&
E[𝑆𝑆(𝑡𝑡)] = 𝜆𝜆𝜆𝜆 ⋅ E[𝑋𝑋]
Var[𝑆𝑆(𝑡𝑡)] = 𝜆𝜆𝜆𝜆 ⋅ E[𝑋𝑋 * ] Method of Intersection:
• Use normal approximation to calculate F
probabilities of events in 𝑆𝑆(𝑡𝑡). 𝑟𝑟(𝐩𝐩) ≤ 1 − ∞ ≥1 − ∞ 𝑝𝑝9 ¥
• Continuity correction is needed 5:, 9∈E%

if 𝑆𝑆(𝑡𝑡) is discrete. B

𝑟𝑟(𝐩𝐩) ≥ ∞ ≥1 − ∞(1 − 𝑝𝑝9 )¥


5:, 9∈I%

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 3
Discrete Markov Chains Classification of States Time Spent in Transient States
• Absorbing: State that cannot be left once it 𝐒𝐒 = (𝐈𝐈 − 𝐏𝐏N )",
Multiple-Step Transition Probabilities
is entered 𝑠𝑠9,5 − 𝛿𝛿9,5
• Chapman-Kolmogorov Probabilities 𝑓𝑓9,5 =
# • Accessible: State that can be entered from 𝑠𝑠5,5
1=B
𝑃𝑃9,5 1
= Ä 𝑃𝑃9,+ B
𝑃𝑃+,5 another state 1, 𝑖𝑖 = 𝑗𝑗
𝛿𝛿9,5 = ë
+:, • Communicating: Two states are accessible 0, otherwise
• Unconditional probability of being in • 𝑠𝑠9,5 is the expected time spent in state 𝑗𝑗
to each other
state 𝑗𝑗 at time 𝑛𝑛: • Class: A set of communicating states given it starts in state 𝑖𝑖.
# • 𝑓𝑓9,5 is the probability of ever transitioning
1 • Irreducible: A chain with only one class
Pr(𝑋𝑋1 = 𝑗𝑗) = Ä 𝛼𝛼9 𝑃𝑃9,5 to state 𝑗𝑗 from state 𝑖𝑖.
9:,
• Recurrent: Probability of re-entering state
where 𝛼𝛼9 is the probability of being in is 1, 𝑓𝑓9 = 1 Time Reversibility
state 𝑖𝑖 at time 0. • Transient: Probability of re-entering state 𝜋𝜋5 𝑃𝑃5,9
is less than 1, 𝑓𝑓9 < 1 𝑅𝑅9,5 =
• The probability of entering state 𝑗𝑗 at time 𝜋𝜋9
𝑚𝑚, starting at state 𝑖𝑖 without entering any • Given that a process starts in a A Markov chain is time reversible if
state in set 𝒜𝒜: transient state 𝑖𝑖, the number of times 𝑅𝑅9,5 = 𝑃𝑃9,5 for every 𝑖𝑖 and 𝑗𝑗.
the process re-enters state 𝑖𝑖, 𝑛𝑛 ≥ 0, has
State i State j Desired Probability M! Random Walk
a geometric distribution with 𝛽𝛽 =
,"M! All random walk models are transient
𝑖𝑖 ∉ 𝒜𝒜 𝑗𝑗 ∉ 𝒜𝒜 B
𝑄𝑄9,5 • Positive recurrent: Finite expected # of except for one-dimensional and two-
transitions for a chain to return to state j dimensional symmetric random walks.
B",
Ä 𝑄𝑄9,J 𝑃𝑃J,5 given it started in that state
𝑖𝑖 ∉ 𝒜𝒜 𝑗𝑗 ∈ 𝒜𝒜 Gambler’s Ruin Problem
J∉𝒜𝒜 • Null recurrent: Infinite expected # of
transitions for a chain to return to state j Probability of reaching 𝑗𝑗 starting with 𝑖𝑖 is:
9
B",
Ä 𝑃𝑃9,J 𝑄𝑄J,5 given it started in that state ⎧ 1 − (𝑞𝑞⁄𝑝𝑝) , 𝑝𝑝 ≠ 1
𝑖𝑖 ∈ 𝒜𝒜 𝑗𝑗 ∉ 𝒜𝒜 ⎪1 − (𝑞𝑞⁄𝑝𝑝)5 2
J∉𝒜𝒜 • Aperiodic: A chain that has 𝑃𝑃9 =
⎨ 𝑖𝑖 1
limiting probabilities ⎪ , 𝑝𝑝 =
⎩ 𝑗𝑗 2
B"*
Ä Ä 𝑃𝑃9,J 𝑄𝑄J,+ 𝑃𝑃+,5 • Periodic: A chain that does not have
𝑖𝑖 ∈ 𝒜𝒜 𝑗𝑗 ∈ 𝒜𝒜
J∉𝒜𝒜 +∉𝒜𝒜 limiting probabilities Branching Processes
• Ergodic: A chain that is irreducible, #

where: positive recurrent, and aperiodic 𝜇𝜇 = Ä 𝑗𝑗 ⋅ 𝑃𝑃5


5:)
𝑄𝑄9,5 = 𝑃𝑃9,5 , if 𝑖𝑖 ∉ 𝒜𝒜, 𝑗𝑗 ∉ 𝒜𝒜 #
Long-Run Proportions (Stationary
𝑄𝑄9,E = Ä 𝑃𝑃9,5 if 𝑖𝑖 ∉ 𝒜𝒜 Probabilities) 𝜎𝜎 * = Ä(𝑗𝑗 − 𝜇𝜇)* ⋅ 𝑃𝑃5
5∈𝒜𝒜
1 1 5:)
𝑄𝑄E,9 = 0 if 𝑖𝑖 ∉ 𝒜𝒜
𝑄𝑄E,E = 1 𝜋𝜋5 = Ä 𝜋𝜋9 𝑃𝑃9,5 , Ä 𝜋𝜋5 = 1
For 𝑋𝑋) = 1:
9:, 5:,
• The reciprocal of 𝜋𝜋5 is the expected time E[𝑋𝑋1 ] = 𝜇𝜇1
spent to return to state 𝑗𝑗. 1 − 𝜇𝜇1
𝜎𝜎 * 𝜇𝜇1", }  , 𝜇𝜇 ≠ 1
• For aperiodic chains, long-run Var[𝑋𝑋1 ] = ¿ 1 − 𝜇𝜇
proportions equal limiting probabilities. 𝑛𝑛𝜎𝜎 * , 𝜇𝜇 = 1
1, 𝜇𝜇 ≤ 1
#
𝜋𝜋) = ¡ 5
Ä 𝜋𝜋) 𝑃𝑃5 , 𝜇𝜇 > 1
5:)

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 4
Life Contingencies • The APV of whole life insurance is the Simulation
sum of the APV of term life and deferred
Number of Deaths 𝑈𝑈 ~ Uniform(0, 1)
whole life.
𝑑𝑑! = 𝑙𝑙! − 𝑙𝑙!=,
• The APV of endowment insurance is the Uniform Number Generation
Probability of Survival sum of the APV of term life and 𝑋𝑋1=, = (𝑎𝑎𝑋𝑋1 + 𝑐𝑐) mod 𝑚𝑚, 𝑛𝑛 ≥ 0
𝑙𝑙!=@ pure endowment. 𝑋𝑋1=,
@𝑝𝑝! = 𝑈𝑈 =
𝑙𝑙! 𝑚𝑚
Whole Life Annuity
Probability of Death # Inversion Method
𝑙𝑙! − 𝑙𝑙!=@ 𝑎𝑎̈ ! = Ä 𝑣𝑣 + ⋅ +𝑝𝑝! 𝑋𝑋 = 𝐹𝐹/", (𝑈𝑈)
@𝑞𝑞! =
𝑙𝑙! +:)
= 1 + 𝑣𝑣𝑝𝑝! ⋅ 𝑎𝑎̈ !=, Acceptance-Rejection Method
Curtate Life Expectancy 1 − 𝐴𝐴! 1. Find constant 𝑐𝑐 that satisfies:
# = 𝑓𝑓(𝑥𝑥)
𝑑𝑑
𝑒𝑒! = Ä +𝑝𝑝! ≤ 𝑐𝑐, for all 𝑥𝑥
𝑔𝑔(𝑥𝑥)
+:, Mortality Discount Factor
2. Simulate 𝑈𝑈 and a random number 𝑌𝑌 with
= 𝑝𝑝! (1 + 𝑒𝑒!=, ) @𝐸𝐸! = 𝑣𝑣 @ @𝑝𝑝! density function 𝑔𝑔.
Complete Expectation of Life Joint Lives 3. Accept the value 𝑌𝑌 if
#
𝑎𝑎̈ ! + 𝑎𝑎̈ O = 𝑎𝑎̈ !O + 𝑎𝑎̈ !O 𝑓𝑓(𝑌𝑌)
0.5 + Ä +𝑝𝑝!
PPPP 𝑈𝑈 ≤
𝑐𝑐𝑐𝑐(𝑌𝑌)
+:, Equivalence Principle Otherwise, reject and return to step 2.
Whole Life Insurance APVQRSTUVT = APVWSXSYU2
#

𝐴𝐴! = Ä 𝑣𝑣 +=, ⋅ +𝑝𝑝! ⋅ 𝑞𝑞!=+


+:)
= 𝑣𝑣𝑞𝑞! + 𝑣𝑣𝑝𝑝! 𝐴𝐴!=,

Key Information for Reliability Theory

# of # of
𝜙𝜙(𝐱𝐱) 𝑟𝑟(𝐩𝐩)
Minimal Path Sets Minimal Cut Sets
1 1

Parallel max(𝑥𝑥9 ) = 1 − ∞(1 − 𝑥𝑥9 ) 𝑛𝑛 1 1 − ∞(1 − 𝑝𝑝9 )


9:, 9:,
1 1

Series min(𝑥𝑥9 ) = ∞ 𝑥𝑥9 1 𝑛𝑛 ∞ 𝑝𝑝9


9:, 9:,
1
𝑛𝑛
𝑛𝑛 𝑛𝑛 Ä ´ ¨ 𝑝𝑝9 (1 − 𝑝𝑝)1"9
𝑘𝑘-out-of-𝑛𝑛 – ´ ¨ ´ ¨ 𝑖𝑖
𝑘𝑘 𝑛𝑛 − 𝑘𝑘 + 1 9:+
where 𝑝𝑝9 = 𝑝𝑝 for all 𝑖𝑖

Minimal Path Sets max ∞ 𝑥𝑥9 – – –


5
9∈E%

Minimal Cut Sets ∞ max 𝑥𝑥9 – – –


9∈I%
5:,

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 5
STATISTICS Special Cases – Complete Data Kernel Density Estimation
Statistics 1
Distribution Shortcut 1
Parameter and Density Estimation 𝑓𝑓“(𝑥𝑥) = Ä 𝑘𝑘9 (𝑥𝑥)
𝑛𝑛
9:,
Method of Moments Gamma, 𝑥𝑥̅
𝜃𝜃Ã = • 𝑏𝑏: Bandwidth
To fit an 𝑟𝑟-parameter distribution, set: fixed 𝛼𝛼 𝛼𝛼
• 𝑥𝑥9 : 𝑖𝑖th observed value
∑19:, 𝑥𝑥9+
E[𝑋𝑋 + ] = , 𝑘𝑘 = 1, 2, … , 𝑟𝑟 𝜇𝜇̂ = 𝑥𝑥̅ • 𝑘𝑘9 (𝑥𝑥): Kernel density function for 𝑥𝑥9 ,
𝑛𝑛
Normal evaluated at 𝑥𝑥
∑19:, 𝑥𝑥9*
Percentile Matching 𝜎𝜎 * = − 𝜇𝜇̂ * • 𝑓𝑓“(𝑥𝑥): PDF of the kernel-smoothed
𝑛𝑛
• Estimate parameters by setting the distribution
theoretical percentiles equal to the ∑19:, ln 𝑥𝑥9
𝜇𝜇̂ = Rectangular Kernels
sample percentiles 𝑛𝑛
Lognormal 1 1
∑ (ln 𝑥𝑥9 )*
𝑘𝑘9 (𝑥𝑥) = ”2𝑏𝑏 , 𝑥𝑥9 − 𝑏𝑏 ≤ 𝑥𝑥 ≤ 𝑥𝑥9 + 𝑏𝑏
9:,
Smoothed Empirical Percentile – Unique 𝜎𝜎 * = − 𝜇𝜇̂ *
𝑛𝑛
Values 0, otherwise
Poisson 𝜆𝜆œ = 𝑥𝑥̅
𝜋𝜋' = [𝑞𝑞(𝑛𝑛 + 1)]23 smallest observed value
• If 𝑞𝑞(𝑛𝑛 + 1) is a non-integer, calculate 𝜋𝜋' Binomial, 𝑥𝑥̅
𝑞𝑞 =
by interpolating between the order fixed 𝑚𝑚 𝑚𝑚
statistics before and after.
Negative
𝑥𝑥̅
Maximum Likelihood Estimation Binomial, 𝛽𝛽œ =
𝑟𝑟 Triangular Kernels
1 fixed 𝑟𝑟
𝑏𝑏 − |𝑥𝑥 − 𝑥𝑥9 |
𝐿𝐿(𝜃𝜃) = ∞ 𝑓𝑓(𝑥𝑥9 ) , 𝑥𝑥9 − 𝑏𝑏 ≤ 𝑥𝑥 ≤ 𝑥𝑥9 + 𝑏𝑏
𝑘𝑘9 (𝑥𝑥) = ” 𝑏𝑏*
9:, Uniform
𝜃𝜃Ã = max(𝑥𝑥, , … , 𝑥𝑥1 ) 0, otherwise
• Estimate 𝜃𝜃 as the value that maximizes [0, 𝜃𝜃]
𝐿𝐿(𝜃𝜃) or 𝑙𝑙(𝜃𝜃) = ln 𝐿𝐿(𝜃𝜃)
• Invariance property Special Cases – Incomplete Data

Incomplete Data Pareto, fixed 𝜃𝜃

Case Likelihood 𝑛𝑛
𝛼𝛼 =
∑1=Z
9:, [ln(𝑥𝑥9 + 𝜃𝜃) − ln(𝑑𝑑9 + 𝜃𝜃)] Gaussian Kernels
Right-censored at 𝑚𝑚 Pr(𝑋𝑋 ≥ 𝑚𝑚)
𝑘𝑘9 (𝑥𝑥)
S-P Pareto, fixed 𝜃𝜃 1 (𝑥𝑥 − 𝑥𝑥9 )*
𝑓𝑓(𝑥𝑥) = exp §− ¶ , −∞ < 𝑥𝑥 < ∞
Left-truncated at 𝑑𝑑 𝑏𝑏√2𝜋𝜋 2𝑏𝑏*
Pr(𝑋𝑋 > 𝑑𝑑) 𝑛𝑛
𝛼𝛼 =
∑1=Z
9:, {ln 𝑥𝑥9 − ln[max(𝜃𝜃, 𝑑𝑑9 )]}
Grouped data on
Pr(𝑎𝑎 < 𝑋𝑋 ≤ 𝑏𝑏)
interval (𝑎𝑎, 𝑏𝑏] Exponential

∑1=Z
9:, (𝑥𝑥9 − 𝑑𝑑9 )
𝜃𝜃Ã =
𝑛𝑛

Weibull, fixed 𝜏𝜏

,⁄[
∑1=Z [ 1=Z [
9:, 𝑥𝑥9 − ∑9:, 𝑑𝑑9
𝜃𝜃Ã = } 
𝑛𝑛

where:
• 𝑛𝑛: # of uncensored data points
• 𝑐𝑐: # of censored data points
• 𝑥𝑥9 : 𝑖𝑖th observed value, or the censoring
point for censored data points
• 𝑑𝑑9 : truncation point for the 𝑖𝑖th observation

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 6
Estimator Quality Fisher Information Exponential Class of Distributions
Statistics and Estimators d* 𝑓𝑓(𝑥𝑥) = exp[𝑎𝑎(𝑥𝑥) ⋅ 𝑏𝑏(𝜃𝜃) + 𝑐𝑐(𝜃𝜃) + 𝑑𝑑(𝑥𝑥)]
𝐼𝐼(𝜃𝜃) = −E § * 𝑙𝑙(𝜃𝜃)¶
∑19:, 𝑋𝑋9 d𝜃𝜃 • ∑19:, 𝑎𝑎(𝑋𝑋9 ) is a complete sufficient
𝑋𝑋’ =
𝑛𝑛 d* statistic for 𝜃𝜃.
∑ 1
(𝑋𝑋9 − 𝑋𝑋’)* = −𝑛𝑛 ⋅ E § * ln 𝑓𝑓(𝑋𝑋)¶
𝑆𝑆 * = 9:, d𝜃𝜃
𝑛𝑛 − 1 Maximum Likelihood Estimators
• [𝐼𝐼(𝜃𝜃)]", is the Rao-Cramér lower bound.
Under specific circumstances, the MLE of 𝜃𝜃:
For a random sample: • 𝐼𝐼(𝜃𝜃) ∙ 𝑔𝑔( (𝜃𝜃)"* is the Fisher information
• Consistent estimator
• E[𝑋𝑋’] = E[𝑋𝑋] for 𝑔𝑔(𝜃𝜃).
• Asymptotically follows a normal
]^R[/]
• Var[𝑋𝑋’] = Minimum Variance Unbiased Estimator distribution with mean 𝜃𝜃 and variance
1
• The MVUE is an unbiased estimator with [𝐼𝐼(𝜃𝜃)]", ; its exact variance may equal the
Bias
the smallest variance among all asymptotic variance
Biasf𝜃𝜃Ãg = Ef𝜃𝜃Ã g − 𝜃𝜃
unbiased estimators. • Function of sufficient statistic 𝑌𝑌
• If lim Biasf𝜃𝜃Ã g = 0, then 𝜃𝜃Ã is
1→# • If 𝑌𝑌 is a complete sufficient statistic for 𝜃𝜃
asymptotically unbiased. and 𝜑𝜑(𝑌𝑌) is an unbiased estimator of 𝜃𝜃,
then the MVUE of 𝜃𝜃 is 𝜑𝜑(𝑌𝑌).
Variance
*
Varf𝜃𝜃Ãg = E ù9𝜃𝜃Ã − Ef𝜃𝜃Ã g: û Sufficiency
• 𝑌𝑌 is a sufficient statistic for 𝜃𝜃 if and only if
Mean Squared Error 𝑓𝑓(𝑥𝑥, , … , 𝑥𝑥1 |𝑦𝑦) = ℎ(𝑥𝑥, , … , 𝑥𝑥1 ) where
*
MSEf𝜃𝜃Ãg = E ù9𝜃𝜃Ã − 𝜃𝜃: û ℎ(𝑥𝑥, , … , 𝑥𝑥1 ) does not depend on 𝜃𝜃.
* • By factorization theorem, 𝑌𝑌 is sufficient if
= Varf𝜃𝜃Ãg + 9Biasf𝜃𝜃Ãg:
and only if 𝑓𝑓(𝑥𝑥, , … , 𝑥𝑥1 ) = ℎ, (𝑦𝑦, 𝜃𝜃) ⋅
Consistency ℎ* (𝑥𝑥, , … , 𝑥𝑥1 ) for non-negative functions
lim Pr9◊𝜃𝜃Ã − 𝜃𝜃◊ > 𝜀𝜀: = 0 for all 𝜀𝜀 > 0 ℎ, and ℎ* where ℎ* (𝑥𝑥, , … , 𝑥𝑥1 ) does not
1→#
depend on 𝜃𝜃.
• If lim Biasf𝜃𝜃Ã g = 0 and lim Varf𝜃𝜃Ãg = 0,
1→# 1→# • 𝑔𝑔(𝑌𝑌) is a sufficient statistic for 𝜃𝜃 if 𝑔𝑔(⋅) is
then 𝜃𝜃Ã is consistent. a one-to-one function of sufficient 𝑌𝑌.
Efficiency • By Rao-Blackwell theorem, the variance
[𝐼𝐼(𝜃𝜃)]", of the unbiased estimator Eb [𝑍𝑍|𝑌𝑌] is at
Efff𝜃𝜃Ãg = most the variance of any unbiased
Varf𝜃𝜃Ãg
estimator 𝑍𝑍 for sufficient 𝑌𝑌. The MVUE
• If Efff𝜃𝜃Ãg = 1, then 𝜃𝜃Ã is efficient.
𝜑𝜑(𝑌𝑌) is Eb [𝑍𝑍|𝑌𝑌].

Key Results for Distributions in the Exponential Class

Distribution Parameter of Interest ∑19:, 𝑎𝑎(𝑋𝑋9 ) MVUE


,
Binomial 𝑞𝑞 ∑19:, 𝑋𝑋9
B
𝑋𝑋’

Normal 𝜇𝜇 ∑19:, 𝑋𝑋9 𝑋𝑋’


,
Normal 𝜎𝜎 * ∑19:,(𝑋𝑋9 − 𝜇𝜇)* ∑19:,(𝑋𝑋9 − 𝜇𝜇)*
1

Poisson 𝜆𝜆 ∑19:, 𝑋𝑋9 𝑋𝑋’


,
Gamma 𝜃𝜃 ∑19:, 𝑋𝑋9 𝑋𝑋’
c

Inverse Gaussian 𝜇𝜇 ∑19:, 𝑋𝑋9 𝑋𝑋’


,
Negative Binomial 𝛽𝛽 ∑19:, 𝑋𝑋9 J
𝑋𝑋’

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 7
Hypothesis Testing Tests for Means Tests for Proportions
• When variance is known, we apply the # of successes from 𝑛𝑛 trials
Terminology 𝑞𝑞 =
Central Limit Theorem. 𝑛𝑛
• Test statistic: A value calculated from data • Critical regions are the same as those for
• When variance is unknown, the random
that assumes 𝐻𝐻) is true testing means with known variance.
sample must be drawn from
• Critical region: The range of test statistic
a normal distribution. Tests for Variances – One Sample
values where 𝐻𝐻) is rejected
• Critical value: A value that borders the Critical Regions – Known Variance
critical region Test Type Critical Region
• Two-tailed test: A test that includes both Test Type Critical Region Left-tailed *
𝑡𝑡. 𝑠𝑠. ≤ 𝜒𝜒c,1",
tails in its critical region
Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑧𝑧,"c
• Right-tailed test: A test that only includes f𝑡𝑡. 𝑠𝑠. ≤ 𝜒𝜒c*⁄*,1", g
the right tail in its critical region Two-tailed
Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑧𝑧,"c⁄* *
∪ f𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c ⁄*,1", g
• Left-tailed test: A test that only includes
the left tail in its critical region Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝑧𝑧,"c *
Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,1",
• Significance level, 𝛼𝛼: The probability of
Critical Regions – Unknown Variance
rejecting 𝐻𝐻) , assuming it is true Tests for Variances – Two Samples
• Power: The probability of rejecting 𝐻𝐻) ,
Test Type Critical Region
assuming it is false Test Type Critical Region
• 𝑝𝑝-value: The probability of observing the Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑡𝑡*c,dY
Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ 𝐹𝐹,"c,1"",,1#",
test statistic or a more extreme value,
assuming 𝐻𝐻) is true Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑡𝑡c,dY
",
ù𝑡𝑡. 𝑠𝑠. ≤ 9𝐹𝐹c⁄*,1#",,1"",: û
Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝑡𝑡*c,dY Two-tailed
𝐻𝐻) is true 𝐻𝐻) is false ∪ f𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c⁄*,1"",,1#", g

One Sample
Type I Correct Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,1"",,1#",
Reject 𝐻𝐻) • df = 𝑛𝑛 − 1
Error Decision
Two Samples • A left-tailed test can be performed by
Fail to Correct Type II writing 𝐻𝐻) in terms of 𝜎𝜎** ⁄𝜎𝜎,* instead and
(𝑛𝑛, − 1)𝑠𝑠,* + (𝑛𝑛* − 1)𝑠𝑠**
reject 𝐻𝐻) Decision Error 𝑠𝑠e* = doing a right-tailed test.
𝑛𝑛, + 𝑛𝑛* − 2
",
• 𝜎𝜎,* = 𝜎𝜎** • 𝐹𝐹',g#,g" = 9𝐹𝐹,"',g",g# :
• For all hypothesis tests, reject 𝐻𝐻) if
• df = 𝑛𝑛, + 𝑛𝑛* − 2
𝑝𝑝-value ≤ 𝛼𝛼.
Two Samples – Paired
• Samples are not independent;
observations form pairs.
• Identical to one sample of
observed differences
• 𝑛𝑛∗ = 𝑛𝑛, = 𝑛𝑛*
• df = 𝑛𝑛∗ − 1

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 8
Summary for Hypothesis Testing

Parameter # of Samples 𝐻𝐻) Variance 𝑡𝑡. 𝑠𝑠.

𝑥𝑥̅ − ℎ
Known
𝜎𝜎⁄√𝑛𝑛
One 𝜇𝜇 = ℎ
𝑥𝑥̅ − ℎ
Unknown
𝑠𝑠⁄√𝑛𝑛
𝑥𝑥̅, − 𝑥𝑥̅* − ℎ
Known 𝜎𝜎 * 𝜎𝜎 *
‡ , + *
𝑛𝑛, 𝑛𝑛*
Means Two 𝜇𝜇, − 𝜇𝜇* = ℎ
𝑥𝑥̅, − 𝑥𝑥̅* − ℎ
Unknown 1 1
𝑠𝑠e ·𝑛𝑛 + 𝑛𝑛
, *

𝑑𝑑̅ − ℎ
Known
𝜎𝜎h ⁄√𝑛𝑛∗
Two, Paired 𝜇𝜇, − 𝜇𝜇* = ℎ
𝑑𝑑̅ − ℎ
Unknown
𝑠𝑠h ⁄√𝑛𝑛∗
𝑞𝑞 − ℎ
One 𝑞𝑞 = ℎ –
·ℎ(1 − ℎ)
𝑛𝑛
Proportions
𝑞𝑞, − 𝑞𝑞* − ℎ
Two 𝑞𝑞, − 𝑞𝑞* = ℎ – 𝑞𝑞 (1 − 𝑞𝑞, ) 𝑞𝑞* (1 − 𝑞𝑞* )
· , +
𝑛𝑛, 𝑛𝑛*

(𝑛𝑛 − 1)𝑠𝑠 *
One 𝜎𝜎 * = ℎ –

Variances
𝜎𝜎,* 𝑠𝑠,* 1
Two =ℎ – ⋅
𝜎𝜎** 𝑠𝑠** ℎ

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 9
Intervals for Means

Parameter Scenario Type 100𝑘𝑘% Confidence Interval

𝜎𝜎
Two-sided 𝑥𝑥̅ ± 𝑧𝑧(,=+)⁄* ⋅
√𝑛𝑛
Known 𝜎𝜎
Left-sided ú−∞, 𝑥𝑥̅ + 𝑧𝑧+ ⋅ °
Variance √𝑛𝑛
𝜎𝜎
Right-sided ú𝑥𝑥̅ − 𝑧𝑧+ ⋅ , ∞°
√𝑛𝑛
𝜇𝜇
𝑠𝑠
Two-sided 𝑥𝑥̅ ± 𝑡𝑡,"+,1", ⋅
√𝑛𝑛
Unknown 𝑠𝑠
Left-sided ú−∞, 𝑥𝑥̅ + 𝑡𝑡*(,"+),1", ⋅ °
Variance √𝑛𝑛
𝑠𝑠
Right-sided ú𝑥𝑥̅ − 𝑡𝑡*(,"+),1", ⋅ , ∞°
√𝑛𝑛

𝜎𝜎 * 𝜎𝜎 *
Two-sided 𝑥𝑥̅, − 𝑥𝑥̅* ± 𝑧𝑧(,=+)⁄* ‡ , + *
𝑛𝑛, 𝑛𝑛*

Known 𝜎𝜎,* 𝜎𝜎**


Left-sided Ø−∞, 𝑥𝑥̅, − 𝑥𝑥̅* + 𝑧𝑧+ ‡ + ≤
Variances 𝑛𝑛, 𝑛𝑛*

𝜎𝜎,* 𝜎𝜎**
Right-sided Ø𝑥𝑥̅, − 𝑥𝑥̅* − 𝑧𝑧+ ‡ + , ∞≤
𝑛𝑛, 𝑛𝑛*

𝜇𝜇, − 𝜇𝜇* 1 1
Two-sided 𝑥𝑥̅, − 𝑥𝑥̅* ± 𝑡𝑡,"+,1"=1#"* ⋅ 𝑠𝑠e ‡ +
𝑛𝑛, 𝑛𝑛*

Unknown 1 1
Left-sided Ø−∞, 𝑥𝑥̅, − 𝑥𝑥̅* + 𝑡𝑡*(,"+),1"=1#"* ⋅ 𝑠𝑠e ‡ + ≤
Variances 𝑛𝑛, 𝑛𝑛*

1 1
Right-sided Ø𝑥𝑥̅, − 𝑥𝑥̅* − 𝑡𝑡*(,"+),1"=1#"* ⋅ 𝑠𝑠e ‡ + , ∞≤
𝑛𝑛, 𝑛𝑛*

Paired All Identical to the one-sample case

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 10
Intervals for Proportions

Parameter Type 100𝑘𝑘% Confidence Interval

𝑞𝑞(1 − 𝑞𝑞)
Two-sided 𝑞𝑞 ± 𝑧𝑧(,=+)⁄* ‡
𝑛𝑛

𝑞𝑞(1 − 𝑞𝑞)
𝑞𝑞 Left-sided Ø−∞, 𝑞𝑞 + 𝑧𝑧+ ‡ ≤
𝑛𝑛

𝑞𝑞(1 − 𝑞𝑞)
Right-sided Ø𝑞𝑞 − 𝑧𝑧+ ‡ , ∞≤
𝑛𝑛

𝑞𝑞, (1 − 𝑞𝑞, ) 𝑞𝑞*(1 − 𝑞𝑞* )


Two-sided 𝑞𝑞, − 𝑞𝑞* ± 𝑧𝑧(,=+)⁄* ‡ +
𝑛𝑛, 𝑛𝑛*

𝑞𝑞, (1 − 𝑞𝑞, ) 𝑞𝑞* (1 − 𝑞𝑞* )


𝑞𝑞, − 𝑞𝑞* Left-sided Ø−∞, 𝑞𝑞, − 𝑞𝑞* + 𝑧𝑧+ ‡ + ≤
𝑛𝑛, 𝑛𝑛*

𝑞𝑞, (1 − 𝑞𝑞, ) 𝑞𝑞* (1 − 𝑞𝑞* )


Right-sided Ø𝑞𝑞, − 𝑞𝑞* − 𝑧𝑧+ ‡ + , ∞≤
𝑛𝑛, 𝑛𝑛*

Intervals for Variances

Parameter Type 100𝑘𝑘% Confidence Interval

(𝑛𝑛 − 1)𝑠𝑠 * (𝑛𝑛 − 1)𝑠𝑠 *


Two-sided } * , * 
𝜒𝜒(,=+) ⁄*,1", 𝜒𝜒(,"+)⁄*,1",

(𝑛𝑛 − 1)𝑠𝑠 *
𝜎𝜎 * Left-sided }0, * 
𝜒𝜒,"+,1",

(𝑛𝑛 − 1)𝑠𝑠 *
Right-sided } * , ∞
𝜒𝜒+,1",

𝑠𝑠,* *
", 𝑠𝑠,
Two-sided } * ⋅ 9𝐹𝐹(,"+)⁄*,1"",,1#", : , * ⋅ 𝐹𝐹(,"+)⁄*,1#",,1"", 
𝑠𝑠* 𝑠𝑠*

𝜎𝜎,* 𝑠𝑠,*
Left-sided }0, ⋅ 𝐹𝐹 
𝜎𝜎** 𝑠𝑠** ,"+,1#",,1"",

𝑠𝑠,* ",
Right-sided } * ⋅ 9𝐹𝐹,"+,1"",,1#", : , ∞
𝑠𝑠*

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 11
Most Powerful Tests Chi-Square Goodness-of-Fit Test Confidence Intervals
+ *
Terminology 9𝑛𝑛5 − 𝑛𝑛𝑞𝑞5 : • For means and proportions, the two-
𝑡𝑡. 𝑠𝑠. = Ä
• Simple: Fully specifies the distribution(s) 𝑛𝑛𝑞𝑞5 sided general form is
5:,
• Composite: Does not fully specify *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,+","J estimate ± (percentile)(standard error)
the distribution(s) • 𝑘𝑘: # of mutually exclusive intervals • 𝐻𝐻) will fail to be rejected at 𝛼𝛼 if ℎ is
• 𝑞𝑞5 : probability of being in interval 𝑗𝑗 within the 100(1 − 𝛼𝛼)%
Most Powerful Test
confidence interval.
When 𝐻𝐻) and 𝐻𝐻, are both simple, the most • 𝑛𝑛5 : # of observed values in interval 𝑗𝑗
powerful test of size 𝛼𝛼 has the largest power • 𝑟𝑟: # of free parameters
Order Statistics
among all tests with the same 𝛼𝛼.
Chi-Square Test of Independence
𝑋𝑋(+) = 𝑘𝑘 23 order statistic
Neyman-Pearson Theorem k j *
1 9𝑛𝑛95 𝑛𝑛 − 𝑛𝑛9• 𝑛𝑛•5 : 𝑋𝑋(,) = min(𝑋𝑋, , … , 𝑋𝑋1 )
The best critical region is embedded in 𝑡𝑡. 𝑠𝑠. = Ä Ä
𝑛𝑛 𝑛𝑛9• 𝑛𝑛•5 𝑋𝑋(1) = max(𝑋𝑋, , … , 𝑋𝑋1 )
9:, 5:,
𝐿𝐿(ℎ) )
≤ 𝑘𝑘 *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,(k",)(j",)
𝐿𝐿(ℎ, ) First Principles
where 𝐻𝐻) and 𝐻𝐻, are both simple. • 𝑎𝑎: # of categories for first variable 𝑛𝑛!
• 𝑏𝑏: # of categories for second variable 𝑓𝑓/(&) (𝑥𝑥) = ⋅ [𝐹𝐹/ (𝑥𝑥)]+",
(𝑘𝑘 − 1)! (𝑛𝑛 − 𝑘𝑘)!
Uniformly Most Powerful (UMP) Tests
• 𝑛𝑛95 : # of observations in first variable’s ⋅ 𝑓𝑓/ (𝑥𝑥) ⋅ [𝑆𝑆/ (𝑥𝑥)]1"+
• For a simple 𝐻𝐻) and composite 𝐻𝐻, , a test
category 𝑖𝑖 and second variable’s
is UMP when the best critical region is the Special Cases
category 𝑗𝑗
same for testing 𝐻𝐻) against each simple
• 𝑛𝑛9• : subtotal # of observations in category Uniform (𝑎𝑎, 𝑏𝑏)
hypothesis in 𝐻𝐻, .
𝑖𝑖, across all categories of the second
• For composite hypotheses 𝐻𝐻) ∶ 𝜃𝜃 ≤ ℎ and 𝑘𝑘(𝑏𝑏 − 𝑎𝑎)
variable
𝐻𝐻, ∶ 𝜃𝜃 > ℎ, a test is UMP if there is a Ef𝑋𝑋(+) g = 𝑎𝑎 +
• 𝑛𝑛•5 : subtotal # of observations in category 𝑛𝑛 + 1
monotone likelihood ratio in a statistic 𝑦𝑦.
𝑗𝑗, across all categories of the first variable
Uniform (0, 𝜃𝜃)
Goodness of Fit Tests Likelihood Ratio Test
𝐿𝐿) 𝑋𝑋(+) ~ Beta (𝑘𝑘, 𝑛𝑛 − 𝑘𝑘 + 1, 𝜃𝜃)
Kolmogorov-Smirnov Test 𝑡𝑡. 𝑠𝑠. = −2 ln ú ° = 2(𝑙𝑙, − 𝑙𝑙) )
𝐿𝐿,
𝑡𝑡. 𝑠𝑠. = 𝐷𝐷 = maximum absolute difference
*
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,J
between 𝐹𝐹 ∗ (𝑥𝑥) and 𝐹𝐹Ã (𝑥𝑥) " "J'
Exponential (𝜃𝜃)
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ critical value • 𝑟𝑟) : # of free parameters in distribution
under 𝐻𝐻) 1
• 𝐹𝐹 ∗ (𝑥𝑥): CDF of the proposed distribution 1
• 𝑟𝑟, : # of free parameters in distribution Ef𝑋𝑋(+) g = 𝜃𝜃 Ä
• 𝐹𝐹Ã (𝑥𝑥): Empirical distribution function 9:1"+=,
𝑖𝑖
# of observations ≤ 𝑥𝑥 under 𝐻𝐻,
𝐹𝐹Ã (𝑥𝑥) = • 𝐿𝐿) : Maximized likelihood under 𝐻𝐻)
𝑛𝑛
• 𝐿𝐿, : Maximized likelihood under 𝐻𝐻,
Left-Truncated at 𝑑𝑑
• 𝑙𝑙) = ln 𝐿𝐿)
𝐹𝐹(𝑥𝑥) − 𝐹𝐹(𝑑𝑑)
𝐹𝐹 ∗ (𝑥𝑥) = • 𝑙𝑙, = ln 𝐿𝐿,
1 − 𝐹𝐹(𝑑𝑑)

Right-Censored at 𝑚𝑚
𝐹𝐹Ã(𝑚𝑚) is undefined.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 12
EXTENDED LINEAR MODELS Model Accuracy
Extended Linear Models
𝑌𝑌 = 𝑓𝑓9𝑥𝑥, , … , 𝑥𝑥l : + 𝜀𝜀, E[𝜀𝜀] = 0
Introduction to Statistical Learning * ∑19:,(𝑦𝑦9 − 𝑦𝑦9 )*
Types of Variables Test MSE = E ù9𝑌𝑌 − 𝑌𝑌Ã: û can be estimated using
𝑛𝑛
• Response: A variable of primary interest For fixed inputs 𝑥𝑥, , … , 𝑥𝑥l , the test MSE is
• Explanatory: A variable used to study the response variable *
Varf𝑓𝑓œ9𝑥𝑥, , … , 𝑥𝑥l :g + 9Biasf𝑓𝑓œ9𝑥𝑥, , … , 𝑥𝑥l :g: + Var[𝜀𝜀]
ÁÈÍ
ÁËËËËËËËËËËËËÈËËËËËËËËËËËËÍ
• Count: A quantitative variable valid on non-negative integers RSdVmUnoS SRRqR URRSdVmUnoS SRRqR

• Continuous: A quantitative variable valid on real numbers • If training data 𝑦𝑦9 ’s are used, training MSE is computed instead.
• Nominal: A qualitative variable having categories without a • As flexibility increases, the training MSE decreases, but the test
meaningful or logical order MSE follows a u-shaped pattern.
• Ordinal: A qualitative variable having categories with a meaningful • Low flexibility leads to a method with low variance and high bias;
or logical order high flexibility leads to a method with high variance and low bias.

Contrasting Statistical Learning Elements Numerical Summaries


∑19:, 𝑥𝑥9 ∑1 (𝑥𝑥9 − 𝑥𝑥̅ )*
𝑥𝑥̅ = , 𝑠𝑠!* = 9:,
𝑛𝑛 𝑛𝑛 − 1
∑19:,(𝑥𝑥9 − 𝑥𝑥̅ )(𝑦𝑦9 − 𝑦𝑦’)
𝑐𝑐𝑐𝑐𝑐𝑐!,O =
𝑛𝑛 − 1
𝑐𝑐𝑐𝑐𝑐𝑐!,O
𝑟𝑟!,O = , −1 ≤ 𝑟𝑟!,O ≤ 1
𝑠𝑠! ⋅ 𝑠𝑠O

Graphical Summaries
• A scatterplot plots values of two variables to investigate
their relationship.
• A box plot captures a variable's distribution using its median, 1st
and 3rd quartiles, and distribution tails.
• A QQ plot plots sample percentiles against theoretical percentiles
to determine whether the sample and theoretical distributions
have similar shapes.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 13
Simple Linear Regression (SLR) Other Numerical Results 𝑡𝑡 Tests
𝑒𝑒 = 𝑦𝑦 − 𝑦𝑦 estimate − hypothesized value
Special case of MLR where 𝑝𝑝 = 1 1
𝑡𝑡. 𝑠𝑠. =
standard error
SSR = Ä (𝑦𝑦9 − 𝑦𝑦’)*
Estimation 9:,
1 Test Type Critical Region
∑1 (𝑥𝑥9 − 𝑥𝑥̅ )(𝑦𝑦9 − 𝑦𝑦’) SSE = Ä (𝑦𝑦9 − 𝑦𝑦9 )*
𝛽𝛽œ, = 9:, 1
∑9:,(𝑥𝑥9 − 𝑥𝑥̅ )* 9:, Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑡𝑡*c,1"l",
1
𝛽𝛽œ) = 𝑦𝑦’ − 𝛽𝛽œ, 𝑥𝑥̅ SST = Ä (𝑦𝑦9 − 𝑦𝑦’)* = SSR + SSE
Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑡𝑡c,1"l",
9:,

Standard Errors SSR SSE


𝑅𝑅* = =1− Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝑡𝑡*c,1"l",
SST SST
1 𝑥𝑥̅ * MSE
𝑠𝑠𝑠𝑠9𝛽𝛽œ) : = ‡MSE } + 1  *
𝑅𝑅^dr. =1− * 𝐹𝐹 Tests
𝑛𝑛 ∑9:,(𝑥𝑥9 − 𝑥𝑥̅ )* 𝑠𝑠O
MSR SSR ÷ 𝑝𝑝
𝑛𝑛 − 1 𝑡𝑡. 𝑠𝑠. = =
MSE = 1 − (1 − 𝑅𝑅* ) ú ° MSE SSE ÷ (𝑛𝑛 − 𝑝𝑝 − 1)
𝑠𝑠𝑠𝑠9𝛽𝛽œ, : = ‡ 1 𝑛𝑛 − 𝑝𝑝 − 1
∑9:,(𝑥𝑥9 − 𝑥𝑥̅ )* • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
Other Key Ideas • ndf = 𝑝𝑝
1 (𝑥𝑥 − 𝑥𝑥̅ )*
𝑠𝑠𝑠𝑠(𝑦𝑦) = ‡MSE } + 1  • 𝑅𝑅* is a poor measure for model • ddf = 𝑛𝑛 − 𝑝𝑝 − 1
𝑛𝑛 ∑9:,(𝑥𝑥9 − 𝑥𝑥̅ )*
comparison because it will increase • If 𝑝𝑝 = 1, 𝑡𝑡. 𝑠𝑠. is the squared test statistic of
1 (𝑥𝑥1=, − 𝑥𝑥̅ )* simply by adding more predictors the 𝑡𝑡 test with the same 𝐻𝐻) .
𝑠𝑠𝑠𝑠(𝑦𝑦1=, ) = ‡MSE }1 + +  to a model.
𝑛𝑛 ∑19:,(𝑥𝑥9 − 𝑥𝑥̅ )*
• Polynomials do not change consistently Source SS df MS
Other Numerical Results by unit increases of its variable, i.e., no Regression SSR 𝑝𝑝 MSR
*
𝑅𝑅* = 𝑟𝑟!,O constant slope.
Error SSE 𝑛𝑛 − 𝑝𝑝 − 1 MSE
• Only 𝑤𝑤 − 1 dummy variables are needed
Multiple Linear Regression (MLR) Total SST 𝑛𝑛 − 1 𝑠𝑠O*
to represent 𝑤𝑤 classes of a categorical
𝑌𝑌 = 𝛽𝛽) + 𝛽𝛽, 𝑥𝑥, + ⋯ + 𝛽𝛽l 𝑥𝑥l + 𝜀𝜀 predictor; one of them acts as the
baseline class. Partial 𝐹𝐹 Tests
Assumptions RSdVm2UqX UX ^ddU2UqX^o
• In effect, dummy variables define a t^RU^nUoU2u dY veSX2
1. 𝑌𝑌9 = 𝛽𝛽) + 𝛽𝛽, 𝑥𝑥9,, + ⋯ + 𝛽𝛽l 𝑥𝑥9,l + 𝜀𝜀9 ˆ˜˜˜¯˜˜˜˘ ˆ˜ ˜¯˜ ˜˘
distinct intercept for each class. Without 9SSEJ − SSEM : ÷ 9𝑝𝑝M − 𝑝𝑝J :
2. 𝑥𝑥9,5 ’s are non-random 𝑡𝑡. 𝑠𝑠. =
the interaction between a dummy SSEM ÷ 9𝑛𝑛 − 𝑝𝑝M − 1:
3. E[𝜀𝜀9 ] = 0 variable and a predictor, the dummy 9𝑅𝑅M* − 𝑅𝑅J* : ÷ 9𝑝𝑝M − 𝑝𝑝J :
4. Var[𝜀𝜀9 ] = 𝜎𝜎 * variable cannot additionally affect that =
91 − 𝑅𝑅M* : ÷ 9𝑛𝑛 − 𝑝𝑝M − 1:
5. 𝜀𝜀9 ’s are independent predictor's regression coefficient.
6. 𝜀𝜀9 ’s are normally distributed • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
7. The predictor 𝑥𝑥5 is not a linear Standard Errors • ndf = 𝑝𝑝M − 𝑝𝑝J
Var Ó g = MSE(𝐗𝐗 N 𝐗𝐗)",
Ò f𝜷𝜷 • ddf = 𝑛𝑛 − 𝑝𝑝M − 1
combination of the other 𝑝𝑝 predictors, for
𝑗𝑗 = 0, 1, … , 𝑝𝑝 Ò f𝛽𝛽œ) g
Var Ò f𝛽𝛽œ) , 𝛽𝛽œl g
⋯ Cov
=≥ ⋮ ⋱ ⋮ ¥ Source SS df
Estimation – Ordinary Least Squares (OLS) Ò f𝛽𝛽œ) , 𝛽𝛽œl g ⋯
Cov Ò f𝛽𝛽œl g
Var
Reduced
𝛽𝛽œ) SSRJ 𝑝𝑝J
Ó = (𝐗𝐗 N 𝐗𝐗)", 𝐗𝐗 N 𝐲𝐲 𝑠𝑠𝑠𝑠9𝛽𝛽œ5 : = ·Var
Ò f𝛽𝛽œ5 g Regression
≥ ⋮ ¥ = 𝜷𝜷
œ
𝛽𝛽l
SSEJ − SSEM
𝑦𝑦 = 𝛽𝛽œ) + 𝛽𝛽œ, 𝑥𝑥, + ⋯ + 𝛽𝛽œl 𝑥𝑥l Confidence Intervals
Difference or 𝑝𝑝M − 𝑝𝑝J
𝛽𝛽œ5 ± 𝑡𝑡,"+,1"l", ⋅ 𝑠𝑠𝑠𝑠9𝛽𝛽œ5 :
𝐇𝐇 = 𝐗𝐗(𝐗𝐗 N 𝐗𝐗)", 𝐗𝐗 N SSRM − SSRJ
𝐲𝐲 = 𝐇𝐇𝐇𝐇 𝑦𝑦 ± 𝑡𝑡,"+,1"l", ⋅ 𝑠𝑠𝑠𝑠(𝑦𝑦)
Full Error SSEM 𝑛𝑛 − 𝑝𝑝M − 1
SSE
MSE = Prediction Intervals
𝑛𝑛 − 𝑝𝑝 − 1 Total SST 𝑛𝑛 − 1
𝑦𝑦1=, ± 𝑡𝑡,"+,1"l", ⋅ 𝑠𝑠𝑠𝑠(𝑦𝑦1=,)
residual standard error = √MSE
Bootstrapping
The bootstrapped 𝑠𝑠𝑠𝑠9𝛽𝛽œ5 : is the unbiased
sample standard deviation of the 𝛽𝛽5
bootstrap estimates.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 14
Analysis of Variance (ANOVA) Testing the Significance of Factor A Two-Way ANOVA – Model with Interactions
SSR x ÷ (𝑤𝑤 − 1) 𝑌𝑌9,5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝛾𝛾5,+ + 𝜀𝜀9,5,+
One-Way ANOVA 𝑡𝑡. 𝑠𝑠. =
SSE^dd ÷ (𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1) • 𝑖𝑖 = 1, … , 𝑛𝑛∗
𝑌𝑌9,5 = 𝜇𝜇 + 𝛼𝛼5 + 𝜀𝜀9,5
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY • 𝑗𝑗 = 1, … , 𝑤𝑤
• 𝑖𝑖 = 1, … , 𝑛𝑛5
• ndf = 𝑤𝑤 − 1 • 𝑘𝑘 = 1, … , 𝑣𝑣
• Factor has 𝑤𝑤 levels, 𝑗𝑗 = 1, … , 𝑤𝑤
• ddf = 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1
1% SSdUYY = SSE^dd − SSEUX2
1 Testing the Significance of Factor B = SSR UX2 − SSR^dd
𝑦𝑦’5 = Ä 𝑦𝑦9,5
𝑛𝑛5 SSR W ÷ (𝑣𝑣 − 1)
9:, 𝑡𝑡. 𝑠𝑠. =
w 1% w SSE^dd ÷ (𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1) Source SS df
* *
SSR = Ä Ä9𝑦𝑦’5 − 𝑦𝑦’: = Ä 𝑛𝑛5 9𝑦𝑦’5 − 𝑦𝑦’: • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
Factor A SSRx 𝑤𝑤 − 1
5:, 9:, 5:, • ndf = 𝑣𝑣 − 1
w 1%
*
• ddf = 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1 Factor B SSRW 𝑣𝑣 − 1
SSE = Ä Ä9𝑦𝑦9,5 − 𝑦𝑦’5 :
5:, 9:, Two-Way ANOVA – Additive Model without
Interaction SSdUYY (𝑤𝑤 − 1)(𝑣𝑣 − 1)
w 1% Replication
*
SST = Ä Ä9𝑦𝑦9,5 − 𝑦𝑦’: 𝑌𝑌5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝜀𝜀5,+ Error SSEUX2 𝑛𝑛 − 𝑤𝑤𝑤𝑤
5:, 9:,
• 𝑛𝑛∗ = 1
Total SST 𝑛𝑛 − 1
• 𝑗𝑗 = 1, … , 𝑤𝑤
Source SS df
• 𝑘𝑘 = 1, … , 𝑣𝑣
Testing the Significance of Interactions
Factor SSR 𝑤𝑤 − 1
1
g
1
w SSdUYY ÷ [(𝑤𝑤 − 1)(𝑣𝑣 − 1)]
𝑦𝑦’5• = Ä 𝑦𝑦5,+ , 𝑦𝑦’•+ = Ä 𝑦𝑦5,+ 𝑡𝑡. 𝑠𝑠. =
𝑣𝑣 𝑤𝑤 SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
Error SSE 𝑛𝑛 − 𝑤𝑤
+:, 5:,
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
g w w
Total SST 𝑛𝑛 − 1 * * • ndf = (𝑤𝑤 − 1)(𝑣𝑣 − 1)
SSRx = Ä Ä9𝑦𝑦’5• − 𝑦𝑦’: = Ä 𝑣𝑣9𝑦𝑦’5• − 𝑦𝑦’:
+:, 5:, 5:, • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
Testing the Significance of Factor g w g

SSR ÷ (𝑤𝑤 − 1) SSRW = Ä Ä(𝑦𝑦’•+ − 𝑦𝑦’)* = Ä 𝑤𝑤(𝑦𝑦’•+ − 𝑦𝑦’)* Testing the Significance of Factor A
𝑡𝑡. 𝑠𝑠. = SSRx ÷ (𝑤𝑤 − 1)
SSE ÷ (𝑛𝑛 − 𝑤𝑤) +:, 5:, +:,
𝑡𝑡. 𝑠𝑠. =
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
g w SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
*
SSE^dd = Ä Ä9𝑦𝑦5,+ − 𝑦𝑦’5• − 𝑦𝑦’•+ + 𝑦𝑦’: • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
• ndf = 𝑤𝑤 − 1
+:, 5:,
• ddf = 𝑛𝑛 − 𝑤𝑤 g w • ndf = 𝑤𝑤 − 1
*
SST = Ä Ä9𝑦𝑦5,+ − 𝑦𝑦’: • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
Two-Way ANOVA – Additive Model +:, 5:,
𝑌𝑌9,5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝜀𝜀9,5,+ Testing the Significance of Factor B
SSR W ÷ (𝑣𝑣 − 1)
• Factor A has 𝑤𝑤 levels, 𝑖𝑖 = 1, … , 𝑛𝑛∗ 𝑡𝑡. 𝑠𝑠. =
SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
• Factor B has 𝑣𝑣 levels, 𝑗𝑗 = 1, … , 𝑤𝑤
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
• 𝑘𝑘 = 1, … , 𝑣𝑣
• ndf = 𝑣𝑣 − 1
SSRW = SSEx − SSE^dd • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
= SSR ^dd − SSRx
Other Key Ideas
Source SS df • In testing whether a source is significant,
the test statistic is the mean square of
Factor A SSRx 𝑤𝑤 − 1 that source divided by the MSE of the
model that has the most predictors.
Factor B SSRW 𝑣𝑣 − 1
• ANCOVA models have both quantitative
Error SSE^dd 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1 and qualitative predictors.
• The uncorrected total sum of squares is
Total SST 𝑛𝑛 − 1 ∑19:, 𝑦𝑦9* . The sources of an
ANOVA/ANCOVA table may sum to the
uncorrected table rather than the
corrected total.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 15
Linear Model Assumptions Model Selection Validation Set
• Randomly splits all available
Leverage • 𝑔𝑔: Total # of predictors in consideration
observations into two groups: the
1 (𝑥𝑥9 − 𝑥𝑥̅ )* • 𝑝𝑝: # of predictors for a specific model
ℎ9 = + 1 for SLR training set and the validation set.
𝑛𝑛 ∑<:,(𝑥𝑥< − 𝑥𝑥̅ )* • MSEy : MSE of the model that uses
23 • Only the observations in the training set
• ℎ9 is the 𝑖𝑖 diagonal entry of 𝐇𝐇. all 𝑔𝑔 predictors
are used to attain the fitted model, and
• ∑19:, ℎ9 = 𝑝𝑝 + 1 • Μl : The “best” model with 𝑝𝑝 predictors
those in validation set are used to
Standardized Residuals Best Subset Selection estimate the test MSE.
𝑒𝑒9
𝑒𝑒v2^,9 = 1. For 𝑝𝑝 = 0, 1, … , 𝑔𝑔, fit all ´yl¨ models with 𝑝𝑝 𝑘𝑘-fold Cross-Validation
˚MSE(1 − ℎ9 )
predictors. The model with the largest 𝑅𝑅 * 1. Randomly divide all available
DFITS is Μl . observations into 𝑘𝑘 folds.
2. Choose the best model among Μ) , … , Μy 2. For 𝑣𝑣 = 1, … , 𝑘𝑘, obtain the 𝑣𝑣th fit by
ℎ9
DFITS9 = 𝑒𝑒v2^,9 ‡ training with all observations except
1 − ℎ9 using a selection criterion of choice.
those in the 𝑣𝑣th fold.
Cook’s Distance Forward Stepwise Selection 3. For 𝑣𝑣 = 1, … , 𝑘𝑘, use 𝑦𝑦 from the 𝑣𝑣th fit to
* 1. Fit all 𝑔𝑔 simple linear regression models. calculate a test MSE estimate with
DFITS9* 𝑒𝑒v2^,9 ℎ9
𝑑𝑑9 = = The model with the largest 𝑅𝑅* is Μ, . observations in the 𝑣𝑣th fold.
𝑝𝑝 + 1 (𝑝𝑝 + 1)(1 − ℎ9 )
2. For 𝑝𝑝 = 2, … , 𝑔𝑔, fit the models that add 4. To calculate CV error, average the 𝑘𝑘 test
𝑒𝑒9* ℎ9
= one of the remaining predictors to Μl", . MSE estimates in the previous step.
MSE(𝑝𝑝 + 1)(1 − ℎ9 )*
The model with the largest 𝑅𝑅* is Μl .
Plots of Residuals Leave-One-Out Cross-Validation (LOOCV)
3. Choose the best model among Μ) , … , Μy
• 𝑒𝑒 versus 𝑦𝑦 • Calculate LOOCV error as a special case of
using a selection criterion of choice.
Residuals are well-behaved if 𝑘𝑘-fold cross-validation where 𝑘𝑘 = 𝑛𝑛.
1
o Points appear to be Backward Stepwise Selection 1 𝑦𝑦9 − 𝑦𝑦9 *
LOOCV Error = Ä ú ° for MLR
randomly scattered 1. Fit the model with all 𝑔𝑔 predictors, Μy . 𝑛𝑛 1 − ℎ9
9:,
o Residuals seem to average to 0 2. For 𝑝𝑝 = 𝑔𝑔 − 1, … , 1, fit the models that
o Spread of residuals does not change drop one of the predictors from Μl=, . Key Ideas on Cross-Validation
• 𝑒𝑒 versus 𝑖𝑖 *
The model with the largest 𝑅𝑅 is Μl . • The validation set approach has unstable
Detects dependence of error terms 3. Choose the best model among Μ) , … , Μy results and will tend to overestimate the
• QQ plot of 𝑒𝑒 test MSE. The two other approaches
using a selection criterion of choice.
mitigate these issues.
Variance Inflation Factor Selection Criteria • With respect to bias, LOOCV < 𝑘𝑘-fold CV <
1 • Adjusted 𝑅𝑅* Validation Set.
VIF5 =
1 − 𝑅𝑅5*
• Mallows’ 𝐶𝐶l • With respect to variance, LOOCV > 𝑘𝑘-fold
VIF5 > 5 indicates multicollinearity. 1 CV > Validation Set.
𝐶𝐶l = 9SSE + 2𝑝𝑝 ⋅ MSEy :
𝑛𝑛
Curse of Dimensionality
• Akaike information criterion
Having many predictors in a model 1
increases the risk of including noise AIC = 9SSE + 2𝑝𝑝 ⋅ MSEy :
𝑛𝑛
predictors that are not associated with • Bayesian information criterion
the response. 1
BIC = 9SSE + ln 𝑛𝑛 ⋅ 𝑝𝑝 ⋅ MSEy :
𝑛𝑛
• Cross-validation error

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 16
Other Linear Regression Approaches Principal Components Partial Least Squares
l
• Supervised technique that performs
Standardizing Variables
𝑧𝑧B = Ä 𝜙𝜙5,B 𝑥𝑥5 dimension reduction on 𝑝𝑝 variables
• A centered variable is the result of 5:,
• Uses the first 𝑘𝑘 PLS directions that are
subtracting the sample mean l
* orthogonal as predictors in an MLR.
from a variable. Ä 𝜙𝜙5,B =1
• 𝑘𝑘 is a measure of flexibility.
• A scaled variable is the result of dividing 5:,
l • When 𝑘𝑘 = 𝑝𝑝, PLS is equivalent to
a variable by its standard deviation.
Ä 𝜙𝜙5,B ⋅ 𝜙𝜙5,< = 0, 𝑚𝑚 ≠ 𝑢𝑢 performing MLR with the 𝑝𝑝 original
• A standardized variable is the result of
5:, variables as predictors.
first centering a variable, then scaling it.
• Unsupervised technique that performs • The first PLS direction is a linear
Shrinkage Methods dimension reduction on 𝑝𝑝 variables combination of the 𝑝𝑝 standardized
• The variability explained by each predictors, with coefficients that are
subsequent principal component is based on the response 𝑦𝑦.
Ridge Lasso
always less than the variability explained • Every subsequent PLS direction is
by its previous principal component. calculated iteratively as a linear
SSE SSE
l l
• Principal components form the lower combination of "updated predictors"
dimension surface that is closest to the which are the residuals of fits with the
+ 𝜆𝜆 Ä 𝛽𝛽œ5* + 𝜆𝜆 Ä◊𝛽𝛽œ5 ◊
5:, 5:,
observations in 𝑝𝑝-dimensional space. "previous predictors" explained by the
• Standardized variables affect the loadings previous direction.
Minimize
SSE SSE by becoming resistant to varying scales
subject to subject to among the original variables.
l l

Ä 𝛽𝛽œ5* ≤ 𝑎𝑎 Ä◊𝛽𝛽œ5 ◊ ≤ 𝑎𝑎 Principal Components Regression


5:, 5:, • Uses the first 𝑘𝑘 principal components that
are orthogonal as predictors in an MLR.
Ó" =
"𝜷𝜷 Ó" =
* "𝜷𝜷 • 𝑘𝑘 is a measure of flexibility.
,
ℓ norm • When 𝑘𝑘 = 𝑝𝑝, PCR is equivalent to
l
·∑5:, 𝛽𝛽œ5* ∑l5:,◊𝛽𝛽œ5 ◊
performing MLR with the 𝑝𝑝 original
variables as predictors.
• 𝜆𝜆: Tuning parameter
• 𝑎𝑎: Budget parameter
• 𝑥𝑥, , … , 𝑥𝑥l are scaled predictors.
• 𝜆𝜆 is inversely related to flexibility.
• With a finite 𝜆𝜆, none of the ridge
estimates will equal 0, but the lasso
estimates could equal 0.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 17
Generalized Linear Models Parameter Estimation – Method of Scoring Likelihood Ratio Test
Ó (B",) + f𝐈𝐈 (B",) g", 𝐮𝐮(B",)
Ó (B) = 𝜷𝜷
𝜷𝜷 Ó M : − 𝑙𝑙9𝜷𝜷
𝑡𝑡. 𝑠𝑠. = 2f𝑙𝑙9𝜷𝜷 Ó J :g
Exponential Family*
", = 𝐷𝐷J − 𝐷𝐷M
𝑓𝑓(𝑦𝑦) = exp[𝑎𝑎(𝑦𝑦) ⋅ 𝑏𝑏(𝜃𝜃) + 𝑐𝑐(𝜃𝜃) + 𝑑𝑑(𝑦𝑦)] = 9𝐗𝐗N 𝐖𝐖 (B",) 𝐗𝐗: 𝐗𝐗 N 𝐖𝐖(B",) 𝐳𝐳 (B",)
*
𝑐𝑐′(𝜃𝜃) 1 • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,l * "l+
E[𝑎𝑎(𝑌𝑌)] = − 𝑤𝑤9 =
𝑏𝑏′(𝜃𝜃) Var[𝑌𝑌9 ] ⋅ 𝑔𝑔′(𝜇𝜇9 )*
𝑏𝑏(( (𝜃𝜃)𝑐𝑐 ( (𝜃𝜃) − 𝑐𝑐 (( (𝜃𝜃)𝑏𝑏( (𝜃𝜃) Wald Test
𝑧𝑧9 = 𝑔𝑔(𝜇𝜇9 ) + (𝑦𝑦9 − 𝜇𝜇9 )𝑔𝑔′(𝜇𝜇9 )
Var[𝑎𝑎(𝑌𝑌)] = *
[𝑏𝑏( (𝜃𝜃)]- 𝛽𝛽œ5 − ℎ
Numerical Results 𝑡𝑡. 𝑠𝑠. = § ¶
𝑠𝑠𝑠𝑠9𝛽𝛽œ5 :
Canonical Form Ó :g
𝐷𝐷 = 2f𝑙𝑙v^2 − 𝑙𝑙9𝜷𝜷 *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,,
• 𝑎𝑎(𝑦𝑦) = 𝑦𝑦 Ó:
𝑙𝑙9𝜷𝜷 Ó − 𝜷𝜷:N 𝐈𝐈9𝜷𝜷
Ó − 𝜷𝜷: follows an
• 𝑏𝑏(𝜃𝜃) is the natural parameter
*
𝑅𝑅evS. =1− • 9𝜷𝜷
𝑙𝑙XVoo
• 𝜇𝜇 = E[𝑌𝑌] is a function of 𝜃𝜃 Ó : + 2𝑘𝑘 approximate chi-square distribution with
AIC = −2 ⋅ 𝑙𝑙9𝜷𝜷
• Var[𝑌𝑌] is a function of 𝜇𝜇 𝑝𝑝 + 1 degrees of freedom.
Ó : + 𝑘𝑘 ln 𝑛𝑛
BIC = −2 ⋅ 𝑙𝑙9𝜷𝜷
*Key results on Exponential Family is on where 𝑘𝑘 is the # of estimated parameters Tweedie Distributions
page 21. Var[𝑌𝑌] = 𝑎𝑎 ⋅ E[𝑌𝑌]>
Residuals
Model Framework Raw Residual Distribution 𝑑𝑑
𝑔𝑔(𝜇𝜇) = 𝐱𝐱 N 𝜷𝜷 = 𝛽𝛽) + 𝛽𝛽, 𝑥𝑥, + ⋯ + 𝛽𝛽l 𝑥𝑥l 𝑒𝑒9 = 𝑦𝑦9 − 𝜇𝜇̂ 9
Normal 0
Pearson Residual Poisson 1
Function Name 𝑔𝑔(𝜇𝜇)
𝑒𝑒9
𝑒𝑒9? = Compound Poisson-Gamma (1, 2)
Identity 𝜇𝜇 Ò [𝑌𝑌9 ]
˚Var
𝜇𝜇 𝑒𝑒9? Gamma 2
Logit ln ú ° ?
𝑒𝑒v2^,9 =
1 − 𝜇𝜇 ˚1 − ℎ9 Inverse Gaussian 3
Logarithmic ln 𝜇𝜇 *
• Pearson chi-square statistic is ∑19:,9𝑒𝑒9? : .
1 Connection with MLR
Inverse
𝜇𝜇 Deviance Residual • A GLM with a normally distributed
Power 𝜇𝜇> 𝑒𝑒9h = ±˚𝐷𝐷9 response, identity link, and
whose sign follows the 𝑖𝑖 23 raw residual homoscedasticity is the same as MLR.
Distribution Canonical Link 𝑒𝑒9h • MLE estimates = OLS estimates
h
𝑒𝑒v2^,9 =
˚1 − ℎ9 • 𝜎𝜎 * 𝐷𝐷 = SSE
Normal Identity
*
• Deviance is ∑19:,9𝑒𝑒9h : .
Binomial Logit

Poisson Logarithmic Inference


• Score statistics 𝐔𝐔 asymptotically follow a
Gamma Inverse
multivariate normal distribution with
Inverse Gaussian Inverse squared mean 𝟎𝟎 and asymptotic variance-
covariance matrix 𝐈𝐈. Thus, 𝐔𝐔N 𝐈𝐈", 𝐔𝐔 follows
Parameter Estimation an approximate chi-square distribution
1
with 𝑝𝑝 + 1 degrees of freedom.
𝑙𝑙(𝜷𝜷) = Ä[𝑦𝑦9 ⋅ 𝑏𝑏(𝜃𝜃9 ) + 𝑐𝑐(𝜃𝜃9 ) + 𝑑𝑑(𝑦𝑦9 )]
• Maximum likelihood estimators 𝜷𝜷 Ó
9:,
Ó:
𝜇𝜇̂ = 𝑔𝑔", 9𝐱𝐱 N 𝜷𝜷 asymptotically follow a multivariate
1 normal distribution with mean 𝜷𝜷 and
(𝑦𝑦9 − 𝜇𝜇9 ) 𝑥𝑥9,5
𝑢𝑢5 = Ä asymptotic variance-covariance
Var[𝑌𝑌9 ] ⋅ 𝑔𝑔′(𝜇𝜇9 )
9:, matrix 𝐈𝐈", .
1
𝐱𝐱9 𝐱𝐱9N • Overdispersion can be addressed by
𝐈𝐈 = Ä
Var[𝑌𝑌9 ] ⋅ 𝑔𝑔′(𝜇𝜇9 )* quasi-likelihood method, which changes
9:,
the variance to:
Var[𝑌𝑌9 ] = 𝜙𝜙 ⋅ original variance

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 18
Binomial and Categorical Logistic Regression Poisson Response Regression
Response Regression exp9𝐱𝐱9N 𝜷𝜷:
𝑞𝑞9 = 𝜇𝜇9 = 𝑎𝑎9 ⋅ exp9𝐱𝐱9N 𝜷𝜷:
1 + exp9𝐱𝐱9N 𝜷𝜷:
Binomial Response Variable where 𝑎𝑎9 is the exposure amount
1
• The odds of an event are the ratio of the
𝑢𝑢5 = Ä(𝑦𝑦9 − 𝜇𝜇9 )𝑥𝑥9,5 1
probability that the event will occur to 9:, 𝑙𝑙(𝜷𝜷) = Ä[𝑦𝑦9 ln 𝜇𝜇9 − 𝜇𝜇9 − ln(𝑦𝑦9 !)]
the probability that the event will not 1
9:,
occur, i.e., 𝐈𝐈 = Ä 𝑚𝑚9 𝑞𝑞9 (1 − 𝑞𝑞9 )𝐱𝐱9 𝐱𝐱9N 1

𝑞𝑞 9:, 𝑢𝑢5 = Ä(𝑦𝑦9 − 𝜇𝜇9 ) 𝑥𝑥9,5


odds =
1 − 𝑞𝑞 9:,
Nominal Response 1
• The odds ratio is the ratio of the odds of
Let 𝜋𝜋9,Z be the probability that the 𝑖𝑖 23 𝐈𝐈 = Ä 𝜇𝜇9 𝐱𝐱9 𝐱𝐱9N
an event with the presence of a
observation is classified as category 𝑐𝑐. 9:,
characteristic to the odds of the same 1
𝑘𝑘 is the reference category. 𝑦𝑦9
event without the presence of 𝐷𝐷 = 2 Ä ü𝑦𝑦9 ln ú ° − (𝑦𝑦9 − 𝜇𝜇̂ 9 )†
𝜋𝜋9,@ 𝜇𝜇̂ 9
that characteristic. ln }  = 𝐱𝐱9N 𝜷𝜷@ 9:,
𝜋𝜋9,+ 1
𝑦𝑦9
exp9𝐱𝐱9N 𝜷𝜷Z : = 2 Ä 𝑦𝑦9 ln ú °
Function Name 𝑔𝑔(𝑞𝑞) ⎧ , 𝑐𝑐 ≠ 𝑘𝑘 𝜇𝜇̂ 9
⎪1 + ∑^oo @ exp9𝐱𝐱 N 𝜷𝜷@ : 9:,
9 𝑦𝑦9 − 𝜇𝜇̂ 9
𝜋𝜋9,Z =
𝑞𝑞 ⎨ 1 𝑒𝑒9? =
Logit ln ú ° ⎪ N
, 𝑐𝑐 = 𝑘𝑘 ˚𝜇𝜇̂ 9
1 − 𝑞𝑞 ⎩1 + ∑^oo @ exp9𝐱𝐱 9 𝜷𝜷@ : 1
(𝑦𝑦9 − 𝜇𝜇̂ 9 )*
Ordinal Response – Proportional Pearson chi-square stat. = Ä
𝜇𝜇̂ 9
Probit Φ", (𝑞𝑞) Odds Cumulative
9:,

Π9,Z Log-Linear Models


ln }  = 𝛽𝛽),Z + 𝐱𝐱9N 𝜷𝜷
Complementary 1 − Π9,Z • Assess whether there is an association or
ln[− ln(1 − 𝑞𝑞)]
log-log ΠZ = 𝜋𝜋, + ⋯ + 𝜋𝜋Z dependence between two factors.
𝑥𝑥9,, • The response is the count in each cell of
1 𝐱𝐱9 = + ⋮ , the contingency table created by the
𝑞𝑞9 𝑥𝑥9,l
𝑙𝑙(𝜷𝜷) = Ä ü𝑦𝑦9 ln ú ° + 𝑚𝑚9 ln(1 − 𝑞𝑞9 )
1 − 𝑞𝑞9 two factors.
9:, 𝛽𝛽,
𝑚𝑚9 𝜷𝜷 = ≥ ⋮ ¥ • Key results of the multinomial model and
+ ln ú °†
𝑦𝑦9 𝛽𝛽l the product multinomial model are
1 shared with the Poisson model.
𝑦𝑦9 A ratio of cumulative odds is not a function
𝐷𝐷 = 2 Ä ü𝑦𝑦9 ln ú ° • In testing the interaction effects with a
𝜇𝜇̂ 9 of the predictor values, e.g.,
9:,
𝑚𝑚9 − 𝑦𝑦9 Ó , ÷ 91 − Π
Π Ó, : likelihood ratio test, the reduced model
+ (𝑚𝑚9 − 𝑦𝑦9 ) ln ú °† = exp9𝛽𝛽œ),, − 𝛽𝛽œ),* : does not have the interaction terms as
𝑚𝑚9 − 𝜇𝜇̂ 9 Ó * ÷ 91 − Π
Π Ó*:
𝑦𝑦9 − 𝑚𝑚9 𝑞𝑞9 predictors, while the full model has the
𝑒𝑒9? = interaction terms.
˚𝑚𝑚9 𝑞𝑞9 (1 − 𝑞𝑞9 )
1
(𝑦𝑦9 − 𝑚𝑚9 𝑞𝑞9 )*
Pearson chi-square stat. = Ä
𝑚𝑚9 𝑞𝑞9 (1 − 𝑞𝑞9 )
9:,

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 19
Generalized Additive Models Smoothing Splines Generalized Additive Models
1 # • Each explanatory variable contributes to
The # of degrees of freedom used is the # of Minimize Ä[𝑦𝑦9 − 𝑔𝑔(𝑥𝑥9 )]* + 𝜆𝜆 * 𝑔𝑔′′(𝑡𝑡)* d𝑡𝑡 the mean response independently of the
regression coefficients, i.e., 𝑝𝑝 + 1. 9:, "#
other explanatory variables; no
• Smoothing parameter 𝜆𝜆 is inversely
Basis Functions interactions are considered.
related to flexibility.
𝑌𝑌 = 𝛽𝛽) + 𝛽𝛽, 𝑏𝑏, (𝑥𝑥) + ⋯ + 𝛽𝛽l 𝑏𝑏l (𝑥𝑥) + 𝜀𝜀 • The effect of each explanatory variable on
• 𝑔𝑔(𝑥𝑥) has the same form as the fitted
the response can be investigated
Step Functions natural cubic spline with knots at the 𝑛𝑛
individually, assuming the other variables
𝐼𝐼9𝜉𝜉 ≤ 𝑥𝑥 < 𝜉𝜉5=, :, 𝑗𝑗 = 1, … , 𝑘𝑘 − 1 values of 𝑥𝑥.
𝑏𝑏5 (𝑥𝑥) = - 5 are held constant.
𝐼𝐼(𝑥𝑥 ≥ 𝜉𝜉+ ), 𝑗𝑗 = 𝑘𝑘 • Effective degrees of freedom measures
• Backfitting can be used for fitting if
flexibility as the sum of the diagonal
Piecewise Polynomial Regression ordinary least squares cannot.
entries of 𝐒𝐒z , where 𝐲𝐲z = 𝐒𝐒z 𝐲𝐲.
The basis functions are:
• 𝑥𝑥, 𝑥𝑥 * , … , 𝑥𝑥 > Local Regression
• 𝑘𝑘 step functions • Calculates the fitted value for a specific
• 𝑑𝑑𝑑𝑑 interaction terms input by mimicking weighted least
squares, i.e., minimize ∑19:, 𝑤𝑤9 (𝑦𝑦9 − 𝑦𝑦9 )* .
Regression Splines • Weights are determined by the span and
• A degree-𝑑𝑑 spline is a continuous the weighting function, such that
piecewise degree-𝑑𝑑 polynomial with observations nearer to the input are
continuity in derivatives up to given larger weights.
degree 𝑑𝑑 − 1 at each knot. • Span is inversely related to flexibility.
• The basis functions of a cubic spline can • Does not perform well in high dimension.
be 𝑥𝑥, 𝑥𝑥 * , 𝑥𝑥 - , (𝑥𝑥 − 𝜉𝜉,)-= , … , (𝑥𝑥 − 𝜉𝜉+ )-= .
• A natural spline is a regression spline that
is linear instead of a polynomial in the
boundary regions.

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 20
Key Results for Distributions in the Exponential Family

Distribution 𝜃𝜃 Natural Parameter, 𝑏𝑏(𝜃𝜃) 𝑐𝑐(𝜃𝜃)

Binomial, 𝑞𝑞
𝑞𝑞 ln ú ° 𝑚𝑚 ln(1 − 𝑞𝑞)
fixed 𝑚𝑚 1 − 𝑞𝑞
Normal, 𝜇𝜇 𝜇𝜇*
𝜇𝜇 −
fixed 𝜎𝜎 * 𝜎𝜎 * 2𝜎𝜎 *

Poisson 𝜆𝜆 ln 𝜆𝜆 −𝜆𝜆

Gamma, 1
𝜃𝜃 − −𝛼𝛼 ln 𝜃𝜃
fixed 𝛼𝛼 𝜃𝜃
Inverse Gaussian, 𝜃𝜃 𝜃𝜃
𝜇𝜇 −
fixed 𝜃𝜃 2𝜇𝜇* 𝜇𝜇
Negative Binomial, 𝛽𝛽
𝛽𝛽 ln ú ° −𝑟𝑟 ln(1 + 𝛽𝛽)
fixed 𝑟𝑟 1 + 𝛽𝛽

Number of Predictors for GAMs with a 𝑑𝑑@A degree polynomial and 𝑘𝑘 knots

Model # of Predictors, 𝑝𝑝

Polynomial 𝑑𝑑

Piecewise constant 𝑘𝑘

Piecewise polynomial 𝑑𝑑 + 𝑘𝑘 + 𝑑𝑑𝑑𝑑

Continuous piecewise polynomial 𝑑𝑑 + 𝑑𝑑𝑑𝑑

Cubic spline 3 + 𝑘𝑘

Natural cubic spline 𝑘𝑘 − 1

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 21
NOTATION Extended Linear Models
Notation
𝑋𝑋 ~ Name(parameters) represents 𝑋𝑋 follows a “Name” distribution Symbol Description
with “parameters” following the parametrization on the exam table. 𝑛𝑛 # of observation

𝑝𝑝 # of predictors
Probability Models
SST Total sum of squares
Symbol Description
SSR Regression sum of squares
𝐀𝐀N Transpose of matrix 𝐀𝐀
SSE/RSS Error sum of squares
𝐀𝐀", Inverse of matrix 𝐀𝐀
SS Sum of squares

Statistics MS Mean square

E[𝑌𝑌], 𝜇𝜇 Mean response


Symbol Description
𝑔𝑔(𝜇𝜇) Link function
𝐻𝐻) Null hypothesis
Ó:
𝑙𝑙9𝜷𝜷 Maximized log-likelihood
𝐻𝐻, Alternative hypothesis
𝑙𝑙XVoo Maximized log-likelihood for null model
𝛼𝛼 Significance level
𝑙𝑙v^2 Maximized log-likelihood for saturated model
𝑡𝑡. 𝑠𝑠. Test statistic
𝐈𝐈 Information matrix
ℎ Hypothesized value
𝐷𝐷 Deviance statistic
df Degrees of freedom

ndf Numerator degrees of freedom

ddf Denominator degrees of freedom


𝑡𝑡*(,"'),dY 100𝑞𝑞th percentile of a 𝑡𝑡-distribution
𝐹𝐹,"',XdY,ddY 100𝑞𝑞th percentile of an 𝐹𝐹-distribution
*
𝜒𝜒',dY 100𝑞𝑞th percentile of a chi-square distribution

𝑠𝑠𝑠𝑠 Estimated standard error

© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 22

You might also like