MAS-I Formula Sheet
MAS-I Formula Sheet
Updated 4/9/24
Probability
PROBABILITY MODELSModels Moment Generating Function (MGF) Claim Severity Distributions
𝑀𝑀/ (𝑧𝑧) = E[𝑒𝑒 0/ ] Common Distributions
Basics (1)
𝑀𝑀/ (0) = E[𝑋𝑋 1 ] S-P Pareto(𝛼𝛼, 𝜃𝜃) ~ Pareto(𝛼𝛼, 𝜃𝜃) + 𝜃𝜃
CDFs, Survival Functions, and (1) Beta(𝑎𝑎 = 1, 𝑏𝑏 = 1, 𝜃𝜃) ~ Uniform(0, 𝜃𝜃)
where 𝑀𝑀/ is the 𝑛𝑛23 derivative
Hazard Functions Weibull(𝜃𝜃, 𝜏𝜏 = 1) ~ Exponential(𝜃𝜃)
!
Probability Generating Function (PGF) Gamma(𝛼𝛼 = 1, 𝜃𝜃) ~ Exponential(𝜃𝜃)
𝐹𝐹(𝑥𝑥) = Pr(𝑋𝑋 ≤ 𝑥𝑥) = * 𝑓𝑓(𝑡𝑡) d𝑡𝑡
"# 𝑃𝑃/ (𝑧𝑧) = E[𝑧𝑧 / ]
# (1) Gamma CDF Shortcut
𝑆𝑆(𝑥𝑥) = Pr(𝑋𝑋 > 𝑥𝑥) = * 𝑓𝑓(𝑡𝑡) d𝑡𝑡 𝑃𝑃/ (1) = E[𝑋𝑋(𝑋𝑋 − 1) … (𝑋𝑋 − 𝑛𝑛 + 1)]
𝐹𝐹/ (𝑥𝑥) = 1 − Pr(𝑁𝑁 < 𝛼𝛼)
! (1)
where 𝑃𝑃/ is the 𝑛𝑛23 derivative • 𝛼𝛼 is a positive integer
𝑓𝑓(𝑥𝑥)
ℎ(𝑥𝑥) = • 𝑋𝑋 ~ Gamma(𝛼𝛼, 𝜃𝜃)
𝑆𝑆(𝑥𝑥) Conditional Distribution
!
Pr(𝐴𝐴 ∩ 𝐵𝐵) Pr(𝐵𝐵 ∣ 𝐴𝐴) Pr(𝐴𝐴) • 𝑁𝑁 ~ Poisson(𝑥𝑥⁄𝜃𝜃)
𝐻𝐻(𝑥𝑥) = * ℎ(𝑡𝑡) d𝑡𝑡 = − ln 𝑆𝑆(𝑥𝑥) Pr(𝐴𝐴 ∣ 𝐵𝐵) = =
"# Pr(𝐵𝐵) Pr(𝐵𝐵) Properties of Exponential Distribution
𝑆𝑆(𝑥𝑥) = 𝑒𝑒 "$(!) 𝑓𝑓/ (𝑥𝑥)
𝑓𝑓/∣56/6+ (𝑥𝑥) = 𝑋𝑋9 ~ Exponential(𝜃𝜃9 )
Pr(𝑗𝑗 < 𝑋𝑋 < 𝑘𝑘)
Percentiles E[𝑋𝑋] = 𝜃𝜃
where 𝑗𝑗 < 𝑥𝑥 < 𝑘𝑘
100𝑞𝑞th percentile is 𝜋𝜋' where 𝐹𝐹9𝜋𝜋' : = 𝑞𝑞. ℎ(𝑥𝑥) = 1⁄𝜃𝜃 = 𝜆𝜆
Law of Total Probability
Pr(𝑋𝑋 > 𝑡𝑡 + 𝑠𝑠|𝑋𝑋 > 𝑡𝑡) = Pr(𝑋𝑋 > 𝑠𝑠)
Mode Pr(𝑋𝑋 = 𝑥𝑥) = E7 [Pr(𝑋𝑋 = 𝑥𝑥 ∣ 𝑌𝑌)] 𝜆𝜆,
Mode is 𝑥𝑥 that maximizes 𝑓𝑓(𝑥𝑥). Pr(𝑋𝑋, < 𝑋𝑋* ) =
Law of Total Expectation 𝜆𝜆, + 𝜆𝜆*
Moments E/ [𝑋𝑋] = E7 fE/ [𝑋𝑋 ∣ 𝑌𝑌]g 1
# min(𝑋𝑋, , 𝑋𝑋* , … , 𝑋𝑋1 ) ~ Exponential }
∑19:, 𝜆𝜆9
E[𝑔𝑔(𝑋𝑋)] = * 𝑔𝑔(𝑥𝑥) ⋅ 𝑓𝑓(𝑥𝑥) d𝑥𝑥
"# Law of Total Variance 1
#
Var/ [𝑋𝑋] = E7 fVar/ [ 𝑋𝑋 ∣ 𝑌𝑌 ]g Ä 𝑋𝑋9 ~Gamma(𝑛𝑛, 𝜃𝜃) where 𝜃𝜃9 = 𝜃𝜃
= * 𝑔𝑔( (𝑥𝑥) ⋅ 𝑆𝑆(𝑥𝑥) d𝑥𝑥 9:,
) + Var7 fE/ [𝑋𝑋 ∣ 𝑌𝑌]g
*] *
Var[𝑔𝑔(𝑋𝑋)] = E[𝑔𝑔(𝑋𝑋) − E[𝑔𝑔(𝑋𝑋)] Greedy Algorithms
𝜇𝜇+( = E[𝑋𝑋 + ] Independence Algorithm A
𝜇𝜇 = 𝜇𝜇,( = E[𝑋𝑋] Pr(𝐴𝐴 ∩ 𝐵𝐵) = Pr(𝐴𝐴) ⋅ Pr(𝐵𝐵) For 𝑖𝑖 = 1, 2, … , 𝑛𝑛:
𝜇𝜇+ = E[(𝑋𝑋 − 𝜇𝜇)+ ] For independent 𝑋𝑋 and 𝑌𝑌: 1. Choose the assignment with the lowest
𝜎𝜎 * = 𝜇𝜇* = Var[𝑋𝑋] • 𝑓𝑓/,7 (𝑥𝑥, 𝑦𝑦) = 𝑓𝑓/ (𝑥𝑥) ⋅ 𝑓𝑓7 (𝑦𝑦) cost, i.e., min 𝐶𝐶9,5 , among all 𝑛𝑛 − 𝑖𝑖 + 1
5
Cov[𝑋𝑋, 𝑌𝑌] = E[𝑋𝑋𝑋𝑋] − E[𝑋𝑋] ⋅ E[𝑌𝑌] • E[𝑔𝑔(𝑋𝑋) ⋅ ℎ(𝑌𝑌)] = E[𝑔𝑔(𝑋𝑋)] ⋅ E[ℎ(𝑌𝑌)]
possible assignments.
Cov[𝑋𝑋, 𝑋𝑋] = Var[𝑋𝑋]
2. Assign that job to that employee.
𝜎𝜎
Coefficient of variation, 𝐶𝐶𝐶𝐶 = 3. Remove that employee and that job from
𝜇𝜇
𝜇𝜇- their respective sets.
Skewness = -
𝜎𝜎
𝜇𝜇. Algorithm B
Kurtosis = . For 𝑘𝑘 = 𝑛𝑛* , (𝑛𝑛 − 1)* , … , 1* :
𝜎𝜎
1. Choose the assignment with the
lowest cost, i.e., min 𝐶𝐶9,5 , among
9,5
all 𝑘𝑘 possible assignments.
2. Assign that job to that employee.
3. Remove that employee and that job from
their respective sets.
1
1
E[Total Cost] = 𝜃𝜃 Ä
𝑖𝑖
9:,
where 𝐶𝐶9,5 ~ Exponential(𝜃𝜃)
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 1
Transformations Insurance Applications Impact of Deductibles on Claim Frequency
• Scaling 𝑌𝑌 ; : payment per loss For 𝑣𝑣 = Pr(𝑋𝑋 > 𝑑𝑑), the # of payments 𝑁𝑁′:
𝜃𝜃 is a scale parameter for all continuous
Policy Limits, 𝑢𝑢 𝑁𝑁 𝑁𝑁′
distributions on the exam table, except
; 𝑋𝑋, 𝑋𝑋 < 𝑢𝑢
lognormal, inverse Gaussian, and log-𝑡𝑡. 𝑌𝑌 = 𝑋𝑋 ∧ 𝑢𝑢 = min(𝑋𝑋, 𝑢𝑢) = î Poisson 𝜆𝜆 𝑣𝑣𝑣𝑣
𝑢𝑢, 𝑋𝑋 ≥ 𝑢𝑢
• CDF Method E[(𝑌𝑌 ; )+ ] = E[(𝑋𝑋 ∧ 𝑢𝑢)+ ] Binomial 𝑚𝑚, 𝑞𝑞 𝑚𝑚, 𝑣𝑣𝑣𝑣
• PDF Method <
+ +
= * 𝑥𝑥 𝑓𝑓(𝑥𝑥) d𝑥𝑥 + 𝑢𝑢 ⋅ 𝑆𝑆(𝑢𝑢) Neg. Binomial 𝑟𝑟, 𝛽𝛽 𝑟𝑟, 𝑣𝑣𝑣𝑣
• MGF Method )
<
Mixtures = * 𝑘𝑘𝑥𝑥 +", 𝑆𝑆(𝑥𝑥) d𝑥𝑥 The Ultimate Formula for Insurance
) 𝑚𝑚
Discrete Mixture
E[𝑌𝑌 ; ] = 𝛼𝛼(1 + 𝑟𝑟) úE ù𝑋𝑋 ∧ û
1 1
Deductibles, 𝑑𝑑 1 + 𝑟𝑟
𝑓𝑓7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝑓𝑓/! (𝑦𝑦) , where Ä 𝑤𝑤9 = 1 𝑑𝑑
Ordinary deductible: − E ü𝑋𝑋 ∧ †°
9:, 9:, 1 + 𝑟𝑟
0, 𝑋𝑋 < 𝑑𝑑
1 𝑌𝑌 ; = (𝑋𝑋 − 𝑑𝑑)= = ë where
𝑋𝑋 − 𝑑𝑑, 𝑋𝑋 ≥ 𝑑𝑑
𝐹𝐹7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝐹𝐹/! (𝑦𝑦) 𝑑𝑑: deductible (set to 0 if not applicable)
9:,
E[𝑌𝑌 ; ] = E[(𝑋𝑋 − 𝑑𝑑)= ] = E[𝑋𝑋] − E[𝑋𝑋 ∧ 𝑑𝑑]
1 E[(𝑌𝑌 ; )+ ] = Ef(𝑋𝑋 − 𝑑𝑑)+= g 𝑢𝑢: policy limit (set to ∞ if not applicable)
𝑆𝑆7 (𝑦𝑦) = Ä 𝑤𝑤9 ⋅ 𝑆𝑆/! (𝑦𝑦) # 𝛼𝛼: coinsurance (set to 1 if not applicable)
9:, = * (𝑥𝑥 − 𝑑𝑑)+ 𝑓𝑓(𝑥𝑥) d𝑥𝑥 𝑟𝑟: inflation rate (set to 0 if not applicable)
1 >
# 𝑢𝑢
E[𝑌𝑌 +]
= Ä 𝑤𝑤9 ⋅ Ef𝑋𝑋9+ g 𝑚𝑚: maximum covered loss = + 𝑑𝑑
= * 𝑘𝑘(𝑥𝑥 − 𝑑𝑑)+", 𝑆𝑆(𝑥𝑥) d𝑥𝑥 𝛼𝛼
9:, >
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 2
Poisson Processes Reliability Theory* Random Graphs
Counting process where non-overlapping • A parallel system functions as long as one • 𝑛𝑛1"* minimal path sets
Poisson increments are independent of the components functions. • 21", − 1 minimal cut sets
• A series system functions only when all • 𝑃𝑃9,5 is the probability nodes 𝑖𝑖 and 𝑗𝑗
𝑁𝑁(𝑡𝑡 + ℎ) − 𝑁𝑁(𝑡𝑡) ~ Poisson(𝜆𝜆)
@=A components function. are connected.
where 𝜆𝜆 = * 𝜆𝜆(𝑢𝑢) d𝑢𝑢 • A 𝑘𝑘-out-of-𝑛𝑛 system functions only when • 𝑃𝑃1 is the probability that a random graph
@
at least 𝑘𝑘 out of 𝑛𝑛 components function. is connected, where all 𝑃𝑃9,5 = 𝑝𝑝.
• Homogeneous if 𝜆𝜆(𝑡𝑡) is constant
• A minimal path set, 𝐴𝐴5 , is a minimal set of
• Non-homogeneous if 𝜆𝜆(𝑡𝑡) varies with 𝑡𝑡 1, 𝑛𝑛 = 1
components whose functioning ⎧
⎪ 𝑝𝑝, 𝑛𝑛 = 2
Time between Events guarantees the functioning of the system. 𝑃𝑃1 = 1",
𝑇𝑇+ : Time until the 𝑘𝑘th event occurs • A minimal cut set, 𝐶𝐶5 , is a minimal set of ⎨1 − Ä ú𝑛𝑛 − 1° 𝑞𝑞+(1"+) 𝑃𝑃 , 𝑛𝑛 > 2
⎪ 𝑘𝑘 − 1 +
𝑉𝑉+ = 𝑇𝑇+ − 𝑇𝑇+", components whose failure guarantees the ⎩ +:,
if 𝑆𝑆(𝑡𝑡) is discrete. B
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 3
Discrete Markov Chains Classification of States Time Spent in Transient States
• Absorbing: State that cannot be left once it 𝐒𝐒 = (𝐈𝐈 − 𝐏𝐏N )",
Multiple-Step Transition Probabilities
is entered 𝑠𝑠9,5 − 𝛿𝛿9,5
• Chapman-Kolmogorov Probabilities 𝑓𝑓9,5 =
# • Accessible: State that can be entered from 𝑠𝑠5,5
1=B
𝑃𝑃9,5 1
= Ä 𝑃𝑃9,+ B
𝑃𝑃+,5 another state 1, 𝑖𝑖 = 𝑗𝑗
𝛿𝛿9,5 = ë
+:, • Communicating: Two states are accessible 0, otherwise
• Unconditional probability of being in • 𝑠𝑠9,5 is the expected time spent in state 𝑗𝑗
to each other
state 𝑗𝑗 at time 𝑛𝑛: • Class: A set of communicating states given it starts in state 𝑖𝑖.
# • 𝑓𝑓9,5 is the probability of ever transitioning
1 • Irreducible: A chain with only one class
Pr(𝑋𝑋1 = 𝑗𝑗) = Ä 𝛼𝛼9 𝑃𝑃9,5 to state 𝑗𝑗 from state 𝑖𝑖.
9:,
• Recurrent: Probability of re-entering state
where 𝛼𝛼9 is the probability of being in is 1, 𝑓𝑓9 = 1 Time Reversibility
state 𝑖𝑖 at time 0. • Transient: Probability of re-entering state 𝜋𝜋5 𝑃𝑃5,9
is less than 1, 𝑓𝑓9 < 1 𝑅𝑅9,5 =
• The probability of entering state 𝑗𝑗 at time 𝜋𝜋9
𝑚𝑚, starting at state 𝑖𝑖 without entering any • Given that a process starts in a A Markov chain is time reversible if
state in set 𝒜𝒜: transient state 𝑖𝑖, the number of times 𝑅𝑅9,5 = 𝑃𝑃9,5 for every 𝑖𝑖 and 𝑗𝑗.
the process re-enters state 𝑖𝑖, 𝑛𝑛 ≥ 0, has
State i State j Desired Probability M! Random Walk
a geometric distribution with 𝛽𝛽 =
,"M! All random walk models are transient
𝑖𝑖 ∉ 𝒜𝒜 𝑗𝑗 ∉ 𝒜𝒜 B
𝑄𝑄9,5 • Positive recurrent: Finite expected # of except for one-dimensional and two-
transitions for a chain to return to state j dimensional symmetric random walks.
B",
Ä 𝑄𝑄9,J 𝑃𝑃J,5 given it started in that state
𝑖𝑖 ∉ 𝒜𝒜 𝑗𝑗 ∈ 𝒜𝒜 Gambler’s Ruin Problem
J∉𝒜𝒜 • Null recurrent: Infinite expected # of
transitions for a chain to return to state j Probability of reaching 𝑗𝑗 starting with 𝑖𝑖 is:
9
B",
Ä 𝑃𝑃9,J 𝑄𝑄J,5 given it started in that state ⎧ 1 − (𝑞𝑞⁄𝑝𝑝) , 𝑝𝑝 ≠ 1
𝑖𝑖 ∈ 𝒜𝒜 𝑗𝑗 ∉ 𝒜𝒜 ⎪1 − (𝑞𝑞⁄𝑝𝑝)5 2
J∉𝒜𝒜 • Aperiodic: A chain that has 𝑃𝑃9 =
⎨ 𝑖𝑖 1
limiting probabilities ⎪ , 𝑝𝑝 =
⎩ 𝑗𝑗 2
B"*
Ä Ä 𝑃𝑃9,J 𝑄𝑄J,+ 𝑃𝑃+,5 • Periodic: A chain that does not have
𝑖𝑖 ∈ 𝒜𝒜 𝑗𝑗 ∈ 𝒜𝒜
J∉𝒜𝒜 +∉𝒜𝒜 limiting probabilities Branching Processes
• Ergodic: A chain that is irreducible, #
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 4
Life Contingencies • The APV of whole life insurance is the Simulation
sum of the APV of term life and deferred
Number of Deaths 𝑈𝑈 ~ Uniform(0, 1)
whole life.
𝑑𝑑! = 𝑙𝑙! − 𝑙𝑙!=,
• The APV of endowment insurance is the Uniform Number Generation
Probability of Survival sum of the APV of term life and 𝑋𝑋1=, = (𝑎𝑎𝑋𝑋1 + 𝑐𝑐) mod 𝑚𝑚, 𝑛𝑛 ≥ 0
𝑙𝑙!=@ pure endowment. 𝑋𝑋1=,
@𝑝𝑝! = 𝑈𝑈 =
𝑙𝑙! 𝑚𝑚
Whole Life Annuity
Probability of Death # Inversion Method
𝑙𝑙! − 𝑙𝑙!=@ 𝑎𝑎̈ ! = Ä 𝑣𝑣 + ⋅ +𝑝𝑝! 𝑋𝑋 = 𝐹𝐹/", (𝑈𝑈)
@𝑞𝑞! =
𝑙𝑙! +:)
= 1 + 𝑣𝑣𝑝𝑝! ⋅ 𝑎𝑎̈ !=, Acceptance-Rejection Method
Curtate Life Expectancy 1 − 𝐴𝐴! 1. Find constant 𝑐𝑐 that satisfies:
# = 𝑓𝑓(𝑥𝑥)
𝑑𝑑
𝑒𝑒! = Ä +𝑝𝑝! ≤ 𝑐𝑐, for all 𝑥𝑥
𝑔𝑔(𝑥𝑥)
+:, Mortality Discount Factor
2. Simulate 𝑈𝑈 and a random number 𝑌𝑌 with
= 𝑝𝑝! (1 + 𝑒𝑒!=, ) @𝐸𝐸! = 𝑣𝑣 @ @𝑝𝑝! density function 𝑔𝑔.
Complete Expectation of Life Joint Lives 3. Accept the value 𝑌𝑌 if
#
𝑎𝑎̈ ! + 𝑎𝑎̈ O = 𝑎𝑎̈ !O + 𝑎𝑎̈ !O 𝑓𝑓(𝑌𝑌)
0.5 + Ä +𝑝𝑝!
PPPP 𝑈𝑈 ≤
𝑐𝑐𝑐𝑐(𝑌𝑌)
+:, Equivalence Principle Otherwise, reject and return to step 2.
Whole Life Insurance APVQRSTUVT = APVWSXSYU2
#
# of # of
𝜙𝜙(𝐱𝐱) 𝑟𝑟(𝐩𝐩)
Minimal Path Sets Minimal Cut Sets
1 1
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 5
STATISTICS Special Cases – Complete Data Kernel Density Estimation
Statistics 1
Distribution Shortcut 1
Parameter and Density Estimation 𝑓𝑓“(𝑥𝑥) = Ä 𝑘𝑘9 (𝑥𝑥)
𝑛𝑛
9:,
Method of Moments Gamma, 𝑥𝑥̅
𝜃𝜃Ã = • 𝑏𝑏: Bandwidth
To fit an 𝑟𝑟-parameter distribution, set: fixed 𝛼𝛼 𝛼𝛼
• 𝑥𝑥9 : 𝑖𝑖th observed value
∑19:, 𝑥𝑥9+
E[𝑋𝑋 + ] = , 𝑘𝑘 = 1, 2, … , 𝑟𝑟 𝜇𝜇̂ = 𝑥𝑥̅ • 𝑘𝑘9 (𝑥𝑥): Kernel density function for 𝑥𝑥9 ,
𝑛𝑛
Normal evaluated at 𝑥𝑥
∑19:, 𝑥𝑥9*
Percentile Matching 𝜎𝜎 * = − 𝜇𝜇̂ * • 𝑓𝑓“(𝑥𝑥): PDF of the kernel-smoothed
𝑛𝑛
• Estimate parameters by setting the distribution
theoretical percentiles equal to the ∑19:, ln 𝑥𝑥9
𝜇𝜇̂ = Rectangular Kernels
sample percentiles 𝑛𝑛
Lognormal 1 1
∑ (ln 𝑥𝑥9 )*
𝑘𝑘9 (𝑥𝑥) = ”2𝑏𝑏 , 𝑥𝑥9 − 𝑏𝑏 ≤ 𝑥𝑥 ≤ 𝑥𝑥9 + 𝑏𝑏
9:,
Smoothed Empirical Percentile – Unique 𝜎𝜎 * = − 𝜇𝜇̂ *
𝑛𝑛
Values 0, otherwise
Poisson 𝜆𝜆œ = 𝑥𝑥̅
𝜋𝜋' = [𝑞𝑞(𝑛𝑛 + 1)]23 smallest observed value
• If 𝑞𝑞(𝑛𝑛 + 1) is a non-integer, calculate 𝜋𝜋' Binomial, 𝑥𝑥̅
𝑞𝑞 =
by interpolating between the order fixed 𝑚𝑚 𝑚𝑚
statistics before and after.
Negative
𝑥𝑥̅
Maximum Likelihood Estimation Binomial, 𝛽𝛽œ =
𝑟𝑟 Triangular Kernels
1 fixed 𝑟𝑟
𝑏𝑏 − |𝑥𝑥 − 𝑥𝑥9 |
𝐿𝐿(𝜃𝜃) = ∞ 𝑓𝑓(𝑥𝑥9 ) , 𝑥𝑥9 − 𝑏𝑏 ≤ 𝑥𝑥 ≤ 𝑥𝑥9 + 𝑏𝑏
𝑘𝑘9 (𝑥𝑥) = ” 𝑏𝑏*
9:, Uniform
𝜃𝜃Ã = max(𝑥𝑥, , … , 𝑥𝑥1 ) 0, otherwise
• Estimate 𝜃𝜃 as the value that maximizes [0, 𝜃𝜃]
𝐿𝐿(𝜃𝜃) or 𝑙𝑙(𝜃𝜃) = ln 𝐿𝐿(𝜃𝜃)
• Invariance property Special Cases – Incomplete Data
Case Likelihood 𝑛𝑛
𝛼𝛼 =
∑1=Z
9:, [ln(𝑥𝑥9 + 𝜃𝜃) − ln(𝑑𝑑9 + 𝜃𝜃)] Gaussian Kernels
Right-censored at 𝑚𝑚 Pr(𝑋𝑋 ≥ 𝑚𝑚)
𝑘𝑘9 (𝑥𝑥)
S-P Pareto, fixed 𝜃𝜃 1 (𝑥𝑥 − 𝑥𝑥9 )*
𝑓𝑓(𝑥𝑥) = exp §− ¶ , −∞ < 𝑥𝑥 < ∞
Left-truncated at 𝑑𝑑 𝑏𝑏√2𝜋𝜋 2𝑏𝑏*
Pr(𝑋𝑋 > 𝑑𝑑) 𝑛𝑛
𝛼𝛼 =
∑1=Z
9:, {ln 𝑥𝑥9 − ln[max(𝜃𝜃, 𝑑𝑑9 )]}
Grouped data on
Pr(𝑎𝑎 < 𝑋𝑋 ≤ 𝑏𝑏)
interval (𝑎𝑎, 𝑏𝑏] Exponential
∑1=Z
9:, (𝑥𝑥9 − 𝑑𝑑9 )
𝜃𝜃Ã =
𝑛𝑛
Weibull, fixed 𝜏𝜏
,⁄[
∑1=Z [ 1=Z [
9:, 𝑥𝑥9 − ∑9:, 𝑑𝑑9
𝜃𝜃Ã = }
𝑛𝑛
where:
• 𝑛𝑛: # of uncensored data points
• 𝑐𝑐: # of censored data points
• 𝑥𝑥9 : 𝑖𝑖th observed value, or the censoring
point for censored data points
• 𝑑𝑑9 : truncation point for the 𝑖𝑖th observation
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 6
Estimator Quality Fisher Information Exponential Class of Distributions
Statistics and Estimators d* 𝑓𝑓(𝑥𝑥) = exp[𝑎𝑎(𝑥𝑥) ⋅ 𝑏𝑏(𝜃𝜃) + 𝑐𝑐(𝜃𝜃) + 𝑑𝑑(𝑥𝑥)]
𝐼𝐼(𝜃𝜃) = −E § * 𝑙𝑙(𝜃𝜃)¶
∑19:, 𝑋𝑋9 d𝜃𝜃 • ∑19:, 𝑎𝑎(𝑋𝑋9 ) is a complete sufficient
𝑋𝑋’ =
𝑛𝑛 d* statistic for 𝜃𝜃.
∑ 1
(𝑋𝑋9 − 𝑋𝑋’)* = −𝑛𝑛 ⋅ E § * ln 𝑓𝑓(𝑋𝑋)¶
𝑆𝑆 * = 9:, d𝜃𝜃
𝑛𝑛 − 1 Maximum Likelihood Estimators
• [𝐼𝐼(𝜃𝜃)]", is the Rao-Cramér lower bound.
Under specific circumstances, the MLE of 𝜃𝜃:
For a random sample: • 𝐼𝐼(𝜃𝜃) ∙ 𝑔𝑔( (𝜃𝜃)"* is the Fisher information
• Consistent estimator
• E[𝑋𝑋’] = E[𝑋𝑋] for 𝑔𝑔(𝜃𝜃).
• Asymptotically follows a normal
]^R[/]
• Var[𝑋𝑋’] = Minimum Variance Unbiased Estimator distribution with mean 𝜃𝜃 and variance
1
• The MVUE is an unbiased estimator with [𝐼𝐼(𝜃𝜃)]", ; its exact variance may equal the
Bias
the smallest variance among all asymptotic variance
Biasf𝜃𝜃Ãg = Ef𝜃𝜃Ã g − 𝜃𝜃
unbiased estimators. • Function of sufficient statistic 𝑌𝑌
• If lim Biasf𝜃𝜃Ã g = 0, then 𝜃𝜃Ã is
1→# • If 𝑌𝑌 is a complete sufficient statistic for 𝜃𝜃
asymptotically unbiased. and 𝜑𝜑(𝑌𝑌) is an unbiased estimator of 𝜃𝜃,
then the MVUE of 𝜃𝜃 is 𝜑𝜑(𝑌𝑌).
Variance
*
Varf𝜃𝜃Ãg = E ù9𝜃𝜃Ã − Ef𝜃𝜃Ã g: û Sufficiency
• 𝑌𝑌 is a sufficient statistic for 𝜃𝜃 if and only if
Mean Squared Error 𝑓𝑓(𝑥𝑥, , … , 𝑥𝑥1 |𝑦𝑦) = ℎ(𝑥𝑥, , … , 𝑥𝑥1 ) where
*
MSEf𝜃𝜃Ãg = E ù9𝜃𝜃Ã − 𝜃𝜃: û ℎ(𝑥𝑥, , … , 𝑥𝑥1 ) does not depend on 𝜃𝜃.
* • By factorization theorem, 𝑌𝑌 is sufficient if
= Varf𝜃𝜃Ãg + 9Biasf𝜃𝜃Ãg:
and only if 𝑓𝑓(𝑥𝑥, , … , 𝑥𝑥1 ) = ℎ, (𝑦𝑦, 𝜃𝜃) ⋅
Consistency ℎ* (𝑥𝑥, , … , 𝑥𝑥1 ) for non-negative functions
lim Pr9◊𝜃𝜃Ã − 𝜃𝜃◊ > 𝜀𝜀: = 0 for all 𝜀𝜀 > 0 ℎ, and ℎ* where ℎ* (𝑥𝑥, , … , 𝑥𝑥1 ) does not
1→#
depend on 𝜃𝜃.
• If lim Biasf𝜃𝜃Ã g = 0 and lim Varf𝜃𝜃Ãg = 0,
1→# 1→# • 𝑔𝑔(𝑌𝑌) is a sufficient statistic for 𝜃𝜃 if 𝑔𝑔(⋅) is
then 𝜃𝜃Ã is consistent. a one-to-one function of sufficient 𝑌𝑌.
Efficiency • By Rao-Blackwell theorem, the variance
[𝐼𝐼(𝜃𝜃)]", of the unbiased estimator Eb [𝑍𝑍|𝑌𝑌] is at
Efff𝜃𝜃Ãg = most the variance of any unbiased
Varf𝜃𝜃Ãg
estimator 𝑍𝑍 for sufficient 𝑌𝑌. The MVUE
• If Efff𝜃𝜃Ãg = 1, then 𝜃𝜃Ã is efficient.
𝜑𝜑(𝑌𝑌) is Eb [𝑍𝑍|𝑌𝑌].
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 7
Hypothesis Testing Tests for Means Tests for Proportions
• When variance is known, we apply the # of successes from 𝑛𝑛 trials
Terminology 𝑞𝑞 =
Central Limit Theorem. 𝑛𝑛
• Test statistic: A value calculated from data • Critical regions are the same as those for
• When variance is unknown, the random
that assumes 𝐻𝐻) is true testing means with known variance.
sample must be drawn from
• Critical region: The range of test statistic
a normal distribution. Tests for Variances – One Sample
values where 𝐻𝐻) is rejected
• Critical value: A value that borders the Critical Regions – Known Variance
critical region Test Type Critical Region
• Two-tailed test: A test that includes both Test Type Critical Region Left-tailed *
𝑡𝑡. 𝑠𝑠. ≤ 𝜒𝜒c,1",
tails in its critical region
Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑧𝑧,"c
• Right-tailed test: A test that only includes f𝑡𝑡. 𝑠𝑠. ≤ 𝜒𝜒c*⁄*,1", g
the right tail in its critical region Two-tailed
Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑧𝑧,"c⁄* *
∪ f𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c ⁄*,1", g
• Left-tailed test: A test that only includes
the left tail in its critical region Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝑧𝑧,"c *
Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,1",
• Significance level, 𝛼𝛼: The probability of
Critical Regions – Unknown Variance
rejecting 𝐻𝐻) , assuming it is true Tests for Variances – Two Samples
• Power: The probability of rejecting 𝐻𝐻) ,
Test Type Critical Region
assuming it is false Test Type Critical Region
• 𝑝𝑝-value: The probability of observing the Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑡𝑡*c,dY
Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ 𝐹𝐹,"c,1"",,1#",
test statistic or a more extreme value,
assuming 𝐻𝐻) is true Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑡𝑡c,dY
",
ù𝑡𝑡. 𝑠𝑠. ≤ 9𝐹𝐹c⁄*,1#",,1"",: û
Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝑡𝑡*c,dY Two-tailed
𝐻𝐻) is true 𝐻𝐻) is false ∪ f𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c⁄*,1"",,1#", g
One Sample
Type I Correct Right-tailed 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,1"",,1#",
Reject 𝐻𝐻) • df = 𝑛𝑛 − 1
Error Decision
Two Samples • A left-tailed test can be performed by
Fail to Correct Type II writing 𝐻𝐻) in terms of 𝜎𝜎** ⁄𝜎𝜎,* instead and
(𝑛𝑛, − 1)𝑠𝑠,* + (𝑛𝑛* − 1)𝑠𝑠**
reject 𝐻𝐻) Decision Error 𝑠𝑠e* = doing a right-tailed test.
𝑛𝑛, + 𝑛𝑛* − 2
",
• 𝜎𝜎,* = 𝜎𝜎** • 𝐹𝐹',g#,g" = 9𝐹𝐹,"',g",g# :
• For all hypothesis tests, reject 𝐻𝐻) if
• df = 𝑛𝑛, + 𝑛𝑛* − 2
𝑝𝑝-value ≤ 𝛼𝛼.
Two Samples – Paired
• Samples are not independent;
observations form pairs.
• Identical to one sample of
observed differences
• 𝑛𝑛∗ = 𝑛𝑛, = 𝑛𝑛*
• df = 𝑛𝑛∗ − 1
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 8
Summary for Hypothesis Testing
𝑥𝑥̅ − ℎ
Known
𝜎𝜎⁄√𝑛𝑛
One 𝜇𝜇 = ℎ
𝑥𝑥̅ − ℎ
Unknown
𝑠𝑠⁄√𝑛𝑛
𝑥𝑥̅, − 𝑥𝑥̅* − ℎ
Known 𝜎𝜎 * 𝜎𝜎 *
‡ , + *
𝑛𝑛, 𝑛𝑛*
Means Two 𝜇𝜇, − 𝜇𝜇* = ℎ
𝑥𝑥̅, − 𝑥𝑥̅* − ℎ
Unknown 1 1
𝑠𝑠e ·𝑛𝑛 + 𝑛𝑛
, *
𝑑𝑑̅ − ℎ
Known
𝜎𝜎h ⁄√𝑛𝑛∗
Two, Paired 𝜇𝜇, − 𝜇𝜇* = ℎ
𝑑𝑑̅ − ℎ
Unknown
𝑠𝑠h ⁄√𝑛𝑛∗
𝑞𝑞 − ℎ
One 𝑞𝑞 = ℎ –
·ℎ(1 − ℎ)
𝑛𝑛
Proportions
𝑞𝑞, − 𝑞𝑞* − ℎ
Two 𝑞𝑞, − 𝑞𝑞* = ℎ – 𝑞𝑞 (1 − 𝑞𝑞, ) 𝑞𝑞* (1 − 𝑞𝑞* )
· , +
𝑛𝑛, 𝑛𝑛*
(𝑛𝑛 − 1)𝑠𝑠 *
One 𝜎𝜎 * = ℎ –
ℎ
Variances
𝜎𝜎,* 𝑠𝑠,* 1
Two =ℎ – ⋅
𝜎𝜎** 𝑠𝑠** ℎ
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 9
Intervals for Means
𝜎𝜎
Two-sided 𝑥𝑥̅ ± 𝑧𝑧(,=+)⁄* ⋅
√𝑛𝑛
Known 𝜎𝜎
Left-sided ú−∞, 𝑥𝑥̅ + 𝑧𝑧+ ⋅ °
Variance √𝑛𝑛
𝜎𝜎
Right-sided ú𝑥𝑥̅ − 𝑧𝑧+ ⋅ , ∞°
√𝑛𝑛
𝜇𝜇
𝑠𝑠
Two-sided 𝑥𝑥̅ ± 𝑡𝑡,"+,1", ⋅
√𝑛𝑛
Unknown 𝑠𝑠
Left-sided ú−∞, 𝑥𝑥̅ + 𝑡𝑡*(,"+),1", ⋅ °
Variance √𝑛𝑛
𝑠𝑠
Right-sided ú𝑥𝑥̅ − 𝑡𝑡*(,"+),1", ⋅ , ∞°
√𝑛𝑛
𝜎𝜎 * 𝜎𝜎 *
Two-sided 𝑥𝑥̅, − 𝑥𝑥̅* ± 𝑧𝑧(,=+)⁄* ‡ , + *
𝑛𝑛, 𝑛𝑛*
𝜎𝜎,* 𝜎𝜎**
Right-sided Ø𝑥𝑥̅, − 𝑥𝑥̅* − 𝑧𝑧+ ‡ + , ∞≤
𝑛𝑛, 𝑛𝑛*
𝜇𝜇, − 𝜇𝜇* 1 1
Two-sided 𝑥𝑥̅, − 𝑥𝑥̅* ± 𝑡𝑡,"+,1"=1#"* ⋅ 𝑠𝑠e ‡ +
𝑛𝑛, 𝑛𝑛*
Unknown 1 1
Left-sided Ø−∞, 𝑥𝑥̅, − 𝑥𝑥̅* + 𝑡𝑡*(,"+),1"=1#"* ⋅ 𝑠𝑠e ‡ + ≤
Variances 𝑛𝑛, 𝑛𝑛*
1 1
Right-sided Ø𝑥𝑥̅, − 𝑥𝑥̅* − 𝑡𝑡*(,"+),1"=1#"* ⋅ 𝑠𝑠e ‡ + , ∞≤
𝑛𝑛, 𝑛𝑛*
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 10
Intervals for Proportions
𝑞𝑞(1 − 𝑞𝑞)
Two-sided 𝑞𝑞 ± 𝑧𝑧(,=+)⁄* ‡
𝑛𝑛
𝑞𝑞(1 − 𝑞𝑞)
𝑞𝑞 Left-sided Ø−∞, 𝑞𝑞 + 𝑧𝑧+ ‡ ≤
𝑛𝑛
𝑞𝑞(1 − 𝑞𝑞)
Right-sided Ø𝑞𝑞 − 𝑧𝑧+ ‡ , ∞≤
𝑛𝑛
(𝑛𝑛 − 1)𝑠𝑠 *
𝜎𝜎 * Left-sided }0, *
𝜒𝜒,"+,1",
(𝑛𝑛 − 1)𝑠𝑠 *
Right-sided } * , ∞
𝜒𝜒+,1",
𝑠𝑠,* *
", 𝑠𝑠,
Two-sided } * ⋅ 9𝐹𝐹(,"+)⁄*,1"",,1#", : , * ⋅ 𝐹𝐹(,"+)⁄*,1#",,1"",
𝑠𝑠* 𝑠𝑠*
𝜎𝜎,* 𝑠𝑠,*
Left-sided }0, ⋅ 𝐹𝐹
𝜎𝜎** 𝑠𝑠** ,"+,1#",,1"",
𝑠𝑠,* ",
Right-sided } * ⋅ 9𝐹𝐹,"+,1"",,1#", : , ∞
𝑠𝑠*
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 11
Most Powerful Tests Chi-Square Goodness-of-Fit Test Confidence Intervals
+ *
Terminology 9𝑛𝑛5 − 𝑛𝑛𝑞𝑞5 : • For means and proportions, the two-
𝑡𝑡. 𝑠𝑠. = Ä
• Simple: Fully specifies the distribution(s) 𝑛𝑛𝑞𝑞5 sided general form is
5:,
• Composite: Does not fully specify *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,+","J estimate ± (percentile)(standard error)
the distribution(s) • 𝑘𝑘: # of mutually exclusive intervals • 𝐻𝐻) will fail to be rejected at 𝛼𝛼 if ℎ is
• 𝑞𝑞5 : probability of being in interval 𝑗𝑗 within the 100(1 − 𝛼𝛼)%
Most Powerful Test
confidence interval.
When 𝐻𝐻) and 𝐻𝐻, are both simple, the most • 𝑛𝑛5 : # of observed values in interval 𝑗𝑗
powerful test of size 𝛼𝛼 has the largest power • 𝑟𝑟: # of free parameters
Order Statistics
among all tests with the same 𝛼𝛼.
Chi-Square Test of Independence
𝑋𝑋(+) = 𝑘𝑘 23 order statistic
Neyman-Pearson Theorem k j *
1 9𝑛𝑛95 𝑛𝑛 − 𝑛𝑛9• 𝑛𝑛•5 : 𝑋𝑋(,) = min(𝑋𝑋, , … , 𝑋𝑋1 )
The best critical region is embedded in 𝑡𝑡. 𝑠𝑠. = Ä Ä
𝑛𝑛 𝑛𝑛9• 𝑛𝑛•5 𝑋𝑋(1) = max(𝑋𝑋, , … , 𝑋𝑋1 )
9:, 5:,
𝐿𝐿(ℎ) )
≤ 𝑘𝑘 *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,(k",)(j",)
𝐿𝐿(ℎ, ) First Principles
where 𝐻𝐻) and 𝐻𝐻, are both simple. • 𝑎𝑎: # of categories for first variable 𝑛𝑛!
• 𝑏𝑏: # of categories for second variable 𝑓𝑓/(&) (𝑥𝑥) = ⋅ [𝐹𝐹/ (𝑥𝑥)]+",
(𝑘𝑘 − 1)! (𝑛𝑛 − 𝑘𝑘)!
Uniformly Most Powerful (UMP) Tests
• 𝑛𝑛95 : # of observations in first variable’s ⋅ 𝑓𝑓/ (𝑥𝑥) ⋅ [𝑆𝑆/ (𝑥𝑥)]1"+
• For a simple 𝐻𝐻) and composite 𝐻𝐻, , a test
category 𝑖𝑖 and second variable’s
is UMP when the best critical region is the Special Cases
category 𝑗𝑗
same for testing 𝐻𝐻) against each simple
• 𝑛𝑛9• : subtotal # of observations in category Uniform (𝑎𝑎, 𝑏𝑏)
hypothesis in 𝐻𝐻, .
𝑖𝑖, across all categories of the second
• For composite hypotheses 𝐻𝐻) ∶ 𝜃𝜃 ≤ ℎ and 𝑘𝑘(𝑏𝑏 − 𝑎𝑎)
variable
𝐻𝐻, ∶ 𝜃𝜃 > ℎ, a test is UMP if there is a Ef𝑋𝑋(+) g = 𝑎𝑎 +
• 𝑛𝑛•5 : subtotal # of observations in category 𝑛𝑛 + 1
monotone likelihood ratio in a statistic 𝑦𝑦.
𝑗𝑗, across all categories of the first variable
Uniform (0, 𝜃𝜃)
Goodness of Fit Tests Likelihood Ratio Test
𝐿𝐿) 𝑋𝑋(+) ~ Beta (𝑘𝑘, 𝑛𝑛 − 𝑘𝑘 + 1, 𝜃𝜃)
Kolmogorov-Smirnov Test 𝑡𝑡. 𝑠𝑠. = −2 ln ú ° = 2(𝑙𝑙, − 𝑙𝑙) )
𝐿𝐿,
𝑡𝑡. 𝑠𝑠. = 𝐷𝐷 = maximum absolute difference
*
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,J
between 𝐹𝐹 ∗ (𝑥𝑥) and 𝐹𝐹Ã (𝑥𝑥) " "J'
Exponential (𝜃𝜃)
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ critical value • 𝑟𝑟) : # of free parameters in distribution
under 𝐻𝐻) 1
• 𝐹𝐹 ∗ (𝑥𝑥): CDF of the proposed distribution 1
• 𝑟𝑟, : # of free parameters in distribution Ef𝑋𝑋(+) g = 𝜃𝜃 Ä
• 𝐹𝐹Ã (𝑥𝑥): Empirical distribution function 9:1"+=,
𝑖𝑖
# of observations ≤ 𝑥𝑥 under 𝐻𝐻,
𝐹𝐹Ã (𝑥𝑥) = • 𝐿𝐿) : Maximized likelihood under 𝐻𝐻)
𝑛𝑛
• 𝐿𝐿, : Maximized likelihood under 𝐻𝐻,
Left-Truncated at 𝑑𝑑
• 𝑙𝑙) = ln 𝐿𝐿)
𝐹𝐹(𝑥𝑥) − 𝐹𝐹(𝑑𝑑)
𝐹𝐹 ∗ (𝑥𝑥) = • 𝑙𝑙, = ln 𝐿𝐿,
1 − 𝐹𝐹(𝑑𝑑)
Right-Censored at 𝑚𝑚
𝐹𝐹Ã(𝑚𝑚) is undefined.
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 12
EXTENDED LINEAR MODELS Model Accuracy
Extended Linear Models
𝑌𝑌 = 𝑓𝑓9𝑥𝑥, , … , 𝑥𝑥l : + 𝜀𝜀, E[𝜀𝜀] = 0
Introduction to Statistical Learning * ∑19:,(𝑦𝑦9 − 𝑦𝑦9 )*
Types of Variables Test MSE = E ù9𝑌𝑌 − 𝑌𝑌Ã: û can be estimated using
𝑛𝑛
• Response: A variable of primary interest For fixed inputs 𝑥𝑥, , … , 𝑥𝑥l , the test MSE is
• Explanatory: A variable used to study the response variable *
Varf𝑓𝑓œ9𝑥𝑥, , … , 𝑥𝑥l :g + 9Biasf𝑓𝑓œ9𝑥𝑥, , … , 𝑥𝑥l :g: + Var[𝜀𝜀]
ÁÈÍ
ÁËËËËËËËËËËËËÈËËËËËËËËËËËËÍ
• Count: A quantitative variable valid on non-negative integers RSdVmUnoS SRRqR URRSdVmUnoS SRRqR
• Continuous: A quantitative variable valid on real numbers • If training data 𝑦𝑦9 ’s are used, training MSE is computed instead.
• Nominal: A qualitative variable having categories without a • As flexibility increases, the training MSE decreases, but the test
meaningful or logical order MSE follows a u-shaped pattern.
• Ordinal: A qualitative variable having categories with a meaningful • Low flexibility leads to a method with low variance and high bias;
or logical order high flexibility leads to a method with high variance and low bias.
Graphical Summaries
• A scatterplot plots values of two variables to investigate
their relationship.
• A box plot captures a variable's distribution using its median, 1st
and 3rd quartiles, and distribution tails.
• A QQ plot plots sample percentiles against theoretical percentiles
to determine whether the sample and theoretical distributions
have similar shapes.
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 13
Simple Linear Regression (SLR) Other Numerical Results 𝑡𝑡 Tests
𝑒𝑒 = 𝑦𝑦 − 𝑦𝑦 estimate − hypothesized value
Special case of MLR where 𝑝𝑝 = 1 1
𝑡𝑡. 𝑠𝑠. =
standard error
SSR = Ä (𝑦𝑦9 − 𝑦𝑦’)*
Estimation 9:,
1 Test Type Critical Region
∑1 (𝑥𝑥9 − 𝑥𝑥̅ )(𝑦𝑦9 − 𝑦𝑦’) SSE = Ä (𝑦𝑦9 − 𝑦𝑦9 )*
𝛽𝛽œ, = 9:, 1
∑9:,(𝑥𝑥9 − 𝑥𝑥̅ )* 9:, Left-tailed 𝑡𝑡. 𝑠𝑠. ≤ −𝑡𝑡*c,1"l",
1
𝛽𝛽œ) = 𝑦𝑦’ − 𝛽𝛽œ, 𝑥𝑥̅ SST = Ä (𝑦𝑦9 − 𝑦𝑦’)* = SSR + SSE
Two-tailed |𝑡𝑡. 𝑠𝑠. | ≥ 𝑡𝑡c,1"l",
9:,
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 14
Analysis of Variance (ANOVA) Testing the Significance of Factor A Two-Way ANOVA – Model with Interactions
SSR x ÷ (𝑤𝑤 − 1) 𝑌𝑌9,5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝛾𝛾5,+ + 𝜀𝜀9,5,+
One-Way ANOVA 𝑡𝑡. 𝑠𝑠. =
SSE^dd ÷ (𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1) • 𝑖𝑖 = 1, … , 𝑛𝑛∗
𝑌𝑌9,5 = 𝜇𝜇 + 𝛼𝛼5 + 𝜀𝜀9,5
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY • 𝑗𝑗 = 1, … , 𝑤𝑤
• 𝑖𝑖 = 1, … , 𝑛𝑛5
• ndf = 𝑤𝑤 − 1 • 𝑘𝑘 = 1, … , 𝑣𝑣
• Factor has 𝑤𝑤 levels, 𝑗𝑗 = 1, … , 𝑤𝑤
• ddf = 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1
1% SSdUYY = SSE^dd − SSEUX2
1 Testing the Significance of Factor B = SSR UX2 − SSR^dd
𝑦𝑦’5 = Ä 𝑦𝑦9,5
𝑛𝑛5 SSR W ÷ (𝑣𝑣 − 1)
9:, 𝑡𝑡. 𝑠𝑠. =
w 1% w SSE^dd ÷ (𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1) Source SS df
* *
SSR = Ä Ä9𝑦𝑦’5 − 𝑦𝑦’: = Ä 𝑛𝑛5 9𝑦𝑦’5 − 𝑦𝑦’: • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
Factor A SSRx 𝑤𝑤 − 1
5:, 9:, 5:, • ndf = 𝑣𝑣 − 1
w 1%
*
• ddf = 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1 Factor B SSRW 𝑣𝑣 − 1
SSE = Ä Ä9𝑦𝑦9,5 − 𝑦𝑦’5 :
5:, 9:, Two-Way ANOVA – Additive Model without
Interaction SSdUYY (𝑤𝑤 − 1)(𝑣𝑣 − 1)
w 1% Replication
*
SST = Ä Ä9𝑦𝑦9,5 − 𝑦𝑦’: 𝑌𝑌5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝜀𝜀5,+ Error SSEUX2 𝑛𝑛 − 𝑤𝑤𝑤𝑤
5:, 9:,
• 𝑛𝑛∗ = 1
Total SST 𝑛𝑛 − 1
• 𝑗𝑗 = 1, … , 𝑤𝑤
Source SS df
• 𝑘𝑘 = 1, … , 𝑣𝑣
Testing the Significance of Interactions
Factor SSR 𝑤𝑤 − 1
1
g
1
w SSdUYY ÷ [(𝑤𝑤 − 1)(𝑣𝑣 − 1)]
𝑦𝑦’5• = Ä 𝑦𝑦5,+ , 𝑦𝑦’•+ = Ä 𝑦𝑦5,+ 𝑡𝑡. 𝑠𝑠. =
𝑣𝑣 𝑤𝑤 SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
Error SSE 𝑛𝑛 − 𝑤𝑤
+:, 5:,
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
g w w
Total SST 𝑛𝑛 − 1 * * • ndf = (𝑤𝑤 − 1)(𝑣𝑣 − 1)
SSRx = Ä Ä9𝑦𝑦’5• − 𝑦𝑦’: = Ä 𝑣𝑣9𝑦𝑦’5• − 𝑦𝑦’:
+:, 5:, 5:, • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
Testing the Significance of Factor g w g
SSR ÷ (𝑤𝑤 − 1) SSRW = Ä Ä(𝑦𝑦’•+ − 𝑦𝑦’)* = Ä 𝑤𝑤(𝑦𝑦’•+ − 𝑦𝑦’)* Testing the Significance of Factor A
𝑡𝑡. 𝑠𝑠. = SSRx ÷ (𝑤𝑤 − 1)
SSE ÷ (𝑛𝑛 − 𝑤𝑤) +:, 5:, +:,
𝑡𝑡. 𝑠𝑠. =
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
g w SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
*
SSE^dd = Ä Ä9𝑦𝑦5,+ − 𝑦𝑦’5• − 𝑦𝑦’•+ + 𝑦𝑦’: • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
• ndf = 𝑤𝑤 − 1
+:, 5:,
• ddf = 𝑛𝑛 − 𝑤𝑤 g w • ndf = 𝑤𝑤 − 1
*
SST = Ä Ä9𝑦𝑦5,+ − 𝑦𝑦’: • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
Two-Way ANOVA – Additive Model +:, 5:,
𝑌𝑌9,5,+ = 𝜇𝜇 + 𝛼𝛼5 + 𝛽𝛽+ + 𝜀𝜀9,5,+ Testing the Significance of Factor B
SSR W ÷ (𝑣𝑣 − 1)
• Factor A has 𝑤𝑤 levels, 𝑖𝑖 = 1, … , 𝑛𝑛∗ 𝑡𝑡. 𝑠𝑠. =
SSEUX2 ÷ (𝑛𝑛 − 𝑤𝑤𝑤𝑤)
• Factor B has 𝑣𝑣 levels, 𝑗𝑗 = 1, … , 𝑤𝑤
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝐹𝐹c,XdY,ddY
• 𝑘𝑘 = 1, … , 𝑣𝑣
• ndf = 𝑣𝑣 − 1
SSRW = SSEx − SSE^dd • ddf = 𝑛𝑛 − 𝑤𝑤𝑤𝑤
= SSR ^dd − SSRx
Other Key Ideas
Source SS df • In testing whether a source is significant,
the test statistic is the mean square of
Factor A SSRx 𝑤𝑤 − 1 that source divided by the MSE of the
model that has the most predictors.
Factor B SSRW 𝑣𝑣 − 1
• ANCOVA models have both quantitative
Error SSE^dd 𝑛𝑛 − 𝑤𝑤 − 𝑣𝑣 + 1 and qualitative predictors.
• The uncorrected total sum of squares is
Total SST 𝑛𝑛 − 1 ∑19:, 𝑦𝑦9* . The sources of an
ANOVA/ANCOVA table may sum to the
uncorrected table rather than the
corrected total.
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 15
Linear Model Assumptions Model Selection Validation Set
• Randomly splits all available
Leverage • 𝑔𝑔: Total # of predictors in consideration
observations into two groups: the
1 (𝑥𝑥9 − 𝑥𝑥̅ )* • 𝑝𝑝: # of predictors for a specific model
ℎ9 = + 1 for SLR training set and the validation set.
𝑛𝑛 ∑<:,(𝑥𝑥< − 𝑥𝑥̅ )* • MSEy : MSE of the model that uses
23 • Only the observations in the training set
• ℎ9 is the 𝑖𝑖 diagonal entry of 𝐇𝐇. all 𝑔𝑔 predictors
are used to attain the fitted model, and
• ∑19:, ℎ9 = 𝑝𝑝 + 1 • Μl : The “best” model with 𝑝𝑝 predictors
those in validation set are used to
Standardized Residuals Best Subset Selection estimate the test MSE.
𝑒𝑒9
𝑒𝑒v2^,9 = 1. For 𝑝𝑝 = 0, 1, … , 𝑔𝑔, fit all ´yl¨ models with 𝑝𝑝 𝑘𝑘-fold Cross-Validation
˚MSE(1 − ℎ9 )
predictors. The model with the largest 𝑅𝑅 * 1. Randomly divide all available
DFITS is Μl . observations into 𝑘𝑘 folds.
2. Choose the best model among Μ) , … , Μy 2. For 𝑣𝑣 = 1, … , 𝑘𝑘, obtain the 𝑣𝑣th fit by
ℎ9
DFITS9 = 𝑒𝑒v2^,9 ‡ training with all observations except
1 − ℎ9 using a selection criterion of choice.
those in the 𝑣𝑣th fold.
Cook’s Distance Forward Stepwise Selection 3. For 𝑣𝑣 = 1, … , 𝑘𝑘, use 𝑦𝑦 from the 𝑣𝑣th fit to
* 1. Fit all 𝑔𝑔 simple linear regression models. calculate a test MSE estimate with
DFITS9* 𝑒𝑒v2^,9 ℎ9
𝑑𝑑9 = = The model with the largest 𝑅𝑅* is Μ, . observations in the 𝑣𝑣th fold.
𝑝𝑝 + 1 (𝑝𝑝 + 1)(1 − ℎ9 )
2. For 𝑝𝑝 = 2, … , 𝑔𝑔, fit the models that add 4. To calculate CV error, average the 𝑘𝑘 test
𝑒𝑒9* ℎ9
= one of the remaining predictors to Μl", . MSE estimates in the previous step.
MSE(𝑝𝑝 + 1)(1 − ℎ9 )*
The model with the largest 𝑅𝑅* is Μl .
Plots of Residuals Leave-One-Out Cross-Validation (LOOCV)
3. Choose the best model among Μ) , … , Μy
• 𝑒𝑒 versus 𝑦𝑦 • Calculate LOOCV error as a special case of
using a selection criterion of choice.
Residuals are well-behaved if 𝑘𝑘-fold cross-validation where 𝑘𝑘 = 𝑛𝑛.
1
o Points appear to be Backward Stepwise Selection 1 𝑦𝑦9 − 𝑦𝑦9 *
LOOCV Error = Ä ú ° for MLR
randomly scattered 1. Fit the model with all 𝑔𝑔 predictors, Μy . 𝑛𝑛 1 − ℎ9
9:,
o Residuals seem to average to 0 2. For 𝑝𝑝 = 𝑔𝑔 − 1, … , 1, fit the models that
o Spread of residuals does not change drop one of the predictors from Μl=, . Key Ideas on Cross-Validation
• 𝑒𝑒 versus 𝑖𝑖 *
The model with the largest 𝑅𝑅 is Μl . • The validation set approach has unstable
Detects dependence of error terms 3. Choose the best model among Μ) , … , Μy results and will tend to overestimate the
• QQ plot of 𝑒𝑒 test MSE. The two other approaches
using a selection criterion of choice.
mitigate these issues.
Variance Inflation Factor Selection Criteria • With respect to bias, LOOCV < 𝑘𝑘-fold CV <
1 • Adjusted 𝑅𝑅* Validation Set.
VIF5 =
1 − 𝑅𝑅5*
• Mallows’ 𝐶𝐶l • With respect to variance, LOOCV > 𝑘𝑘-fold
VIF5 > 5 indicates multicollinearity. 1 CV > Validation Set.
𝐶𝐶l = 9SSE + 2𝑝𝑝 ⋅ MSEy :
𝑛𝑛
Curse of Dimensionality
• Akaike information criterion
Having many predictors in a model 1
increases the risk of including noise AIC = 9SSE + 2𝑝𝑝 ⋅ MSEy :
𝑛𝑛
predictors that are not associated with • Bayesian information criterion
the response. 1
BIC = 9SSE + ln 𝑛𝑛 ⋅ 𝑝𝑝 ⋅ MSEy :
𝑛𝑛
• Cross-validation error
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 16
Other Linear Regression Approaches Principal Components Partial Least Squares
l
• Supervised technique that performs
Standardizing Variables
𝑧𝑧B = Ä 𝜙𝜙5,B 𝑥𝑥5 dimension reduction on 𝑝𝑝 variables
• A centered variable is the result of 5:,
• Uses the first 𝑘𝑘 PLS directions that are
subtracting the sample mean l
* orthogonal as predictors in an MLR.
from a variable. Ä 𝜙𝜙5,B =1
• 𝑘𝑘 is a measure of flexibility.
• A scaled variable is the result of dividing 5:,
l • When 𝑘𝑘 = 𝑝𝑝, PLS is equivalent to
a variable by its standard deviation.
Ä 𝜙𝜙5,B ⋅ 𝜙𝜙5,< = 0, 𝑚𝑚 ≠ 𝑢𝑢 performing MLR with the 𝑝𝑝 original
• A standardized variable is the result of
5:, variables as predictors.
first centering a variable, then scaling it.
• Unsupervised technique that performs • The first PLS direction is a linear
Shrinkage Methods dimension reduction on 𝑝𝑝 variables combination of the 𝑝𝑝 standardized
• The variability explained by each predictors, with coefficients that are
subsequent principal component is based on the response 𝑦𝑦.
Ridge Lasso
always less than the variability explained • Every subsequent PLS direction is
by its previous principal component. calculated iteratively as a linear
SSE SSE
l l
• Principal components form the lower combination of "updated predictors"
dimension surface that is closest to the which are the residuals of fits with the
+ 𝜆𝜆 Ä 𝛽𝛽œ5* + 𝜆𝜆 Ä◊𝛽𝛽œ5 ◊
5:, 5:,
observations in 𝑝𝑝-dimensional space. "previous predictors" explained by the
• Standardized variables affect the loadings previous direction.
Minimize
SSE SSE by becoming resistant to varying scales
subject to subject to among the original variables.
l l
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 17
Generalized Linear Models Parameter Estimation – Method of Scoring Likelihood Ratio Test
Ó (B",) + f𝐈𝐈 (B",) g", 𝐮𝐮(B",)
Ó (B) = 𝜷𝜷
𝜷𝜷 Ó M : − 𝑙𝑙9𝜷𝜷
𝑡𝑡. 𝑠𝑠. = 2f𝑙𝑙9𝜷𝜷 Ó J :g
Exponential Family*
", = 𝐷𝐷J − 𝐷𝐷M
𝑓𝑓(𝑦𝑦) = exp[𝑎𝑎(𝑦𝑦) ⋅ 𝑏𝑏(𝜃𝜃) + 𝑐𝑐(𝜃𝜃) + 𝑑𝑑(𝑦𝑦)] = 9𝐗𝐗N 𝐖𝐖 (B",) 𝐗𝐗: 𝐗𝐗 N 𝐖𝐖(B",) 𝐳𝐳 (B",)
*
𝑐𝑐′(𝜃𝜃) 1 • Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,l * "l+
E[𝑎𝑎(𝑌𝑌)] = − 𝑤𝑤9 =
𝑏𝑏′(𝜃𝜃) Var[𝑌𝑌9 ] ⋅ 𝑔𝑔′(𝜇𝜇9 )*
𝑏𝑏(( (𝜃𝜃)𝑐𝑐 ( (𝜃𝜃) − 𝑐𝑐 (( (𝜃𝜃)𝑏𝑏( (𝜃𝜃) Wald Test
𝑧𝑧9 = 𝑔𝑔(𝜇𝜇9 ) + (𝑦𝑦9 − 𝜇𝜇9 )𝑔𝑔′(𝜇𝜇9 )
Var[𝑎𝑎(𝑌𝑌)] = *
[𝑏𝑏( (𝜃𝜃)]- 𝛽𝛽œ5 − ℎ
Numerical Results 𝑡𝑡. 𝑠𝑠. = § ¶
𝑠𝑠𝑠𝑠9𝛽𝛽œ5 :
Canonical Form Ó :g
𝐷𝐷 = 2f𝑙𝑙v^2 − 𝑙𝑙9𝜷𝜷 *
• Reject 𝐻𝐻) if 𝑡𝑡. 𝑠𝑠. ≥ 𝜒𝜒,"c,,
• 𝑎𝑎(𝑦𝑦) = 𝑦𝑦 Ó:
𝑙𝑙9𝜷𝜷 Ó − 𝜷𝜷:N 𝐈𝐈9𝜷𝜷
Ó − 𝜷𝜷: follows an
• 𝑏𝑏(𝜃𝜃) is the natural parameter
*
𝑅𝑅evS. =1− • 9𝜷𝜷
𝑙𝑙XVoo
• 𝜇𝜇 = E[𝑌𝑌] is a function of 𝜃𝜃 Ó : + 2𝑘𝑘 approximate chi-square distribution with
AIC = −2 ⋅ 𝑙𝑙9𝜷𝜷
• Var[𝑌𝑌] is a function of 𝜇𝜇 𝑝𝑝 + 1 degrees of freedom.
Ó : + 𝑘𝑘 ln 𝑛𝑛
BIC = −2 ⋅ 𝑙𝑙9𝜷𝜷
*Key results on Exponential Family is on where 𝑘𝑘 is the # of estimated parameters Tweedie Distributions
page 21. Var[𝑌𝑌] = 𝑎𝑎 ⋅ E[𝑌𝑌]>
Residuals
Model Framework Raw Residual Distribution 𝑑𝑑
𝑔𝑔(𝜇𝜇) = 𝐱𝐱 N 𝜷𝜷 = 𝛽𝛽) + 𝛽𝛽, 𝑥𝑥, + ⋯ + 𝛽𝛽l 𝑥𝑥l 𝑒𝑒9 = 𝑦𝑦9 − 𝜇𝜇̂ 9
Normal 0
Pearson Residual Poisson 1
Function Name 𝑔𝑔(𝜇𝜇)
𝑒𝑒9
𝑒𝑒9? = Compound Poisson-Gamma (1, 2)
Identity 𝜇𝜇 Ò [𝑌𝑌9 ]
˚Var
𝜇𝜇 𝑒𝑒9? Gamma 2
Logit ln ú ° ?
𝑒𝑒v2^,9 =
1 − 𝜇𝜇 ˚1 − ℎ9 Inverse Gaussian 3
Logarithmic ln 𝜇𝜇 *
• Pearson chi-square statistic is ∑19:,9𝑒𝑒9? : .
1 Connection with MLR
Inverse
𝜇𝜇 Deviance Residual • A GLM with a normally distributed
Power 𝜇𝜇> 𝑒𝑒9h = ±˚𝐷𝐷9 response, identity link, and
whose sign follows the 𝑖𝑖 23 raw residual homoscedasticity is the same as MLR.
Distribution Canonical Link 𝑒𝑒9h • MLE estimates = OLS estimates
h
𝑒𝑒v2^,9 =
˚1 − ℎ9 • 𝜎𝜎 * 𝐷𝐷 = SSE
Normal Identity
*
• Deviance is ∑19:,9𝑒𝑒9h : .
Binomial Logit
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 18
Binomial and Categorical Logistic Regression Poisson Response Regression
Response Regression exp9𝐱𝐱9N 𝜷𝜷:
𝑞𝑞9 = 𝜇𝜇9 = 𝑎𝑎9 ⋅ exp9𝐱𝐱9N 𝜷𝜷:
1 + exp9𝐱𝐱9N 𝜷𝜷:
Binomial Response Variable where 𝑎𝑎9 is the exposure amount
1
• The odds of an event are the ratio of the
𝑢𝑢5 = Ä(𝑦𝑦9 − 𝜇𝜇9 )𝑥𝑥9,5 1
probability that the event will occur to 9:, 𝑙𝑙(𝜷𝜷) = Ä[𝑦𝑦9 ln 𝜇𝜇9 − 𝜇𝜇9 − ln(𝑦𝑦9 !)]
the probability that the event will not 1
9:,
occur, i.e., 𝐈𝐈 = Ä 𝑚𝑚9 𝑞𝑞9 (1 − 𝑞𝑞9 )𝐱𝐱9 𝐱𝐱9N 1
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 19
Generalized Additive Models Smoothing Splines Generalized Additive Models
1 # • Each explanatory variable contributes to
The # of degrees of freedom used is the # of Minimize Ä[𝑦𝑦9 − 𝑔𝑔(𝑥𝑥9 )]* + 𝜆𝜆 * 𝑔𝑔′′(𝑡𝑡)* d𝑡𝑡 the mean response independently of the
regression coefficients, i.e., 𝑝𝑝 + 1. 9:, "#
other explanatory variables; no
• Smoothing parameter 𝜆𝜆 is inversely
Basis Functions interactions are considered.
related to flexibility.
𝑌𝑌 = 𝛽𝛽) + 𝛽𝛽, 𝑏𝑏, (𝑥𝑥) + ⋯ + 𝛽𝛽l 𝑏𝑏l (𝑥𝑥) + 𝜀𝜀 • The effect of each explanatory variable on
• 𝑔𝑔(𝑥𝑥) has the same form as the fitted
the response can be investigated
Step Functions natural cubic spline with knots at the 𝑛𝑛
individually, assuming the other variables
𝐼𝐼9𝜉𝜉 ≤ 𝑥𝑥 < 𝜉𝜉5=, :, 𝑗𝑗 = 1, … , 𝑘𝑘 − 1 values of 𝑥𝑥.
𝑏𝑏5 (𝑥𝑥) = - 5 are held constant.
𝐼𝐼(𝑥𝑥 ≥ 𝜉𝜉+ ), 𝑗𝑗 = 𝑘𝑘 • Effective degrees of freedom measures
• Backfitting can be used for fitting if
flexibility as the sum of the diagonal
Piecewise Polynomial Regression ordinary least squares cannot.
entries of 𝐒𝐒z , where 𝐲𝐲z = 𝐒𝐒z 𝐲𝐲.
The basis functions are:
• 𝑥𝑥, 𝑥𝑥 * , … , 𝑥𝑥 > Local Regression
• 𝑘𝑘 step functions • Calculates the fitted value for a specific
• 𝑑𝑑𝑑𝑑 interaction terms input by mimicking weighted least
squares, i.e., minimize ∑19:, 𝑤𝑤9 (𝑦𝑦9 − 𝑦𝑦9 )* .
Regression Splines • Weights are determined by the span and
• A degree-𝑑𝑑 spline is a continuous the weighting function, such that
piecewise degree-𝑑𝑑 polynomial with observations nearer to the input are
continuity in derivatives up to given larger weights.
degree 𝑑𝑑 − 1 at each knot. • Span is inversely related to flexibility.
• The basis functions of a cubic spline can • Does not perform well in high dimension.
be 𝑥𝑥, 𝑥𝑥 * , 𝑥𝑥 - , (𝑥𝑥 − 𝜉𝜉,)-= , … , (𝑥𝑥 − 𝜉𝜉+ )-= .
• A natural spline is a regression spline that
is linear instead of a polynomial in the
boundary regions.
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 20
Key Results for Distributions in the Exponential Family
Binomial, 𝑞𝑞
𝑞𝑞 ln ú ° 𝑚𝑚 ln(1 − 𝑞𝑞)
fixed 𝑚𝑚 1 − 𝑞𝑞
Normal, 𝜇𝜇 𝜇𝜇*
𝜇𝜇 −
fixed 𝜎𝜎 * 𝜎𝜎 * 2𝜎𝜎 *
Poisson 𝜆𝜆 ln 𝜆𝜆 −𝜆𝜆
Gamma, 1
𝜃𝜃 − −𝛼𝛼 ln 𝜃𝜃
fixed 𝛼𝛼 𝜃𝜃
Inverse Gaussian, 𝜃𝜃 𝜃𝜃
𝜇𝜇 −
fixed 𝜃𝜃 2𝜇𝜇* 𝜇𝜇
Negative Binomial, 𝛽𝛽
𝛽𝛽 ln ú ° −𝑟𝑟 ln(1 + 𝛽𝛽)
fixed 𝑟𝑟 1 + 𝛽𝛽
Number of Predictors for GAMs with a 𝑑𝑑@A degree polynomial and 𝑘𝑘 knots
Model # of Predictors, 𝑝𝑝
Polynomial 𝑑𝑑
Piecewise constant 𝑘𝑘
Cubic spline 3 + 𝑘𝑘
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 21
NOTATION Extended Linear Models
Notation
𝑋𝑋 ~ Name(parameters) represents 𝑋𝑋 follows a “Name” distribution Symbol Description
with “parameters” following the parametrization on the exam table. 𝑛𝑛 # of observation
𝑝𝑝 # of predictors
Probability Models
SST Total sum of squares
Symbol Description
SSR Regression sum of squares
𝐀𝐀N Transpose of matrix 𝐀𝐀
SSE/RSS Error sum of squares
𝐀𝐀", Inverse of matrix 𝐀𝐀
SS Sum of squares
© 2024 Coaching Actuaries. All Rights Reserved www.coachingactuaries.com MAS-I Formula Sheet 22