0% found this document useful (0 votes)
56 views1,201 pages

Guide

This document provides an introduction to the Program MARK software for analysis of mark-recapture data. It covers topics such as the maximum likelihood theory underlying MARK's approach, formatting data for input, running initial analyses in MARK, building and comparing models through parameter indexing and model selection criteria like AIC, and goodness-of-fit testing. The table of contents indicates it is divided into chapters covering these topics in detail across nearly 100 pages.

Uploaded by

ClaudiaSánchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views1,201 pages

Guide

This document provides an introduction to the Program MARK software for analysis of mark-recapture data. It covers topics such as the maximum likelihood theory underlying MARK's approach, formatting data for input, running initial analyses in MARK, building and comparing models through parameter indexing and model selection criteria like AIC, and goodness-of-fit testing. The table of contents indicates it is divided into chapters covering these topics in detail across nearly 100 pages.

Uploaded by

ClaudiaSánchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1201

Program MARK

A Gentle Introduction

evan g. cooch & gary c. white (eds.)

( 19th edition
Table of contents

1 First steps. . .
1.1 Return ‘rates’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
1.2 A more robust approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
1.3 Maximum likelihood theory – the basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
1.3.1 Why maximum likelihood? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
1.3.2 Simple estimation example – the binomial coefficient . . . . . . . . . . . . . . . . . . . . . 1-6
1.3.3 Multinomials: a simple extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 13
1.4 Application to mark-recapture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 15
1.5 Variance estimation for > 1 parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 19
1.6 More than ‘estimation’ – ML and statistical testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 20
1.7 Technical aside: a bit more on variances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 22
1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 23
1.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 - 23

2 Data formatting: the input file . . .


2.1 Encounter histories formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
2.1.1 Groups within groups... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.2 Removing individuals from the sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
2.3 missing sampling occasions + uneven time-intervals . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
2.4 Different encounter history formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2.5 Some more examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2.5.1 Dead recoveries only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
2.5.2 Individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
Addendum: generating .inp files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 - 11

3 First steps in using Program MARK. . .


3.1 Starting MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
3.2 starting a new project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3.3 Running the analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
3.4 Examining the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.4.1 MARK, PIMs, and parameter indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 - 15

4 Building & comparing models


4.1 Building models – parameter indexing & model structures . . . . . . . . . . . . . . . . . . . . . . . 4-2

© Cooch & White (2019) 08.30.2019


TABLE OF CONTENTS 3

4.2 A quicker way to build models – the PIM chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 12


4.2.1 PIM charts and single groups – Dipper re-visited . . . . . . . . . . . . . . . . . . . . . . . 4 - 12
4.2.2 PIM charts and multiple groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 19
4.3 Model selection – the basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 34
4.3.1 The AIC, in brief... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 35
4.3.2 Some important refinements to the AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 39
4.3.3 BIC – an alternative to the AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 41
4.4 Using the AIC for model selection – simple mechanics... . . . . . . . . . . . . . . . . . . . . . . . . 4 - 42
4.5 Model uncertainty: an introduction to model averaging . . . . . . . . . . . . . . . . . . . . . . . . 4 - 47
4.5.1 Model averaging: deriving SE and CI values . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 53
4.6 Significance? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 59
4.6.1 Classical significance testing in MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 60
4.6.2 Some problems with the classical approach... . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 64
4.6.3 ‘Significance’ of a factor using AIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 66
4.7 LRT or AIC, or something else? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 69
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 69
4.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 70
Addendum: counting parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 - 71

5 Goodness of fit testing...


5.1 Conceptual motivation – ‘c-hat’ (ĉ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5.2 The practical problem – estimating ĉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5.3 Program RELEASE – details, details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.4 Program Release – TEST 2 & TEST 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
5.4.1 Running RELEASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5.4.2 Running RELEASE as a standalone application . . . . . . . . . . . . . . . . . . . . . . . . 5 - 12
5.5 Enhancements to RELEASE – program U-CARE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 16
5.5.1 RELEASE & U-CARE – estimating ĉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 20
5.6 MARK and bootstrapped GOF testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 22
5.6.1 RELEASE versus the bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 27
5.7 ‘median ĉ’ – a way forward? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 27
5.8 The Fletcher ĉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 31
5.9 What to do when the general model ‘doesn’t fit’? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 32
5.9.1 Inappropriate model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 32
5.9.2 Extra-binomial variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 35
5.10 How big a ĉ is ‘too big’? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 37
5.10.1 General recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 38
5.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 40
5.12 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 - 40

6 Adding constraints: MARK and linear models


6.1 A (brief) review of linear models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
6.2 Linear models and the ‘design matrix’: the basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
6.3 The European Dipper – the effects of flooding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 12
6.3.1 Design matrix options: full, reduced, and identity . . . . . . . . . . . . . . . . . . . . . . . 6 - 21
6.4 Running the model: details of the output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 21
6.5 Reconstituting parameter values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 25
6.5.1 Subset models and the design matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 32

TABLE OF CONTENTS
TABLE OF CONTENTS 4

6.6 Some additional design matrix tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 38


6.7 Design matrix...or PIMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 39
6.8 Constraining with ‘real’ covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 41
6.8.1 Reconstituting estimates using real covariates . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 44
6.8.2 Plotting the functional form – real covariates . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 45
6.9 A special case of ‘real covariates’ – linear trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 50
6.10 More than 2 levels of a group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 58
6.11 > 1 classification variables: n-way ANOVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 60
6.12 Time + Group – building additive models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 64
6.13 Linear models and ‘effect size’: a test of your understanding. . . . . . . . . . . . . . . . . . . . . . . 6 - 66
6.13.1 Linear models: β estimates and odds ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 74
6.13.2 ĉ and effect size: a cautionary note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 79
6.14 Pulling all the steps together: a sequential approach . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 79
6.14.1 Application – alternative design matrices for additive models . . . . . . . . . . . . . . . . 6 - 85
6.15 A final example: mean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 92
6.16 Model averaging over linear covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 99
6.17 RMark – an alternative approach to linear models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 110
6.18 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 110
6.19 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 - 110

7 ‘Age’ and cohort models...


7.1 ‘Age’ models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2
7.2 Constraining an age model: marked as young only . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 14
7.2.1 DM with > 2 age classes: ‘ugly’ interaction terms . . . . . . . . . . . . . . . . . . . . . . . 7 - 24
7.3 multiple ‘groups’ of marked individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 28
7.3.1 marked as young, multiple marking groups . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 28
7.3.2 marked as young + marked as adults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 32
7.3.3 Marked as young and adult: the design matrix. . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 34
7.4 ‘Time since marking’ – when an age model is NOT an ‘age’ model . . . . . . . . . . . . . . . . . . 7 - 39
7.4.1 Age, transience and the DM – a complex example . . . . . . . . . . . . . . . . . . . . . . . 7 - 44
7.5 Age/TSM models and GOF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 49
7.6 Cohort models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 49
7.6.1 Building cohort models: PIMS and design matrices . . . . . . . . . . . . . . . . . . . . . . 7 - 51
7.7 Model averaging and age/cohort models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 55
7.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 57
7.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 - 57

8 ‘Dead’ recovery models


8.1 ‘Brownie’ parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
8.2 Counting parameters – Brownie parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
8.3 Brownie estimation: individuals marked as young only . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 11
8.4 Brownie analysis: individuals marked both as young + adults . . . . . . . . . . . . . . . . . . . . . 8 - 12
8.5 A different parameterization: Seber (S and r) models . . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 16
8.5.1 Seber vs. Brownie estimates in constrained models: careful! . . . . . . . . . . . . . . . . . 8 - 19
8.6 Recovery analysis when the number marked is not known . . . . . . . . . . . . . . . . . . . . . . . 8 - 23
8.7 Recovery models and GOF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 26
8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 28
8.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 - 28

TABLE OF CONTENTS
TABLE OF CONTENTS 5

9 Joint live encounter & dead recovery data


9.1 Combining live encounters and dead recoveries – first steps. . . . . . . . . . . . . . . . . . . . . . . 9-1
9.1.1 Estimating fidelity rate, F i : some key assumptions. . . . . . . . . . . . . . . . . . . . . . . . 9-2
9.2 Live + dead encounters: underlying probability structure . . . . . . . . . . . . . . . . . . . . . . . 9-3
9.3 Combined recapture/recovery analysis in MARK: marked as adult + young . . . . . . . . . . . . 9-5
9.4 Marked as young only: combining live encounters + dead recoveries . . . . . . . . . . . . . . . . . 9 - 11
9.5 Joint live-recapture/live resight/tag-recovery model
(Barker’s Model) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 12
9.6 Barker Model – ‘movement’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 13
9.6.1 Formatting encounter histories for the Barker model . . . . . . . . . . . . . . . . . . . . . 9 - 15
9.7 Live encounters, dead recoveries & multi-state models . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 15
9.7.1 Barker model: assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 15
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 17
9.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 - 17

10 Multi-state models...
10.1 Separating survival and movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 4
10.2 A worked example: cost of breeding analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 6
10.3 States as ‘groups’ – multi-state models and the DM . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 13
10.3.1 A simple metapopulation model – size, distance & quality . . . . . . . . . . . . . . . . . . 10 - 17
10.4 Multi-state models as a unifying framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 27
10.4.1 Simple example (1) – CJS mark-recapture as a MS problem . . . . . . . . . . . . . . . . . . 10 - 28
10.4.2 Simple example (2) – dead-recovery analysis as a MS problem . . . . . . . . . . . . . . . . 10 - 31
10.4.3 A more complex example – recruitment probability . . . . . . . . . . . . . . . . . . . . . . 10 - 36
10.5 GOF testing and multi-state models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 47
10.5.1 Program U-CARE and GOF for time-dependent multi-state models . . . . . . . . . . . . . 10 - 47
10.5.2 MS models and the median ĉ test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 48
10.6 multi-state models & unequal time intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 49
10.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 49
10.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 - 49

11 Individual covariates
11.1 ML estimation and individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 2
11.2 Example 1 – normalizing selection on body weight . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 3
11.2.1 Specifying covariate data in MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 5
11.2.2 Executing the analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 6
11.3 A more complex example – time variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 15
11.4 The DM & individual covariates – some elaborations . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 17
11.5 Plotting + individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 25
11.6 Missing covariate values, time-varying covariates, and other complications... . . . . . . . . . . . . 11 - 34
11.6.1 Continuous individual covariates & multi-state models... . . . . . . . . . . . . . . . . . . . 11 - 36
11.6.2 the ‘trinomial likelihood’ approach... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 40
11.7 Individual covariates as ‘group’ variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 46
11.7.1 Individual covariates for a binary classification variable . . . . . . . . . . . . . . . . . . . . 11 - 46
11.7.2 Individual covariates for non-binary classification variables . . . . . . . . . . . . . . . . . 11 - 52
11.8 Model averaging and individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 54
11.8.1 Careful! – traps to watch when model averaging . . . . . . . . . . . . . . . . . . . . . . . . 11 - 59
11.8.2 Model averaging and environmental covariates . . . . . . . . . . . . . . . . . . . . . . . . 11 - 61

TABLE OF CONTENTS
TABLE OF CONTENTS 6

11.9 GOF Testing and individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 65


11.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 67
11.11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 - 67

12 Jolly-Seber models in MARK


12.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 1
12.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 2
12.3 Multiple formulations of the same process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 3
12.3.1 The Original Jolly-Seber formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 4
12.3.2 POPAN formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 5
12.3.3 Link-Barker and Pradel-recruitment formulations . . . . . . . . . . . . . . . . . . . . . . . 12 - 7
12.3.4 Burnham JS and Pradel-λ formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 9
12.3.5 Choosing among the formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 11
12.3.6 Interesting tidbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 12
12.4 Example 1 – estimating the number of spawning salmon . . . . . . . . . . . . . . . . . . . . . . . . 12 - 13
12.4.1 POPAN formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 14
12.4.2 Link-Barker and Pradel-recruitment formulations . . . . . . . . . . . . . . . . . . . . . . . 12 - 25
12.4.3 Burnham Jolly-Seber and Pradel-λ formulations . . . . . . . . . . . . . . . . . . . . . . . . 12 - 29
12.5 Example 2 – Muir’s (1957) female capsid data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 35
12.5.1 POPAN formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 37
12.5.2 Link-Barker and Pradel-recruitment formulations . . . . . . . . . . . . . . . . . . . . . . . 12 - 42
12.5.3 Burnham Jolly-Seber and Pradel-λ formulations . . . . . . . . . . . . . . . . . . . . . . . . 12 - 45
12.6 Final words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 48
12.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 - 49

13 Time-symmetric open models: recruitment, survival, and population growth rate


13.1 Population growth: realized vs. projected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 1
13.2 Estimating realized λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 2
13.2.1 Reversing encounter histories: ϕ and γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 3
13.2.2 Putting it together: deriving λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 4
13.3 Projected λ versus realized λ: are they equivalent? . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 6
13.4 Time-symmetric models in MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 6
13.4.1 Linear constraints and time-symmetric models . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 10
13.5 Extensions using the S and f parametrization... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 14
13.6 ‘average’ realized growth rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 18
13.7 Time-symmetric models and Jolly-Seber estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 25
13.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 26
13.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 - 26

14 Closed population capture-recapture models


14.1 The basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 1
14.1.1 The Lincoln-Petersen estimator – a quick review . . . . . . . . . . . . . . . . . . . . . . . . 14 - 2
14.2 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 3
14.2.1 Full likelihood approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 3
14.2.2 Conditional likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 4
14.3 Model types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 7
14.3.1 Constraining the final p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 9
14.4 Encounter histories format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 10

TABLE OF CONTENTS
TABLE OF CONTENTS 7

14.5 Building models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 11


14.6 Closed population models and the design matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 15
14.7 Heterogeneity models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 21
14.7.1 Finite, discrete mixture models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 23
14.7.2 Continuous mixture models using numerical integration . . . . . . . . . . . . . . . . . . . 14 - 27
14.8 Misidentification models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 34
14.8.1 Joint heterogeneity and misidentification models . . . . . . . . . . . . . . . . . . . . . . . 14 - 35
14.9 Goodness-of-fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 36
14.10 Model averaging and closed models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 37
14.10.1 Estimating CI for model averaged abundance estimates . . . . . . . . . . . . . . . . . . . . 14 - 44
14.11 Parameter estimability in closed models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 50
14.12 Other applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 51
14.13 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 51
14.14 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 51
Addendum 1: testing equality of estimated abundance between groups . . . . . . . . . . . . . . . . . . . 14 - 53
Addendum 2: heterogeneity modeling for other data types . . . . . . . . . . . . . . . . . . . . . . . . . . 14 - 59

15 The ‘robust design’


15.1 Decomposing the probability of subsequent encounter . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 1
15.2 Estimating γ: the classical ‘live encounter’ RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 4
15.3 The RD extended – temporary emigration: γ′ and γ′′ . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 6
15.3.1 γ parameters and multi-state notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 7
15.3.2 illustrating the extended model: encounter histories & probability expressions . . . . . . 15 - 8
15.3.3 Random (classical) versus Markovian temporary emigration . . . . . . . . . . . . . . . . . 15 - 9
15.3.4 Alternate movement models: no movement, and ‘even flow’ . . . . . . . . . . . . . . . . . 15 - 11
15.4 Advantages of the RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 13
15.5 Assumptions of analysis under the RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 13
15.6 RD (closed) in MARK – some worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 14
15.6.1 Closed robust design – simple worked example . . . . . . . . . . . . . . . . . . . . . . . . 15 - 14
15.6.2 Closed robust design – more complex worked example . . . . . . . . . . . . . . . . . . . . 15 - 23
15.7 The multi-state closed RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 28
15.7.1 multi-state closed RD – simple worked example . . . . . . . . . . . . . . . . . . . . . . . . 15 - 28
15.8 The ‘open’ robust design... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 38
15.8.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 38
15.8.2 The General ORD Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 39
15.8.3 Implementing the ORD in MARK: (relatively) simple example . . . . . . . . . . . . . . . 15 - 40
15.8.4 Dealing with unobservable states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 43
15.8.5 Which parameters can be estimated? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 45
15.8.6 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 45
15.8.7 Derived parameters from information within primary periods . . . . . . . . . . . . . . . . 15 - 45
15.8.8 Analyzing data for just one primary period . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 46
15.9 The robust design & unequal time intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 47
15.10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 - 48

16 Known-fate models
16.1 The Kaplan-Meier Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 1
16.2 The binomial model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 3
16.3 Encounter histories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 5

TABLE OF CONTENTS
TABLE OF CONTENTS 8

16.4 Worked example: black duck survival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 6


16.5 Pollock’s staggered entry design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 10
16.5.1 Staggered entry – worked example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 10
16.6 Known fate and joint live-dead encounter models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 20
16.6.1 Live-dead and known fate models (1) – ‘radio impact’ . . . . . . . . . . . . . . . . . . . . . 16 - 22
16.6.2 Live-dead and known fate models: (2) – ‘temporary emigration’ . . . . . . . . . . . . . . . 16 - 22
16.7 Censoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 22
16.8 Goodness of fit and known fate models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 23
16.9 Known-fate models and derived parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 24
16.10 Known-fate analyses and ‘nest success models’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 24
16.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 24
16.12 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 - 25

17 Nest survival models


17.1 Competing models of daily survival rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 3
17.2 Encounter histories format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 4
17.3 Nest survival, encounter histories, & cell probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 6
17.4 Building models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 8
17.4.1 Models that consider observer effects on DSR . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 15
17.5 Model results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 16
17.6 Individual covariates and design matrix functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 17
17.7 Additional applications of the nest success model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 18
17.8 Goodness of fit and nest survival . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 18
17.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 18
17.10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 - 19

18 Mark-resight models
18.1 What is mark-resight? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 2
18.2 The mixed logit-normal mark-resight model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 5
18.2.1 No individually identifiable marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 6
18.2.2 Individually identifiable marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 11
18.3 The immigration-emigration mixed logit-normal model . . . . . . . . . . . . . . . . . . . . . . . . 18 - 16
18.3.1 No individually identifiable marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 17
18.3.2 Individually identifiable marks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 20
18.4 The Poisson-log normal mark-resight model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 23
18.4.1 Closed resightings only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 25
18.4.2 Full-likelihood robust design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 33
18.5 Which mark-resight model? A summary ‘decision table’... . . . . . . . . . . . . . . . . . . . . . . . 18 - 39
18.6 Suggestions for mark-resight analyses in MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 39
18.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 40
Addendum: formatting mark-resight input files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 - 42

19 Young survival from marked adults


19.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 2
19.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 2
19.3 Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 3
19.4 Parameters and Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 5
19.5 Relationship with CJS and Multi-state Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 5
19.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 6

TABLE OF CONTENTS
TABLE OF CONTENTS 9

19.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 - 6

20 Density estimation...
20.1 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 5
20.2 Implementation in MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 5
20.2.1 Estimate proportion on site or use the data? . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 12
20.2.2 Threshhold Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 14
20.3 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 16
20.4 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 16
20.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 18
20.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 - 18

21 Occupancy models – single-species


21.1 The static (single-season) occupancy model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 2
21.1.1 Single-season occupancy model – example without covariates . . . . . . . . . . . . . . . . 21 - 4
21.1.2 Single-season occupancy model – incorporating covariates . . . . . . . . . . . . . . . . . . 21 - 5
21.1.3 An example with covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 6
21.2 Model averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 12
21.3 Model assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 15
21.3.1 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 15
21.3.2 Unmodeled heterogeneity in occupancy or detection probability . . . . . . . . . . . . . . 21 - 16
21.3.3 Lack of independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 17
21.3.4 False positives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 19
21.4 Unobserved detection heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 20
21.4.1 Finite mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 20
21.4.2 Random effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 22
21.5 Goodness of fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 23
21.6 The dynamic occupancy model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 24
21.6.1 Dynamic (multi-season) occupancy – an example . . . . . . . . . . . . . . . . . . . . . . . 21 - 27
21.7 The ‘false positive’ (misidentification) occupancy models . . . . . . . . . . . . . . . . . . . . . . . 21 - 32
21.7.1 False positive single-season occupancy model in MARK . . . . . . . . . . . . . . . . . . . 21 - 36
21.7.2 The dynamic (multi-season) occupancy model with false positives . . . . . . . . . . . . . 21 - 40
21.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 43
21.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 - 44

22 Occupancy models – multi-species


22.1 Multi-species occupancy model – without covariates . . . . . . . . . . . . . . . . . . . . . . . . . . 22 - 3
22.2 Multi-species occupancy model – with covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 - 6
22.2.1 Detection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 - 11
22.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 - 13

A Simulations in MARK . . .
A.1 Simulating CJS data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
A.1.1 Simulating CJS data – MARK simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
A.1.2 Simulating CJS data – RELEASE simulations . . . . . . . . . . . . . . . . . . . . . . . . . . A - 10
A.2 Generating encounter histories – program MARK . . . . . . . . . . . . . . . . . . . . . . . . . . . . A - 12
A.3 Simulating data from a prior MARK analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A - 14
A.4 Simulation of robust design + closed capture data – special considerations . . . . . . . . . . . . . A - 18

TABLE OF CONTENTS
TABLE OF CONTENTS 10

A.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A - 18

B The ‘Delta method’ . . .


B.1 Background – mean and variance of random variables . . . . . . . . . . . . . . . . . . . . . . . . . B-1
B.2 Transformations of random variables and the Delta method . . . . . . . . . . . . . . . . . . . . . . B-4
B.3 Transformations of one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-7
B.3.1 A potential complication – violation of assumptions . . . . . . . . . . . . . . . . . . . . . . B-9
B.4 Transformations of two or more variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B - 18
B.5 Delta method and model averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B - 35
B.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B - 37
B.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B - 38
Addendum: ‘computationally intensive’ approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B - 38

C RMark - an alternative approach to building linear models in MARK


C.1 RMark installation and first steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
C.2 A simple example (return of the dippers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-5
C.3 How RMark works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-9
C.4 Dissecting the function “mark” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 15
C.4.1 Function process.data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 15
C.4.2 Function make.design.data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 17
C.5 More simple examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 20
C.6 Design covariates in RMark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 25
C.7 Comparing results from multiple models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 30
C.8 Producing model-averaged parameter estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 32
C.9 Quasi-likelihood adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 33
C.10 Coping with identifiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 34
C.11 Fixing real parameter values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 40
C.12 Data Structure and Import for RMark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 46
C.13 A more organized approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 51
C.14 Defining groups with more than one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 55
C.15 More complex examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 58
C.16 Individual covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 72
C.17 Multi-strata example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 78
C.18 Nest survival example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 86
C.19 Occupancy examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 90
C.20 Known fate example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 96
C.21 Exporting to MARK interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 98
C.22 Using R for further computation and graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C - 99
C.23 Problems and errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .C - 102
C.24 A (very) brief R primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .C - 104

D Variance components and random effects models in MARK . . .


D.1 Variance components – some basic background theory . . . . . . . . . . . . . . . . . . . . . . . . . D-3
D.2 Variance components estimation – worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . D-7
D.2.1 Binomial survival – simple mean (no sampling covariance) . . . . . . . . . . . . . . . . . . D-7
D.2.2 Binomial example extended – simple trend . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 12
D.2.3 What about sampling covariance? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 15
D.3 Random effects models and shrinkage estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 17
D.3.1 The basic ideas... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 18

TABLE OF CONTENTS
TABLE OF CONTENTS 11

D.3.2 Some technical background... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 21


D.3.3 Deriving an AIC for the random effects model . . . . . . . . . . . . . . . . . . . . . . . . . D - 23
D.4 Random effects models – some worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 23
D.4.1 Binomial survival revisited – basic mechanics . . . . . . . . . . . . . . . . . . . . . . . . . D - 24
D.4.2 A more complex example – California mallard recovery data . . . . . . . . . . . . . . . . . D - 27
D.4.3 Random effects – environmental covariates . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 33
D.4.4 Worked example – λ – Pradel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 37
D.5 Model averaging? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 40
D.6 Caveats, warnings, and general recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 42
D.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 44
D.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D - 45

E Markov Chain Monte Carlo (MCMC) estimation in MARK . . .


E.1 Variance components analysis revisited – MCMC approach . . . . . . . . . . . . . . . . . . . . . . E-5
E.1.1 Example 1 – binomial survival re-visited . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-5
E.1.2 estimating the hyperparameters µ and σ . . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 18
E.1.3 Example 2 – California mallard survival re-visited . . . . . . . . . . . . . . . . . . . . . . . E - 25
E.1.4 Example 3 – environmental covariates re-visited . . . . . . . . . . . . . . . . . . . . . . . . E - 28
E.1.5 Example 4 – group + time as random effects . . . . . . . . . . . . . . . . . . . . . . . . . . E - 38
E.2 Hyperdistributions between structural parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 41
E.2.1 Example 1 - dead recovery analysis (corr{S, f}) . . . . . . . . . . . . . . . . . . . . . . . . . E - 41
E.2.2 Example 2 – time-symmetric model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 48
E.3 caveats, warnings, and general recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 51
E.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 53
E.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E - 53

F Parameter identifiability by data cloning...


F.1 Worked example (1) – the Dippers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-2
F.1.1 Structural identifiability – confounded parameters . . . . . . . . . . . . . . . . . . . . . . . F-3
F.1.2 ‘boundary problems’ – data limits and link functions . . . . . . . . . . . . . . . . . . . . . F-5
F.1.3 Choice of link function – does it matter? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F-8
F.2 Worked example (2) – AFS monograph example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 10
F.3 Worked example (3) – robust design example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 10
F.4 Data cloning and ‘unbounded’ parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 13
F.5 Worked example (4) – Pradel model example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 13
F.6 Limitations & other thoughts and approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 17
F.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 21
F.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F - 21

G The ‘data bootstrap. . . ’


G.1 Empirical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G-2
G.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G - 10
G.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G - 10
G.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G - 11

TABLE OF CONTENTS
Program MARK – a ‘gentle introduction’ o
Program MARK is the most comprehensive and widely used software application currently available
for the ‘analysis of data from marked individuals’ (hence the name MARK). MARK is a very flexible and
powerful program, with many options, and a lot of technical sophistication. It encompasses virtually
all currently used methods for analysis of marked individuals – including many new approaches only
recently described in the primary literature.
As such, MARK is not a program that you can learn to use without some instruction. However, the
only ‘bundled’ documentation for MARK is the integrated ‘help file’. This is not to slight the help file
– it is extremely comprehensive, and covers much of the ‘technical details’. For people with a strong
background in analysis of these sorts of data, and especially experienced users of E/M-SURGE and
POPAN (the other ‘big’ applications in common use), the help file alone may, in fact, be sufficient to get
you ‘up and running’ with MARK with only a bit of work. MARK draws heavily on the strengths of
other applications, and aspects and underlying principles are (to varying degrees) similar across many
applications.
However, for the ‘new user’, who may have little to no background in the analysis of these sort of data,
learning how to use MARK from the help file alone is very inefficient, and is often a frustrating exercise.
This type of user needs a different type of ‘documentation’. It was with this type of user in mind that we
developed this book – a comprehensive example-driven ‘tutorial’ on the theory, mechanics and practice
of using MARK.
Of course, MARK is not the only program available for analysis of encounter data from marked
individuals (see http://www.phidot.org/software/ for pointers to other available software), so you
may wonder “why bother with MARK?”. The short answer is that MARK offers far more flexibility
and power in statistical modeling and hypothesis testing than other widely available and frequently
used programs. It also uses a consistent, and familiar ‘Windows interface’, and allows the user to work
with a consistent data format throughout. If you’re just starting out, and have to pick one program to
become proficient with, we strongly suggest you spend your time with MARK. Of course, there may
be reasons why you don’t want to use MARK, but on average, it’ll be well worth your while.

About this book

This book is intended to allow you to (in effect) ‘teach yourself how to use MARK’. We have included
much of the material we normally cover in the classroom or during workshops, placing as much
emphasis on “why things work the way they do” as on “now. . .press this button”. Our basic view
of learning to use software is that the only way to really master an application is to understand what it
is doing, and then to practice the mechanics of the application (over and over again).
Having said that, it is worth letting you know right from the beginning that this is not a book on the
ii

theory of analysis of data from marked individuals (in the strict sense), and should not be cited as such.∗
This guide is intended simply to be an accessible means by which you can learn how to use MARK. In
the process, however, we do cover a fair bit of the ‘conceptual theory’. If you’re an experienced analyst
of these sort of data, you’ll quickly find which parts you can skip, and which you can’t. Regardless, we
urge you to read the current literature (see below) – it is the only way to keep up with the many recent
developments in the analysis of data from marked individuals.

Structure of the book

Chapters 1 through 7 are the ‘core’ mechanical and conceptual ‘skill-building’ chapters. Chapters 8 and
higher are focused on more advanced applications. For newcomers, we strongly suggest that you work
through Chapters 1 through 7 first. And, by ‘working through’, we mean sitting at the computer, with
this book, and working through all of the computer exercises.
This book is largely based on the premise that you ‘learn by doing’. Chapter 1 provides a simple
introduction to some of the ideas and theory. Chapter 2 covers the basics of data formatting (the obvious
first step to analyzing your data). Chapters 3 to 7 provide detailed instruction on the ‘basics’ of using
MARK, within the context of ‘standard’ open population mark-recapture analysis.
We decided to begin with basic mark-recapture for two reasons. First, it is the basis for most of
the commonly used software applications currently in wide use. Since most experienced analysts
will probably have some level of experience with one or more of these applications, building the
core introduction around mark-recapture seems to present the minimal learning curve. Second, if you
understand basic mark-recapture analysis, you can pick up the other types of analysis fairly quickly.
In these first chapters, we will take you through the process of using MARK, working for the most part
with ‘practice’ data sets† , starting with the basic rudiments, and ending with some fairly sophisticated
examples. Our goal is to provide you with enough understanding of how MARK works so that even if
we don’t explicitly cover the particular problem you’re working on, you should be able to figure out how
to approach the problem with MARK, on your own. In fact, we measure the success of this book by how
little you’ll need to refer to it again, once you’ve gone through all of the core chapters. Once you have
worked through Chapters 1 through 7, you should be able to jump to any of the following chapters with
relative ease. All succeeding chapters are reasonably self-contained, but do presume you’re familiar
with basic mark-recapture theory, and (especially) how it is implemented in MARK.

begin sidebar
sidebars – extra information

Interspersed throughout most chapters will be ‘sidebars’ – small snippets of technical information,
conceptual arm-waving, or other information which we think is potentially useful for you to read – but
not so essential that they can’t be skipped over to maintain your flow of the reading of the main body
of the text. Whenever you come across one of these sidebar items, it might be worth at least reading
the first few lines to see what it refers to.

end sidebar


We’re occasionally asked how to properly cite this book. Easy answer – please don’t. This book is not a ‘technical reference’,
but a ‘software manual’. The various ‘technical’ bits in the book (i.e., suggestions on how to approach some sorts of analysis,
guides to interpreting results...) are drawn from our collective experience, and the primary literature, the latter of which should
be cited in all cases.

Most of the practice data sets are contained in the file markdata.zip, which can be downloaded from the same website you
accessed to download this book. The last item of the drop-down menu where you select chapters (left-hand side of the page)
is a link to the example files. If you can’t find the files there, check the /mark/examples subdirectory that is created when you
install MARK.

Foreword
iii

Getting MARK and installing it on your computer

The primary source for program MARK is Gary White’s MARK web page, which is currently located
at

https://sites.warnercnr.colostate.edu/gwhite/program-mark/

New versions, miscellaneous notes, and general comments concerning MARK are found there, as
well as links to lecture notes, and other relevant information. And, if you’re reading this, then you’ve
obviously found the ‘other MARK website’, maintained by Evan Cooch.

http://www.phidot.org/software/mark/

The purpose of this ‘other’ web site is twofold: (i) to provide access to this book, and (ii) to provide
a locally mirrored copy of the MARK install files (either one website or the other is always likely to
be ‘up and running’). Updates to MARK are relatively frequent – and attempts are made to keep the
book as current as the software. At present, the only way to check for new versions of both the software
and the book is to periodically visit the MARK webiste(s), and look to see if another version has been
posted. Alternatively, you can register for the online MARK discussion forum (see below), which will
send periodic emails announcing new releases of MARK.
Now, about installing MARK. Before we go into the details – some quick comments:

• MARK is a Windows program. For most users, this means a machine running Windows 7
→ Windows 10 as the operating system. However, as ‘virtualization’ software gets better and
better, it has become more tractable to run MARK on a non-Windows platform (e.g., using
virtualization software, like VMWare, or VirtualBox). Some details about running MARK on
a non-Windows machine can be found on the MARK website.
• Running MARK also requires a ‘real computer’ – we recommend minimally a machine with
a CPU clocked at 2 GHz or better (MARK supports and makes use of multi-core processors),
with at least 2 GB of RAM (>2 GB strongly recommended). We also suggest getting a decent
sized monitor (you’ll discover why we make this recommendation the first time you pull up
a ‘big, ugly’ design matrix).

You install MARK using a fairly standard setup program. Once you’ve downloaded the MARK
setup.exe program, simply double-click it, and off you go! It is a fairly standard Windows installer, with
prompts for where you want to install MARK, and so forth. The installation is generally uneventful.
Once the install program has finished, you’re done. The install routine should have placed a short-cut
to MARK for you on your desktop.

begin sidebar
upgrading from an earlier installation

Updates to MARK are fairly frequent (with the pace of change being roughly proportional to the rate
at which new methods enter the literature). To upgrade an existing MARK installation, you should

1. uninstall the old version, using the standard ‘uninstall software’ option from the Windows
control panel (ignore any error messages you might get about Windows not being able to
unregister certain items – these are spurious). If you really want to be thorough, follow this
by manually deleting the MARK subdirectory as well (although this isn’t really necessary).
2. install the new version. For some operating systems, you may get an error message or two
concerning problems trying to register certain graphics components – ignore these.

Foreword
iv

3. test the installation. double-click the MARK icon that should have been placed on the desktop
during installation, and make sure the MARK GUI starts up correctly. If it does, you should
be fine.

end sidebar

Finding help

No matter how good the documentation, there will always be things that remain unclear, or simply
aren’t covered in this book (although we keep trying). As such, it’s nice to have some options for getting
help (beyond the earlier suggestion to ‘check the help file’). As such, we have created a web-based
discussion forum for just this purpose – a place where you can ask questions, make suggestions for
MARK∗ , and so forth. The forum can be accessed at

http://www.phidot.org/forum

In addition to providing a resource for getting answers to specific technical questions, registering
for the forum† is also a convenient way to learn about recent changes to MARK (and this book), and
finding out about upcoming workshops and training sessions.


The forum also hosts similar discussions for a number of other software applications; e.g., PRESENCE, M/E-SURGE...

Registration for the forum is free, and you have a fair bit of control over how much ‘email traffic’ it generates.

Foreword
v

References & background reading

The literature for analysis of data from marked individuals is very large – and growing at an exponential
rate (100-150 new papers per year in recent years). As such, it’s easy to feel that keeping up with the
literature is not even remotely tractable. Don’t fret – it’s unlikely anyone reads all the papers.∗
Fortunately, there have been several recently published books, which do much of the collation and
synthesis of this large literature for you – we strongly suggest that you get access (in some fashion) to
the following 3 books:

Analysis and Management of Animal Populations – Ken Williams, Jim Nichols,


and Mike Conroy. (2002) Academic Press. 1,040 pages.
A comprehensive volume that is the de facto standard reference for the integration of
modeling, estimation, and management, written by 3 of the luminaries in the field. It
provides a superb synthesis of most of the vast literature on estimation from data from
marked individuals, as part of a cohesive framework of construction and use of models
in conservation management.

Handbook of Capture-Recapture Analysis – Steve Amstrup, Lyman MacDonald,


and Bryan Manly. (2006) Princeton University Press. 296 pages.
In some ways, a precis of some of the key ‘estimation’ sections of the WNC book
(above), in others, a more detailed ‘guide’ to several extensions to methods discussed
in WNC. A very good, compact summary of estimation methods, with a focus on
practical application.

Model Selection and Multi-Model Inference (2nd Edition) – Ken Burnham and
David Anderson. (2002) Spring-Verlag. 496 pages.
So you want to fit models to data, eh? Well, fundamental to this process is the issue
of selecting amongst such models. How should you do this? Burnham and Anderson
cover this critical issue in great detail – and in so doing, will give you a solid basis
for the mechanics, and theory, of model selection as applied to analysis of data from
marked individuals.

Collectively, these books represent the minimum library you should have at your disposal, and are
essential companions to this book.


With the likely exception of Jim Nichols, who is a ‘special case’ in several respects...

Foreword
vi

Acknowledgements

The first draft of this book was written over two weeks in 1998. It was approximately 150 pages in
length. It is now >1,100 pages, and has gone through numerous revisions, based almost entirely on
comments, corrections and feedback submitted by many of the several thousand people who have used
the book in its various incarnations. In addition, several new chapters have recently been contributed
by some of our colleagues. These contributions are so significant that we now consider ourselves as
merely ‘editors’ of the larger effort by the community of MARK users to document the software. Any
strengths of this book come from these collegial interactions – its failings, however, are our fault alone.
We believe in earnest that the only truly ‘dumb’ question is one never asked. Hopefully, most of your
questions concerning the use of program MARK are answered here.

Ithaca, New York EGC


Ft. Collins, Colorado GCW
(
2019

This publication may not be reproduced, stored, or transmitted


in any form except for (i) fair use for the purposes of research,
teaching (which includes printing for instructional purposes)
or private study, or for criticism or review, or (ii) with the
express written permission of the editors, or authors of individual
chapters. Permission is considered ‘granted’ for use governed
by these terms. Enquiries concerning reproduction outside these
terms should be directed to the editors.

Foreword
CHAPTER 1

First steps. . .

We introduce the basic idea for analysis of data from encounters of marked individuals by means
of a simple example. Suppose you are interested in exploring the potential ‘cost of reproduction’ on
survival of some species of your favorite taxa (say, a species of bird). The basic idea is pretty simple:
an individual that spends a greater proportion of available energy on breeding may have less available
for other activities which may be important for survival. In this case, individuals putting more effort
into breeding (i.e., producing more offspring) may have lower survival than individuals putting less
effort into breeding. On the other hand, it might be that individuals that are of better ‘quality’ are able
to produce more offspring, such that there is no relationship between ‘effort’ and survival.
You decide to reduce the confounding effects of the ‘quality’ hypothesis by doing an experiment. You
take a sample of individuals who all produce the same number of offspring (the idea being, perhaps,
that if they had the same number of offspring in a particular breeding attempt, that they are likely to be
of similar quality). For some of these individuals, you increase their ‘effort’ by adding some offspring
to the nest (i.e., more mouths to feed, more effort expended feeding them). For others, you reduce effort
by removing some offspring from the nest (i.e., fewer mouths to feed, less effort spent feeding them).
Finally, for some individuals, you do not change the number of offspring, thus creating a control group.
As described, you’ve set up an ‘experiment’, consisting of a control group (unmanipulated nests),
and 2 treatment groups: one where the number of offspring has been reduced, and one where the
number of offspring has been increased. For convenience, call the group where the number of offspring
was increased the ‘addition’ group, and call the group where the number of offspring was reduced
the ‘subtraction’ group. Your hypothesis might be that the survival probability of the females in the
‘addition’ group should be lower than the control (since the females with enlarged broods might have to
work harder, potentially at the expense of survival), whereas the survival probability of the females in
the ‘subtraction’ group should be higher than the control group (since the females with reduced broods
might not have to work as hard as the control group, potentially increasing their survival). To test this
hypothesis, you want to estimate the survival of the females in each of the 3 groups. To do this, you
capture and individually mark the adult females at each nest included in each of the treatment groups
(control, additions, subtractions). You release them, and come back at some time in future to see how
many of these marked individuals are ‘alive’ (the word ‘alive’ is written parenthetically for a reason
which will be obvious in a moment).
Suppose at the start of your study (time t) you capture and mark 50 individuals in each of the 3 groups.
Then, at some later time (time t+1), you go back out in the field and encounter alive 30 of the marked
individuals from the ‘additions’ treatment, 38 of the marked individuals from the control group, and
30 individuals from the ‘subtractions’ treatment. The ‘encounter data’ from our study are tabulated at
the top of the next page.

© Cooch & White (2019) 08.06.2019


1.1. Return ‘rates’ 1-2

group t t+1
additions 50 30
control 50 38
subtractions 50 30

Hmm. This seems strange. While you predicted that the 2 treatment groups would differ from the
controls, you did not predict that the results from the two treatments would be the same. What do these
results indicate? Well, of course, you could resort to the time-honored tradition of trying to concoct a
parsimonious ‘post-hoc adaptationist’ story to try to demonstrate that (in fact) these results ‘made
perfect sense’, according to some ‘new twist to underlying theory’. However, there is another possibility
– namely, that the analysis has not been thoroughly understood, and as such, interpretation of the results
collected so far needs to be approached very cautiously.

1.1. Return ‘rates’

Let’s step back for a moment and think carefully about our experiment – particularly, the analysis of
‘survival’. In our study, we marked a sample of individual females, and simply counted the numbers of
those females that were subsequently seen again on the next sampling occasion. The implicit assumption
is that by comparing relative proportions of ‘survivors’ in our samples (perhaps using a simple χ2 test),
we will be testing for differences in ‘survival probability’. However (and this is the key step), is this
a valid assumption? Our data consist of the number of marked and released individuals that were
encountered again at the second sampling occasion. While it is obvious that in order to be seen on the
second occasion, the marked individual must have survived, is there anything else that must happen?
The answer (perhaps obviously, but in case it isn’t) is ‘yes’ – the number of individuals encountered
on the second sampling occasion is a function of 2 probabilities: the probability of survival, and the
probability that conditional on surviving, that the surviving individual is encountered. While the first
of these 2 probabilities is obvious (and is in fact what we’re interested in), the second may not be. This
second probability (which we refer to generically as the ‘encounter probability’) is the probability that
given that the individual is alive and in the sample, that it is in fact encountered (e.g., seen, or ‘visually
encountered’). In other words, simply because an individual is alive and in the sampling area may not
guarantee that it is encountered.
So, the proportion of individuals that were encountered alive on the second sampling occasion (which
is often referred to in the literature as ‘return rate’∗ ) is the product of 2 different probability processes:
the probability of surviving and returning to the sampling area (which we’ll call ‘apparent’ or ‘local’
survival), and the probability of being encountered, conditional on being alive an in the sample (which
we’ll call ‘encounter probability’). So, ‘return rate’ = ‘survival probability’ × ‘encounter probability’.
Let’s let ϕ (pronounced ‘fee’ or ‘fie’, depending on where you come from) represent the ‘local survival
probability’, and p represent the ‘encounter probability’. Thus, we would write ‘return rate’ = ϕp.
So, why do we care? We care because this complicates the interpretation of ‘return rates’ – in our
example, differences in ‘return rates’ could reflect differences in the probability of survival, or they
could reflect differences in encounter probability, or both! Similarly, lack of differences in ‘return rates’
(as we see when comparing the ‘additions’ and ‘subtractions’ treatment groups in our example) may
not indicate ‘no differences in survival’ (as one interpretation) – there may in fact be differences in
survival, but corresponding differences in encounter probability, such that their products (‘return rate’)


The term ‘return rate’ is something of a misnomer, since it is not a rate, but rather a proportion. However, because the term ‘return
rate’ is in wide use in the literature, we will continue to use it here.

Chapter 1. First steps. . .


1.2. A more robust approach 1-3

are equal. For example, in our example study, the ‘return rate’ for both the ‘additions’ and ‘subtractions’
treatment groups is the same: (30/50)  0.6. Our initial ‘reaction’ might have been that these data did
not support our hypothesis predicting difference in survival between the 2 groups.
However, suppose that in fact the ‘treatment’ (i.e., manipulating the number of offspring in the nest)
not only influenced survival probability (as was our original hypothesis), but also potentially influenced
encounter probabilities? For example, suppose the true survival probability of the ‘additions’ group was
ϕ add  0.65 (i.e., a 65% probability of surviving from t to t+1), while for the ‘subtractions’ group, the
survival probability is ϕ s ub  0.80 (i.e., an 80% probability of surviving from t to t+1). However, in
addition, suppose that the encounter probability for the ‘additions’ group was p add  0.923 (i.e., a
92.3% chance that a marked individual will be encountered, conditional on it being alive and in the
sampling area), while for the ‘subtractions’ group, the encounter probability was p s ub  0.75 (we’ll
leave it to proponents of the adaptationist paradigm to come up with a ‘plausible’ explanation for such
differences). While there are clear differences between the 2 groups, the products of the 2 probabilities
are the same: (0.65 × 0.923)  0.6, and (0.8 × 0.75)  0.6. In other words, it is difficult to compare ‘return
rates’, since differences (or lack thereof) could reflect differences or similarities in the 2 underlying
probabilities (survival probability, and encounter probability).

1.2. A more robust approach

How do we solve this dilemma? Well, the solution we’re going to focus on here (and essentially for
the next 1,000 pages or so) is to collect more data, and using these data, separately estimate all of the
probabilities (at least, when possible) underlying the encounters of marked individuals. Suppose for
example, we collected more data for our experiment, on a third sampling occasion (at time t + 2). On
the third occasion, we encounter individuals marked on the first occasion.
But, perhaps some of those individuals encountered on the third occasion were not encountered on
the second occasion. How would we be able to use these data? First, we introduce a simple bookkeeping
device, to help us keep track of our ‘encounter’ data (in fact, we will use this bookkeeping system
throughout the rest of the book – discussed in much more detail in Chapter 2). We will ‘keep track’ of
our data using what we call ‘encounter histories’. Let a ‘1’ represent an encounter with a marked individual
(in this example, we’re focusing only on ‘live encounters’), and let a ‘0’ indicate that a particular marked
individual was not encountered on a particular sampling occasion.
Now, recall from our previous discussion that a ‘0’ could indicate that the individual had in fact
died, but it could also indicate that the individual was in fact still alive, but simply not encountered (the
problem we face is how to differentiate between the two possibilities). For our 3 occasion study, where
individuals were uniquely marked on the first occasion only, there are 4 possible encounter histories:

encounter history interpretation

111 captured and marked on the first occasion, alive and encountered on
the second occasion, alive and encountered on the third occasion

110 captured and marked on the first occasion, alive and encountered on
the second occasion, and either (i) dead by the third occasion, or (ii)
alive on the third occasion, but not encountered

101 captured and marked on the first occasion, alive and not encountered
on the second occasion, and alive and encountered on the third
occasion

Chapter 1. First steps. . .


1.2. A more robust approach 1-4

100 captured and marked on the first occasion, and either (i) dead
by the second occasion, (ii) alive on the second occasion, and not
encountered, and alive on the third occasion and not encountered,
(iii) alive on the second occasion, and not encountered, and dead by
the third occasion

You might be puzzled by the verbal explanation of the third encounter history: 101. How do we know
that the individual is alive at the second occasion, if we didn’t see it? Easy – we come to this conclusion
logically, since we saw it alive at the third occasion. And, if it was alive at occasion 3, then it must also
have been alive at occasion 2. But, we didn’t see it on occasion 2, even though we know (logically) that it
was alive. This, in fact, is one of the key pieces of logic – the individual was alive at the second occasion
but not seen. If p is the probability of detecting (encountering) an individual given that it is alive and in
the sample, then (1 − p) is the probability of missing it (i.e., not detecting it). And clearly, for encounter
history ‘101’, we ‘missed’ the individual at the second occasion.
All we need to do next is take this basic idea, and formalize it. As written (above), you might see that
each of these encounter histories could occur due to a specific sequence of events, each of which has a
corresponding probability. Let ϕ i be the probability of surviving from time (i) to (i+1), and let p i be the
probability of encounter at time (i). Again, if p i is the probability of encounter at time (i), then (1 − p i )
is the probability of not encountering the individual at time (i).
Thus, we can re-write the preceding table as:

encounter history probability of encounter history


111 ϕ1 p 2 ϕ2 p 3
 
110 ϕ1 p2 ϕ2 (1 − p3 ) + (1 − ϕ2 )
 ϕ1 p2 (1 − ϕ2 p3 )

101 ϕ1 1 − p 2 ϕ2 p 3

100 1 − ϕ1 + ϕ1 (1 − p2 )(1 − ϕ2 ) + ϕ1 (1 − p2 )ϕ2 (1 − p3 )
 1 − ϕ1 p2 − ϕ1 (1 − p2 )ϕ2 p3

(If you don’t immediately see how to derive the probability expressions corresponding to each
encounter history, not to worry: we will cover the derivations in much more detail in later chapters).
So, for each of our 3 treatment groups, we simply count the number of individuals with a given
encounter history. Then what? Once we have the number of individuals with a given encounter history,
we use these frequencies to estimate the probabilities which give rise to the observed frequency. For
example, suppose for the ‘additions’ group we had N 111  7 (where N 111 is the number of individuals
in our sample with an encounter history of ‘111’), N 110  2, N 101  5, and N 100  36. So, of the 50
individuals marked at occasion 1, only (7 + 2 + 5)  14 individuals were subsequently encountered alive
(at either sampling occasion 2, sampling occasion 3, or both), while 36 were never seen again. Suppose
for the ‘subtractions’ group we had N 111  5, N 110  7, N 101  2, and N 100  36. Again, 14 total
individuals encountered alive over the course of the study.
However, even though both treatment groups (additions and subtractions) have the same overall
3-year return rate (14/50  0.28), we see clearly that the frequencies of the various encounter histories
differ between the groups. This indicates that there are differences among encounter occasions in
survival probability, or encounter probability (or both) between the 2 groups, despite no difference

Chapter 1. First steps. . .


1.3. Maximum likelihood theory – the basics 1-5

in overall return rate. The challenge, then, is how to estimate the various probabilities (parameters) in
the probability expressions, and how to determine if these parameter estimates are different between
the 2 treatment groups.
An ad hoc way of getting at this question involves comparing ratios of frequencies of different
encounter ratios. For example,

N 111 ✚ϕ✚ ϕ✚
1 p2 ✚ 2✚p✚
3 p2
  .
N 101 ϕ✚
✚ ϕ✚
1 (1 − p 2 )✚ 2✚p✚
3 1 − p2

So, for the ‘additions’ group, (N 111 /N 101 )  (7/5)  1.4. Thus, p̂(2,add)  0.583. In contrast, for the
‘subtractions’ group, (N 111 /N 101 )  (5/2)  2.5. Thus, p̂(2,s ub)  0.714. Once we have estimates of p2 , we
can see how we could substitute these values into the various probability expressions to solve for some
of the other parameter (probability) values. However, while this is reasonably straightforward (at least
for this very simple example), what about the question of ‘is this difference between the two different p̂2
values meaningful/significant?’. To get at this question, we clearly need something more – in particular
we need to be able to come up with estimates of the uncertainty (variance) in our parameter estimates.
To do this, we need a robust statistical tool.

1.3. Maximum likelihood theory – the basics

Fortunately, we have such a tool at our disposal. Analysis of data from marked individuals involves
making inference concerning the probability structure underlying the sequence of events that we
observe. Maximum likelihood (ML) estimation (courtesy of Sir Ronald Fisher) is the workhorse of
analysis of such data. While it is possible to become fairly proficient at analysis of data from marked
individuals without any real formal background in ML theory, in our experience at least a passing
familiarity with the concepts is helpful. The remainder of this (short) introductory chapter is intended
to provide a simple (very) overview of this topic. The standard ‘formal’ reference is the 1992 book by
AWF Edwards (‘Likelihood’, Johns Hopkins University Press). Readers with significant backgrounds in
the theory will want to skip this chapter, and are encouraged to refrain from comment as to the necessary
simplifications we make.
So here we go. . .the basics of maximum likelihood theory without (much) pain. . .

1.3.1. Why maximum likelihood?

The method of maximum likelihood provides estimators that are both reasonably intuitive (in most
cases) and several have some ‘nice properties’ (at least statistically):

1. The method is very broadly applicable and is simple to apply.


2. Once a maximum-likelihood estimator is derived, the general theory of maximum-likelihood
estimation provides standard errors, statistical tests, and other results useful for statistical
inference. More technically:
(a) maximum-likelihood estimators (MLE) are consistent.
(b) they are asymptotically unbiased (although they may be biased in finite samples).
(c) they are asymptotically efficient – no asymptotically unbiased estimator has a smaller
asymptotic variance.
(d) they are asymptotically normally distributed – this is particularly useful since it

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1-6

provides the basis for a number of statistic ‘tests’ based on the normal distribution
(discussed in more detail in Chapter 4).
(e) if there is a sufficient statistic for a parameter, then the MLE of the parameter is a
function of a sufficient statistic.∗
3. A disadvantage of the maximum likelihood method is that it frequently requires strong
assumptions about the structure of the data.

1.3.2. Simple estimation example – the binomial coefficient

We will introduce the basic idea behind maximum likelihood (ML) estimation using a simple, and
(hopefully) familiar example: a binomial model with data from a flip of a coin. Much of the analysis of
data from marked individuals involves ML estimation of the probabilities defining the occurrence of one
or more events. Probability events encountered in such analyses often involve binomial or multinomial
distributions.
There is a simple,logical connection between binomial probabilities,and analysis of data from marked
individuals, since many of the fundamental parameters we are interested in are ‘binary’ (having 2
possible states). For example, survival probability (live or die), detection probability (seen or not seen),
and so on. Like a coin toss (head or tail), the estimation methods used in the analysis of data from
marked individuals are deeply rooted in basic binomial theory. Thus, a brief review of this subject is in
order.
To understand binomial probabilities, you need to first understand binomial coefficients. Binomial
coefficients are commonly used to calculate the number of ways (combinations) a sample size of n
can be taken without replacement from a population of N individuals:
 
N N!
 . (1.1)
n n!(N − n)!

This is read as ‘the number of ways (or, ‘combinations’) a sample size of n can be taken (without
replacement) from a population of size N’. Think of N as the number of organisms in a defined
population, and let n be the sample size, for example. Recall that the ‘!’ symbol means factorial (e.g.,
5!  5 × 4 × 3 × 2 × 1  120).
A quick example – how many ways can a sample of size 2 (i.e., n  2) be taken from a population of
size 4 (i.e., N  4)? Just to confirm we’re getting the right answer, let’s first derive the answer by ‘brute
force’. Let the individuals in the sample all have unique marks: call them individuals A, B, C and D,
respectively. So, given that we sample 2 at a time, without replacement, the possible combinations we
could draw from the ‘population’ are:

AB AC AD BC BD CD
BA CA DA CB DB DC

So, 6 total different combinations are possibly selected (6, not 12 – the pair in each column are
equivalent; e.g., ‘AB’ and ‘BA’ are treated as equivalent).


Sufficiency is the property possessed by a statistic, with respect to a parameter, when no other statistic which can be calculated
from the same sample provides any additional information as to the value of the parameter. For example, the arithmetic mean
is sufficient for the mean (µ) of a normal distribution with known variance. Once the sample mean is known, no further
information about µ can be obtained from the sample itself.

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1-7

4
So, does this match with 2 ?
 
4 4! 24 24
    6.
2 2!(4 − 2)! 2(2) 4

Nice when things work out, eh? OK, to continue – we use the binomial coefficient to calculate the
binomial probability. For example, what is the probability of 5 heads in 20 tosses of a fair coin. Each
individual coin flip is called a Bernoulli trial, and if the coin is fair, then the probability of getting a head
is p  0.5, while the probability of getting a tail is (1 − p)  0.5 (commonly denoted as q). So, given a
fair coin, and p  q  0.5, then the probability of y heads in N flips of the coin is:
 

f y N, p 
N y
p (1 − p)(N−y) . (1.2)
y

The left-hand side of the equation is read as ‘the probability of observing y events given that we do
the experiment – toss the coin N times, and given that the probability of a head in any given experiment
(i.e., toss of the coin) is p’. Given that N  20, and p  0.5, then the probability of getting exactly 5 heads
in 20 tosses of the coin is:
 


f 5 20, p 
20 5
p (1 − p)(20−5) .
5


First, we calculate 20
5  15,504 (note: 20! is a huge number). If p  0.5, then f (5 | 20, 0.5)  (15,504 ×
0.03125 × 0.000030517578125)  0.0148. So, there is a 1.48% chance of having 5 heads out of 20 coin flips,
if p  0.5.
Now, in this example, we are assuming that we know both the number of times that we toss the coin,
and (critically) the probability of a head in a single toss of the coin. However, if we are studying the
survival of some organism, for example, what information on the left side of the probability equation
(above) would we know? Well, hopefully we know the number of individuals marked (N). Would we
know the survival probability (in the above, the survival probability would correspond to p – later, we’ll
call it S)? No! Clearly, this is what we’re trying to estimate.
So, given the number of marked individuals (N) at the start of the study and the number of individuals
that survive (y), how can we estimate the survival probability p? Easy enough, actually – we simply work
‘backwards’ (more or less). We find the value of p that maximizes the likelihood (L) that we would observe
the data we did.∗ So, for example, what would the value of p have to be to give us the observed data?
Formally, we write this as:
 


L p N, y 
N y
p (1 − p)(N−y) . (1.3)
y

We notice that the right-hand side of eqn. (1.3) is identical to what it was before in eqn. (1.2) – but the
left hand side is different in a subtle, but critical way. We read the left-hand side now as ‘the likelihood
L of survival probability p given that N individuals were released and that y survived’. Now, suppose
N  20, and that we see 5 individuals survive (i.e., y  5). What would p have to be to maximize the
chances of this occurring?

The word ‘likelihood’ is often used synonymously for ‘probability’, but in statistical usage they are not equivalent. One may
ask ‘If I were to flip a fair coin 10 times, what is the probability of it landing heads-up every time?’ or ‘Given that I have flipped
a coin 10 times and it has landed heads-up 10 times, what is the likelihood that the coin is fair?’ but it would be improper to
switch ‘likelihood’ and ‘probability’ in the two sentences.

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1-8

We’ll try a ‘brute force’ approach first, simply seeing what happens if we set p  0, 0.1, 0.2, . . . , and
so on. Look at the following plot of the binomial probability calculated for different values of p:

As you see, the likelihood of ‘observing 5 survivals out of 20 individuals’ rises to a maximum when
p is 0.25. In other words, if p, which is unknown, were 0.25, then this would correspond to the maximal
probability of observing the data of 5 survivors out of 20 released individuals. This graph shows that
some values of the unknown parameter p are ‘relatively unlikely’ (i.e., those with low likelihoods), given
the data observed. The value of the parameter p at which this graph is at a maximum is the most likely
value of p (the probability of a head), given the data. In other words, the chances of actually observing
11 heads and 5 tails are maximal when p is at the maximum point of the curve, and the chances are less
when you move away from this point.
While graphs are useful for getting a ‘look’ at the likelihood, we prefer a more elegant way to estimate
the parameter. If you remember any of your basic calculus at all, you might recall that what we want
to do is find the maximum point of the likelihood function. Recall that for any function y  f (x),
we can find the maximum inflection point over a given domain by setting the first derivative dy/dx
to zero and solving. This is exactly what we want to do here, except that we have one preliminary
step – we ‘could’ take the derivative of the likelihood function as written, but it is simpler to convert
everything to logarithms first. The main reason to do this is because it simplifies the analytical side of

things considerably. The log-transformed likelihood, now referred to as a ‘log-likelihood’, is denoted as
ln L q | data .
Recall that our expression is
 


f p N, y 
N y
p (1 − p)(N−y) .
y

The binomial coefficient in this equation is a constant (i.e., it does not depend on the unknown
parameter p), and so we can ignore it, and express this equation in log terms as:
 
L p data ∝ p y (1 − p)(N−y) → ln L p data ∝ y ln(p) + (N − y) ln(1 − p).

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1-9

Note that we’ve written the left-hand side in a sort of short-hand notation – ‘the likelihood L of the
parameter p, given the data’ (which in this case consist of 5 survivors out of 20 individuals). So, now
the equation we’re interested in is:

ln L p data ∝ y ln(p) + (N − y) ln(1 − p).

So, all you need to do is differentiate this equation with respect to the unknown parameter p, set
equal to zero, and solve.
 
∂ ln L p data y (N − y)
 −  0.
∂p p (1 − p)

So, solving for p, we get:


y
p̂  .
N
Thus, the value of parameter p which maximizes the likelihood of observing y  5 given N  20
(i.e., p̂, the maximum likelihood estimate for p) is the same as our intuitive estimate: simply, y/N. Now,
your intuition probably told you that the ‘only’ way you could estimate p from these data was to simply
divide the number of survivors by the total number of animals. But we’re sure you’re relieved to learn
that 5/20  0.25 is also the MLE for the parameter p.

begin sidebar

closed and non-closed MLE

In the preceding example, we considered the MLE for the binomial likelihood. In that case, we
could ‘use algebra’ to ‘solve’ for the parameter of interest (p̂). When it is possible to derive an
‘analytical solution’ for a parameter (or set of parameters for likelihoods where there are more than
one parameter), then we refer to the solution as a solution in ‘closed form’. Put another way, there is a
closed form solution for the MLE for the binomial likelihood.
However, not all likelihoods have closed form solutions. Meaning, the MLE cannot be derived
‘analytically’ (generally, by taking the derivative of the likelihood and solving at the maximum, as
we did in the binomial example). MLE’s that cannot be expressed in closed form need to be solved
numerically. Here is a simple example of a likelihood that cannot be put in closed form. Suppose we
are interested in estimating the abundance of some population. We might intuitively understand that
unless we are sure that we are encountering the entire population in our sample, then the number we
encounter (the ‘count’ statistic; i.e., the number of individuals in our sample) is a fraction of the total
population. If p is the probability of encountering any one individual in a population, and if n is the
number we encounter (i.e., the number of individuals in our sample from the larger population), then
we might intuitively understand that our canonical estimator for the size of the larger population is
simply (n/p). For example, if there is a 50% chance of encountering an individual in a sample, and
we encounter 25 individuals, then our estimate of the population size is N̂  (25/0.5)  50. (Note: we
cover abundance estimation in detail in Chapter 14.)
Now, suppose you are faced with the following situation. You are sampling from a population for
which you’d like to derive an estimate of abundance. We assume the population is ‘closed’ (no entries
or exits while the population is being sampled). You go out on a number of sampling ‘occasions’, and
capture a sample of individuals in the population. You uniquely mark each individual, and release
it back into the population. At the end of the sampling, you record the total number of individuals
encountered at least once – call this M t+1 .
Now, if the canonical estimator for abundance is N̂  (n/p), then p̂  (n/N). In other words, if we
knew the size of the population N then we could derive a simple estimate of the encounter probability
p by dividing the number encountered in the sample n into the size of the population. Remember, p

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1 - 10

is the probability of encountering an individual. Thus, the probability of ‘missing’ an individual (i.e.,
not encountering it) is simple (1 − p)  1 − (n/N).
So, over t samples, we can write
    
n1 n2 nt   
1− 1− ... 1−  1 − p1 1 − p2 . . . 1 − p t ,
N N N

where p i is the encounter probability at time i, and n i is the number of individuals caught at time i.
If you think about it for a moment, you’ll see that the product on right-hand side is the overall
probability that an individual is not caught – not even once – over the course of the study (i.e., over t
total samples). Remember from above that we defined M t+1 as the number of individuals caught at
least once. So, we can write
!      
M t+1 n1 n2 n3 nt
1−  1− 1− 1− ··· 1− .
N N N N N

In other words, the LHS and RHS both equal the probability of never being caught – not even once.
Now, if you had estimates of p i for each sampling occasion i, then you could write
!
M t+1   
1−  1 − p1 1 − p2 . . . 1 − p t
N
M t+1   
 1 − 1 − p1 1 − p2 . . . 1 − p t
N
M t+1
N̂    .
1 − 1 − p1 1 − p2 . . . 1 − p t

So, the expression is rewritten in terms of N – analytical solution – closed form, right? Not quite.
Note that we said if you had estimates of p i . In fact, you don’t. All you have is the count statistic (i.e.,
the number of individuals captured on each sampling occasion, n i ). So, in fact, ‘all we have’ are the
count data (i.e., M t+1 , n 1 , n 2 . . . n t ), which (from above) we relate algebraically in the following:
!      
M t+1 n1 n2 n3 nt
1−  1− 1− 1− ··· 1− .
N N N N N

It is not possible to ‘solve’ this equation so that only the parameter N appears on the LHS, while all
the other terms (representing data – i.e., M t+1 , n 1 , n 2 . . . n t ) appear on the RHS. Thus, the estimator
for N cannot be expressed in closed form.
However, the expression does have a solution – but it is a solution we must derive numerically, rather
than analytically. In other words, we must use numerical, iterative methods to find the value of N that
‘solves’ this equation. That value of N is the MLE, and would be denoted as N̂.
Consider the following data:

n 1  30, n 2  15, n 3  22, n 4  n t  45, and M t+1  79

Thus, one wants the value of N that ‘solves’ the equation


      
79 30 15 22 45
1−  1− 1− 1− 1− .
N N N N N

One could try to solve this equation by ‘trial and error’. That is, one could plug in a guess for
population size and see if the LHS = RHS (not very likely unless you can guess very well). Thinking
about the problem a bit, one realizes that, logically, N ≥ M t+1 (i.e., the size of the population N must
be at least as large as the number of unique individuals caught at least once, M + t + 1). So, at least, one
has a lower bound (in this case, 79 if we restrict the parameter space to integers). If the first guess for N

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1 - 11

does not satisfy the equation, one could try another guess and see if that either (1) satisfies the equation
or (2) is closer than the first guess. The log-likelihood functions for many (but not all) problems are
unimodal (for the exponential family); thus, you can usually make a new guess in the right direction.
One could keep making guesses until a value of N (an integer) allows the LHS = RHS, and take
this value as the MLE, N̂. Clearly, the ‘trial-and-error’ method will unravel if there is more than 1 or 2
parameters. Likewise, plotting the log-likelihood function is useful only when 1 or 2 parameters are
involved. We will quickly be dealing with cases where there are 30-40 parameters, thus we must rely
on efficient computer routines for finding the maximum point in the multidimensional cases. Clever
search algorithms have been devised for the 1-dimensional case. Computers are great at such routine
computations and the MLE in this case can be found very quickly. Many (if not most) of the estimators
we will work with cannot be put in closed form, and we will rely on computer software – namely,
program MARK – to compute MLEs numerically.

end sidebar

Why go to all this trouble to derive an estimate for p? Well the maximum likelihood approach also
has other uses – specifically, the ability to estimate the sampling variance. For example, suppose you
have some data from which you have estimated that p̂  0.6875. Is this ‘significantly different’ (by some
criterion) from, say, 0.5? Of course, to address this question, you need to consider the sampling variance
of the estimate, since this is a measure of the uncertainty we have about our estimate. How would you
do this? Of course, you might try the ‘brute force’ approach and simply repeat your ‘experiment’ a
large number of times. Each time, derive the estimate of p, and then calculate a mean and variance of
the parameter. While this works, there is a more elegant approach – again using ML theory and a bit
more calculus (fairly straightforward stuff).
Conceptually, the sampling variance is related to the curvature of the likelihood at its maximum.
Why? Consider the following: let’s say we release 16 animals, and observe 11 survivors. What would
the MLE estimate of p be? Well, we now know it is (y/N)  (11/16)  0.6875. What if we had released
80 animals, instead of 16? Suppose we did this experiment, and observed 55 survivors (i.e., the expected
values assuming p  0.6875). What would the likelihood look like in this case? The maximum of the
likelihood in both ‘experiments’ should occur at precisely the same point: 0.6875. But what about the
‘shape’ of the curve?
In the following, we plot the likelihoods for both experiments (N  16 and N  80 respectively):

N  16, y  11 MLE
N  80, y  55
likelihood

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
p

Chapter 1. First steps. . .


1.3.2. Simple estimation example – the binomial coefficient 1 - 12

Clearly, the larger sample size (N  80) results in a ‘narrower’ function around the ML parameter
estimate, p̂  0.6875. If the sampling variance is related to the degree of curvature of the likelihood at
its maximum, then we would anticipate the sampling variance of the parameter in these 2 experiments
to be quite different, given the apparent differences in the likelihood functions.
What is the basis for stating that ‘variance is related to curvature’? Think of it this way – values of
the likelihood at increasing distances from the MLE are increasingly ‘unlikely’, relative to the MLE.
The degree to which they are less likely is a function of how rapidly the curve drops away from the
maximum as you move away from the MLE (i.e., the ‘steepness’ of the curve on either side of the MLE).
How do we address this question of ‘curvature’ analytically? Well, again we can use calculus. We use
the first derivative of the likelihood function to find the point on the curve where the rate of change was
0 (i.e., the maximum point on the function). This first derivative of the likelihood is known as Fisher’s
score function.
We can then use the derivative of the score function with respect to the parameter(s) (i.e., the second
derivative of the likelihood function, which is known as the Hessian), evaluated at the estimated value
of the parameter (p, in this case), to ‘tell us something about the curvature’ at this point. In fact, more
than just the curvature, Fisher showed that the negative inverse of the second partial derivative of the
log-likelihood function (i.e., the negative inverse of the Hessian), evaluated at the MLE, is the MLE of
the variance of the parameter. This negative inverse of the Hessian, evaluated at the MLE, is known as
the information function, or matrix.
For our example, our estimate of the variance of p is

"  !# −1
∂2 ln L p data
c p̂)  −
var( .
∂p 2 p p̂

So, we first find the second derivative of the log-likelihood (i.e., the Hessian):

∂2 L y N−y
− − .
∂p 2 p2 (1 − p)2

We evaluate this second derivative at the MLE, by substituting y  pN (since p̂  y/N). This gives


∂2 L Np N(1 − p)

∂p 2
− −
ypN
p2 (1 − p)2

N
− .
p(1 − p)

The variance of p is then estimated as the negative inverse of this expression (i.e., the information
function, or matrix), such that:

p(1 − p)
c p̂) 
var( .
N

So, how do the sampling variances of our 2 experiments compare? Clearly, since p and (1-p) are the
same in both cases (i.e., same ML estimate for p̂), the only difference is in the denominator, N. Since
N  80 is obviously larger than N  16, we know immediately that the sampling variance of the larger
sample will be smaller (0.0027) than the sampling variance of the smaller sample (0.0134).

Chapter 1. First steps. . .


1.3.3. Multinomials: a simple extension 1 - 13

1.3.3. Multinomials: a simple extension

A binomial probability involves 2 possible states (e.g., live or dead). What if there are more than 2
states? In this case, we use multinomial probabilities. As with our discussion of the binomial probability
(above), we start by looking at the multinomial coefficient – the multinomial equivalent of the binomial
coefficient. The multinomial is extremely useful in understanding the models we’ll discuss in this book.
The multinomial coefficient is nearly always introduced by way of a die tossing example. So, we’ll stick
with tradition and discuss this classic example here. Recall that a die has 6 sides – therefore 6 possible
outcomes if your roll a die once.
The multinomial coefficient corresponding to the ‘die’ example is
 
N N! N!
n1 !n2 !n3 !n4 !n5 !n6 ! Îk n !
  .
n1 n2 n3 n4 n5 n6
i1 i

Î
Note the use of the product operator ‘ ’ in the denominator. In a multinomial context, we assume that
individual trials are independent, and that outcomes are mutually exclusive and all inclusive. Consider
the ‘classic’ die example. Assume we throw the die 60 times (N  60), and a record is kept of the number
of times a 1, 2, 3, 4, 5 or 6 is observed. The outcomes of these 60 independent trials are shown below.

face frequency notation


1 13 y1
2 10 y2
3 8 y3
4 10 y4
5 12 y5
6 7 y6

Each trial has a mutually exclusive outcome (1 or 2 or 3 or 4 or 5 or 6). Note that there is a type of
dependency in the cell counts in that once n and y1 , y2 , y3 , y4 and y5 are known, then y6 can be obtained
by subtraction, because the total (N) is known. Of course, the dependency applies to any count, not just
y6 . This same dependency is also seen in the binomial case – if you know the total number of coin tosses,
and the total number of heads observed, then you know the number of tails, by subtraction.
The multinomial distribution is useful in a large number of applications in ecology. The probability
function for k  6 is
 
 n y1 y2 y3 y4 y5 y6
P y i | n, p i  p p p p p p .
yi 1 2 3 4 5 6

Again, as was the case with the binomial probability, the multinomial coefficient does not involve
any of the unknown parameters, and is conveniently ignored for many estimation issues.
This is a good thing, since in the simple die tossing example the multinomial coefficient is
 
n 60!
 ,
yi 13!10!8!10!12!7!

which is an absurdly big number – likely beyond the capacity of your simple hand calculator to calculate.
So, it is helpful that we can ignore it for all intents and purposes.

Chapter 1. First steps. . .


1.3.3. Multinomials: a simple extension 1 - 14

Some simple examples – suppose you role a ‘fair’ die 6 times (i.e., 6 trials), First, assume (y1 , y2 , y3 ,
y4 , y5 , y6 ) is a multinomial random variable with parameters p1  p2  . . . p6  0.1667 and N  6.
What is the probability that each face is seen exactly once? This is written simply as:

 6

P 1, 1, 1, 1, 1, 1 6, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6 
6! 1
1!1!1!1!1!1! 6
 
5
  0.0154.
324

What is the probability that exactly four 1’s occur, and two 2’s occur in 6 tosses? In this case,

 4  2

L 4, 2, 0, 0, 0, 0 6, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6 
6! 1 1
4!2!0!0!0!0! 6 6
 
5
 ≪ 0.0154.
15,552

As noted in our discussion of the binomial probability theorem, we are generally faced with the
reverse problem – we do not know the parameters, but rather we want to estimate the parameters
from the data. As we saw, these issues are the domain of the likelihood and log-likelihood functions.
The key to this estimation issue is the multinomial distribution, and, particularly, the likelihood and
log-likelihood functions  
L q data or L p i n i , y i ,
which we read as ‘the likelihood of the parameters, given the data’ – the left-hand expression is the
more general one, where the symbol q indicates one or more parameters. The right-hand expression
specifies the parameters of interest.
The likelihood function looks somewhat messy, but it is only a slightly different view of the probability
function. Just as we saw from the binomial probability function, the multinomial function assumes N
is given. The probability function further assumes that the parameters are given, while the likelihood
function assumes the data are given. The likelihood function for the multinomial distribution is
 


L pi ni , yi 
N y1 y2 y3 y4 y5 y6
p1 p2 p3 p4 p5 p6 .
yi

Since the first term – the multinomial coefficient – is a constant, and since it doesn’t involve any
parameters, we ignore it. Next, because probabilities must sum to 1 (i.e., {sum of p i over all i} = 1), there
are only 5 ‘free’ parameters, since the 6th one is defined by the other 5 (the ‘dependency’ issue we
mentioned earlier), and the total, N. We will use the symbol K to denote the total number of estimable
parameters in a model. Here, K  5.
The likelihood function for K  5, for example, is

   Õ  Í5
 5 N− pi

i1
N y1 y2 y3 y4 y5
L p i N, y i  p1 p2 p3 p4 p5 1 − pi .
yi
i1

As for the binomial example, we use a maximization routine to find the values of p1 , p2 , p3 , p4 and
p5 that maximize the likelihood of the data that we observe. Remember – all we are doing is finding the
values of the parameters which maximize the likelihood of observing the data that we see.

Chapter 1. First steps. . .


1.4. Application to mark-recapture 1 - 15

1.4. Application to mark-recapture

Let’s look at an example relevant to the task at hand (no more dice, or flipping coins.). Let’s pretend
we do a three year mark-recapture study, with 55 total marked individuals from a single cohort.∗ Once
each year, we go out and look to see if we can ‘see’ (encounter) any of the 55 individuals we marked
alive and in our sample. For now, we’ll assume that we only encounter ‘live’ individuals.
The following represents the basic ‘structure’ of our sampling protocol:

ϕ1 ϕ2
1 2 3

p2 p3

In this diagram, each of the sampling events (referred to as ‘sampling occasions’) is indicated by a
shaded grey circle. Our ‘experiment’ has three sampling occasions, numbered 1 → 3, respectively. In
this diagram, time is moving forward going from left to right (i.e., sampling occasion 2 occurs one time
step after sampling occasion 1, and so forth). Connecting the sampling occasions we have an arrow –
the direction of the arrow indicates the direction of time – again, moving left to right, forward in time.
We’ve also added two variables (symbols) to the diagram: ϕ and p. What do these represent?
For this example, these represent the two primary parameters which we believe (assume) govern
the encounter process: ϕ i (the probability of surviving from occasion i to i + 1), and p i (the probability
that if alive and in the sample at time i, that the individual will be encountered). So, as shown on the
diagram, ϕ1 is the probability that an animal encountered and released alive at sampling occasion 1
will survive the interval from occasion 1 → occasion 2, and so on. Similarly, p2 is the probability that
conditional on the individual being alive and in the sample, that it will be encountered at occasion 2,
and so on.
Why no p1 ? Simple – p1 is the probability of encountering a marked individual in the population, and
none are marked prior to occasion 1 (which is when we start our study). In addition, the probability of
encountering any individual (marked or otherwise) could only be calculated if we knew the size of the
population, which we don’t (this becomes an important consideration we will address in later chapters
where we make use of estimated abundance). The important thing to remember here is the probability
of being encountered at a particular sampling occasion is governed by two parameters: ϕ and p.
Now, as discussed earlier, if we encounter the animal, we record it in our data as ‘1’. If we don’t
encounter the animal, it’s a ‘0’. So, based on a 3 year study, an animal with an encounter history of ‘111’
was ‘seen in the first year (the marking year), seen again in the second year, and also seen in the third
year’. Compare this with an animal with an encounter history of ‘101’. This animal was ‘seen in the first
year, when it was marked, not seen in the second year, but seen again in the third year’.
For a 3 occasion study, where the occasion refers to the sampling occasion, with a single release
cohort, there are 4 possible encounter histories: {111 , 101, 110 , 110}. The key question we have to
address, and (in simplest terms) the basis for analysis of data from marked individuals, is ‘what is
the probability of observing a particular encounter history?’. The probability of a particular encounter
history is determined by a set of parameters – for this study, we know (or assume) that the parameters
governing the probability of a given encounter history are ϕ and p.
Based on the diagram at the top of this page, we can write a probability expression corresponding


In statistics and demography, a cohort is a group of ‘subjects’ defined by experiencing a common event (typically birth) over a
particular time span. In the present context, a cohort represents a group of individuals captured, marked, and released alive at
the same point in time. These individuals would be part of the same release cohort.

Chapter 1. First steps. . .


1.4. Application to mark-recapture 1 - 16

to each of these possible encounter histories:

encounter history probability


111 ϕ1 p 2 ϕ2 p 3

110 ϕ1 p 2 1 − ϕ2 p 3

101 ϕ1 1 − p 2 ϕ2 p 3
100 1 − ϕ1 p2 − ϕ1 (1 − p2 )ϕ2 p3

For example, take encounter history ‘101’. The individual is marked and released on occasion 1 (the
first 1 in the history), is not encountered on the second occasion, but is encountered on the third occasion.
Now, because of this encounter on the third occasion, we know that the individual was in fact alive on
the second occasion, but simply not encountered. So, we know the individual survived from occasion
1 → 2 (with probability ϕ1 ), was not encountered at occasion 2 (with probability 1 − p2 ), and survived
to occasion 3 (with probability ϕ2 ) where it was encountered (with probability p3 ). So, the probability
of observing encounter history ‘101’ would be ϕ1 (1 − p2 )ϕ2 p3 .
Here are our ‘data’ – which consist of the observed frequencies of the 55 marked individuals with
each of the 4 possible encounter histories:

encounter history frequency


111 7
110 13
101 6
100 29

So, of the 55 individually marked and released alive in the release cohort, 7 were encountered on
both sampling occasion 2 and sampling occasion 3, 13 were encountered on sampling occasion 2, but
were not seen on sampling occasion 3, and so on.
The estimation problem, then, is to derive estimates of the parameters p i and ϕ i which maximizes the
likelihood of observing the frequency of individuals with each of these 4 different encounter histories.
Remember, the encounter histories are the data - we want to use the data to estimate the parameter values.
What parameters? Again, recall also that the probability of a given encounter history is governed (in
this case) by two parameters: ϕ, and p.
OK, so we’ve been playing with multinomials (above), and you might have suspected that these
encounter data must be related to multinomial probabilities, and likelihoods. Good guess! The basic
idea is to realize that the statistical likelihood of an actual encounter data set (as is tabulated above) is
merely the product of the probabilities of the possible capture histories over those actually observed.
As noted by Lebreton et al. (1992), because animals with the same encounter history have the same
probability expression, then the number of individuals observed with each encounter history appears
as an exponent of the corresponding probability in the likelihood.
Thus, we write
h i N(111) h  i N(110) h  i N(101) h  i N(100)
L  ϕ1 p 2 ϕ2 p 3 ϕ1 p 2 1 − ϕ2 p 3 ϕ1 1 − p 2 ϕ2 p 3 1 − ϕ1 p 2 − ϕ1 1 − p 2 ϕ2 p 3 ,

where N(i jk) is the observed frequency of individuals with encounter history i jk.

Chapter 1. First steps. . .


1.4. Application to mark-recapture 1 - 17

As with the binomial, we take the log transform of the likelihood expression, and after substituting
the frequencies of each history, we get:
  h i h  i
ln L ϕ1 , p2 , ϕ2 , p3  7 ln ϕ1 p2 ϕ2 p3 + 13 ln ϕ1 p2 1 − ϕ2 p3 + 6 ln ϕ1 1 − p2 ϕ2 p3
h  i
+ 29 ln 1 − ϕ1 p2 − ϕ1 1 − p2 ϕ2 p3

All that remains is to derive the estimates of the parameters ϕ i and p i that maximize this likelihood.
Let’s go through a worked example, using the encounter history data tabulated on the preceding page.
To this point, we have assumed that these encounter histories are governed by ‘time-specific’ variation
in ϕ and p. In other words, we would write the probability statement for encounter history ‘111’ as
ϕ1 p 2 ϕ2 p 3 .
These time-specific parameters are indicated in the following diagram:

ϕ1 ϕ2
1 2 3

p2 p3

Again, the subscripting indicates a different survival and recapture probability for each interval or
sampling occasion.
However, what if instead we assume that the survival and recapture probabilities do not vary over
time? In other words, ϕ1  ϕ2  ϕ, and p2  p3  p. In this case, our diagram would now look like:

ϕ ϕ
1 2 3

p p

What would the probability statements be for the respective encounter histories? In fact, in this case
deriving them is very straightforward – we simply drop the subscripts from the parameters in the
probability expressions:

encounter history probability


111 ϕpϕp
110 ϕp(1 − ϕp)

101 ϕ 1 − p ϕp
100 1 − ϕp − ϕ(1 − p)ϕp

So, what would the likelihood look like? Well, given the frequencies, the likelihood would be:
111 110 101 100
L  [ϕpϕp]N [ϕp(1 − ϕp)]N [ϕ(1 − p)ϕp]N [1 − ϕp − ϕ(1 − p)ϕ]N .

Thus,

ln L(ϕ, p)  7 ln[ϕpϕp] + 13 ln[ϕp(1 − ϕp)] + 6 ln[ϕ(1 − p)ϕp] + 29 ln[1 − ϕp − ϕ(1 − p)ϕp].

Chapter 1. First steps. . .


1.4. Application to mark-recapture 1 - 18

Again, we can use numerical methods to solve for the values of ϕ and p which maximize the likelihood
of the observed frequencies of each encounter history. The likelihood profile for these data is plotted as
a 2-dimensional contour plot, shown below:

We see that the maximum of the likelihood occurs at p  0.542 and ϕ  0.665 (where the 2 dark black
lines cross in the figure).
For this example, we used a numerical approach to find the MLE. In fact, for this example where ϕ
and p are constant over time, the probability expressions are defined entirely by these two parameters,
and we could (if we really had to) write the likelihood as two closed-form equations in ϕ and p, and
derive estimates for ϕ and p analytically. All we need to do is (1) take the partial derivatives of the
likelihood with respect to each of the parameters (ϕ, p) in turn (∂L/∂ϕ, ∂L/∂p), (2) set each partial
derivative to 0, and (3) solve the resulting set of simultaneous equations.
Solving simultaneous equations is something that most symbolic math software programs (e.g.,
MAPLE, Mathematica, GAUSS, Maxima) does extremely well. For this problem, the ML estimates are
derived analytically as ϕ̂  0.665 and p̂  0.542 (just as we saw earlier using the numerical approach).
However, recall that many of the likelihoods we’ll be working with cannot be evaluated analytically
in closed form, so we will rely in numerical methods. Program MARK evaluates all likelihoods (and
functions of likelihoods) numerically.
What is the actual value of the likelihood at this point? On the log scale, ln(L) is maximized at -
65.041. For comparison, the maximized ln(L) for the model where both ϕ and p were allowed to vary
with time is -65.035. Now, these two likelihoods aren’t very far apart – only in the second and third
decimal places. Further, the two models (with constant ϕ and p, and with time varying ϕ and p) differ
by only 1 estimable parameter (we’ll talk a lot more about estimable parameters in coming lectures).
So, a χ2 test would have only 1 df. The difference in the ln(L) is 0.006 (actually, the test is based on

Chapter 1. First steps. . .


1.5. Variance estimation for > 1 parameter 1 - 19

2 ln(L), so the difference is actually 0.012). This difference is not significant (in the familiar sense of
‘statistical significance’) at P ≫ 0.5. So, the question we now face is, which of the two models do we use
for inference? This takes us to one of the main themes of this book – model selection – which we’ll cover
in some detail in Chapter 4.

1.5. Variance estimation for > 1 parameter

Earlier, we considered the derivation of the MLE, and the variance, for a simple situation involving only
a single parameter. If in fact we have more than one parameter, the same idea we’ve just described for
one parameter still works, but there is one important difference: a multi-parameter likelihood surface
will have more than one second partial derivative. In fact, what we end up with a matrix of second
partial derivatives, called the Hessian.
Consider for example, the log-likelihood of the simple mark-recapture data set we just analyzed in
the preceding section:

ln L(ϕ, p)  7 ln[ϕpϕp] + 13 ln[ϕp(1 − ϕp)] + 6 ln[ϕ(1 − p)ϕp] + 29 ln[1 − ϕp − ϕ(1 − p)ϕp].

Thus, the Hessian H (i.e., the matrix of second partial derivatives of the likelihood L with respect to
ϕ and p) would be:

 2 
 ∂L ∂2 L 
 
 ∂ϕ2 ∂ϕ∂p 
H   .

 ∂2 L ∂ L 
2
 
 ∂p∂ϕ ∂p 2 

We’ll leave it as an exercise for you to derive the second partial derivatives corresponding to each of
the elements of the Hessian. It isn’t difficult, just somewhat cumbersome.
For our present example,

∂2 L 26 26p 13[p(1 − ϕp) − ϕp 2 ]


− − − +
∂ϕ2 ϕ2 ϕ(1 − ϕp) ϕ2 p(1 − ϕp)

13[p(1 − ϕp) − ϕp 2 ] 58(1 − p)p 29[−p − 2ϕ(1 − p)p]2


− − .
ϕ(1 − ϕp)2 1 − ϕp − ϕ2 (1 − p)p [1 − ϕp − ϕ2 (1 − p)p]2

Next, we evaluate the Hessian at the MLE for ϕ and p (i.e., we substitute the MLE values for our
parameters – ϕ̂  0.6648 and p̂  0.5415 – into the Hessian), which yields the information matrix, I:
 
−203.06775 −136.83886
I .
−136.83886 −147.43934

The negative inverse of the information matrix (−I−1 ) is the variance-covariance matrix for parameters
ϕ and p:

  −1  
−1 −203.06775 −136.83886 0.0131 −0.0122
−I −  .
−136.83886 −147.43934 −0.0122 0.0181

Chapter 1. First steps. . .


1.6. More than ‘estimation’ – ML and statistical testing 1 - 20

Note that the variances are found along the diagonal of the matrix, while the off-diagonal elements
are the covariances.
In general, for an arbitrary parameter θ, the variance of θi is given as the elements of the negative
inverse of the information matrix corresponding to

∂2 ln L
,
∂θi ∂θi

while the covariance of θi with θ j is given as the elements of the negative inverse of the information
matrix corresponding to

∂2 ln L
.
∂θi ∂θ j

Obviously, the variance-covariance matrix is the basis for deriving measures of the precision of our
estimates. But, as we’ll see in later chapters, the variance-covariance matrix is used for much more –
including estimating the number of estimable parameters in the model. While MARK handles all this
for you, it’s important to have a least a feel for what MARK is doing ‘behind the scenes’, and why.

1.6. More than ‘estimation’ – ML and statistical testing

In the preceding, we focussed on the maximization of the likelihood as a means of deriving estimates of
parameters and the sampling variance of those parameters. However, the other primary use of likelihood
methods is for comparing the fits of different models.
We know that L(θ̂) is the value of the likelihood function evaluated at the MLE θ̂, whereas L(θ) is
the likelihood for the true (but unknown) parameter θ. Since the MLE maximizes the likelihood for a
given sample, then the value of the likelihood at the true parameter value θ is generally smaller than
the MLE θ̂ (unless by chance θ̂ and θ happen to coincide).
This, combined with other properties of ML estimators noted earlier lead directly to several classic
and general procedures for testing the statistical hypothesis that Ho : θ  θ0 . Here we briefly describe
three of the more commonly used tests.

Fisher’s Score Test

The ‘score’ is the slope of the log-likelihood at a particular value of θ. In other words, S(θ) 
∂ ln L(θ)/∂θ. At the MLE, the score (slope) is 0 (by definition of a maximum).
Recall from earlier in this chapter that

" !# −1
∂2 ln L(θ | data)
c θ̂)  −
var( .
∂θ 2 θ θ̂

The term inside the inner parentheses

∂2 ln L(θ)
I(θ)  − ,
∂θ 2
is known as Fisher information.

Chapter 1. First steps. . .


1.6. More than ‘estimation’ – ML and statistical testing 1 - 21

It can be shown that the score statistic



S θ0
S0  q

,
I θ0

is asymptotically distributed as N(0, 1) under Ho .

Wald test

The Wald test relies on the asymptotic normality of the MLE θ̂. Given the normality of the MLE, we
can calculate the test statistic

θ̂ − θ0
Z0  q

,
c θ̂
var

which is asymptotically distributed as N(0, 1) under the null Ho .

Likelihood ratio test


  
It is known that 2 ln L θ̂ − ln L θ0 follows an asymptotic χ2 distribution with one degree of
freedom.
The relationship among these tests is shown in the following diagram:

In general, these three tests are asymptotically equivalent, although in some applications, the score
test has the practical advantage of not requiring the computation of the MLE at θ̂ (since S0 depends
only on the null value θ0 , which is specified in H0 ). We consider one of these tests (the likelihood ratio
test) in more detail in Chapter 4.

Chapter 1. First steps. . .


1.7. Technical aside: a bit more on variances 1 - 22

1.7. Technical aside: a bit more on variances

As we discussed earlier, the classic MLE approach to variance calculation (for purposes of creating a
SE and so forth) is to use the negative inverse of the 2nd derivative of the MLE evaluated at the MLE.
However, the problem with this approach is that, in general, it leads to derivation of symmetrical 95%
CI, and in many cases – especially for parameters that are bounded on the interval [0, 1] – this makes no
sense. A simple example will show what we mean. Suppose we release 30 animals, and find 1 survivor.
We know from last time that the MLE for the survival probability is (1/30)  0.03333. We also know
from earlier in this chapter that the classical estimator for the variance, based on the 2nd derivative, is

p̂(1 − p̂)
c p̂) 
var(
N
0.0333(1 − 0.0333)
  0.0010741.
30

So, based on this, the 95% CI using classical approaches would be ±1.96(SE), where the SE (standard
c  0.001074, the 95% CI would be
error) is estimated as the square-root of the variance. Thus, given var
±1.96(0.03277), or [0.098, −0.031].
OK, so what’s wrong with this? Well, clearly, we don’t expect a 95% CI to ever allow values < 0 (or
> 1) for a parameter that is logically bounded to fall between 0 and 1 (like ϕ or p). So, there must be a
problem, right?
Well, somewhat. Fortunately, there is a better way, using something called the profile likelihood ap-
proach, which makes more explicit use of the shape of the likelihood. We’ll go into the profile likelihood
in further detail in later chapters, but to briefly introduce the concepts – consider the following diagram,
which shows the maximum part of the log likelihood for ϕ, given N  30, y  23 (i.e., 23/30 survive).

Profile likelihood confidence intervals are based on the log-likelihood function. For a single parame-
ter, likelihood theory shows that the 2 points 1.92 units down from the maximum of the log likelihood
function provide a 95% confidence interval when there is no extra-binomial variation (i.e., c  1; see
Chapter 5). The value 1.92 is half of the χ12  3.84. Thus, the same confidence interval can be computed
with the deviance by adding 3.84 to the minimum of the deviance function, where the deviance is the
log-likelihood multiplied by -2 minus the -2 log likelihood value of the saturated model (more on these

Chapter 1. First steps. . .


1.8. Summary 1 - 23

concepts in later chapters).


Put another way, we use the critical value of 1.92 to derive the profile – you take the value of the log
likelihood at the maximum (for this example, the maximum occurs at −16.30), add 1.92 to it (yielding
−18.22; note we keep the negative sign here), and look to see where the −18.22 line intersects with the
profile of the log likelihood function. In this case, we see that the intersection occurs at approximately
0.6 and 0.9. The MLE is (23/30)  0.767, so clearly, the profile 95% CI is not symmetrical around this
MLE value. But, it is bounded on the interval [0, 1]. The profile likelihood is the preferred approach to
deriving 95% CI. The biggest limit to using it is computational – it simply takes more work to derive a
profile likelihood (and corresponding CI). Fortunately, MARK does all the work for us.

1.8. Summary

That’s it for Chapter 1! Nothing about MARK, but some important background. Beginning with
Chapter 2, we’ll consider formatting of our data (the ‘encounter histories’ we introduced briefly in
this chapter). After that, the real details of using program MARK. Our suggestion at this stage is to (i)
leave your own data alone – you need to master the basics first. This means working through at least
chapters 3 → 8, in sequence, using the example data sets. Chapter 9 and higher refer to specific data
types – one or more may be of particular interest to you. Then, when you’re ready (i.e., have a good
understanding of the basic concepts), (ii) get your data in shape – this is covered in Chapter 2.

1.9. References

Edwards, A. W. F. (1972) Likelihood. Cambridge University Press, Cambridge (expanded edition, 1992,
Johns Hopkins University Press, Baltimore).
Lebreton, J.-D., Burnham, K. P., Clobert, J., and Anderson, D. R. (1992). Modeling survival and testing
biological hypotheses using marked animals: a unified approach with case studies. Ecological
Monographs, 62, 67-118. doi:10.2307/2937171

Chapter 1. First steps. . .


CHAPTER 2

Data formatting: the input file . . .

Clearly, the first step in any analysis is gathering and collating your data. We’ll assume that at the
minimum, you have records for the individually marked individuals in your study, and from these
records, can determine whether or not an individual was ‘encountered’ (in one fashion or another) on
a particular sampling occasion. Typically, your data will be stored in what we refer to as a ‘vertical file’
– where each line in the file is a record of when a particular individual was seen. For example, consider
the following table, consisting of some individually identifying mark (ring or tag number), and the year.
Each line in the file (or, row in the matrix) corresponds to the animal being seen in a particular year.

tag number year


1147-38951 73
1147-38951 75
1147-38951 76
1147-38951 82
1147-45453 74
1147-45453 78

However, while it is easy and efficient to record the observation histories of individually marked
animals this way, the ‘vertical format’ is not at all useful for capture-mark-recapture analysis. The
preferred format is the encounter history. The encounter history is a contiguous series of specific dummy
variables, each of which indicates something concerning the encounter of that individual – for example,
whether or not it was encountered on a particular sampling occasion, how it was encountered, where it
was encountered, and so forth. The particular encounter history will reflect the underlying model type
you are working with (e.g., recaptures of live individuals, recoveries of dead individuals).
Consider for example, the encounter history for a typical mark-recapture analysis (the encounter
history for a mark-recapture analysis is often referred to as a capture history, since it implies physical
capture of the individual). In most cases, the encounter history consists of a contiguous series of ‘1’s
and ‘0’s, where ‘1’ indicates that an animal was recaptured (or otherwise known to be alive and in
the sampling area), and ‘0’ indicates the animal was not recaptured (or otherwise seen). Consider the
individual in the preceding table with tag number ‘1147-38951’. Suppose that 1973 is the first year of
the study, and that 1985 is the last year of the study. Examining the table, we see that this individual was
captured and marked during the first year of the study, was seen periodically until 1982, when it was seen
for the last time. The corresponding encounter-history for this individual would be: ‘1011000001000’.
In other words, the individual was seen in 1973 (the starting ‘1’), not seen in 1974 (‘0’), seen in 1975 and
1976 (‘11’), not seen for the next 5 years (‘00000’), seen again in 1982 (‘1’), and then not seen again (‘000’).

© Cooch & White (2019) 08.06.2019


2.1. Encounter histories formats 2-2

While this is easy enough in principal, you surely don’t want to have to construct capture-histories
manually. Of course, this is precisely the sort of thing that computers are good for – large-scale data
manipulation and formatting. MARK does not do the data formatting itself – no doubt you have your
own preferred ‘data manipulation’ environment (dBASE, Excel, Paradox, SAS). Thus, in general, you’ll
have to write your own program to convert the typical ‘vertical’ file (where each line represents the
encounter information for a given individual on a given sampling occasion; see the example on the
preceding page) into encounter histories (where the encounter history is a horizontal string).
In fact, if you think about it a bit, you realize that in effect what you need to do is to take a vertical
file, and ‘transpose’ (or, ‘pivot’) it into a horizontal file – where fields to the right of the individual
tag number represent when an individual was recaptured or resighted. However, while the idea of a
‘transpose’ or ‘pivot’ seems simple enough, there is one rather important thing that needs to be done –
your program must insert the ‘0’ value whenever an individual was not seen.
We’ll assume for the purposes of this book that you will have some facility to put your data into the
proper encounter-history format. For those of you who have no idea whatsoever on how to approach
this problem, we provide some practical guidance in the Addendum at the end of this chapter. Of course,
you could always do it by hand, if absolutely necessary!

begin sidebar

editing the .INP file

Many of the problems people have getting started with MARK can ultimately be traced back to prob-
lems with the .INP file. One common issue relates to choice of editor used to make changes/additions
to the .INP file. You are strongly urged to avoid – as in ‘like the plague’ – using Windows Notepad
(or, even worse, Word) to do much of anything related to building/editing .INP files. Do yourself a
favor and get yourself a real ASCII editor. There are a number of very good ‘free’ editors you can (and
should) use instead of Notepad (e.g., Notepad++, EditPad Lite, jEdit, and so on...)

end sidebar

2.1. Encounter histories formats

Now we’ll look at the formatting of the encounter histories file in detail. It is probably easiest to show
you a ‘typical’ encounter history file, and then explain it ‘piece by piece’. The encounter-history reflects
a mark-recapture experiment.

Superficially, the encounter histories file is structurally quite simple. It consists of an ASCII (text)
file, consisting of the encounter history itself (the contiguous string of dummy variables), followed by
one or more additional columns of information pertaining to that history. Each record (i.e., each line)

Chapter 2. Data formatting: the input file . . .


2.1. Encounter histories formats 2-3

in the encounter histories file ends with a semi-colon. Each history (i.e., each line, or record) must be
the same length (i.e., have the same number of elements – the encounter history itself must be the
same length over all records, and the number of elements ‘to the right’ of the encounter history must
also be the same) – this is true regardless of the data type. The encounter histories file should have a
.INP suffix (for example, EXAMPLE1.INP). Generally, there are no other ‘control statements’ or ‘PROC
statements’ required in a MARK input file. However, you can optionally add comments to the .INP file
using the ‘slash-asterisk asterisk/slash’ convention common to many programming environments – we
have included a comment at the top of the example input file (shown at the bottom of the preceding
page). The only thing to remember about comments is that they do not end with a semi-colon.
Let’s look at each record (i.e., each line) a bit more closely. In this example, each encounter history
is followed by a number. This number is the frequency of all individuals having a particular encounter
history. This is not required (and in fact isn’t what you want to do if you’re going to consider individual
covariates – more on that later), but is often more convenient for large data sets. For example, the
summary encounter history

110000101 4;

could also be entered in the .INP files as

110000101 1;
110000101 1;
110000101 1;
110000101 1;

Note again that each line – each ‘encounter history record’ – ends in a semi-colon. How would you
handle multiple groups? For example, suppose you had encounter data from males and females? In
fact, it is relatively straightforward to format the .INP file for multiple groups – very easy for summary
encounter histories, a bit less so for individual encounter histories. In the case of summary encounter
histories, you simply add a second column of frequencies to the encounter histories to correspond to
the other sex. For example,

110100111 23 17;
110000101 4 2;
101100011 1 3;

In other words, 23 of one sex and 17 of the other have history ‘110100111’ (the ordering of the sexes –
which column of frequencies corresponds to which sex – is entirely up to you). If you are using individual
records, rather than summary frequencies, you need to indicate group association in a slightly less-
obvious way – you will have to use a ‘0’ or ‘1’ within a group column to indicate the frequency – but
obviously for one group only.
We’ll demonstrate the idea here. Suppose we had the following summary history, with frequencies
for males and females (respectively):

110000101 4 2;

In other words, 4 males, and 2 females with this encounter history (note: the fact that males come
before females in this example is completely arbitrary. You can put whichever sex – or ‘group’ – you want
in any column you want – all you’ll need to do is remember which columns in the .INP file correspond
to which groups).

Chapter 2. Data formatting: the input file . . .


2.1.1. Groups within groups... 2-4

To ‘code’ individual encounter histories, the .INP file would be modified to look like:

110000101 1 0;
110000101 1 0;
110000101 1 0;
110000101 1 0;
110000101 0 1;
110000101 0 1;

In this example, the coding ‘1 0’ indicates that the individual is a male (frequency of 1 in the male
column, frequency of 0 in the female column), and ‘0 1’ indicates the individual is a female (frequency
of 0 in the male column, and frequency of 1 in the male column). The use of one-record per individual
is only necessary if you’re planning on using individual covariates in your analysis.

2.1.1. Groups within groups...

In the preceding example, we had 2 groups: males and females. The frequency of encounters for each
sex is coded by adding the frequency for each sex to the right of the encounter history.
But, what if you had something like males, and females (i.e., data from both sexes) and good colony
and poor colony (i.e., data were sampled for both sexes from each of 2 different colonies – one classified
as good, and the other as poor). How do you handle this in the .INP file? Well, all you need to do is have
a frequency column for each (sex.colony) combination: one frequency column for females from the
good colony, one frequency column for females from the poor colony, one frequency column for males
from the good colony, and finally, one frequency column for males from the poor colony. An example
of such an .INP file is shown below:

As we will see in subsequent chapters, building models to test for differences between and among
groups, and for interactions among groups (e.g., an interaction of sex and colony in this example) is
relatively straightforward in MARK – all you’ll really need to do is remember which frequency column
codes for which grouping (hence the utility of adding comments to your .INP file, as we’ve done in this
example).

2.2. Removing individuals from the sample

Occasionally, you may choose to remove individuals from the data set at a particular sampling occasion.
For example, because your experiment requires you to remove the individual after its first recapture,
or because it is injured, or for some other reason. The standard encounter history we have looked at so
far records presence or absence only. How do we accommodate ‘removals’ in the .INP file? Actually,
it’s very easy – all you do is change the ‘sign’ on the frequencies from positive to negative. Negative
frequencies indicates that that many individuals with a given encounter history were removed from
the study.

Chapter 2. Data formatting: the input file . . .


2.3. missing sampling occasions + uneven time-intervals 2-5

For example,

100100 1500 1678;


100100 -23 -25;

In this example, we have 2 groups, and 6 sampling occasions. In the first record, we see that there were
1,500 individuals and 1,678 individuals in each group marked on the first occasion, not encountered on
the next 2 occasions, seen on the fourth occasion, and not seen again. In the second line, we see the same
encounter history, but with the frequencies ‘-23’ and ‘-25’. The negative values indicate to MARK that
23 and 25 individuals in both groups were marked on the first occasion, not seen on the next 2 occasions,
were encountered on the fourth occasion, at which time they were removed from the study. Clearly, if
they were removed, they cannot have been seen again. So, in other words, 1,500 and 1,678 individuals
recaptured and released alive, on the fourth occasion, in addition to 23 and 25 individuals that were
recaptured, but removed, on the fourth occasion. So, (1,500 + 23)  1,523 individuals in group 1, and
(1,678 + 25)  1,703 individuals in group 2, with encounter history ‘100100’.
Note: the ‘-1’ code is for removing individuals from the live marked population. This is usually
reserved for losses on capture (i.e., where the investigator captures an animal, and then, for some reason,
decides to remove it from the study). The idea is that you don’t want to include these ‘biologist-caused’
mortalities in the survival estimate.
On the other hand, if the known mortalities are natural (i.e., the investigator encounters a ‘dead
recovery’), and are not associated with the capture event itself, you have two options to get unbiased
survival estimates

1. pretend you never observed the mortalities (i.e., just treat those individuals as regular
releases that you never observe again). This approach is probably reasonable if the number
of such mortalities is relatively small.
2. conduct a joint live recapture-dead recovery analysis with these 6 individuals treated as
dead recoveries (see Chapter 9). Including the known mortalities (i.e., dead recoveries) will
improve precision of your survival estimates.

2.3. missing sampling occasions + uneven time-intervals

In the preceding, we have implicitly assumed that the sampling interval between sampling occasions is
identical throughout the course of the study (e.g., sampling every 12 months, or every month, or every
week). But, in practice, it is not uncommon for the time interval between occasions to vary – either by
design, or because of ‘logistical constraints’. In the extreme, you might even miss a sampling occasion
altogether. This has clear implications for how you analyze your data.
For example, suppose you sample a population each October, and again each May (i.e., two samples
within a year, with different time intervals between samples; October → May (7 months), and May →
October (5 months)). Suppose the true monthly survival rate is constant over all months, and is equal
to 0.9. As such, the estimated survival for October → May will be 0.97  0.4783, while the estimated
survival rate for May → October will be 0.95  0.5905. Thus, if you fit a model without accounting
for these differences in time intervals, it is clear that there would ‘appear’ to be differences in survival
between successive samples, when in fact the monthly survival does not change over time.
So, how do you ‘tell MARK’ that the interval between samples may vary over time? In fact, depending
on the particular situation, there are a couple of things you can do. First, consider the situation where
you sample at all occasions, but where the interval between some of those occasions is not equal. You

Chapter 2. Data formatting: the input file . . .


2.3. missing sampling occasions + uneven time-intervals 2-6

might think that you need to ‘code’ this interval information in the .INP file in some fashion. In fact,
you don’t – you specify the time intervals when you are specifying the data type in MARK, and not in
the .INP file. In the .INP file, you simply enter the encounter histories as contiguous strings, regardless
of the true interval between sampling occasions.
Alternatively, what if you’re missing a sampling occasion altogether? For example, suppose you have
a 5 occasion study, but for some reason, were unable to sample on the third occasion. This situation has
2 implications. First, the encounter probability on third occasion is logically 0. Second, in the absence
of encounter data from the third occasion, the transition estimate for individual alive and in the sample
at occasion 2 would be from occasion 2 → 4, where 4 is the next sampling occasion in the data, and not
2 → 3.

1 2 3 4 5
occasion 1 occasion 2 <missing> occasion 4 occasion 5

So, in effect, missing a sampling occasion altogether is strictly equivalent to an unequal interval, at
least with respect to estimating interval transition parameters, like survival.
However, it is quite different in terms of modeling the encounter probabilities. For simple unequal
intervals, where all occasions are sampled (at unequal intervals), there is a parameter estimated for each
encounter occasion. For missing sampling occasions, you need to account for both the unequal interval
that is generated as an artifact of the missing occasion, and the fact that the encounter probability is
logically 0 for the missing occasion.
In terms of formatting the data for the situation where a sampling occasion is missing, you have
a couple of choices. Consider the following example encounter histories for our proposed 5 occasion
study, where the third sampling occasion was missed:

11011 12;
10011 7;
01001 3;

Note that for all 3 histories, the missing third occasion is represented by a ‘0’. While technically
correct, since you have explicitly entered a ‘0’ for the third occasion, you will need to remember to fix
the encounter parameter for that occasion to ‘0’.
An alternative approach which is available for some data types, you can indicate missing occasions
explicitly in the encounter history by using a ‘.’ (‘dot’) – see the ‘Help | Data Types menu option in
MARK for a full list of data types that allow the ‘dot’ notation’ in the encounter history for missing
occasions).
For example, we would modify the preceding example histories as follows:

11.11 12;
10.11 7;
01.01 3;

There are some important considerations for handling missing occasions for data types where the
transitions between occasions involve multiple states (e.g., multi-state models, robust design models –
see the relevant discussion in those chapters). We discuss general approaches to handling uneven time-
intervals and missing sampling occasions in detail in Chapter 4.

Chapter 2. Data formatting: the input file . . .


2.4. Different encounter history formats 2-7

2.4. Different encounter history formats

Up until now, we’ve more or less used typical mark-recapture encounter histories (i.e., capture histories)
to illustrate the basic principles of constructing an .INP file. However, MARK can be applied to far more
than mark-recapture analysis, and as such, there are a number of slight permutations on the encounter
history that you need to be aware of in order to use MARK to analyze your particular data type. First,
we summarize in table form (below) the different data types MARK can handle, and the corresponding
encounter history format.

recaptures only LLLL


recoveries only LDLDLDLD
both LDLDLDLD
known fate LDLDLDLD
closed captures LLLL
BTO ring recoveries LDLDLDLD
robust design LLLL
both (Barker model) LDLDLDLD
multi-strata LLLL
Brownie recoveries LDLDLDLD
Jolly-Seber LLLL
Huggins’ closed captures LLLL
Robust design (Huggins) LLLL
Pradel recruitment LLLL
Pradel survival & seniority LLLL
Pradel survival & λ LLLL
Pradel survival & recruitment LLLL
POPAN LLLL
multi-strata - live and dead encounters LDLDLDLD
closed captures with heterogeneity LLLL
full closed captures with heterogeneity LLLL
nest survival LDLDLDLD
occupancy estimation LLLL
robust design occupancy estimation LLLL
open robust design multi-strata LLLL
closed robust design multi-strata LLLL

Each data type in MARK requires a primary from of data entry provided by the encounter history.
Encounter histories can consist of information on only live encounters (LLLL) or information on both
live and dead (LDLDLDLD). In addition, some types allow a summary format (e.g., recovery matrix) which
reduces the amount of input. The second column of the table shows the basic structure for a 4 occasion
encounter history. There are, in fact, broad types: live encounters only, and mixed live and dead (or
known fate) encounters.
For example, for a recaptures only study (i.e., live encounters), the structure of the encounter history
would be ‘LLLL’ – where ‘L’ indicates information on encountered/not encountered status. As such, each

Chapter 2. Data formatting: the input file . . .


2.5. Some more examples 2-8

‘L’ in the history would be replaced by the corresponding ‘coding variable’ to indicate encountered or
not encountered status (usually ‘1’ or ‘0’ for the recaptures only history). So, for example, the encounter
‘1011’ indicates seen and marked alive at occasion 1, not seen on occasion 2, and seen again at both
occasion 3 and occasion 4.
For data types including both live and dead individuals, the encounter history for the 4 occasion
study is effectively ‘doubled’ – taking the format ‘LDLDLDLD’, where the ‘L’ refers to the live encountered
or not encountered status, and the ‘D’ refers to the dead encountered or not encountered status. At each
sampling occasion, either ‘event’ is possible – an individual could be both seen alive at occasion (i) and
then found dead at occasion (i), or during the interval between (i) and (i+1). Since both ‘potential events’
need to be coded at each occasion, this effectively doubles the length of the encounter history from a 4
character string to an 8 character string.
For example, suppose you record the following encounter history for an individual over 4 occasions
– where the encounters consist of both live encounters and dead recoveries. Thus, the history ‘10001100’
reflects an individual seen and marked alive on the first occasion, not recovered during the first interval,
not seen alive at the second occasion and not recovered during the second interval, seen alive on the
third occasion and then recovered dead during the third interval, and not seen or recovered thereafter
(obviously, since the individual was found dead during the preceding interval).

2.5. Some more examples

The MARK help files contain a number of different examples of encounter formats. We list only a few
of them here. For example, suppose you are working with dead recoveries only. If you look at the table
on the preceding page, you see that it has a format of ‘LDLDLDLD’. Why not just ‘LLLL’, and using ‘1’ for
live’, and ‘0’ for recovered dead? The answer is because you need to differentiate between known dead
(which is a known fate) , and simply not seen. ‘0’ alone could ambiguously mean either dead, or not
seen (or both!).

2.5.1. Dead recoveries only

The following is an example of dead recoveries only, because a live animal is never captured alive after
its initial capture. That is, none of the encounter histories have more than one ‘1’ in an L column. This
example has 15 encounter occasions and 1 group. If you study this example, you will see that 500 animals
were banded each banding occasion.

000000000000000000000000000010 465;
000000000000000000000000000011 35;
000000000000000000000000001000 418;
000000000000000000000000001001 15;
000000000000000000000000001100 67;
000000000000000000000000100000 395;
000000000000000000000000100001 3;
000000000000000000000000100100 25;
000000000000000000000000110000 77;

Traditionally, recoveries only data sets were summarized into what are known as recovery tables.
MARK accommodates recovery tables, which have a ‘triangular matrix form’, where time goes from left
to right (shown at the top of the next page). This format is similar to that used by Brownie et al. (1985).

Chapter 2. Data formatting: the input file . . .


2.5.2. Individual covariates 2-9

7 4 1 0 1;
8 5 1 0;
10 4 2;
16 3;
12;
99 88 153 114 123;

Following each matrix is the number of individuals marked each year. So, 99 individuals marked
on the first occasion, of which 7 were recovered dead during the first interval, 4 during the second, 1
during the third, and so on.

2.5.2. Individual covariates

Finally, an example (below) of known fate data, where individual covariates are included. Comments are
given at the start of each line to identify the individual (this is optional, but often very helpful in keeping
track of things). Then comes the capture history for this individual, in a ‘LDLDLD. . .’ sequence. Thus the
first capture history is for an animal that was released on occasion 1, and died during the interval. The
second animal was released on occasion 1, survived the interval, released again on occasion 2, and died
during this second interval. Following the capture history is the count of animals with this history
(always 1 in this example). Then, 4 covariates are provided. The first is a dummy variable representing
age (0=subadult, 1=adult), then a condition index, wing length, and body weight.

/* 01 */ 1100000000000000 1 1 1.16 27.7 4.19;


/* 04 */ 1011000000000000 1 0 1.16 26.4 4.39;
/* 05 */ 1011000000000000 1 1 1.08 26.7 4.04;
/* 06 */ 1010000000000000 1 0 1.12 26.2 4.27;
/* 07 */ 1010000000000000 1 1 1.14 27.7 4.11;
/* 08 */ 1010110000000000 1 1 1.20 28.3 4.24;
/* 09 */ 1010000000000000 1 1 1.10 26.4 4.17;

What if you have multiple groups, such that individuals are assigned (or part of) a given group, and
where you also have individual covariates? There are a couple of ways you could handle this sort of
situation. You can either code for the groups explicitly in the ..INP file, or use an individual covariate
for the groups. There are pros and cons to either approach (this issue is discussed in Chapter 11).
Here is an snippet from a data set with 2 groups coded explicitly, and an individual covariate. In
this data fragment, the first 8 contiguous values represent the encounter history, followed by 2 columns
representing the frequencies depending on group: ‘1 0’ indicating group 1, and ‘0 1’ indicating group
2, followed by the value of the covariate:

11111111 1 0 123.211;
11111111 0 1 92.856;
11111110 1 0 122.115;
11111110 1 0 136.460;

So, the first record with an encounter history of ‘11111111’ is in group 1, and has a covariate value
of 123.211. The second individual, also with an encounter history of ‘11111111’, is in group 2, and has a
covariate value of 92.856. The third individual has an encounter history of ‘11111110’, and is in group
1, with a covariate value of 122.115. And so on.

Chapter 2. Data formatting: the input file . . .


2.5.2. Individual covariates 2 - 10

If you wanted to code the group as an individual covariate, this same input file snippet would look
like:

11111111 1 1 123.211;
11111111 1 0 92.856;
11111110 1 1 122.115;
11111110 1 1 136.460;

In this case, following the encounter history, is a column of 1’s, indicating the frequency for each
individual, followed by a column containing a 0/1 dummy code to indicate group (in this example,
we’ve used a 1 to indicate group 1, 0 to indicate group 2), followed by the value of the covariate.
A final example – for three groups where we code for each group explicitly (such that each group
has it’s own ‘dummy column’ in the input file), an encounter history with individual covariates might
look like:

11111 1 0 0 123.5;
11110 0 1 0 99.8;
11111 0 0 1 115.2;

where the first individual with encounter history ‘11111’ is in group 1 (dummy value of 1 in the first
column after the encounter history, and 0’s in the next two columns) and has a covariate value of 123.5,
second individual with encounter history ‘11110’ is in group 2 (dummy code of 0 in the first column, 1
in the second, and 0 in the third) and a covariate value of 99.8, and a third individual with encounter
history ‘11111’ in group 3 (0 in the first two columns, and a 1 in the third column), with a covariate
value of 115.2.
As is noted in the help file (and discussed at length in Chapter 11), it is helpful to scale the values
of covariates to have a mean on the interval [0, 1] to ensure that the numerical optimization algorithm
finds the correct parameter estimates. For example, suppose the individual covariate ‘weight’ is used,
with a range from 1,000 g to 5,000 g. In this case, you should scale the values of weight to be from
0.1 to 0.5 by multiplying each ‘weight’ value by 0.0001. In fact, MARK defaults to doing this sort of
scaling for you automatically (without you even being aware of it). This ‘automatic scaling’ is done by
determining the maximum absolute value of the covariates, and then dividing each covariate by this
value. This results in each column scaled to between -1 and 1. This internal scaling is purely for purposes
of ensuring the success of the numerical optimization – the parameter values reported by MARK (i.e.,
in the output that you see) are ‘back-trasformed’ to the original scale. Alternatively, if you prefer that
the ‘scaled’ covariates have a mean of 0, and unit variance (this has some advantages in some cases), you
can use the ‘Standardize Individual Covariates’ option of the ‘Run Window’ to perform the default
standardization method (more on these in subsequent chapters).
More details on how to handle individual covariates in the input file are given in Chapter 11.

Summary

That’s it! You’re now ready to learn how to use MARK. Before you leap into the first major chapter
(Chapter 3), take some time to consider that MARK will always do its ‘best’ to analyze the data you
feed into it. However, it assumes that you will have taken the time to make sure your data are correct.
If not, you’ll be the unwitting victim to perhaps the most telling comment in data analysis: ‘garbage
in...garbage out’. Take some time at this stage to make sure you are confident in how to properly create
and format your files.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 11

Addendum: generating .inp files

As noted at the outset of this chapter, MARK has no capability of generating input (.INP) files. This is
something you will need to do for yourself. In this short addendum, we introduce two approaches to
generating .INP files, the first based on ‘Excel pivot tables’, and the second using R. Since there are any
number of different software applications for managing and manipulating data, we state for the record
that we are going to demonstrate creating .INP files using Excel or R, not as a point of advocacy for
using Excel, but owing more to the near ubiquity of one or both software applications.
We also assume that in most cases, the data file that will need to be ‘transformed’ into a MARK-
compatible .INP file consists of what we call a ‘vertical array’ of data – where each individual (identified
by a band or tag number) encounter is entered as a separate row in the data file. One row for each
encounter occasion. So, the minimum data file we consider has at least the band or tag number, and the
time of the encounter (say, the year of the encounter). We seek to ‘transpose’ (or ‘pivot’) this ‘vertical
array’ into a ‘horizontal’ encounter history.

Using Excel

Andrew Sterner, Marine Turtle Research Group, University of Central Florida

We will demonstrate the basic idea using an example where we will reformat an Excel spreadsheet
containing some live encounter data (note: most of what follows applies generally to Access databases
as well). We wish to format these data into an .INP file. The data are contained in the Excel spreadsheet
csj-pivot.xlsx (note, we’re clearly using Excel 2007 or later). Here are what the data look like before
we transform them into an input file.
The file consists of two data columns: TAG (indicating
the individual), and YEAR (the year that the individual was
encountered). This data file contains encounter data for 15
marked individuals, with encounter data collected from 2000 to
2010 (thus, the encounter history will be 11 characters in length).
Our challenge, then is to take this ‘vertical’ file (one record
per individual each year encountered), and ‘pivot’ it horizontally.
For example, take the first individual in the file, ATS150. It was
first encountered in 2000, again in 2002, and again (for the final
time) in 2003. The second individual, ATS151, was seen for the
first time in 2006, and then not seen again. The third individual,
ATS153, was seen in 2004, and not seen again after that. And so on.
If we had to generate the .INP file by hand for these individuals,
their encounter histories would look like:

/* ATS150 */ 10110000000 1;
/* ATS151 */ 00000010000 1;
/* ATS153 */ 00001000000 1;

As it turns out, we can make use of the ’pivot table’ in Excel


(and some simple steps involving ‘search and replace’ and the
CONCATENATE function), to generate exactly what we need. The
process can be more involved for more complicated data types
(e.g., robust design), but the basic principle of ‘pivoting’ applies.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 12

Here are the basic steps. First, we select the rows and
columns containing the data. Then, select ‘Insert | PivotTable
| PivotTable’, as shown to the right (make sure you select
PivotTable and not PivotChart).
This will bring up a dialog window (below) related to the
data you want to use, and where you want the pivot table to be
placed. We strongly recommend you put the pivot table into a
‘New Worksheet’ (this is selected by default).:

The ‘Table/Range’ field will already be filled with the rows and columns of the data you selected.
Once you click ‘OK’, you will be presented with the template from which you will generate the pivot
table:

All you really need to do at this point is specify the ‘row labels’, the ‘column labels’, and the ‘values’
(on the right hand side of the template).

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 13

So, to specify the row labels, we simply select ‘tag’

and then drag the ‘tag’ field down to the ‘row labels’ box at the bottom-right, if it isn’t there already
(this largely depends on which version of Excel you might have):

Then, do the same thing for the ‘Year’ field: select ‘Year’, and drag it down to the ‘column labels’
box.

Once you have done this, you will quickly observe that a table (the ‘pivot table’) has been inserted
into the main body of the template (below):

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 14

The table has row labels (individual tag numbers) and column labels (the years in your data file), plus
some additional rows and columns for ‘Grand total’ (reflecting the fact that pivot tables were designed
primarily for business applications).
However, at present, there is nothing in the table (all of the cells are blank). Again, this may differ
depending on the version of Excel you’re using. Now, drag the ‘year’ field label down to the ‘values’
box in the lower right-hand corner.

What we see is that the year during which an encounter has occurred for a given individual has been
entered explicitly into the table, in the column corresponding to that year.
But, for an encounter history, we want a ‘1’ to indicate the encounter year, not the year itself, and a
‘0’ to indicate a year when an encounter did not occur. Achieving the first objective is easy. Simply pull
down the ‘Sum of Year’ menu, and select ‘Value Field Settings...’:

Then, switch the ‘Summarize value field by’ selection from ‘Sum’ to ‘Count’, as shown on the top of
the next page.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 15

As soon as you do this, then all of the years in the pivot table will be changed to 1. Why? Simple – all
you’ve told the pivot table to do is count the number of times a year occurs in a given cell. Since the data
file contains only a single record for each individual for each year it was encountered, then it makes
sense that the tabulated ‘Count’ should be a 1. Moreover, now the ‘Grand Total’ rows and columns have
some relevance – they indicate the number of individuals encountered in a given year (column totals),
or the number of times a given individual was caught over the interval from 2000 to 2010 (row totals).
OK, on to the next step – putting a ‘0’ in the blank cells for those years when an individual wasn’t
caught. This sounds easy enough in principle – a reasonable approach would be to select the rows and
columns, and execute a ‘search and replace’, replacing blank cells with ‘0’. In fact, this is exactly what
we want to do. However, for various reasons, you can’t actually edit a pivot table. What you need to do
first is select and copy the rows and columns (including the row labels, but excluding row and column
totals), and paste them into a new worksheet. Then, simply do a ‘Find & Select’), replacing blanks
(simply leave the ‘Find what’ field empty) with a ‘0’.
The result is shown below:

(Alternatively, if you navigate to ‘PivotTable | PivotTable Name | Options’, you will see an option
to specify what an empty cell should show. Simply change it to a ‘0’).
We’re clearly getting closer. All that remains is to do the following. First, we remember that each line
of the encounter history file must end with a frequency – where each line in the file corresponds to a
single individual, then this frequency is simply ‘1;’. So, we simply enter ‘1;’ into column K, and copy it
down for as many rows as there are in the data (there are a number of ways to copy a value down a set
of rows – we’ll assume here you know of at least one way to do this).

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 16

Now, for a final step – we ultimately want an encounter history (.INP file) where the encounters
form a contiguous string (i.e., no spaces). We can achieve this relatively easily by using the CONCATENATE
function in Excel. Simply click the top-most cell in the next empty column (column N in our example),
and then go up into the function box, and enter

=CONCATENATE("/* ",A1," */ ",B1,C1,D1,E1,F1,G1,H1,I1,J1," ",K1)

In other words, we want to ‘concatenate’ (merge together without spaces), various elements – some
from within the spreadsheet, others explicitly entered (e.g., the delimiters for comments, so we can
include the tag information, and some spacer elements).
Once you execute this cell macro, you can copy it down in column N over all rows in the file. If you
manage to do this correctly, you will end up with a spreadsheet looking like the one below:

All that remains is to select column N (which contains the formatted, concatenated encounter histo-
ries), and paste them into an ASCII text file. (A reminder here that you should avoid – as in ‘like the
plague’ – using Word or Notepad as your ASCII editor. Do yourself a favor and get yourself a real ASCII
editor.

Sampling occasions with no data...

Suppose that you have no encounter data for a particular sampling occasion (meaning, no newly
marked individuals, or encounters with previously marked individuals). Although not common, such
a circumstance can arise for various reasons.
Let’ consider how to handle this sort of situation, using a subset of the of the data contained in the
csj-pivot.xlsx spreadsheet we used in the preceding section. Here, we’ll simply drop the encounter
data for the individual with the tag ‘ATS207’. We’ll call this edited spreadsheet csj-pivot_missing.xlsx.
After dropping individual ‘ATS207’, we now have 14 individuals.
What about the range of years? If you look at csj-pivot_missing.xlsx, we see that the first encounter
year is 2000, and the asst encounter year is still 2010. So, our encounter histories should still be 11
characters long.
We’ll run through the ‘pivot table’ approach we just demonstrated in the preceding section, to see if
in fact it works with the edited spreadsheet. We’ll skip most of the various steps (which are identical to
what was described earlier), and simply look at the pivot table (shown at the top of the next page).

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 17

Superficially, looks very much the same as the pivot table generated for the full spreadsheet data. Ah,
but look closely. Notice that there are columns for the years 2000 → 2006, and 2009 → 2010. But, there
are no columns for 2007 and 2008! Why? Simple – with individual ‘ATS207’ dropped from the data set,
there simply are no encounter data for 2007 and 2008.
In fact, this is a significant issue. If we hadn’t noticed the ‘missing years’, and simply proceeded
with the remaining steps, we’d end up with encounter histories that ‘look’ correct, but which would
be entirely wrong. The only clue we’d have, if we were paying attention, is that instead of being 11
characters long, they’d be only 9 characters long – because of the two missing years.
So, what do you do to deal with this problem? There are at least 2 options. First, you can simply
copy the pivot table to a new sheet, and insert 2 new columns, for 2007 and 2008, respectively (as shown
below – we’ve indicated the two new columns using a red font for the column labels):

For the ‘missing years’, you can choose to either (i) fill them with 0’s, along with the other zero cells of
the table, or (ii) fill the columns for 2007 and 2008 with ‘dots’ (.). For some data types (see ‘Help | Data
Types menu option in MARK), the ‘dot’ is interpreted as a missing occasion, which can be somewhat
more convenient that manually setting the encounter probability to 0 for those years.
Alternatively, you can enter a ‘fake’ individual into your spreadsheet right at the start, before you
begin constructing the pivot table. For this ‘fake’ individual, you simply add as many rows (where each
row corresponds to a sampling occasion) as you have occasions over the whole data set (in this example,
11 occasions, from 2000 → 2010).

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 18

So, for this example,

Why do this? By introducing this ‘fake’ individual, where there is an encounter record (i.e., row) for
every possible year in the data set. If you think about it, this should ensure that when we ‘pivot’ the
spreadsheet there is a column for all years. If we try it, this is exactly what we see:

So, we now simply select the ‘real data’ from the pivot table (as shown below), and copy it out to a
new sheet, and proceed from there.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 19

Both approaches are fairly straightforward to implement, and successfully address the problem of
‘missing occasions’. If there is an advantage of the ‘fake individual’ approach, it is because it is easy to
implement as a general practice. Meaning, for any data set, you simply (i) scroll to the bottom, and (ii)
add encounter rows for each occasion in the study for a ‘fake individual’, (iii) run through the process
of pivoting the spreadsheet, and then (iv) copy the ‘real data’ to a new sheet for the final processing.

Other data types

Here we will consider 2 other data types, the robust design, and multi-state. Clearly, there are more
data types in MARK, but these two represent very common data types, and if you understand steps in
formatting .INP files for these two data types, you’ll more than likely be able to figure out other data
types on your own.

multi-state

Here we will demonstrate formatting an .INP file for a multi-state data set (see Chapter 10). The
encounter data we will use are contained in the Excel spreadsheet MS-pivot.xlsx. The file consists
of 3 columns: TAG (indicating the individual), YEAR (the year the individual was encountered), and the
STATE (for this example, there are 3 possible states: F, U, N).
We start by noting that STATE is a character (i.e., a
letter). This might seem perfectly reasonable, since the
most appropriate state name (indicator) might be a
character. Unfortunately, Excel can’t handle characters
in the table cells when you pivot the table. As such, you
first need to (i) select the column containing the state
variable, (ii) copy this into the first empty column, and
(iii) execute a ‘Find and Replace’ in this column, such
that you change F → 1, U → 2, and N → 3. Once finished,
your Excel spreadsheet shoot look something like what
is shown to the right.
Next, select the data, and insert a Pivot Table into a new sheet in the spreadsheet. Drag TAG to the
‘Row Labels’ box, YEAR to the ’Column Labels’ box, and State (numeric) to the ‘Values’ box:

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 20

Next, copy the TAGS, YEARS and table values to a new worksheet. Then ‘Find and Replace’ all the
blank cells with zeros. At this point, you have a decision to make: you can either (i) ‘Find and Replace’
the states from numeric back to their original character values (i.e., 1 → F, 2 → U and 3 → N), or (ii) leave
the states numeric, and simply inform MARK what the states mean. For this example, we’ll ‘Find and
Replace’ the states from numeric back to their original character values. Finally, add a column of ‘1;’
to the new worksheet.
Then click the top-most cell in the next empty column (column L in our example), and then go up
into the function box, and enter
=CONCATENATE("/* ",A1," */ ",B1,C1,D1,E1,F1,G1,H1,I1,J1,K1,L1," ",M1)

In other words, we want to ‘concatenate’ (merge together without spaces), various elements – some
from within the spreadsheet, others explicitly entered (e.g., the delimiters for comments, so we can
include the tag information, and some spacer elements). Once you execute this cell macro, you can
copy it down in column L over all rows in the file. The final worksheet should look something like the
one shown at the top of the next page. At this point, you simply copy your concatenated encounter
histories from column N into an editor, and save into an .INP file.

robust design

For our final example, we consider formatting an .INP file for a robust design analysis (the robust design
is covered in Chapter 15). In brief, the robust design combines closed population samples embedded
(nested) within open population samples. Consider the following figure:

time
open open

primary
samples 1 2 3

secondary
samples 1 2 ... k1 1 2 ... k2 1 2 ... k3

closure closure closure

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 21

As shown, there are 3 ‘open population’ samples (known as primary period samples). Between open
samples, population abundance can change due to emigration, death, immigration or birth. Within each
open sample period are embedded k ‘closed population’ (or secondary) samples. The trick here is to
encode the encounter history taking into account the presence of both primary and secondary samples
(where the number of secondary samples may vary among primary samples). As you might expect, the
greater complexity of the RD encounter file might require a somewhat higher level of Excel proficiency
than the first two examples we discussed earlier.
In this example (data in RD-pivot.xlsx), we assume primary samples from 2000-2010. Within each
primary period, we have 4 secondary samples, which occur from May 1 to May 15 (secondary sample
1), May 16 to May 30 (secondary sample 2), June 1 to June 15 (secondary sample 3), and June 16 to June
30 (secondary sample 4). For each secondary sample, an encountered individual is recorded only once.
We imagine that your data are stored in the following way. For each individual (TAG), for each primary
sample (YEAR), you have a series of columns, one for each secondary sampling period.

For example, in the preceding figure, we see that individual with tag ‘ATS150’ was observed during
primary sample, 2000, 2001, 2002, 2003, 2009, and 2010. In 2000, the individual was not observed during
the first secondary sample (May 1 to May 15), was observed during the second secondary sample (May
16 to May 30), was not observed during the third secondary sample (June 1 to June 15), and was observed
during the fourth and final secondary sample (June 16 to June 30). In contrast, in 2010, the individual
with tag ‘ATS1150’ was observed in all 4 secondary samples.
Now, you may be wondering why we’ve entered dates in terms of 2012, even for primary encounter
years <2012. For example, for ‘ATS150’, we enter ‘5/30/12’ as the date for the encounter during the second
secondary sample period. We need to do this in order to make use of some very handy Excel functions.
For example, consider the ‘year’ function. This function extracts the year associated with a given date
(such that if you type in ‘=year(B2)’ and B2 is a date, it will return the year associated with that date. So,
for robust design data, you may have intervals (for a secondary sample period) spanning from 5/1/12
to 5/15/12, and you want to know if the encounter date falls between them.
All you need to do is

• use the AND function to determine if a date falls within a given range. For example, in cell
H3 in the spreadsheet, we enter
=AND(D3>=H1,D#<=H2)

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 22

• What you are asking Excel is: “Is D3 (my date of capture) greater than or equal to my first
date, 5/1/12, and less than or equal to 5/15/12”. We do the same thing for each of the other
3 secondary sample periods.
• This may seem a bit odd at first but keep in mind that Excel treats all dates as a number of
days since January 1, 1900 or 1904 (depending on which version of Excel you are using)
• The AND function will return a TRUE value is the criteria in the parenthesis are met or a
FALSE value if they are not
• Once you have got all of your TRUE and FALSE values copy them into a separate set of
columns. Note that instead of just ‘paste’or ‘ctrl+v’, you want to right click and ‘paste
special’ and select the ‘Values’ box. This tells Excel to just give you the displayed number
text or whatever appears in the box without any of the underlying formulas.
• Now you can ‘Find and Replace’ TRUE with 1 and FALSE with 0

These steps (and cell macros) are shown in worksheet ‘RD within season period trick’. At this
point, you will se something that look like

At this point, the remaining steps are similar to the same steps we used for CJS and MS data types
(as described earlier). You simply

1. copy the the data to a new worksheet (shown in ‘capture data-robust design’)
2. Select the data, then ‘Insert | Pivot Table | Pivot Table’
3. Drag Tag to ’Row Label’, Year to ‘Column Label’
4. Now here is another difference for the RD: there are multiple occasions per year. So just
drag each one to the values box in the order that they occur!
5. concatenate into a contiguous encounter history, and you’re done. Have a look at the
worksheet ‘RD Input Construction’ for what it should look like.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 23

Using R

Brandon Merriell & Cory Snyder, Environmental and Sustainability Sciences, Cornell University

We will demonstrate the basic idea using an example where we will reformat an existing Excel
spreadsheet containing some live encounter data (note: most of what follows applies generally to
Access databases as well). We wish to format these data into an .INP file. The data are contained in
the Excel spreadsheet csj-pivot.xlsx (the same example spreadsheet used in the preceding section
demonstrating the use of a ‘pivot table’ in Excel). Here are what the data look like before we transform
them into an input file:
The file consists of two data columns: TAG (indicating
the individual), and YEAR (the year that the individual was
encountered). This data file contains encounter data for 15
marked individuals, with encounter data collected from 2000 to
2010 (thus, the encounter history will be 11 characters in length).
Our challenge, then is to take this ‘vertical’ file (one record
per individual each year encountered), and ‘pivot’ it horizontally.
For example, take the first individual in the file, ATS150. It was
first encountered in 2000, again in 2002, and again (for the final
time) in 2003. The second individual, ATS151, was seen for the
first time in 2006, and then not seen again. The third individual,
ATS153, was seen in 2004, and not seen again after that. And so
on.

If we had to generate the .INP file by hand for these individuals, their encounter histories would look
like:

/* ATS150 */ 101100000 1;
/* ATS151 */ 000000100 1;
/* ATS153 */ 000010000 1;

Accomplishing this is relatively straightforward in R, using the capabilities of the reshape package
(Wickham & Hadley 2007). Basically, the reshape package lets you “melt” data so that each row is a
unique id-variable combination. Then you “cast” the melted data into any shape you would like (see
the package documentation for details, and examples).
As noted in the preceding section on using a pivot table in Excel, one technical ‘hassle’ you might face
is how to handle occasions (say, years in this example) where there are simply no data (i.e., occasions
with no initial capture-marking events, and no reencounter events of previously marked individuals).
The ‘challenge’ is to find a way to (i) identify the missing occasions (years, for this example), (ii) add
encounter columns consisting of all 0’s for those missing years, and (iii) make sure the resulting set of
encounter columns is ordered in the correct chronological (ordinal) sequence.
There are probably several ways to accomplish this in R. A ‘brute force’ approach, as introduced
in the preceding section on using Excel to format a CJS .INP file using pivot tables, is to simply
introduce encounter data for a ‘fake individual’ before formatting the data. We’ll return to that in a
moment. For now, we’ll consider a ‘general approach’ using R, that automatically detects and accounts
for ‘missing occasions’. This approach is based on comparing the set (list) of occasions (years) that are in
the encounter file against a canonical list of all the occasions (years) that occur between the first and last

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 24

encounter years. We will first confirm that the script works for a data set without any missing occasions
(csj-pivot.xlsx), and then test the approach to handling ‘missing occasions’ using a subset of these
data, after dropping the encounter data for individual with tag ‘ATS207’ (csj-pivot_missing.xlsx).
Here is the script:

# simple code to generate encounter history .inp files for MARK from CJS data

# clean up a few things, and set wd


rm(list=ls(all=TRUE))
getwd()
setwd("C:/Users/markbook/Desktop")

# load package "reshape" first


library(reshape)

# read in the data - assumes .csv file minimially contains individual identification field
# called ’Tag’ and the encounter year as ’Year’. Following code assumes ’Tag’ is in the
# first column, and ’Year’ is in some other column (say, the second column).
data=read.csv("cjs-pivot.csv")
data=data.frame(data); data

# now, we reshape the data using melt


transform=melt(data, id.vars="Tag")
pivot=cast(transform, Tag ~ value)
pivot[is.na(pivot)]=0

# turns all non-zero matrix elements (for year, not tag) into 1
# following assumes Tag is in the first column of pivot
pivot[,2:ncol(pivot)][pivot[,2:ncol(pivot)] != 0] = 1

## now get everything ready to output the CH ##

# find length of history


lh <- max(data$Year)-min(data$Year)+2;

# following code needed to accommodate any missing years of data. Basic logic
# is to identify missing occasions by comparing columns in pivot table with
# a canonical list (occStr) of occasions (years) that should be in the table...

occStr <- seq(min(data$Year),max(data$Year),1); # vector of years you want


occStr <- as.character(occStr); # convert to character string
occStr <- c("Tag",occStr); # pre-pend tag onto occStr

Missing <- setdiff(occStr,names(pivot)) # Find names of missing columns


pivot[Missing] <- 0 # Add them, filled with ’0’s or dots
pivot <- pivot[occStr] # sort in ordinal sequence

#
# now do formatting of encounter history file
#

pivot$eh <- apply(pivot[2:lh],1,paste,collapse="") # concatenates encounter columns into eh


pivot[2:lh] <- NULL # drops individual encounter columns

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 25

# create commented tag


pivot$Tag <- paste("/*", pivot$Tag, "*/", sep=" ")

# sort by descending encounter histories


pivot <- pivot[order(pivot$eh,decreasing=TRUE),]

# tack on the frequency for the individual


pivot$end <- "1;";

# write out the input file


write.table(pivot,file="cjs-pivot.inp",sep=" ",quote=F,col.names=F,row.names=F);

The output file cjs-pivot.inp (shown below) is exactly what we want – the commented individual
tag, followed by the encounter history as a contiguous string length 11, consisting of 1’s and 0’s, followed
by a ‘1;’ to indicate the individual frequency:

Sampling occasions with no data...

Now, does the script work if there are ‘missing occasions’? We’ll look at the data contained in the spread-
sheet csj-pivot_missing.xlsx, which is a subset of these full dataset, after dropping the encounter data
for individual with tag ‘ATS207’. As a result, in this subset of the data, there are no encounter data for
2007 and 2008.
If we run through the first few lines of the R script, and have a look at the the first few rows of
the pivot dataframe before we have the script ‘look’ for the missing occasions, we see that the pivot
dataframe looks much the same as what we saw using Excel – and (looking closely), we see that we’re
missing occasions columns) for 2007 and 2008:

> pivot
Tag 2000 2001 2002 2003 2004 2005 2006 2009 2010
1 ATS150 2000 NA 2002 2003 NA NA NA NA NA
2 ATS151 NA NA NA NA NA NA 2006 NA NA
3 ATS153 NA NA NA NA 2004 NA NA NA NA
4 ATS155 NA 2001 NA NA NA 2005 NA 2009 2010
5 ATS156 NA NA NA NA NA NA 2006 NA NA
6 ATS157 2000 NA NA NA NA NA NA NA NA

OK, so how do we fix the problem? First, we’ll change all the non-zero encounter ‘years’ to 1’s, and
all the NA’s to 0’s.

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 26

Then, we have the script determine which ‘occasions’ are missing. We do this be generating a
‘canonical’ list of the years that should be in the file (2000 → 2010), which we call ‘occStr’:
occStr <- seq(min(data$Year),max(data$Year),1); # vector of years you want
occStr <- as.character(occStr); # convert to character string
occStr <- c("Tag",occStr);

Now, the key step – we use the setdiff function to compare the canonical list of occasions (occStr),
with the list of occasions that actually are in the data file (using names(pivot)). We then add the columns
to the pivot dataframe, set them to zero (we could use ‘dots’, if desired), and then sort them
Missing <- setdiff(occStr, names(pivot)) # Find names of missing columns
pivot[Missing] <- 0 # Add them, filled with ’0’s
pivot <- pivot[occStr] # sort in ordinal sequence

The resulting pivot dataframe now has columns for the missing occasions (years) 2007 and 2008 have
been inserted, made all 0’s, and then sorted in the correct ordinal order. Here are the first few rows:

> pivot
Tag 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
1 ATS150 1 0 1 1 0 0 0 0 0 0 0
2 ATS151 0 0 0 0 0 0 1 0 0 0 0
3 ATS153 0 0 0 0 1 0 0 0 0 0 0
4 ATS155 0 1 0 0 0 1 0 0 0 1 1
5 ATS156 0 0 0 0 0 0 1 0 0 0 0
6 ATS157 1 0 0 0 0 0 0 0 0 0 0

Pretty slick, eh? Note that for the ‘missing years’, you can choose to either (i) fill them with 0’s (as
shown above), or (ii) fill the columns for 2007 and 2008 with ‘dots’ (.). For some data types (see ‘Help
| Data Types menu option in MARK), the ‘dot’ is interpreted as a missing occasion, which can be
somewhat more convenient that manually setting the encounter probability to 0 for those years.
The remaining lines in the script handle the final formatting (concatenating the encounter data into
a contiguous string, adding a few bits like commented tags, and the ‘1;’ frequency column, and finally
outputting everything to the .INP file).
Of course, none of this is needed if you use the ‘fake individual approach’ noted earlier. As discussed
in the preceding section on using pivot tables in Excel, you simply enter a ‘fake’ individual into your
spreadsheet right at the start, before you begin ‘reshaping’ the dataframe. For this ‘fake’ individual, you
simply add as many rows (where each row corresponds to a sampling occasion) as you have occasions
over the whole data set (in this example, 11 occasions, from 2000 → 2010).
So, for this example,

Chapter 2. Data formatting: the input file . . .


Addendum: generating .inp files 2 - 27

Why do this? By introducing this ‘fake’ individual, where there is an encounter record (i.e., row) for
every possible year in the data set. If you think about it, this should ensure that when we ‘pivot’ the
spreadsheet there is a column for all years.
This is exactly what we see – if you look at the pivot dataframe before the script starts ‘looking’ for
missing occasions, you’ll see that including the ‘fake individual’ has already accomplished what we
needed – since all occasions (years) are now in the dataframe:

> pivot
Tag 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
1 ATS150 1 0 1 1 0 0 0 0 0 0 0
2 ATS151 0 0 0 0 0 0 1 0 0 0 0
3 ATS153 0 0 0 0 1 0 0 0 0 0 0
4 ATS155 0 1 0 0 0 1 0 0 0 1 1
5 ATS156 0 0 0 0 0 0 1 0 0 0 0
6 ATS157 1 0 0 0 0 0 0 0 0 0 0
7 ATS158 1 0 0 0 0 0 1 0 0 0 0
8 ATS159 0 0 0 1 0 0 1 0 0 0 0
9 ATS160 0 0 0 0 0 0 1 0 0 0 0
10 ATS161 0 0 0 0 0 0 1 0 0 0 0
11 ATS164 0 0 0 1 0 0 1 0 0 0 0
12 ATS165 1 0 0 0 0 0 1 0 0 0 0
13 ATS166 0 0 0 0 1 0 0 0 0 0 0
14 ATS167 0 1 1 0 0 0 0 0 0 0 0
15 fake 1 1 1 1 1 1 1 1 1 1 1

All that remains is to format the data frame, by concatenating the encounter data into a contiguous
string, adding a few bits like commented tags, and the ‘1;’ frequency column, outputting everything
to the .INP file, and then deleting the row corresponding to the ‘fake individual’.
So, the R code to ‘find and insert’ missing occasions as shown in the script would only be needed if
you didn’t first introduce the ‘fake individual’ into the original spreadsheet. Use whichever approach
you find easiest to understand, and modify.

Other data types

Here we will consider 2 other data types, the robust design, and multi-state. Clearly, there are more
data types in MARK, but these two represent very common data types, and if you understand steps in
formatting .INP files for these two data types, you’ll more than likely be able to figure out other data
types on your own.
Coming soon...

Chapter 2. Data formatting: the input file . . .


CHAPTER 3

First steps in using Program MARK. . .

In this chapter we will introduce the basic mechanics of running program MARK, using a small data set
consisting of 7 years of capture-recapture data on a small passerine bird, the European Dipper (Cinclus
cinclus). This data set is the same as that used in ‘Examples’ in Lebreton et al. (1992), and consists of
marking and recapture data from 294 breeding adults each year during the breeding period, from early
March to 1 June. All birds in the sample were at least 1 year old when initially banded. We’ll forgo
discussion of GOF (Goodness of Fit) testing for the moment, although we emphasize that, in fact, this
is the prerequisite step before you analyze your data. GOF testing is covered later.
Our main intent here is to show you the basics of running MARK, not to provide great detail on
‘why you are doing what you are doing’. Our experience has shown that perhaps the greatest initial
hurdle to using a new piece of software, especially one as sophisticated as MARK, is the ‘newness’ of
the interface, and the sheer number of options available to the user. In addition, we are starting with
the Dipper data set, since it has been extensively analyzed in several places, and will be very familiar
to many experienced users migrating to MARK from another application.
We also believe that starting with a ‘typical’ mark-recapture problem is a good place to begin – if
you understand how to do a mark-recapture analysis, you’re much of the way to understanding the
principles behind many of the other analytical models incorporated into MARK. The male subset of
the Dipper data set is not one of those which are ‘bundled’ with MARK (although the full dipper data
set is). We’ll assume here that you’ve managed to extract the Dipper data set ED_MALES.INP from the
markdata.zip file that accompanies this book (if you don’t have markdata.zip, you can download it
from the same website you downloaded this book from).

3.1. Starting MARK

Since MARK is a true Windows application, starting it is as simple as double-clicking the MARK icon
(which we assume resides in some specified folder on your desktop). Locate the icon, and double-click
on it. MARK is a fairly large program (although this is now a relative statement with the advent of 100+
MB word processors!), and may take a few moments to start up. If all goes well, you should soon be
presented with the opening ‘splash screen’ (shown at the top of the next page – the particular ‘warm and
fuzzy organism’ you see will depend on which version of MARK you are using), indicating that MARK
is ‘up and running’. If nothing happens (or you get some typically obscure Windows error message),
this is a good indication that something is not working right.
Unfortunately, figuring out the problem depends to a large degree on things like: how many other
applications do you have open? Are you sure you’ve installed the most recent version of MARK? How

© Cooch & White (2019) 08.06.2019


3.2. starting a new project 3-2

much memory is in your machine? Are you ‘lucky’ enough to still be running Windows Vista? And so
on, and so on. MARK is very robust on most machines, so problems getting it started should be very
infrequent. If it doesn’t start correctly, then try again after first closing all other running applications
(in general, MARK runs ‘best’ when it is the only program running). If that doesn’t work, then trying
re-installing from scratch – if you’ve already downloaded the setup.exe file, then this should only take
a few moments (again, when in doubt, try reinstalling – this is often a good way to ‘fix’ minor problems
that might arise).

3.2. starting a new project

The very first thing you need to do is to tell MARK you’re going to start a new project (we’re assuming
that at this stage, you don’t have any existing MARK projects going ). Doing this is very easy – all you
need to do is pull down the ‘File’ menu in the upper left-hand corner of the main MARK window, and
select ‘New’ from the drop-down menu:

Once you have selected the ‘New’ option from the drop-down ‘File’ menu, the graphical ‘splash-
screen’ that you saw when you started MARK will be erased, and you will be presented with a new

Chapter 3. First steps in using Program MARK. . .


3.2. starting a new project 3-3

sub-window – the specification window for program MARK (below):

Clearly, the specification window contains a fair bit of information, but the ‘basics’ can be broken
down into 4 main sections.
First, on the left of the window (highlighted by a bezeled line) is a simple radio-button list for the
various data types MARK can handle. In fact, this is the point at which you tell MARK what kind
of analysis you want to do. In its simplicity, this list belies the sheer scope of analytical coverage
provided by MARK. Whereas most previous applications specialized on one (or a couple) of types
of analysis (for example, live recapture-only, or dead recovery-only), MARK can handle most of the
common analytical designs in use today. While MARK is clearly not a replacement (in some ways) for
a more general purpose approach like SURVIV (or, more recently, R, MATLAB or WinBUGS), it is in
many respects a replacement for much (if not all) of the ‘canned’ software previously in general use.
The ‘live recaptures (CJS)’ data type is selected by default – and, is the data type we want for this
analysis. At the top of the right-hand half of the window is a fill-in element for the ‘Title’ of the project.
This is available as a convenience to the user, and does not have to be filled out. For this example, we
use a title reflecting the fact we’re analyzing the male data from the European Dipper study:

Immediately below the title field is a second fill-in element where the user specifies the file containing
the encounter histories they want to analyze. If the file name and path are known, they can be entered
directly into this box. More typically, you will want to browse for the file (i.e, select it manually), by
clicking on the ‘Click to Select File’ button immediately below the box (you can also double-click

Chapter 3. First steps in using Program MARK. . .


3.2. starting a new project 3-4

the fill-in box itself). If you click on this button, you will be presented with the standard Windows
interface to finding and selecting a particular file (see below).

One thing to note is that until you have selected a file to analyze, the ‘View File’ button is not available
(i.e., is not active). Once you have selected the file, you will be able to view its contents. This is useful if
you forget certain things about your data (for example, the number of occasions). In our example, we
select the file ED_MALES.INP (obviously, the path shown here reflects the machine we’re running MARK
on, and will not be the same as what might appear on your machine).

Again, note that once the file has been specified, the ‘View File’ button become active. If we click on
View File, MARK starts up the default Windows browser (usually the Windows Notepad – if your file
is too large for Notepad to handle, then Windows will prompt you to use an alternative application to
view the file). We see from the Notepad window (pictured below) that we have 7 occasions, and only
1 group (recall from Chapter 2 the basics of formatting data for processing by program MARK).

Finally, the bottom-half of the right hand side. Here, you specify the number of occasions in the file.
Because the .inp files containing the capture histories do not explicitly code for the number of capture
histories in the file, you will have to tell MARK how many occasions there are. It defaults to 5 – do not
be fooled into thinking this is the number of occasions in your file. MARK has no way of knowing what
this value is – you have to enter it explicitly. You can also tell MARK how many ‘attribute groups’
are in the data (for example, if your file contained data for both males and females, you would enter 2).

Chapter 3. First steps in using Program MARK. . .


3.2. starting a new project 3-5

Finally, the number of individual covariates. (Note: the boxes for the ‘number of strata’ and Mixtures’
only becomes available if you select the appropriate data types which require specifying these options.)
To the right of each input box, there is a button which allows you to control some aspects of each
box. For example, to the right of the number of occasions box is a button which lets you set the intervals
between each occasion. The default is ‘1 time period between each occasion’. We’ll talk more about
this particular option later. The other two buttons for ‘group labels’ and ‘individual covariate’
names are (we suspect) fairly self-explanatory. For this analysis, since we have only one attribute group
(males), there is no real need to specify a particular label.
In our Dipper example, we have 7 occasions, and only 1 group (as shown below):

Now, once you’ve got everything set in the specification window, you’re ready to proceed (if not,
simply correct the individual entries as needed, or cancel to start over). To continue, simply click the
‘OK’ button.
The first thing that will happen when you press the ‘OK’ button (assuming that you’ve specified a file
that exists) is that MARK will close the specification window, and present you with a small little pop-up
window telling you that it has created a DBF (data-base format) file to hold the results of your analyses.
The name of the file in this example, is ED_MALES.DBF. MARK uses the prefix of the .inp data file (in
this case, ED_MALES) as the prefix of the DBF file (resulting in ED_MALES.DBF). MARK pauses until you
press the ‘OK’ button, telling it to proceed. If ED_MALES.DBF already existed (i.e., if you’d already done
some analysis on these data), MARK would inform you that it will have to overwrite the existing file,
and will then ask you if this is OK.
Since we haven’t run any analysis on these data before, we simply click the ‘OK’ button. Doing so
causes the pop-up window to close, and a window containing a ‘triangular’ matrix representation of
the survival model structure is presented:

Chapter 3. First steps in using Program MARK. . .


3.2. starting a new project 3-6

However, you’ll note from the title of the window that the matrix represents only the survival
parameters. For a mark-recapture analysis, we’re also interested in estimating the recapture probability.
What you don’t see here is that, by default, MARK initially presents you with only the survival
parameters, assuming (presuming?) that this is what you’re most interested in working with. However,
clearly we may (and probably should) also be interested in modelling the other parameters which
define the system (in this case, the recapture parameter). For now, let’s get MARK to show us both
parameter matrices. We get MARK to show us any (or all) of the parameter matrices by accessing the
‘Parameter Information (or Index) Matrix’ menu (PIM), and selecting the ‘Open Parameter Index
Matrix’ option:

Once you select this option, you will be presented with a new window containing a list of parameters
which you can ‘open’ – in other words, a list of parameters you can ask MARK to show you in the
‘triangular matrix’ format. In our example, since the survival parameter chart is already open, the only
one which will show up on the list is the recapture parameter. You can either select it individually, or
use the ‘Select All’ button. Once you have selected the recapture parameter matrix, you simply click
‘OK’, to cause MARK to place the recapture PIM in its own window. Initially, they may (or likely will)
be presented in an overlapped way (technically known in Windows-jargon as ‘cascaded’). To see the 2
PIM windows separately, you can access the ‘Window’ menu, and select ‘Tile’. This will cause the PIM
windows to be arranged in a ‘tiled’ (i.e., non-overlapping) fashion.

If you’ve had some background in mark-recapture analysis, you might recognize that these PIM’s
reflect the fully time-dependent Cormack-Jolly-Seber (CJS) model. If you’re new to mark-recapture
analysis, not to worry – all of this gets covered in great detail in subsequent chapters. For the moment,
all you need to know is that in this analysis, the survival parameters are numbered 1 to 6 (there are seven
occasions, so six intervals over which survival is estimated), and 6 recapture parameters (numbered 7
through 12). In fact, there are only 10 separable parameters and one ‘product’ parameter estimated in
this analysis (11 total parameters), but we’ll discuss the details concerning this later.

Chapter 3. First steps in using Program MARK. . .


3.3. Running the analysis 3-7

3.3. Running the analysis

Let’s proceed to run the analysis. Not surprisingly, to run an analysis in MARK, you need to access the
‘Run’ menu. In this example (and at this stage), when you access the ‘Run’ menu, MARK will present
you with several options – for now, we want to select the ‘Current Model’ option.
The first thing that MARK does is present you with a new window – the ‘Setup Numerical Estimation
Run’ window (since this is a rather awkward window name, we’ll refer to it simply as the ‘Run’ window).
Much like the specification window we saw earlier, the ‘Run’ window (shown on the next page) contains
a lot of options, so let’s take some time at this point to get you familiar with the components of this
window. As with the specification window, the run window can be broken down into 4 major elements.
First, in the top left, the analysis title (which MARK carries over from what you entered earlier in the
specification window), and the model name (which is initially blank). MARK must have a model title to
refer to (the reason why will be obvious later), so we need to enter something in the model name input
box. Following the naming convention advocated in Lebreton et al. (1992) (which follows closely from
standard linear models notation), we’ll use ‘Phi(t)p(t)’ as the name for this model, reflecting the fact
that the model we’re fitting has full time-dependence for both survival (ϕ – since MARK doesn’t allow
you to enter non-ASCII text, we’re restricted to writing out the Greek symbol as ‘phi’) and recapture
(p). The (t)-notation simply indicates time-dependence.
Next, the lower-left hand side of the run window. This is where we specify the link function (MARK
lets you choose among several different functions – the sin link is the default, while the logit link is
perhaps the most familiar). We’ll talk about link functions in more detail later. For the moment, we’ll
use the default sin link.

To the right of this list of link functions is a list of two ‘Var. Estimation’ options – these are
options which controls how the variance-covariance matrix is estimated. This is important because
this estimation provides both the information needed to compute the standard errors of the estimates,
but is also used to calculate the number of estimable parameters in the model (in this example, there
should be 11 estimable parameters). The ‘2nd Part’ option is the default and preferred option, so we’ll
use it here. Finally, just above the link and variance estimation lists is a button which allows you to ‘fix
parameters’. As we’ll see in subsequent chapters, there will be occasions when we need to specify a value

Chapter 3. First steps in using Program MARK. . .


3.3. Running the analysis 3-8

(normally 0 or 1) for some parameters – in this example, there is no need to do so. Fixing parameters that
are logically constrained to be a particular value is useful, since it both reduces the number of estimable
parameters, and helps MARK estimate the remaining parameters more accurately and efficiently.
On the right-hand side of the run window are a series of program options related to the numerical
optimization. Most of the options are fairly self-explanatory, so we won’t go into any of these in detail
here. For now, we’ll leave all of the check boxes on the right-hand side blank.
You’re now ready to run the program. To do this, simply click on the ‘OK to run’ button in the lower
right-hand corner of the run window. If you are using the default preferences for MARK (which you
probably will be at this point), MARK will spawn a box asking you if you want to use the ‘identity
matrix’ (since none was specified). The use of the identity matrix will be covered in much more detail
in Chapter 6 – for now, simply accept the ‘identity matrix’, by clicking ‘Yes’. Immediately after doing
so, MARK spawns a numerical estimation window (black background). This window, which is a shell
to the numerical estimation routines in MARK, allows you to watch the progress of the estimation.
Several things will scroll by pertaining to the estimation. Much of the information being printed to the
estimation window can be also printed out to the results DBF, by specifying the appropriate options
in the run window. You may also notice that MARK has created another window, called the ‘Results
Browser’ – more on this window in a moment. Once the estimation is complete (which should only take
a second or two for this dataset), MARK will shut down this command window, and then present you
with the ‘pop-up’ window shown below:

This window provides summary of some of the key results of your estimation (the number of
parameters estimated, the model deviance and so forth). The purpose of this summary is to give you
the option of whether or not to append the full results of the estimation to the result database file (DBF).
In theory, you could decide at this point that there was something ‘wrong’, and not append the results.
You might wonder about the **WARNING** statement at the bottom – if you read it carefully, you’ll see
that MARK is ‘warning’ you that at least one pair of encounter histories in the .INP file are duplicates.
In fact, for .INP files where each line represents the encounter history for a different individual (as in the
present example), this is entirely expected. Meaning, there is nothing to worry about in this case. So we
click ‘Yes’, and accept the results. As soon as you click ‘Yes’, two things happen in rapid succession. First,
the results summary window closes. Next, an item is added to the results browser window (below):

Chapter 3. First steps in using Program MARK. . .


3.4. Examining the results 3-9

3.4. Examining the results

Now that we have something to ‘look at’ in the browser window, we can focus a bit more on the structure
of the window, and what options are available. First, in the main body of the results browser, you have
several columns of information. From left to right: the model title (or simply, ‘Model’), the corrected
or adjusted Akaike’s Information Criterion (AICc ), the ∆AIC value (the difference in the value of the
AIC from the model currently in the browser having the lowest AIC – since we have only one model in
the browser at the moment, ∆AIC  0), the AIC weight, the model likelihood, the number of estimated
parameters, and the model deviance. The column arrangement can be modified to suit your preferences
by simply dragging the columns to the left or right. The AICc , the number of parameters and the model
deviance (in particular) form the basis for comparison with other models. Since we only have one model
at the moment, we’ll defer discussion of these issues until later. Along the top of the browser window
are a number of buttons, which you can use to access a variety of functions. More on these later.
For now, we want to view the results of our estimation. We can view just the reconstituted parameter
estimates (on the ‘real probability scale’) by clicking on the fourth button (from the left) on the menu,
or by selecting ‘Output | Specific Model Output | Parameter Estimates | Real’. This spawns a
Windows Notepad window to open up, containing the estimates (shown below).

The MARK output consists of 5 columns: the first column is the parameter index number (1 → 12),
followed by the parameter estimate, the standard error of the parameter, and the lower and upper 95%
confidence limits for the estimates, respectively. The parameter indexing relates to the indexing used
in the PIMs we saw earlier.

3.4.1. MARK, PIMs, and parameter indexing

Let’s stop a moment for a quick introduction to the indexing scheme that MARK uses. Consider the
following figure:

ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p2 p3 p4 p5 p6 p7

In our analysis of the male Dipper data, recall that we have 7 occasions – the initial marking occasion,
and 6 subsequent recapture (or resighting) occasions. The ϕ i values represent the survival probabilities
between successive occasions (i.e., ϕ i is the probability of surviving from occasion i to occasion i + 1),
while the p i values represent the recapture probabilities at the ith occasion. For details, see Lebreton et
al. (1992).

Chapter 3. First steps in using Program MARK. . .


3.4.1. MARK, PIMs, and parameter indexing 3 - 10

What MARK does is to substitute a numerical indexing scheme for the individual ϕ i and p i values,
respectively. For example, consider the indexing for the survival parameters ϕ i – there are 6 values, ϕ1
through ϕ6 . How does this connect to the ‘survival PIM’?

ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p2 p3 p4 p5 p6 p7


1 2 3 4 5 6
1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
7 8 9 10 11 12

What about individuals captured for the first time and marked at the second occasion? Technically,
we would refer to such individuals as being in the second release cohort (a cohort is simply a group of
individuals that bear something in common – in this case, individuals captured, marked and released
on the second occasion would comprise the second cohort). This is similar for individuals captured,
marked and released on the third occasion, and so forth. All we need to do to account for these different
release cohorts is to start the ‘indexing’ at the appropriate occasion for each cohort – this is shown
schematically, below:



ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6 1 2 3 4 5 6
cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7 

1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7



p2 p3 p4 p5 p6 p7 7 8 9 10 11 12




ϕ2 ϕ3 ϕ4 ϕ5 ϕ6 2 3 4 5 6
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7 

2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7



p3 p4 p5 p6 p7 8 9 10 11 12




ϕ3 ϕ4 ϕ5 ϕ6 3 4 5 6
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7 

3 −→ 4 −→ 5 −→ 6 −→ 7
p4 p5 p6 p7 
 9 10 11 12



ϕ4 ϕ5 ϕ6 4 5 6
cohort 4 4 −→ 5 −→ 6 −→ 7 

4 −→ 5 −→ 6 −→ 7



p5 p6 p7 10 11 12




ϕ5 ϕ6 5 6
cohort 5 5 −→ 6 −→ 7 

5 −→ 6 −→ 7



p6 p7

11 12



ϕ6

6

cohort 6 6 −→ 7 6 −→ 7
p7  12

Now, have another look at the PIMs for survival and recapture:

Chapter 3. First steps in using Program MARK. . .


3.4.1. MARK, PIMs, and parameter indexing 3 - 11

The numbers 1 to 6 in the PIM correspond to the survival values ϕ1 to ϕ6 , respectively. For the
recapture PIM, the numbers 7 to 12 correspond to the recapture probabilities p2 to p7 , respectively.
Notice that MARK indexes the survival parameters first (1 to 6), followed in numerical sequence by
the index values for the recaptures (7 to 12). In other words, the indexing that MARK uses does not
correspond to the number of the particular interval or occasion involved. For example, the recapture
index 8 corresponds to the recapture probability at occasion 3 (i.e., p3 ). Although it can be a little
confusing at first, with a bit of work you should be able to see the basic connection between the ‘true’
parameter structure of the model, and the way in which MARK indexes it, both internally (which
becomes very important later on as we examine more complex models), and in the output file.
We’ll be using PIMs frequently throughout the rest of the book, so not to worry if you don’t
immediately see the connection – you’ll get lots of practice. But, make sure you spend some time trying
to grasp the connection now. This is very important.
So, getting back to the output:

Note that our survival estimates correspond to parameters 1 to 6, while 7 to 12 correspond to the
recapture probabilities (corresponding to the figures on the preceding page). But, if you look carefully,
you’ll notice that the estimates for index 6 (the survival probability from occasion 6 to occasion 7)
and index 12 (the recapture probability at occasion 7) are identical (0.76377). As discussed in detail in
Lebreton et al. (1992), and elsewhere in this book, this reflects the fact that, for this model, the survival
and recapture rates for the last interval are not individually identifiable. The value of 0.76377 is actually
the square-root of the estimated product ϕ6 p7 (this is denoted as β7 in Lebreton et al. 1992). Thus, MARK
has estimated 11 parameters: 5 survival probabilities (ϕ1 to ϕ5 ), 5 recapture probabilities (p2 to p6 ), and
one related to the product ϕ6 p7 . You may also notice that the SE and 95% CI for one of the estimates
(p̂3 ) is clearly ‘suspect’ – we’ll deal much more with ‘problems with the estimates’ later.
At this point, you can do one of several things. You can print your estimates, you can plot them, or
you can examine aspects of the estimation procedure, and look at the degree to which any given capture
history in your .INP file affects your estimates. We’ll defer all the options for the moment, and simply
close the notepad window with the estimates, and look back at the main browser window itself.
Much of what MARK shows you in the browser is important only in the context of comparing the fit of
one model with that of another. In order to demonstrate this, we’ll continue our exploration of the Dipper
data set, by running 3 additional models: {ϕ t p · }, {ϕ p t } and {ϕ p · }, corresponding to time-varying
survival and constant recapture ({ϕ t p · }), constant survival and time-varying recapture ({ϕ· p t }), and
constant survival and recapture ({ϕ· p · }), respectively. There are many more models we could try to fit,
but for the purposes of this exercise (which is intended to give you a more complete sense of how the
results browser works), we’ll run these three.
There are a number of ways you could specify these models in MARK. For the moment, we’ll use
a ‘short-cut’ method which is convenient for some standard models. The short-cut makes use of some
‘built-in’ models in MARK. Note: you will be strongly discouraged from ever using this short-cut again,

Chapter 3. First steps in using Program MARK. . .


3.4.1. MARK, PIMs, and parameter indexing 3 - 12

but for the moment, it’s convenient.


If you pull down the ‘Run’ menu, you will see that there are several menu options, beside the ‘run
the current model’ we used earlier. The two options of interest at this point are: ‘pre-defined’ models
and ‘file’ models. For the moment, we’re interested only in the pre-defined models.

Once you have selected the ‘pre-defined models’ option from the ‘Run’ menu, you will be dumped
directly back into the ‘Run’ window. However, there is one important, but easily overlooked change to
the window. Where in the first instance there was a button to fix parameter estimates, note that now
this button has been replaced with a ‘Select models’ button. Everything else is the same, so you have
to be observant. The other difference is that the model title box has been eliminated. In a moment you’ll
see why. Click on the ‘Select models’ button.

Once you’ve clicked on the appropriate button, you’ll be presented with a tabbed-window (shown at
the top of the next page), where each tab represents one of the parameters in the models you’re working
with (in this case, ϕ and p). All you need to do is (i) select the parameter by clicking the appropriate tab,
and (ii) select the model structure(s) you want for each parameter in turn. For example, the following
shows the ϕ parameter – we’ve selected only the time-invariant ‘dot’ model {·}. [Don’t worry about the
‘design matrix’ option for the moment – much more on that in Chapter 6.] Selecting both t and · for both
parameters would yield 4 final models: {ϕ t p t }, {ϕ t p · }, {ϕ· p t }, and {ϕ· p · }. For the moment, we’re only
interested in the last 3 (since we’ve already built model {ϕ t p t }).

Chapter 3. First steps in using Program MARK. . .


3.4.1. MARK, PIMs, and parameter indexing 3 - 13

So, simply select the {·} option for the ϕ parameter, as shown above. Note that at this point, MARK
tells you (in the lower-right hand corner of the window) that the number of models selected to run is 0
– this is because we haven’t yet defined the structure for the other parameter p yet. Click the tab for the
p parameter, and select both the {t} and {·} models. Now, MARK reports that you’ve selected 2 models:
{ϕ· p · } and {ϕ· p t }. Click the ‘OK’ button for the 2 models we’ve just selected. This brings you back to
the ‘Set Up Numerical Estimation Run’ window again. Click the ‘Run’ button, which will run these 2
models, and automatically add the results to the browser. However, we also want to run model {ϕ t p · }.
Simply go through this process again (i.e., running a pre-defined model), except this time, select {t} for
the ϕ parameter, and {·} for the p parameter. Click ‘OK’, and run this model, adding the results to the
browser.
Note that when running pre-defined models, you still do not have the option of fixing parameters, and
there is still no input box to specify the model title. The reason? Simple – MARK figures that if you’re
using the pre-defined models, you’re willing to accept the parameter structure and model names that
MARK uses for defaults. Of course, MARK still allows you to choose among the various link functions,
variance estimation routines, and other options normally associated with the run window. If you’re
satisfied with what is set for these other options, then simply click the ‘OK to run’ button in the lower
right-hand corner of the run window.
Running pre-defined models in MARK is somewhat different from the way they run normally (i.e.,
the way our initial model ran). First, running pre-defined windows does not cause MARK to spawn a
command window showing the progress of the numerical estimation. Second, MARK does not ask you
after each model is finished whether or not you want to add the results to the results browser – MARK
assumes you do, and sends the results to the browser automatically.
Once the processing of the models is complete, you will note that they have been added to the results
browser:

However, note that the results are not listed in the order in which the models were processed. In fact,
if you’d been watching the results browser while MARK was processing the list of models, you might
have noticed that the ordering of the models in the list changed with the completion of each model. In
fact, MARK is ordering (or sorting) the results based on some sort of criterion – in this case, in ascending
sequence starting with the model with the lowest AICc (Akaike Information Criterion) value (for the
Dipper example, this corresponds to model ‘Phi(.)p(.)’ – constant survival and constant recapture).
The sorting criterion can be controlled using the ‘Order’ menu.
We will talk a lot more about AICc in subsequent chapters. For the moment, accept that the AICc is
a good, well-justified criterion for selecting the most parsimonious model (i.e., the model which best
explains the variation in the data while using the fewest parameters). In a very loose sense, we might
state that the model with the lowest AICc is the ‘best’ model (although clearly, what is ‘best’ or ‘worst’
depends upon the context). The results browser shows you the AIC for each model, as well as the
arithmetic difference between each model and the top model (i.e., the one with the lowest AICc value).
For example, model ‘Phi(t)p(.)’ has an AICc value of 330.06, which is 7.50 units larger than the AICc
for the most parsimonious model (model ‘Phi(.)p(.)’, which has an AICc value of 322.55).

Chapter 3. First steps in using Program MARK. . .


3.4.1. MARK, PIMs, and parameter indexing 3 - 14

The right-most column (by default) is the model deviance. In simple terms, the lower the deviance,
the better the model fits. The technical details of the estimation of the likelihood and deviances are
given in Lebreton et al. (1992). We’ll talk more about the deviance (and related statistics) later.
Perhaps more notably, the difference in deviance between ‘nested models’ (models in which one
model differs from another by the elimination of one or more model terms) is distributed as a χ2 statistic
with the degrees of freedom given as the difference in the number of estimable parameters between the
two models. This forms the basis of the likelihood ratio test (LRT). In fact, MARK provides a variety of
‘statistical tests’ for comparing among models, including the LRT. To perform a LRT on the models in
the results browser, simply pull down the ‘Tests’ menu, and select ‘LR tests’ (as shown at the top of
the next page). (Note: the relationship between AIC, model selection, and ‘classical’ statistical tests like
the LRT will be presented in more detail in subsequent chapters.)

MARK will then present you with a window allowing you to select the models you want to compare
using LRT. For now, simply ‘Select all’, and then click the ‘OK’ button. You will be shown the results
of the LRT tests in a Notepad window.

Now, what do we note about the results? Most importantly, we see that not all paired-comparisons
among models are possible – the comparison between model {ϕ t p · } and model {ϕ· p t } is not calculated.
Why? If you recall from the preceding page, LRT may be applied only to ‘nested’ models. We’ll talk
more about ‘nesting’ in the next chapter, but for now, accept that these 2 models are not nested. As such,
we cannot use an LRT to compare the fit of the 2 models. In one sense (although perhaps not the most
appropriate one), this is one of the advantages of using the AIC to compare models – it works regardless
of whether or not the models are nested. [However, as we will discover in later chapters, there may be
far more important reasons to use AIC as an omnibus (general) model selection tool.]
For the moment, concentrate on the comparison of model {ϕ· p · } with model {ϕ t p · }, the first 2 models
noted in the editor window (above). We note that the difference in the model deviances is 3, with a
difference in the number of parameters of 5. Based on a χ2 distribution, this difference is not significant
at the nominal α  0.05 level (P  0.700). In other words, both models fit the data equally well (we cannot
differentiate between them statistically). The model with more parameters fits the data better in terms
of the model deviance, but not so much so as to compensate for the fact that it takes more parameters
to achieve this better fit. Thus, we would conclude that model {ϕ· p · } is the more parsimonious model,
which is consistent with the results using the AICc .

Chapter 3. First steps in using Program MARK. . .


3.5. Summary 3 - 15

Of course, at this point you can browse the estimates, plot them, examine residual plots – simply by
selecting the model you’re interested in (by clicking on it in the results browser). You can also re-run
any particular model, using (for example) a different link function, simply by selecting the model from
the list, retrieving the model (the ‘Retrieve’ menu), and then re-running the model (specifying the new
link function, of course). MARK will process the data, and then ask you if you want to add the new
results to the browser. It’s that easy.
Once you’re done working with this project, all you need to do is exit MARK. All the model results
will be stored away in a DBF file (in this case, called ED_MALES.DBF). Then, if you want to continue work
on this analysis, all you’ll need to do is start MARK, and then ‘Open’ the ed_males.dbf file. That’s it!

3.5. Summary

Congratulations! You’ve just finished your first analysis using program MARK. Of course, the fact that
there are many hundreds of pages left in this book should tell you that there is a lot more left to be
covered. But, you’ve at least gotten your feet wet, and run through a ‘typical’ MARK analysis once
– this is an important first step. If you don’t feel comfortable with what we’ve done so far go back
through the chapter – slowly. Many of the basic mechanics and presented in this chapter (in particular,
the relationship between model ‘structure’ and the PIMs) will be used repeatedly throughout the rest
of the book, so it is important to feel comfortable with them before proceeding much further.

Chapter 3. First steps in using Program MARK. . .


CHAPTER 4

Building & comparing models

In this chapter, we introduce several important concepts.∗ First, we introduce the basic concepts and
‘mechanics’ for building models in MARK. Second, we introduce some of the concepts behind the
important questions of ‘model selection’ and ‘multi-model inference’. How to build models in MARK
is ‘mechanics’ – why we build certain models, and what we do with them, is ‘science’. Both are critical
concepts to master in order to use MARK effectively, and are fundamental to understanding everything
else in this book, so take your time.
We’ll begin by re-visiting the male Dipper data we introduced in the last chapter. We will compare
2 different subsets of models: models where either survival or recapture (or both) varies with time, or
models where either survival or recapture (or both) are constant with respect to time. The models are
presented in the following table, using the notation suggested in Lebreton et al. (1992).

model explanation
{ϕ t p t } both survival and encounter probability time dependent
{ϕ· p t } survival constant over time, encounter probability time dependent
{ϕ t p · } survival time dependent, encounter probability constant over time
{ϕ· p · } both survival and encounter probabilities constant over time

In the following, we will go through the steps in fitting each of these 4 models to the data. In fact,
these models are the same ones we fit in Chapter 3. So why do them again? In Chapter 3, our intent was
to give you a (very) gentle run-through of running MARK, using some of the standard options. In this
chapter, the aim is to introduce you to the mechanics of model building, from the ground up. We will
not rely on ‘built-in’ or ‘pre-defined’ models in this chapter (in fact, you’re not likely to ever use them
again). Since you already saw the ‘basics’ of getting MARK up and running in Chapter 3, we’ll omit
some of the more detailed explanations for each step in this chapter.
However, we must emphasize that before you actually use MARK (or any other program) to compare
different models, you need to first confirm that your ‘starting model’ (generally, the most parameterized
or most general model) adequately fits the data. In other words, you must conduct a goodness-of-fit
(GOF) test for your ‘starting model’. GOF testing is discussed in detail in Chapter 5, and periodically
throughout the remainder of this book. For convenience, we’ll assume in this chapter that the ‘starting
model’ does adequately fit the data.


Very important...

© Cooch & White (2019) 08.09.2019


4.1. Building models – parameter indexing & model structures 4-2

4.1. Building models – parameter indexing & model structures

As in Chapter 3, start MARK by double-clicking the MARK icon. We’re going to use the same data set
we analyzed in Chapter 3 (ED_MALES.INP). At this point, we can do one of 2 things: (1) we can start a
completely new MARK session (i.e., create a completely new *.DBF file), or (2) we can re-open the *.DBF
file we created in Chapter 3, and append new model results to it. Since you already saw in Chapter 3
how to start a ‘new’ project, we’ll focus here on the second possibility – appending new model results
to the existing *.DBF file.
This is very easy to do – from the opening MARK ‘splash screen’, select ‘Open’ from the ‘File’ menu,
and find the ED_MALES.DBF file you created in Chapter 3 (remember that MARK uses the prefix of the
*.INP file – the file containing the encounter histories – as the prefix for the *.DBF file. Thus, analysis of
ED_MALES.INP leads to the creation of ED_MALES.DBF). Once you’ve found the ED_MALES.DBF file, simply
double-click the file to access it. Once you’ve double-clicked the file, the MARK ‘splash screen’ will
disappear, and you’ll be presented with the main MARK window, and the results browser. In the
results browser, you’ll see the models you already fit to these data in the last chapter (there should be
4 models), and their respective AIC and deviance values.
In this chapter, we want to show you how to build these models from scratch. As such, there is no
point in starting with all the results already in the browser! So, take a deep breath and delete all the
models currently in the browser! To do this, simply highlight each of the models in turn, and click the
trash-can icon in the browser toolbar.
Next, bring up the Parameter Index Matrices (PIMs), which (as you may recall from Chapter 3), are
fundamental to determining the structure of the model you are going to fit to the data. So, the first
step is to open the PIMs for both the survival and recapture parameters. To do this, simply pull down
the ‘PIM’ menu, and select ‘Open Parameter Index Matrix’. This will present you with a dialog box
containing two elements: ‘Apparent Survival Parameter (Phi) Group 1’, and ‘Recapture Parameter
(p) Group 1’. You want to select both of them. You can do this either by clicking on both parameters,
or by simply clicking on the ‘Select All’ button on the right-hand side of the dialog box.
Once you’ve selected both PIMs, simply click the ‘OK’ button in the bottom right-hand corner. This
will cause the PIMs for survival and recapture to be added to the MARK window.
Here’s what they look like, for the survival parameters,

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-3

and the recapture parameters, respectively:

We’re going to talk a lot more about PIMs, and why they look like they do, later on in this (and
subsequent) chapters. For now, the only thing you need to know is that these PIMs reflect the currently
active model. Since you deleted all the models in the browser, MARK reverts to the default model –
which is always the fully time-dependent model. For mark-recapture data, this means the fully time-
dependent CJS model.
OK, so now you want to fit a model. While there are some ‘built-in’ models in MARK, we’ll
concentrate at this stage on using MARK to manually build the various models we want to fit to our
data. Once you’ve mastered this manual, more general approach, you can then proceed to using ‘short-
cuts’ (such as built-in models). Using short-cuts before you know the ‘general way’ is likely to lead to
one thing – you getting lost!
Looking back at the table on the first page of this chapter, we see that we want to fit 4 models to the
data, {ϕ t p t }, {ϕ t p.}, {ϕ· p t } and {ϕ· p · }. A quick reminder about model syntax – the presence of a ‘t’
subscript means that the model is structured such that estimates for a given parameter are time-specific;
in other words, that the estimates may differ over time. The absence of the ‘t’ subscript (or, the presence
of a ‘dot’) means the model will assume that the parameter is fixed through time (the use of the ‘dot’
subscript leads to such models usually being referred to as ‘dot models’ – naturally).
Let’s consider model {ϕ t p t } first. In this model, we assume that both survival (ϕ) and recapture (p)
can vary through time. How do we translate this into MARK? Pretty easy, in fact. First, recall that in this
data set, we have 7 total occasions: the first occasion is the initial marking (or release) occasion, followed
by 6 subsequent recapture occasions. Now, typically, in each of these subsequent recapture occasions 2
different things can occur.
Obviously, we can recapture some of the individuals previously marked. However, part of the
sample captured on a given occasion is unmarked. What the investigator does with these individuals
differs from protocol to protocol. Commonly, all unmarked individuals are given a unique mark, and
released. As such, on a given recapture occasion, 2 types of individuals are handled and released: those
individuals which have been previously marked, and those which are newly marked.
Whether or not the fate of these two ‘types’ of individuals is the same is something we can test (we will
explore this in a later chapter). In some studies, particularly in some fisheries and insect investigations,
individuals are only marked at the initial release (sometimes known as a ‘batch mark’). There are no
newly marked individuals added to the sample on any subsequent occasions. The distinctions between

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-4

these two types of mark-release schemes are important to understanding the structure of the parameter
matrices MARK uses.
Consider our first model, the CJS model {ϕ t p t } with full time-dependence in both survival and
recapture probabilities. Let’s assume there are no age effects (say, for example, all individuals are marked
as adults – we deal with ‘age’ in a later chapter). In Chapter 3, we represented the parameter structure
of this model as shown below:

ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p2 p3 p4 p5 p6 p7

In fact, this representation is incomplete, since it does not record or index the fates of individuals
newly marked and released at each occasion. These are referred to as ‘cohorts’ – groups of animals
marked and released on a particular occasion.
We can do this easily by adding successive rows to our model structure, each row representing the
individuals newly marked on each occasion. Since the occasions obviously occur sequentially, then each
row will be indented from the one above it by one occasion. This is shown below:

ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p2 p3 p4 p5 p6 p7
ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p3 p4 p5 p6 p7
ϕ3 ϕ4 ϕ5 ϕ6
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7
p4 p5 p6 p7
ϕ4 ϕ5 ϕ6
cohort 4 4 −→ 5 −→ 6 −→ 7
p5 p6 p7
ϕ5 ϕ6
cohort 5 5 −→ 6 −→ 7
p6 p7
ϕ6
cohort 6 6 −→ 7
p7

Notice that the occasions are numbered from left to right, starting with occasion 1. Survival proba-
bility is the probability of surviving between successive occasions (i.e., between columns). Each release
cohort is listed in the left-hand column.
For example, some individuals are captured and marked on occasion 1, released, and potentially can
survive to occasion 2. Some of these surviving individuals may survive to occasion 3, and so on. At
occasion 2, some of the captured sample are unmarked. These unmarked individuals are newly marked
and released at occasion 2. These animals comprise the second release cohort. At occasion 3, we take a
sample from the population. Some of the sample might consist of individuals marked in the first cohort
(which survived to occasion 3), some would consist of individuals marked in the second cohort (which
survived to occasion 3), while the remainder would be unmarked. These unmarked individuals are
newly marked, and released at occasion 3. These newly marked and released individuals comprise the
third release cohort. And so on.
If we rewrite cohort structure, showing only the sampling occasion numbers, we get the structure
shown at the top of the next page.

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-5

cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7
cohort 4 4 −→ 5 −→ 6 −→ 7
cohort 5 5 −→ 6 −→ 7
cohort 6 6 −→ 7

The first question that needs to be addressed is: does survival vary as a function of which cohort an
individual belongs to, does it vary with time, or both? This will determine the indexing of the survival
and recapture parameters. For example, assume that cohort does not affect survival, but that survival
varies over time. In this case, survival can vary among intervals (i.e., among columns), but over a given
interval (i.e., within a column), survival is the same over all cohorts (i.e., over all rows). Again, consider
the following cohort matrix – but showing only the survival parameters:

ϕ1 ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
ϕ2 ϕ3 ϕ4 ϕ5 ϕ6
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
ϕ3 ϕ4 ϕ5 ϕ6
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7
ϕ4 ϕ5 ϕ6
cohort 4 4 −→ 5 −→ 6 −→ 7
ϕ5 ϕ6
cohort 5 5 −→ 6 −→ 7
ϕ6
cohort 6 6 −→ 7

The shaded columns indicate that survival is constant over cohorts, but the changing subscripts in ϕ i
indicate that survival may change over time. This is essentially Table 7A in Lebreton et al. (1992). What
MARK does to generate the parameter or model structure matrix is to reproduce the structure and
dimensions of this figure, after first replacing the ϕ i values with a simple numerical indexing scheme,
such that ϕ1 is replaced by the number 1, ϕ2 is replaced by the number 2, and so forth. Thus, the
preceding figure (above) is represented by a triangular matrix of the numbers 1 to 6 (for the 6 survival
probabilities):

1 2 3 4 5 6
2 3 4 5 6
3 4 5 6
4 5 6
5 6
6

This ‘triangular matrix’ (the PIM) represents the way that MARK ‘stores’ the model structure
corresponding to time variation in survival, but no cohort effect (Fig. 7A in Lebreton et al. 1992). Notice
that the dimension of this matrix is (6 rows by 6 columns), rather than (7 columns by 7 rows). This is
because there are 7 capture occasions, but only 6 survival intervals (and, correspondingly, 6 recapture
occasions). This representation is the basis of the PIMs which you see on your screen (it will also be
printed in the output file). Perhaps most importantly, though, this format is the way MARK keeps

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-6

track of model structure and parameter indexing. It is essential that you understand the relationships
presented in the preceding figures. A few more examples will help make them clearer.
Let’s consider the recapture probability. If recapture probability is also time-specific, what do you
think the model structure would look like? If you’ve read and understood the preceding, you should
be able to make a reasonable guess. Again, remember that we have 7 sampling occasions – the
initial marking event (occasion 1), and 6 recapture occasions. With time-dependence, and assuming
no differences among cohorts, the model structure for recaptures would be:

cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p2 p3 p4 p5 p6 p7
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p3 p4 p5 p6 p7
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7
p4 p5 p6 p7
cohort 4 4 −→ 5 −→ 6 −→ 7
p5 p6 p7
cohort 5 5 −→ 6 −→ 7
p6 p7
cohort 6 6 −→ 7
p7

Now, what are the corresponding index values for the recapture probabilities? As with survival,
there are 6 parameters, p2 to p7 (corresponding to recapture on the second through seventh occasion,
respectively). With survival probabilities, we simply looked at the subscripts of the parameters, and
built the PIM. However, things are not quite so simple here (although as you’ll see, they’re not very
hard). All you need to know is that the recapture parameter index values start with the first value after
the survival values. Hmmm...let’s try that another way. For survival, we saw there were 6 parameters,
so our survival PIM looked like

1 2 3 4 5 6
2 3 4 5 6
3 4 5 6
4 5 6
5 6
6

The last index value is the number ‘6’ (corresponding to ϕ6 , the apparent survival probability between
occasion 6 and occasion 7). To build the recapture PIM, we start with the first value after the largest value
in the survival PIM. Since ‘6’ is the largest value in the survival PIM, then the first index value used in
the recapture PIM will be the number ‘7’. Now, we build the rest of the PIM. What does it look like?
If you think about it for a moment, you’ll realize that the recapture PIM looks like:

7 8 9 10 11 12
8 9 10 11 12
9 10 11 12
10 11 12
11 12
12

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-7

Do these look familiar? They might – look at the PIMs MARK has generated on the screen. In fact,
we’re now ready to ‘run the CJS model’ fully time-dependent model. We covered this step in Chapter 3,
but let’s go through it again (repetition is a good teacher). In fact, there are a couple of ways you can
proceed. You can either (i) pull down the ‘Run’ menu and ‘Run the current model’ (the model defined
by the PIMs is always the current model), or (ii) click the ‘Run’ icon on the toolbar of either of the PIMs.
This will bring up the ‘Setup’ window for the numerical estimation, which you saw for the first time in
Chapter 3. All you need to do is fill in a name for the model (we’ll use Phi(t)p(t) for this model), and
click the ‘OK to run button’ (lower right-hand corner). Again, as you saw in Chapter 3, MARK will
ask you about the ‘identity matrix’, and then spawn a numerical estimation window. Once it’s finished,
simply add these results to the results browser.
Now, let’s consider model {ϕ t p · } – time-dependent survival, but constant recapture probability.
What would the PIMs for this model look like? The survival PIM would be identical to what we already
have, so no need to do anything there. What about the recapture PIM? Well, in this case, we have constant
recapture probability. What does the parameter structure look like? Look at the following figure:

cohort 1 1 −→ 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p p p p p p
cohort 2 2 −→ 3 −→ 4 −→ 5 −→ 6 −→ 7
p p p p p
cohort 3 3 −→ 4 −→ 5 −→ 6 −→ 7
p p p p
cohort 4 4 −→ 5 −→ 6 −→ 7
p p p
cohort 5 5 −→ 6 −→ 7
p p
cohort 6 6 −→ 7
p

Note that there are no subscripts for the recapture parameters – this reflects the fact that for this
model, we’re setting the recapture probability to be constant, both among occasions, and over cohorts.
What would the PIM look like for recapture probability? Recall that the largest index value for the
survival PIM is the number ‘6’, so the first index value in the recapture PIM is the number ‘7’. And, since
the recapture probability is constant for this model, then the entire PIM will consist of the number ‘7’:

7 7 7 7 7 7
7 7 7 7 7
7 7 7 7
7 7 7
7 7
7

Now, how do we modify the PIMs in MARK to reflect this structure? As you’ll discover, MARK
gives you many different ways to accomplish the same thing. Modifying PIMs is no exception. The
most obvious (and pretty well fool-proof) way to modify the PIM is to edit the PIM directly, changing
each cell in the PIM, one at a time, to the desired value. For small PIMs, or for some esoteric model
structures we’ll discuss in later chapters, this is not a bad thing to try. However, let’s use one of the
built-in time-savers in MARK to do most of the work for us.
Remember, all we want to do here is modify the recapture PIM. To do this, make that PIM ‘active’
by clicking in the first ‘cell’ (upper left corner of the PIM). You can in fact make a window active by

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-8

clicking on it anywhere (it doesn’t matter where – just remember not to click the ‘X’ in the upper right-
hand corner, since this will close the window!), but as we’ll see, there are advantages in clicking in a
specific cell in the PIM. When you’ve successfully selected a cell, you should see a vertical cursor in that
cell.
Once you’ve done this, you can do one of a couple of things. You can pull down the ‘Initial’ menu
on the main MARK parent toolbar. When you do this, you’ll see a number of options – each of them
controlling the value (if you want, the initial value) of some aspect of the active window (in this case, the
recapture PIM). Since we want to have a constant recapture probability, you might guess the ‘Constant’
option on the ‘Initial’ menu would be the right one. You’d be correct. Alternatively, you can right-
click with the mouse anywhere in the recapture PIM window – this will generate the same menu as
you would get if you pull down the ‘Initial’ menu. Use whichever approach you prefer:

Once you’ve done this, you will see that all the values in the recapture PIM are changed to 7.

Believe it our not, you’re now ready to run this model (model {ϕ t p · }). Simply go ahead and click
the ‘Run’ icon in the toolbar of either PIM. For a model name, we’ll use ‘Phi(t)p(.)’. Once MARK is
finished, go ahead and append the results from this run to the results browser.
What you might see is that model ‘Phi(t)p(.)’ (representing model {ϕ t p · }) is listed first, and model

Chapter 4. Building & comparing models


4.1. Building models – parameter indexing & model structures 4-9

‘Phi(t)p(t)’ second, even though model ‘Phi(t)p(t)’ was actually run first. As you may recall from
our discussions in Chapter 3, the model ordering is determined by a particular criterion (say, the AIC),
and not necessarily the order in which the models were run.
Before we delve too deeply into the results of our analyses so far, let’s finish our list of candidate
models. We have model {ϕ· p t } and model {ϕ· p · } remaining. Let’s start with model {ϕ· p t } – constant
survival, and time-dependent recapture probability. If you think about it for a few seconds, you’ll realize
that this model is essentially the ‘reverse’ of what we just did – constant survival and time-dependent
recapture, instead of the other way around. So, you might guess that all you need to do is reverse the
indexing in the survival and recapture PIMs. Correct again! Start with the survival PIM. Click in the
first cell (upper left-hand corner), and then pull down the ‘Initial’ menu and select ‘Constant�