0% found this document useful (0 votes)
34 views72 pages

Object Oriented Data Analysis 1st Edition Marron

The document provides information about the book 'Object Oriented Data Analysis' by J.S. Marron and Ian L. Dryden, which offers a framework for analyzing complex data. It includes various case studies and topics such as data visualization, distance-based methods, and high-dimensional inference. Additionally, it features links to other ebooks available for instant download on ebookmeta.com.

Uploaded by

isaiibrei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views72 pages

Object Oriented Data Analysis 1st Edition Marron

The document provides information about the book 'Object Oriented Data Analysis' by J.S. Marron and Ian L. Dryden, which offers a framework for analyzing complex data. It includes various case studies and topics such as data visualization, distance-based methods, and high-dimensional inference. Additionally, it features links to other ebooks available for instant download on ebookmeta.com.

Uploaded by

isaiibrei
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.

com

Object Oriented Data Analysis 1st Edition Marron

https://ebookmeta.com/product/object-oriented-data-
analysis-1st-edition-marron/

OR CLICK HERE

DOWLOAD EBOOK

Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Object Oriented Data Analysis 1st Edition James Stephen


Marron

https://ebookmeta.com/product/object-oriented-data-analysis-1st-
edition-james-stephen-marron/

ebookmeta.com

Object-Oriented Python Irv Kalb

https://ebookmeta.com/product/object-oriented-python-irv-kalb/

ebookmeta.com

Object-Oriented Python 1st Edition Irv Kalb

https://ebookmeta.com/product/object-oriented-python-1st-edition-irv-
kalb/

ebookmeta.com

Genome Editing Current Technology Advances and


Applications for Crop Improvement 1st Edition Shabir
Hussain Wani
https://ebookmeta.com/product/genome-editing-current-technology-
advances-and-applications-for-crop-improvement-1st-edition-shabir-
hussain-wani/
ebookmeta.com
Summary Analysis of White Fragility Why It s So Hard for
White People to Talk About Racism A Guide to the Book by
Robin DiAngelo 1st Edition Zip Reads
https://ebookmeta.com/product/summary-analysis-of-white-fragility-why-
it-s-so-hard-for-white-people-to-talk-about-racism-a-guide-to-the-
book-by-robin-diangelo-1st-edition-zip-reads/
ebookmeta.com

Chimpanzees and Human Evolution 1st Edition Martin N.


Muller

https://ebookmeta.com/product/chimpanzees-and-human-evolution-1st-
edition-martin-n-muller/

ebookmeta.com

Immunology of Endometriosis: Pathogenesis and Management


1st Edition Kaori Koga

https://ebookmeta.com/product/immunology-of-endometriosis-
pathogenesis-and-management-1st-edition-kaori-koga/

ebookmeta.com

Cultures of Change in Contemporary Zimbabwe Socio


Political Transition from Mugabe to Mnangagwa Routledge
Contemporary Africa 9781032040264 1st Edition Oliver
Nyambi (Editor)
https://ebookmeta.com/product/cultures-of-change-in-contemporary-
zimbabwe-socio-political-transition-from-mugabe-to-mnangagwa-
routledge-contemporary-africa-9781032040264-1st-edition-oliver-nyambi-
editor/
ebookmeta.com

Landscaping for Dummies 2nd Edition Teri Dunn Chace

https://ebookmeta.com/product/landscaping-for-dummies-2nd-edition-
teri-dunn-chace/

ebookmeta.com
Tu rkiyenin Tarihi Deniz Fenerleri Historical Lighthouses
of Turkey First Edition Bostan ■dris Türkhan M Sait Soyer
Kolçak Emel
https://ebookmeta.com/product/tu-rkiyenin-tarihi-deniz-fenerleri-
historical-lighthouses-of-turkey-first-edition-bostan-idris-turkhan-m-
sait-soyer-kolcak-emel/
ebookmeta.com
Object Oriented
Data Analysis
MONOGRAPHS ON STATISTICS AND APPLIED
PROBABILITY

Editors: F. Bunea, R. Henderson, N. Keiding, L. Levina, R. Smith, W.


Wong

Recently Published Titles

Large Covariance and Autocovariance Matrices


Arup Bose and Monika Bhattacharjee 162

The Statistical Analysis of Multivariate Failure Time Data: A


Marginal Modeling Approach
Ross L. Prentice and Shanshan Zhao 163

Dynamic Treatment Regimes


Statistical Methods for Precision Medicine
Anastasios A. Tsiatis, Marie Davidian, Shannon T. Holloway, and Eric
B. Laber 164

Sequential Change Detection and Hypothesis Testing


General Non-i.i.d. Stochastic Models and Asymptotically Optimal Rules
Alexander Tartakovsky 165

Introduction to Time Series Modeling


Genshiro Kitigawa 166

Replication and Evidence Factors in Observational Studies


Paul R. Rosenbaum 167

Introduction to High-Dimensional Statistics, Second Edition


Christophe Giraud 168

Object Oriented Data Analysis


J.S. Marron and Ian L. Dryden 169

Martingale Methods in Statistics


Yoichi Nishiyama 170

For more information about this series please visit: https://www.crcpress.com/


Chapman–HallCRC-Monographs-on-Statistics–Applied-Probability/book-series/
CHMONSTAAPP
Object Oriented
Data Analysis

J.S. Marron and Ian L. Dryden


First edition published 2022
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

© 2022 Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, LLC

Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and
let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.

For permission to photocopy or use material electronically from this work, access www.
copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]

Trademark notice: Product or corporate names may be trademarks or registered trademarks and
are used only for identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data


Names: Marron, James Stephen, 1954- author. | Dryden, I. L. (Ian L.),
author.
Title: Object oriented data analysis / J.S. Marron and Ian L. Dryden.
Description: [Boca Raton] : Taylor & Francis Group, LLC, [2021] | Includes
bibliographical references and index. | Summary: “Object Oriented Data
Analysis (OODA) provides a useful general framework for the
consideration of many types of Complex Data. It is deliberately intended
to be particularly useful in the analysis of data in complicated
situations which are typically not easily represented as an
unconstrained matrix of numbers”-- Provided by publisher.
Identifiers: LCCN 2021023347 (print) | LCCN 2021023348 (ebook) | ISBN
9780815392828 (hardback) | ISBN 9781032114804 (paperback) | ISBN
9781351189675 (ebook)
Subjects: LCSH: Object-oriented methods (Computer science) | Quantitative
research. | Statistics--Methodology.
Classification: LCC QA76.9.O35 M369 2021 (print) | LCC QA76.9.O35 (ebook)
| DDC 005.1/17--dc23
LC record available at https://lccn.loc.gov/2021023347
LC ebook record available at https://lccn.loc.gov/2021023348

ISBN: 978-0-8153-9282-8 (hbk)


ISBN: 978-1-032-11480-4 (pbk)
ISBN: 978-1-351-18967-5 (ebk)

DOI: 10.1201/9781351189675

Typeset in Nimbus
by KnowledgeWorks Global Ltd.
Dedication

To our families for their ongoing strong support over the many years it took to
fully develop these ideas, and to the many colleagues who have played a vital role
in shaping this approach to data analysis.

v
Contents

Preface xi

1 What Is OODA? 1
1.1 Case Study: Curves as Data Objects 3
1.2 Case Study: Shapes as Data Objects 10
1.2.1 The Segmentation Challenge 10
1.2.2 General Shape Representations 12
1.2.3 Skeletal Shape Representations 13
1.2.4 Bayes Segmentation via Principal Geodesic Analysis 15

2 Breadth of OODA 19
2.1 Amplitude and Phase Data Objects 19
2.2 Tree-Structured Data Objects 23
2.3 Sounds as Data Objects 25
2.4 Images as Data Objects 28

3 Data Object Definition 31


3.1 OODA Foundations 31
3.1.1 OODA Terminology 31
3.1.2 Object and Feature Space Example 32
3.1.3 Scree Plots 36
3.1.4 Formalization of Modes of Variation 38
3.2 Mathematical Notation 39
3.3 Overview of Object and Feature Spaces 40
3.3.1 Example: Probability Distributions as Data Objects 43

4 Exploratory and Confirmatory Analyses 47


4.1 Exploratory Analysis–Discover Structure in Data 47
4.1.1 Example: Tilted Parabolas FDA 48
4.1.2 Example: Twin Arches FDA 52
4.1.3 Case Study: Lung Cancer Data 55
4.1.4 Case Study: Pan-Cancer Data 60
4.2 Confirmatory Analysis–Is It Really There? 63
4.3 Further Major Statistical Tasks 69

vii
viii CONTENTS
5 OODA Preprocessing 71
5.1 Visualization of Marginal Distributions 71
5.1.1 Case Study: Spanish Mortality Data 72
5.1.2 Case Study: Drug Discovery Data 74
5.2 Standardization–Appropriate Linear Scaling 85
5.2.1 Example: Two Scale Curve Data 86
5.2.2 Overview of Standardization 89
5.3 Transformation–Appropriate Nonlinear Scaling 91
5.4 Registration–Appropriate Alignment 94

6 Data Visualization 97
6.1 Heat-Map Views of Data Matrices 97
6.2 Curve Views of Matrices and Modes of Variation 104
6.3 Data Centering and Combined Views 107
6.4 Scatterplot Matrix Views of Scores 116
6.5 Alternatives to PCA Directions 120

7 Distance Based Methods 125


7.1 Fréchet Centers In Metric Spaces 127
7.2 Multi-Dimensional Scaling For Object Representation 132
7.3 Important Distance Examples 136
7.3.1 Conventional Norms 136
7.3.2 Wasserstein Distances 137
7.3.3 Procrustes Distances 139
7.3.4 Generalized Procrustes Analysis 141
7.3.5 Covariance Matrix Distances 143

8 Manifold Data Analysis 147


8.1 Directional Data 147
8.2 Introduction to Shape Manifolds 149
8.3 Statistical Analysis of Shapes 151
8.4 Landmark Shapes 157
8.4.1 Shape Tangent Space 160
8.4.2 Case Study: Digit 3 Data 160
8.4.3 Case Study: DNA Molecule Data 162
8.4.4 Principal Nested Shape Spaces 164
8.4.5 Size-and-shape space 166
8.4.6 Further Methodology 167
8.5 Central Limit Theory on Manifolds 167
8.6 Backwards PCA 169
8.7 Covariance Matrices as Data Objects 172

9 FDA Curve Registration 175


9.1 Fisher-Rao Curve Registration 176
9.1.1 Example: Shifted Betas Data 176
CONTENTS ix
9.1.2 Introduction to Warping Functions 181
9.1.3 Fisher-Rao Mathematics 182
9.2 Principal Nested Spheres Decomposition 193

10 Graph Structured Data Objects 197


10.1 Arterial Trees as Data Objects 198
10.1.1 Combinatoric Approaches 198
10.1.2 Phylogenetics 199
10.1.3 Dyck Path 202
10.1.4 Persistent Homology 203
10.1.5 Comparison of Tree Analysis Methods 206
10.2 Networks as Data Objects 207
10.2.1 Graph Laplacians 207
10.2.2 Example: A Tale of Two Cities 209
10.2.3 Extrinsic and Intrinsic Analysis 211
10.2.4 Case Study: Corpus Linguistics 211
10.2.5 Labeled versus Unlabeled Nodes 213

11 Classification–Supervised Learning 215


11.1 Classical Methods 217
11.2 Kernel Methods 226
11.3 Support Vector Machines 232
11.4 Distance Weighted Discrimination 236
11.5 Other Classification Approaches 241

12 Clustering–Unsupervised Learning 243


12.1 K-Means Clustering 243
12.2 Hierarchical Clustering 247
12.3 Visualization Based Methods 254
12.3.1 Hybrid Clustering Methods 256

13 High-Dimensional Inference 257


13.1 DiProPerm–Two Sample Testing 257
13.2 Statistical Significance in Clustering 262
13.2.1 High Dimensional SigClust 266

14 High Dimensional Asymptotics 275


14.1 Random Matrix Theory 276
14.2 High Dimension Low Sample Size 281
14.3 High Dimension Medium Sample Size 290

15 Smoothing and SiZer 293


15.1 Why Not Histograms?–Hidalgo Stamps Data 294
15.2 Smoothing Basics–Bralower Fossils Data 299
15.3 Smoothing Parameter Selection 302
x CONTENTS
15.4 Statistical Inference and SiZer 303
15.4.1 Case Study: British Family Incomes Data 304
15.4.2 Case Study: Bralower Fossils Data 307
15.4.3 Case Study: Mass Flux Data 307
15.4.4 Case Study: Kidney Cancer Data 308
15.4.5 Additional SiZer Applications and Variants 311

16 Robust Methods 313


16.1 Robustness Controversies 314
16.2 Robust Methods for OODA 315
16.2.1 Case Study: Cornea Curvature Data 321
16.2.2 Case Study: Genome-Wide Association Data 325
16.3 Other Robustness Areas 327

17 PCA Details and Variants 331


17.1 Viewpoints of PCA 332
17.1.1 Data Centering 334
17.1.2 Singular Value Decomposition 342
17.1.3 Gaussian Likelihood View 348
17.1.4 PCA Computational Issues 349
17.2 Two Block Decompositions 350
17.2.1 Partial Least Squares 351
17.2.2 Canonical Correlations 354
17.2.3 Joint and Individual Variation Explained 359

18 OODA Context and Related Areas 361


18.1 History and Terminology 361
18.2 OODA Analogy with Object-Oriented Programming 362
18.3 Compositional Data Analysis 364
18.4 Symbolic Data Analysis 365
18.5 Other Research Areas 367

Bibliography 371

Index 416
Preface

This book is intended as a resource for researchers in the development of novel


statistical and data science methodology. At the time of this writing, Big Data is
a very popular area of study. While Big Data does indeed present major statistical
challenges, an even greater challenge is dealing effectively with Complex Data
which is the main motivation for Object Oriented Data Analysis. The latter is a
framework that facilitates inter-disciplinary research through new terminology for
discussing the often many possible approaches to the analysis of complex data.
Such data are naturally arising in a wide variety of areas. This book aims to pro-
vide ways of thinking that enable the making of sensible choices. The main points
are illustrated with many real data examples, based on the authors’ personal expe-
riences, which have motivated the invention of a wide array of analytic methods.
A generally relevant comment is that most statistical problems can be solved
in many sensible ways. Simon Sheather elegantly summarized that state of affairs
(applicable to the many methods discussed in this book) as: “every dog has its
day”. The point is that any method that has been seriously advocated by someone
has situations where it gives excellent performance, but also situations where it can
be quite poor. The challenge is to understand the properties of each well enough to
guide good choices. The material in this book will provide the reader with useful
insights and a general framework to assist in this process.
A fundamental theme throughout the book, that has not been deeply explored
elsewhere is modes of variation. That provides a novel terminology and frame-
work for understanding many aspects of Object Oriented Data Analysis.
While the mathematics goes far beyond the usual in statistics (including differ-
ential geometry and even topology), the book is aimed at accessibility by graduate
students. There is deliberate focus on ideas over mathematical formulas. An ex-
ception is the detailed linear algebra development in Chapter 17. While many
references to various aspects of OODA are given, it should be noted we have
deliberately not attempted to be comprehensive in those. Our aim instead is to
simply provide useful starting points that interested researchers can use for their
own bibliographic searches.
The historical background of the Object Oriented Data Analysis terminology is
discussed in Section 18.1.
Much of the material that went into this book, including data sets, and the code
to generate most of the graphics can be found in the the web companion to this
book at Marron (2020). Many of those require Marron’s Matlab Software, avail-
able at: Marron (2017b). The companion website also contains further references
and links to other software packages.

xi
xii PREFACE
Acknowledgments
Many of the ideas and presentation style have been developed during the teaching
of a graduate course entitled Object Oriented Data Analysis and have been taught
roughly every other year at the University of North Carolina since 2005. There
were some important precursors, including a related course taught at Cornell Uni-
versity in 2002. Two such courses were offered at the Statistical and Mathemati-
cal Sciences Institute in 2010 and 2011 with lecturers (beyond the authors of this
book) including Hans-Georg Müller, James O. Ramsay, and Jane-Ling Wang. The
course was also taught at the National University of Singapore in 2015.
Several events have played a pivotal role in the development of Object Oriented
Data Analysis. One was the Statistical and Mathematical Sciences Institute pro-
gram on “Analysis of Object Data” during 2010–2011, with co-organizers Hans-
Georg Müller, James O. Ramsay and Jane-Ling Wang. Another pivotal workshop
was the November 2012 “Statistics of Time Warpings and Phase Variations” at the
Mathematical Biosciences Institute, with co-organizers James O. Ramsay, Laura
Sangalli and Anuj Srivastava.
General research in this area by the authors has been supported over the years
by a number of grants from the National Science Foundation, including DMS-
9971649, DMS-0308331, DMS-0606577, DMS-0854908 and IIS-1633074, and
the Engineering and Physical Sciences Research Council grants EP/K022547/1
and EP/T003928/1.
The material on sounds as data objects was kindly provided by Davide Pigoli.
The authors are especially grateful to Stephan F. Huckemann, John T. Kent, James
O. Ramsay, and Anuj Srivastava for providing formal reviews of early drafts that
fundamentally impacted the final version. Additional useful comments on various
drafts have been provided by Iain Carmichael, Benjamin Elztner, Thomas Keefe,
Carson Mosso, Vic Patrangenaru, Stephen M. Pizer, Davide Pigoli, and Piercesare
Secchi.
The authors are grateful to John Kimmel, for helpful advice at many points, and
for his patience over the long time it took for this book to come together.
CHAPTER 1

What Is OODA?

The fields of human endeavor currently known as statistics, data science, and data
analytics have been radically transformed over the recent past. These transforma-
tions have been driven simultaneously by a massive increase in computational
capabilities coupled with a rapidly growing scientific appetite for ever deeper un-
derstanding and insights. The notion of forming a data matrix provides a useful
paradigm for understanding important aspects of how these fields are evolving.
In particular, the currently popular context of Big Data has several quite different
facets, ranging from low dimension high sample size areas (the basis of classi-
cal mathematical statistical thought, which is perhaps typified by sample survey
and census data), through both high dimension and sample sizes (common for in-
ternet scale data sets of many types), and on to high dimension low sample size
contexts (frequently encountered in areas such as genetics, medical imaging and
other types of extremely rich but relatively expensive measurements). The press-
ing need to analyze data in this wide array of contexts has generated many exciting
new ideas and approaches.
Yet a deeper look into these developments suggests that the organization of data
into a matrix may itself be imposing limitations. In particular, there is a growing
realization that the challenges presented by Big Data are being eclipsed by the
perhaps far greater challenges of Complex Data, which are typically not easily
represented as an unconstrained matrix of numbers. Object Oriented Data Anal-
ysis (OODA) provides a useful general framework for the consideration of many
types of Complex Data. It is deliberately intended to be particularly useful in the
analysis of data in complicated situations, diverse examples of which are given in
the first two chapters. The phrase OODA in this context was coined by Wang and
Marron (2007). An overview of the area was given in Marron and Alonso (2014).
For more discussion of Big Data and its relation to statistics, see Carmichael and
Marron (2018) and many interesting viewpoints in the special issue edited by San-
galli (2018).
The OODA viewpoint is easily understood through taking data objects to be
the atoms of a statistical analysis, where atom is meant in the sense of elementary
particle, studied in several contexts of increasing complexity:
• In a first course in statistics atoms are numbers, and the goal is to develop
methods for understanding of variation in populations of numbers.
• A more advanced course, termed multivariate analysis in the statistical culture,
generalizes the atoms, i.e. the data objects from numbers to vectors and in-
volves a host of methods for managing uncertainty in that context. For example

DOI: 10.1201/9781351189675-1 1
2 WHAT IS OODA?
see Mardia et al. (1979), Muirhead (1982), and Koch (2014) (for a more up to
date treatment).
• At the time of this writing a fashionable area in statistics is Functional Data
Analysis (FDA), where the goal is to analyze the variation in a population of
curves. A good introduction to this vibrant research area, where functions are
the data objects, can be found in Ramsay and Silverman (2002, 2005). A case
study, illustrating many of the basic concepts of FDA which are useful for
understanding OODA is given in Section 1.1.
• OODA provides the next step in terms of complexity of atoms of a statistical
analysis to a wide array of more complicated objects. The important example
of shapes as data objects is considered in the case study of Section 1.2. A wide
variety of other examples, which highlight the breadth of OODA, appears in
Chapter 2.
Note that each of the above areas can be thought of as containing the preced-
ing ones as special cases. For example, multivariate analysis is the case of FDA
where the functions are discretely supported. Similarly multivariate analysis and
FDA are special cases of OODA. In later chapters it is useful to recall that OODA
includes these predecessors as special cases. This is because often simple multi-
variate examples are used for maximal clarity in the illustration of concepts and
methods, but the ideas are useful more generally for OODA.
A good question is: What is the value added to applied statistics and data sci-
ence from the concept of OODA and its attendant terminology? The terminology
is based on very substantial real-world experience with a wide variety of complex
data sets. A fact that rapidly becomes clear in the course of interdisciplinary re-
search is that there frequently are substantial hurdles in terminology. Especially at
the beginning of such endeavors, it can feel like collaborators are even speaking
different languages, so often serious effort needs to be devoted to the develop-
ment of a common set of definitions just to carry on a useful discussion. An added
complication is that for complex data contexts, it is frequently not obvious how to
even “get a handle on the data”. Usually there are many options available, which
are most effectively decided upon through careful discussion between domain sci-
entists and statisticians. In such discussions, the issue of what should be the data
objects? has proven to frequently lead to useful choices, thus resulting in an ef-
fective and insightful data analysis.
Real data examples demonstrating data object choices in a variety of contexts
are given in the following and Chapter 2. In particular, Section 1.1 introduces
curves as data objects. A more complex variant involves curves with interesting
variation in phase in place of, or in addition to, the usual FDA amplitude variation
discussed in Section 2.1. A mathematically deeper case is considered in Section
1.2 where shapes are the data objects which require special treatment as shapes
are most naturally viewed as points on a curved manifold. Section 2.2 considers
a perhaps even more challenging data set of tree-structured data objects. The data
objects in Section 2.3 are recordings of sounds, in particular human spoken words,
CASE STUDY: CURVES AS DATA OBJECTS 3
which bring special challenges in the choice of data objects. Finally, in Section
2.4, a fun example with images of faces as data objects is considered.
It is seen that the notion of data objects provides a particularly useful format
for discussing modes of variation that give insights about population structure.
This term is formally defined in Section 3.1, but until then the meaning should be
intuitively clear from the context.
One more general aspect of OODA is that there are frequently three major
phases of this type of data analysis:
1. Object Definition. This is the phase where the fundamental issue of what should
be the data objects is addressed. A number of examples of this phase are pro-
vided in the rest of this chapter and also in further examples in other sections.
2. Exploratory Analysis. Here the goal is to find perhaps surprising population
structure in data, often using some type of visualization method. A wide va-
riety of examples and methods for exploratory analysis are given in the rest
of this Chapter and in Chapters 2, 4, 5–10. Exploratory analysis frequently
only appears sparingly in most classical statistics courses, but is usually more
prominent in machine learning. However it has a strong statistical tradition,
going back well before the ideas nicely summarized in Tukey (1977).
3. Confirmatory Analysis. While many great discoveries have been made using
exploratory methods, it is also very easy to make discoveries that are not real,
in the sense of being non-replicable sampling artifacts. For this reason it is very
important to validate such discoveries. This critical topic and many variants of
approaches to it are discussed in detail in the very large classical statistical lit-
erature. Some less well known aspects, that are particularly relevant to OODA
are discussed in Chapter 13.
A companion website to this book, containing links to available software, the Mat-
lab or R programs used to generate most of the figures in this book, and additional
graphics can be found at Marron (2020).
Further discussion on other ideas and nomenclature related to OODA can be
found in Chapter 18. Additional big picture discussion of data science and statis-
tics can be found in Marron (2017a) and Carmichael and Marron (2018).

1.1 Case Study: Curves as Data Objects


An interesting example of functional data analysis (viewed here as an important
special case of OODA) is the Spanish Mortality data, first studied from an FDA
viewpoint in Section 2 of Marron and Alonso (2014). Such data sets are available
at the Human Mortality Database of Wilmoth and Shkolnikov (2008). For a given
population (e.g. citizens of one country) mortality data are generally a matrix with
rows and columns indexed by years and ages. The matrix entries are the chance of
a person of each age dying in the given year, calculated as the number of deaths
divided by the number of people for that year–age pair. Here we study mortality
of males in Spain, mostly because there are interesting features in the data, due to
recent Spanish history.
4 WHAT IS OODA?
This data set provides good illustration of the issue of Object Definition, be-
cause there are several data object choices to be made in the analysis of this data.
First, since these probabilities range over several orders of magnitude, logarithms
are useful to provide good visual separation across a wide range of scales. Particu-
larly strong interpretability comes from the choice of log10 of the probability (e.g.
−2 corresponds to a probability of 0.01 as opposed to about 0.135 for the natural
log). The utility of this data object choice is demonstrated in Figure 1.1, where
the raw probabilities are shown in the left panel (with much interesting structure
missed since this is very nearly 0 for the important younger age groups) with log10
mortality in the right (highlighting important contrasts among the younger ages).
Second there are two different ways to turn the matrix of data into functional data.
One is to consider data objects to be curves of mortality as a function of age, with
curves indexed by year. The other is (the matrix transpose) where the mortality
is viewed as a function of year, with data object curves indexed by age. In this
analysis, the former choice is used, because it gives the best illustration of the
usefulness of OODA concepts and also gives an interesting narrative. The latter
choice is considered in Figure 17.7. An analysis that also integrates female mor-
tality in an interesting way can be found in Feng et al. (2018). The choice here
results in n = 95 curves corresponding to the years 1908–2002. Ages considered
here are 0 through 98, since larger ages are problematic due to occasional small
population sizes. The raw data are shown as overlaid curves in Figure 1.1. There
the curves are distinguished using the standard graphical technique of a rotating
color palate (in this case the default 7 colors in Matlab).

0.4
-1
log10(mortality)

0.3
mortality

-2
0.2

-3
0.1

0 -4
20 40 60 80 20 40 60 80
age age

Figure 1.1 Spanish Mortality curves as a function of age. Raw male mortality is in the
left panel, with log10 mortality on the right. Years are distinguished using a rotating color
palette. Shows age effects and large variation (factors of more than 10 for some age groups)
across years, as well as the data object choice of log10 mortality being the more useful
scaling of the data.

This view already shows interesting aspects of the data. For example, being
born is a risky activity, with a high mortality rate. However, the chance of dy-
ing falls off rapidly, up until the teen years when risky behavior tends to begin.
Then through adulthood the death rate slowly increases, becoming quite high in
CASE STUDY: CURVES AS DATA OBJECTS 5
old age. Also note the bundle of curves is quite thick, with the axes indicating
approximately a 10 fold change over the years, begging an investigation into how
things have changed over time. This is easily provided in Figure 1.2, by applying
a different color scheme to the curves in the right panel of Figure 1.1. Here time
ordering of the curves is highlighted through coloring with a rainbow scheme to
indicate years, starting with magenta ([1 0 1] in RGB coordinates) for 1908 and
ranging through violet, blue, cyan, green, yellow, and orange to red ([1 0 0] in
RGB) for 2002.

-0.5 1997

1987
-1
1977
-1.5
log10(mortality)

1967
-2 1957

-2.5 1947

-3 1937

1927
-3.5
1917
-4
20 40 60 80
age

Figure 1.2 Spanish Mortality curves using a rainbow color scheme to indicate progression
in time (over years 1908–2002). Shows major improvements in mortality over this time
range.

This shows a very clear overall improvement over the years in mortality, due
mostly to improvements in medicine and public health. Note also that these im-
provements have benefited younger people more than the old, as there is not yet
much treatment available for aging. As happens frequently with OODA data, ad-
ditional visual insights come from careful decomposition of the variation present
in these curves, through a Principal Component Analysis (PCA). See Chapters
4 and 17 and Jolliffe (2002) for background information concerning the many
ways this method is used. One important use of PCA is to gain insight into how
data objects relate to each other. Insight comes from considering the data as lying
in an abstract point cloud in d = 99 dimensional space, where low-dimensional
projections frequently visually illustrate key relationships (e.g. clustering of data
objects). An often useful first step of a PCA is mean centering, which essentially
moves the point cloud so that it is centered at the origin. As seen in Figure 1.3,
this centering operation itself can provide an informative decomposition of the
data into the mean and residuals about the mean.
6 WHAT IS OODA?
-0.5
1
-1
log10(mortality)

log10(mortality)
0.5
-1.5
0
-2
-0.5
-2.5
-1
-3
20 40 60 80 20 40 60 80
age age

Figure 1.3 Left panel is the mean mortality curve. Right panel contains the mean residuals,
where the mean is subtracted from each curve, using the same color scheme. Shows that
age effects are essentially common for all (i.e. over time), in the sense of appearing in the
mean. Improvements over time appear in the residuals, with overall most improvement for
the young.

The left panel of Figure 1.3 shows the mean curve, computed as the point-wise
mean of the curves in Figure 1.2. The right panel contains the mean residuals,
which are computed by subtracting the mean from each of the data curves, while
retaining the original year coloring. Note that the mean curve contains many of the
important features of the raw data, especially those related to age. In particular, the
danger of being born together with low mortality for the young with increasingly
higher mortality for the old are all properties of the mean. These essentially do
not appear in the mean residuals, indicating that these are population properties
which have not changed much over time. A perhaps surprising aspect of the mean
is the occasional blips that appear. One might think these are random noise, but
note that they are quite periodic and in fact appear at decades. This is a function
of historically poor record keeping. The early lack of birth certificates for the full
population led to some uncertainty of age at the time of death for some, with
subsequent rounding to decades which is clearly visible. The mean residuals also
reflect an important aspect of the population structure, being driven by the changes
over time. Most important are the dramatic improvements in mortality that have
been made over the course of this study. This view also makes it clear that the
young have benefited the most with that benefit decreasing as a function of age.
PCA is usefully understood as decomposing the mean centered data in the right
panel of Figure 1.3 into insightful modes of variation (this concept is formally de-
fined in Section 3.1.4). One such mode is the variation revealed by the first princi-
pal component as shown in Figure 1.4. Insight comes from thinking of the above
mentioned point cloud, where each data object (curve in this case) is a point. The
PCA modes of variation are developed by seeking orthogonal directions of maxi-
mal variation within the point cloud. The first PC direction is the unit (i.e. norm 1)
vector, based at the sample mean, which maximizes the variance of the data pro-
jected onto that vector. This direction is easily computed as the first eigenvector of
the sample covariance matrix (defined at (3.5)). The entries of that vector (which
CASE STUDY: CURVES AS DATA OBJECTS 7
1.5

1 0.15
log (mortality) 0.5
0.1
0
10

-0.5 0.05
-1
0
20 40 60 80 -5 0 5
age PC1 Scores

Figure 1.4 PC1 mode of variation plot (left) and scores distribution plot (right). The former
shows that this dominant mode of variation reflects most of the overall improvement in
mortality. Scores plot shows most of the improvements happened relatively rapidly, plus
highlights the 1918 Flu Pandemic (violet outlier on the right) and the Spanish Civil War
(light blue sharp trend to the right).

indicate how it relates to the variables, i.e. features, of the data set) are called the
loadings. Visual insight into these loadings comes from the mode of variation plot
in the left panel of Figure 1.4. The horizontal axis indexes the variables, which are
ages in this case, and the curves are all multiples of the eigenvector. In particular
the curves are projections of the data curves onto the direction vector. These are
the columns of the rank 1 matrix that is the product of the column vector of load-
ings times the row vector of scores, which are the projection coefficients of each
data object onto the eigenvector. In classical multivariate analysis the scores are
also called the principal components. This matrix is the (least squares) best rank 1
approximation of the mean residual matrix shown in the right panel of Figure 1.3.
This PC1 view highlights the dominant mode of variation, which nicely reflects
the major overall improvement in mortality. In addition, as life and death record
keeping has improved over time, the decline in age rounding effects is reflected in
the decadal spikes pointing upwards (early) and downwards (later). In particular,
the rounding was present earlier, not later, so it shows up partially in the mean in
Figure 1.3, and then as this contrast in Figure 1.4 (left panel).
The right panel of Figure 1.4 is the PC1 scores distribution plot, which will be
used frequently in the following to display detailed information as to how the data
objects relate to each other. Each circle represents one score using the same color
scheme. Horizontal coordinates indicate the score and vertical coordinates indi-
cate order in the data set, in this case the year. The magenta color of the top circle
is the year 1908 and the red color of the lowest circle is for the year 2002. The
overall leftward trend again shows the overall improvement in mortality over these
years. The black curve shows a kernel density estimate, which can be thought of as
a smooth histogram. The vertical axis records the heights of this curve. Detailed
discussion of kernel density estimation is in Chapter 15. See Wand and Jones
(1995) for a more in-depth overview. This type of display of one-dimensional dis-
tributions, which includes both the actual data points and the smooth histogram, is
used many other times in the following. In this case it shows much higher density
8 WHAT IS OODA?
of scores in the higher and lower regions, which is another way of seeing that
most of the overall transition from higher to lower mortality was relatively rapid.
A couple of smaller-scale aspects are also clear in this scores plot. The violet year,
farthest to the right was the year 1918, when many people around the world died
during a flu pandemic, which until recently was the largest ever well-documented
epidemiological event worldwide. Also notable is the shift toward higher mortal-
ity (i.e. to the right) shown as light blue, which was the time of the Spanish Civil
War, just before World War II (in which Spain was not a combatant).

0.2 1.2

0.1 1
log10(mortality)

0 0.8

0.6
-0.1
0.4
-0.2
0.2
-0.3
0
20 40 60 80 -0.5 0 0.5 1
age PC2 Scores

Figure 1.5 PC2 mode of variation (left) and scores distribution (right), using the same
format as Figure 1.4. The loadings plot shows this second mode of variation provides a
contrast between the 20–45-year-olds with the rest. The scores plot shows the deep effects
of the flu pandemic, the Spanish civil war and automotive death rate.

Figure 1.4 showed the first mode of variation in the mortality data called PC1.
An interesting complementary mode of variation is the second PC, as shown in
Figure 1.5. This represents the direction of second strongest variation (in the sense
of being orthogonal to the first direction) measured again in terms of variance of
projections. It is computed as the second eigen direction of the sample covariance
matrix. The PC2 mode of variation plot (left panel) shows that this direction high-
lights differences between the 20–45-year-old cohort, with the union of the young
and the old. The color pattern is harder to interpret in this mode, but is very clear
in the scores distribution plot (right panel). Note that the 20–45-year-olds suffered
even stronger effects from both the pandemic and also the war, as they died at a
substantially higher rate than usual in those times. Another interesting feature is
the growing mortality for this cohort in the 1960s to 1980s (green to orange). This
period corresponds to growing access to automobiles, and apparently the idea that
young males are the group most prone to risky automobile behavior. Note that
in the final years, the direction of this trend has fortunately reversed, which has
been ascribed to much improved car safety (such as seat belts) and also to major
improvements in roads.
The concept of modes of variation as determined by PCA loadings and scores
is explored more deeply in Section 3.1.
CASE STUDY: CURVES AS DATA OBJECTS 9

0.5
PC2 Scores

-0.5

-4 -2 0 2 4 6
PC1 Scores

Figure 1.6 Scatterplot of PC1 vs. PC2 scores for the Spanish Mortality data. This shows
many of the above historical trends in a single plot.

Figure 1.6 shows a scatterplot of the bivariate distribution of the PC1 and PC2
scores, which provides a useful and concise summary of both modes of variation,
i.e. of much of the structure in this data set. The one-dimensional PC1 scores
distribution in the right panel of Figure 1.4 is on the horizontal axis, while the ver-
tical axis has the corresponding PC2 scores distribution from the right of Figure
1.5. This is the two-dimensional projection of the data onto the plane with max-
imal variation. Note that the circles representing the data objects (i.e. the mor-
tality curves) are connected with line segments in time order, which facilitates
keeping the progression of years in mind when interpreting the plot. The overall
improvement in mortality, with the exceptions of flu and war, are clear from the
main leftwards progression. Variation over time of the contrast between the 20–
45-year-olds and the rest are also clear on the vertical axis, nicely highlighting the
flu, war and automobile effects.
For this data set, the most interesting views are in the first two PC components.
For others, more components can also be quite insightful. A useful summary of
several PC components is a matrix of such scatterplots, with the axes carefully
coordinated over both rows and columns. The diagonal of such a display is most
useful when it shows some sort of 1-d distributional summary, e.g. the combina-
tion of jitter plots (the colored circles) and kernel density estimates used to show
the distribution of scores in the right panels of Figures 1.4 and 1.5. Jitter plots
are discussed in more detail in Section 4.1. Further examples of such matrices of
scatterplots can be found in Figures 4.4, 4.12, 4.13, and many other places in later
chapters.
Mortality rates for other countries can be explored in a similar way. For exam-
ple mortality data from Switzerland (also available in Wilmoth and Shkolnikov
(2008)) show similar flu pandemic and automobile effects as observed here, but
10 WHAT IS OODA?
neither the data rounding (due to a longer period of good record keeping) nor the
war caused mortality effects are visible as expected.

1.2 Case Study: Shapes as Data Objects


A particularly deep and important example of shapes as data objects is the
Bladder-Prostate-Rectum data, motivated by the challenge of planning radiation
treatment of prostate cancer described in Chaney et al. (2004).

1.2.1 The Segmentation Challenge


Radiation treatment of cancer is quite effective, and administered over the course
of a number of days. The goal is to provide a maximal radiation dose to the
prostate while minimizing the impact on nearby sensitive organs such as the at-
tached bladder and the rectum, which is adjacent. A major radiation treatment
planning challenge is that (even within the same person) the locations of all 3 or-
gans vary widely on the critical time scale of days. Computed Tomography (CT)
images are useful for visually locating these organs on a given day, with CT pre-
ferred over Magnetic Resonance images due to its superior accuracy of location.
However segmentation, i.e. finding the set of voxels (three-dimensional analogs of
pixels) inside each organ, was a challenging problem because of poor contrast and
noise, as shown in Figure 1.7. That is one slice of a 3-d stack of images, showing
a side view of the hip region for one patient. The color scheme of CT is the same
as for x-rays, so dense objects such as bones show up as white. Thus the upper
right of Figure 1.7 shows the tailbone, and a hipbone passes through this slice in
the lower center. Black indicates the least dense regions which are gas bubbles in
the rectum, which is the curved lighter region containing the darkest spots starting
near the top center and curving down below and to the left of the tail bone. The
lighter gray region between the top of the rectum and the small hip bone is the
bladder. The prostate, which is the target of the treatment, is a light gray region
between the hip bone, the bladder and the lowest visible section of the rectum.
Segmentation of the prostate is quite challenging because of very poor contrast
with surrounding objects (it is essentially the same shade of gray and has both
lighter and darker regions nearby) and because of the relatively high noise level.
For these reasons, incorporation of anatomical knowledge is essential to the seg-
mentation process. Manual segmentation achieves this through an anatomically
trained technician drawing the boundary of an object on each slice of the 3-d
image. The union of the interior voxels aggregated over slices then gives a seg-
mentation of the object. An example of that process is in Figure 1.8, which shows
two views of a manual segmentation of the bladder in Figure 1.7. The left panel
shows how voxels are aggregated across slices, using a view orthogonal to that
where the drawing was done. The right panel is a rotated view of the highlighted
collection of blue colored voxels without the CT image, giving a clear impression
of the 3-d object.
While manual segmentation is quite effective at locating these organs for
CASE STUDY: SHAPES AS DATA OBJECTS 11

Figure 1.7 One slice of 3-d CT image in Bladder-Prostate-Rectum data. Bones are white,
black gas bubbles indicate the rectum. Bladder and prostate are light gray near the center
and lower center. This image shows that automatic segmentation is very challenging.

Figure 1.8 Left panel shows the results of a manual segmentation of the bladder, performed
sequentially on orthogonal slices. Right panel shows a rotated view of the same bladder, to
highlight the 3-d aspect of the segmentation.

planning radiation treatment, it is time-consuming and hence it is not practical


to repeat this manual operation many times over the course of radiation treatment
(i.e. in a clinical setting). This has motivated a lot of research on automatic seg-
mentation of these organs, much of which was developed in the references cited
12 WHAT IS OODA?
at the end of this section. The key idea is to incorporate anatomical information
into the training process, using a Bayesian statistical model. The starting point for
this is a shape representation, i.e. a parametric model for each organ.

1.2.2 General Shape Representations


In some contexts shape is conveniently represented by landmark configurations,
i.e. a set of points that correspond across members of the data set, which can be
readily found on each. The statistical analysis of landmark configuration shape
data objects was pioneered by Kendall (1984) and Bookstein (1986). For intro-
duction to the large literature on that, see Dryden and Mardia (2016). The fun-
damental idea is illustrated by a toy data set of triangles in R2 as data objects in
Figure 1.9. An intuitive representation of each triangle is the configuration of the
R2 coordinates of the vertices (a 6-tuple), which are natural landmarks. However
many triangles with different configurations have the same shape. In particular,
the triangles to the left of the dashed line are all translations, rotations, and scal-
ings of each other, i.e. all have the same shape. Two other sets of common shapes
appear between the vertical lines, and to the right of the dot-dashed line. The math-
ematical device of equivalence relation provides a convenient formulation of the
notion of shape. Calling two triangular configurations equivalent when they are
translations, rotations, and scalings of each other results in equivalence classes.
These are the sets of all triangles which can be translated, rotated, and scaled into
each other, i.e. triangles of the same shape. These equivalence classes of identi-
fied triangles then become the shape data objects. Spaces of equivalence classes
are widely studied in differential geometry, where they are called quotient spaces.
Common synonyms for the equivalence classes are fibers (frequently used here in
Chapter 8) and orbits (appearing often here in Chapter 9). As discussed in Sec-
tion 8.4 and in Section 4.3.4 of Dryden and Mardia (2016), the natural geometry

3.5

2.5

1.5

0.5

0
0 1 2 3 4 5 6 7

Figure 1.9 Toy data set of triangles in R2 , to illustrate shapes as data objects. Lines sep-
arate three equivalence classes (i.e. fibers or orbits) with respect to translation, rotation,
and scaling.
CASE STUDY: SHAPES AS DATA OBJECTS 13
2
of the quotient space of triangle shapes is the sphere S (see (3.2) for a formal
definition). Sections 8.2 and 8.4 contain a broader discussion of shape quotient
spaces, where it is seen that many of those are also curved. This provides strong
motivation for studying data objects lying on curved manifolds, as done below
and in more depth in Section 8.3.
While landmark approaches are useful for many tasks, they are typically less
useful in many medical imaging situations, such as soft tissues, where landmarks
that correspond across cases can be hard to find, with often very few obvious
choices apparent. Hence, there has been much research devoted to boundary rep-
resentations. In the computer graphics world a very common boundary represen-
tation is a triangular mesh, see e.g. Owen (1998). A major challenge to the use of
mesh representations in shape statistics is correspondence, i.e. relating the mesh
parameters (e.g. triangle vertices) across instances of shape data objects. Two im-
portant approaches to this are Active Shape Models, see Cootes et al. (1994) for
a good introduction, and the entropy-based ideas of Cates et al. (2007). Another
major formulation of boundary representations is through Fourier methods, e.g.
as in Kelemen et al. (1999). For sufficiently smooth shapes, Kurtek et al. (2013)
have shown that superior representation comes from enhancing boundary repre-
sentations by also including surface normal vectors in the data objects.

1.2.3 Skeletal Shape Representations


As discussed in Siddiqi and Pizer (2008), a medial representation can provide
improvements for a number of imaging tasks. The key idea is to base the rep-
resentation on the more robust concept of 3-d solids instead of on 2-d boundary
surfaces. For the reasons discussed in Chapter 3 of Siddiqi and Pizer (2008), the
concept of medial locus has been generalized to give skeletal representations. As
noted in Pizer et al. (2013) the enhanced flexibility of skeletal representations al-
lows for superior fits to data. A skeletal representation of one bladder, prostate and
rectum instance is illustrated in Figure 1.10.
The left panel of Figure 1.10 shows the interior components of three skeletal
representations, one for each organ. Each has a set of yellow dots, called skeletal
atoms, connected by green line segments, which are a discretization of the skele-
tal sheet, the 2-d surface which is approximately medial in the sense of being
equidistant from both boundaries. Each skeletal atom has spokes, shown as cyan
and magenta line segments, extending from the skeletal sheet to the boundary of
the organ. Skeletal atoms at the edge of the sheet each have one additional spoke
shown in red, extending to the edge of the organ. The central panel of Figure 1.10
adds three colored meshes (yellow for the bladder, green for the prostate, red for
the rectum) which indicate the boundary of each that is implied by the interior
components as a quadrilateral mesh that connects the ends of the spokes. The
right panel shows the boundary more explicitly by coloring the panels of the quad
meshes and using a light source shading in the same colors. The skeletal model is
a parametric model of shape, whose parameters are the 3-d locations of the yellow
14 WHAT IS OODA?

Figure 1.10 Skeletal representation of a single bladder-prostate-rectum. Left panel shows


the central skeletal sheets, atoms and spokes for each shape object. Center panel adds the
implied boundaries as quad meshes, using yellow for the bladder, green for the prostate,
red for the rectum. Right panel represents the implied boundaries using a light source
rendering.

atoms, the lengths of the spokes, and the angles of the spokes, each of which is
represented as a point on the sphere S 2 .
The data objects in this OODA case study are chosen to be the skeletal models
represented by the locations of k atoms in R3 , l positive spoke lengths in R+ and
m directions on S 2 . For CT images where a manual segmentation has been per-
formed, the skeletal shape model can be fit to the binary image shown in blue in
Figure 1.8 (i.e. the various parameters estimated), using direct methods such as
least squares. However as discussed above, for clinical applications such as radi-
ation treatment planning, with a need for a technician to perform this operation
several tens of times for one course of treatment, manual segmentation is pro-
hibitively expensive. This motivated the work cited at the end of this section, on
automating fitting of skeletal models (as shown in Figure 1.10) directly to raw CT
images (as shown in Figure 1.7). As discussed above, this requires incorporation
of something akin to anatomical information. That is done using a Bayesian sta-
tistical approach. Essentially some manual segmentations are used to train a prior
distribution using OODA, which is combined with a likelihood that is based on a
new CT image, to generate a posterior distribution which is maximized over the
parameters of the skeletal shape representation, to give an automatic segmentation.
CASE STUDY: SHAPES AS DATA OBJECTS 15
1.2.4 Bayes Segmentation via Principal Geodesic Analysis
The Bayes implementation employed in this type of application differs somewhat
from most modern Bayes applications. On one hand, the underlying probability
distributions are very basic, since only conjugate Gaussian priors, likelihood, and
hence posteriors are used. This is a strong contrast with the complicated models
involving Markov chain Monte Carlo methods that are currently very prevalent in
applications of Bayes methods. On the other hand, this Bayes application is rel-
atively deep in two ways. First the number of parameters to fit is typically much
higher then the number of training instances, i.e. it lies in the high dimensional
OODA domain discussed in a general way in Chapter 14. The second compli-
cation is the non-Euclidean nature of the reparameterizations, caused mostly by
each spoke naturally lying on the surface of the sphere S 2 . As this research has
progressed, the high dimensionality has been handled by a variety of methods
related to PCA. More challenging is that skeletal parameterized data objects are
naturally elements of a space of the form R3k × Rl+ × (S 2 )m (i.e. tuples of k
real numbers, l positive reals, and m points on the sphere). Such spaces are called
manifolds in differential geometry (see Section 8.2 for an introduction to aspects
of this topic needed for OODA) and are usefully thought of as curved surfaces
(e.g. the surface of a sphere).
The need to address the first complication (the high dimension) in the bladder-
prostate-rectum segmentation challenge described above has led to a series of de-
velopments in terms of analogs of PCA for data lying on the manifolds of skeletal
representations. The Principal Geodesic Analysis (PGA) of Fletcher et al. (2004)
represents an important early advance in this work. The main idea of PGA is
to consider the Euclidean PCA basis as a set of orthogonal lines that (sequen-
tially) best fit the data. In PGA these best fitting lines are replaced by best fitting
geodesics (e.g. great circles on S 2 ) which are a natural analog of lines. The results
of a PGA, based upon n = 17 skeletal representations (collected over a sequence
of days) from a single patient are shown in Figure 1.11.
Figure 1.11 reveals clinically interesting modes of variation of these organs
within this person. The left column (first mode of variation) seems to reflect verti-
cal shift variation driven by the rectum. The second mode (middle column) shows
twisting, while the third (right column) is about emptying and filling of the blad-
der. This input led to the Bayes segmentation method giving very effective auto-
matic segmentation. That was the basis for the successful start-up company Mor-
phormics, which was subsequently purchased by the radiation treatment equip-
ment manufacturer Accuray.
More recently there has been a series of improvements to PGA, motivated by a
succession of deeper and deeper integrations of statistical ideas with differential
geometry. Detailed discussion of this progression appears in Section 8.3. While
this discussion has focused mostly on segmentation using skeletal shape repre-
sentations, much important related work has been done on classification as dis-
cussed in Chapter 11 and on confirmatory analysis which appears here in Chapter
13. A good overview of the usefulness of skeletal representations, especially in
16 WHAT IS OODA?

Figure 1.11 Principal Geodesic Analysis of Bladder-Prostate-Rectum variation within one


person. Columns give visual impression of first 3 PGA modes of variation. All three plots
in the second row are the Fréchet mean (notion of center defined in (7.5)). Top row shows
the three +2 standard deviation departures from the mean, and bottom row shows the cor-
responding -2 standard deviation departures. This gives three interpretable and sensible
modes of variation.

comparison to other types of representations can be found in Pizer et al. (2013,


2014), Schulz et al. (2016), and Hong et al. (2016).
The bladder-prostate-rectum research that lies at the core of the discussion
of this section was developed in a series of papers. That includes Chaney et al.
(2004), Broadhurst et al. (2005), Davis et al. (2005), Pizer et al. (2005a,b, 2006,
CASE STUDY: SHAPES AS DATA OBJECTS 17
2007), Lu et al. (2007), Stough et al. (2007), Jeong et al. (2008), Merck et al.
(2008), and Feng et al. (2010).
CHAPTER 2

Breadth of OODA

This chapter illustrates the breadth of OODA through relatively brief overviews
of quite diverse applications.

2.1 Amplitude and Phase Data Objects


A challenging situation in FDA is when the curve data objects are misaligned,
as shown in the top panel of Figure 2.1 for a Proteomics data set called the TIC
Curves here. Many statistical methods can be strongly impacted by misalignment.
An example of the impact of misalignment on the sample mean in FDA is shown
using the Shifted Betas toy data in Figure 5.17. A quite different type of impact on
PCA appears in Figure 9.2. As noted in Marron et al. (2014b), FDA approaches
to dealing with alignment issues are sometimes called curve registration, because
it is very useful in situations where the curve data objects are clearly misaligned.
There are many approaches to the curve registration challenge, with an
overview provided in the survey paper Marron et al. (2015). Most methods in
the area involve tuning parameters that have proven to be tricky to choose in a
fully automatic way, as illustrated in Figure 9.1. This problem has been solved
using the OODA way of thinking, as discussed in Chapter 9. In particular, that
approach is based on unusually deep mathematical ideas based on the Fisher-Rao
metric, which resulted in a rigorous methodology that is hence fully automatically
useful.
An interesting example of curve registration, from Koch et al. (2014) and Mar-
ron et al. (2014a), is shown in Figure 2.1. The data objects here are proteomics
mass spectrometry profiles from Ho (2011), a larger study of bio-markers in Acute
Myeloid Leukemia. A detailed description of this data set including a number of
pre-processing steps (such as median smoothing and interpolation to an equally
spaced grid) can be found in Koch et al. (2014). Essentially there are 5 patients,
represented as colors, with 3 replicate curves for each patient, thus 15 curves in
all, shown in the top part of the top panel. Each curve shows Total Ion Counts
(TIC), for an equally spaced grid of 2001 mass to charge ratios (horizontal coor-
dinate). The TIC curves have many peaks, which correspond to various peptides.
A common goal of mass spectrometry analyses is curve registration, i.e. finding
deformations, sometimes called warpings (intuitively thought of as stretchings
and compressings of the horizontal axis), to properly align the peaks so that they
chemically correspond. In most contexts it is hard to quantitatively assess the per-
formance of a given registration, but this data set is special because the locations
of several of the actual peptide peaks have been (laboriously) found for each curve

DOI: 10.1201/9781351189675-2 19
20 BREADTH OF OODA
using additional information as detailed in Koch et al. (2014). These peak loca-
tions, for each of the 15 curves, are indicated by peak numbers (1–14), with colors
corresponding to the curves. The peak numbers are sorted vertically by height of
the corresponding peak and connected with gray line segments to give some visual
correspondence. It is hard to see much pattern, showing this to be a challenging
curve registration problem.
As noted above, there are a number of approaches to this type of data challenge,
with several such analyses of this data set discussed in Marron et al. (2014a).
The bottom panel of Figure 2.1 shows the results of registration of these same

Figure 2.1 Top panel contains raw TIC Curves data, with a labeling of certain important
peaks in the lower part of the panel. Bottom panel shows a Fisher-Rao registration of the
TIC curves. Numbers under the curves indicate peak locations, showing that the registra-
tion has been mostly quite effective.
AMPLITUDE AND PHASE DATA OBJECTS 21
TIC curves using the Fisher-Rao method proposed in Srivastava et al. (2011) and
Kurtek et al. (2012) (discussed in more detail in Section 9.1), using only the curves
themselves and not the peak location information. The colored numbers reveal that
this is a particularly challenging problem, because the peaks have quite different
heights across patients. Peak 10 is particularly challenging as it is quite low for
the red patient (especially compared to nearby very tall peaks), yet is the highest
peak for other patients. Note the alignment is not perfect for every numbered peak,
but it is still of impressively high quality. Roughly comparable quality has been
obtained using a linear registration approach that is integrated with clustering in
Bernardi et al. (2014b), and by a Bayesian approach in Cheng et al. (2014).
An important point made in the overview of Marron et al. (2015) is that curve
registration methods are useful more generally than simply to align curves. While
in some contexts, such as that of Figure 2.1, the phase component is merely nui-
sance variation to be dealt with but of no intrinsic interest, there are many other
situations where the warps themselves represent useful modes of variation. In
such contexts it is insightful to consider different types of data objects for OODA.
In particular, amplitude data objects, whose variation is contained in the aligned
curves, and phase data objects which are the warps used to achieve the alignment.
Depending on the context either or both choices of data object can be of primary
interest or either could represent just nuisance variation.
The notions of amplitude and phase data objects are illustrated using the Bi-
modal Phase Shift example in Figure 2.2. The upper left panel shows a simulated
functional data set, where every data object (curve) has two peaks and is a multiple
of a beta mixture probability density. A rainbow color scheme is used to distin-
guish the curves, in order of how separated the peaks are. The peaks have both
different heights showing substantial amplitude variation, and also quite different
locations reflecting strong phase variation. These two types of mode of variation
are decomposed in a useful way by the warping functions shown in the bottom
right panel, computed using the Fisher-Rao method, described in Section 9.1. The
vertical axis is the same as in the upper left panel. Rescaling that axis using the ma-
genta warp functions moves the magenta peaks inwards, and using the red warp
functions moves the red peaks outwards. The top right panel shows the ampli-
tude data objects, i.e. aligned curves. A careful look shows that the random peak
heights are linearly related with the left peak being high when the right peak is
low. This set of amplitude data objects consists of just a single one-dimensional
mode of variation. The warps in the lower right panel can be thought of as the
phase data objects, although they are not easy to interpret. Enhanced interpreta-
tion of the variation in the phase data objects comes from the view in the lower left
panel. That is an application of each of the warps to the Fréchet mean (discussed
in Section 7.7) template from the Fisher-Rao calculation, which nicely reflects the
one-dimensional phase variation.
22 BREADTH OF OODA

Figure 2.2 Bimodal Phase Shift data (top left panel) showing decomposition into amplitude
(top right panel) and phase (bottom left panel) modes of variation. Decomposition is based
on the warping functions (bottom right panel). Rainbow color scheme highlights the phase
mode, with red for closest peaks through magenta for farthest peaks.

As clearly demonstrated in Figures 9.2, 9.3, 9.9, and 9.11, decompositions of


the type shown in Figure 2.2 can be much more useful than a standard PCA in
FDA, which tends to both mix the amplitude and phase components, and also to
spread the phase variation over a large number of components, because it is a
nonlinear mode of variation from that viewpoint. As discussed in Marron et al.
(2014b, 2015), amplitude-phase decomposition is useful in many FDA applica-
tions. As noted above, for some of these, such as the TIC data shown in Figure
2.1, the amplitude data objects are the focus of the analysis, and the phase data
objects can be viewed as nuisance parameters. However in other situations, for
example when analyzing neural spike train data (as discussed in Wu et al. (2014))
the phase data objects are of primary interest, and the amplitude data objects are
the nuisance component. In still other situations, both amplitude and phase data
objects are important, as is their joint variation. These include the variation in the
AneuRisk65 artery shape data in Sangalli et al. (2014a), and in the juggling data
discussed in Ramsay et al. (2014).
Figure 2.3 shows some of the analysis of the Juggling data from Lu and Mar-
ron (2014a). The starting point was positional recordings of location over time of
the hand of a juggler, which were reduced to time series of acceleration curves,
as discussed in Ramsay et al. (2014). These traces were cut into cycles and time
TREE-STRUCTURED DATA OBJECTS 23
registered, to obtain the 113 curves shown in the far left panel of Figure 2.3. Thus
the data objects in this OODA are time registered 1-d acceleration curves. Figure
3 of Lu and Marron (2014a) shows a variety of PCA type scores plots. Most of
these seem to indicate a homogeneous population. The middle left panel of Figure
2.3 shows the version based on the method of Principal Nested Spheres (PNS),
from Jung et al. (2012a). As further described in Sections 8.5 and 9.2, PNS makes
special use of the fact that Fisher-Rao warp data objects naturally lie on a high di-
mensional sphere. The value added of using this method which takes the curvature
of the sphere properly into account, is that it shows two clear clusters, which are
highlighted using the graphical technique of brushing, i.e. visually separating the
cluster through the use of colors and symbols. See Section 9.2 for more discussion
(based on Yu et al. (2017a)) of how and why PNS provides enhanced statistical
analysis of Fisher-Rao phase data objects. The clusters shown in the center-left
panel of Figure 2.3 represent important underlying structure in the data. This is
clear from the two right-hand panels of Figure 2.3, which show actual vertical
and horizontal locations of the paths (orthographic projections) corresponding to
these clusters, using the same colors. These are clearly two quite different types
of motions present in the data, which correspond to “better controlled” and “less
well-controlled” cycles.

Figure 2.3 Analysis of the Juggling data. Far left panel shows the input acceleration
curves. Center left is the Principal Nested Spheres scatterplot, revealing two distinct clus-
ters highlighted by brushing. Right panels verify these clusters represent two different types
of cycles.

Figure 2.3 uses parts of Figures 2, 3, and 4 from Lu and Marron (2014a).

2.2 Tree-Structured Data Objects


A very different example of OODA is trees, in the sense of graph theory, as data
objects. An interesting data set, where each data object is a representation of the
set of arteries in one person’s brain, was collected by Bullitt and Aylward (2002);
Aylward and Bullitt (2002). While a long term goal is to study pathologies, in-
cluding stroke tendency or brain cancer, such cases were deliberately screened
out of this data set to focus on normal variation within the population. Interesting
quantities that are useful for various comparisons below are age and gender. Sec-
tion 10.1 gives an overview of various analytic approaches that feature improving
abilities to distinguish age and gender.
These data objects were acquired using a modality of Magnetic Resonance
24 BREADTH OF OODA
Imaging called Magnetic Resonance Angiography (MRA). This modality flags
motion as white, so the flow of blood through the arteries shows up very well.
This is seen in Figure 2.4 as the white spots, where the different panels show
adjacent horizontal slices of the 3-d image.

Figure 2.4 Three adjacent slices of an MRA image for a single subject. Arteries show up
as white dots and curves.

A major contribution of Aylward and Bullitt (2002) was the development of a


3-d tube tracking algorithm which was used to generate reconstruction of a given
artery tree. At this point the data object is the union of many small spheres, whose
centers follow the central curve of each arterial branch, and whose radii are the
branch radius at that point. This tree representation, from the MRA shown in Fig-
ure 2.4 can be seen in Figure 2.5. The three panels show different rotations of the
same set of arteries. The left panel is a fairly large rotation and the right panel is a
small one, with the closest vessels moved to the left and right, respectively.

Figure 2.5 Three views of the arterial tree for the subject in Figure 2.4, showing the 3-d
structure through somewhat different rotations.

Such data object representations have been computed for approximately 100
people, in the Brain Artery data set. For example, three more of these for three
different subjects are shown in Figure 2.6. The original study was a little larger,
but some were deleted due to MRA acquisition problems.
SOUNDS AS DATA OBJECTS 25

Figure 2.6 Artery tree data objects for three additional subjects.

Data objects of this type present major challenges to doing statistical analysis.
For example, it is really not clear how to define even the sample mean of such a
set of objects. Understanding variation about the mean, e.g. as done by PCA in
Section 1.1, is a further challenge. A number of approaches to this are discussed
in Chapter 10, which studies these trees in the more general context of graphs as
data objects.

2.3 Sounds as Data Objects


Another example of OODA is sounds as data objects, which have been studied in
a particularly deep way in a series of papers analyzing human speech based on
digital recordings. Hadjipantelis et al. (2012, 2015) investigated Mandarin Chi-
nese using a mixed effect model to develop relations between dialects which were
consistent with linguistic ideas. Coleman et al. (2015) used these methods to ex-
trapolate back in time to estimate how archaic languages may have sounded. Pigoli
et al. (2018) analyze the relationships between modern romance languages, yield-
ing insights well beyond those available from classical textual linguistic analysis
(such as studied in Section 10.2). In addition a transformation is proposed that pro-
vides an estimated reconstruction of how a given speaker would sound speaking
a different language. Tavakoli et al. (2019) combined these analyses with spatial
smoothing to produce a dialectic map of the United Kingdom. Shiers et al. (2017)
developed a sound-based evolutionary tree for romance languages and dialects.
A typical first step in those analyses is to decompose the raw digital recording
of the sound into a spectrogram, which is a moving window version of the Fourier
transform, giving a frequency representation in time, as shown in Figure 2.7, from
the study of Pigoli et al. (2018), kindly provided by Davide Pigoli. The top panel
is the raw recording of one person saying the word “deux” (two) in French.
Frequently, the focus is on human speech from the viewpoint that aspects such
as pitch and timing are nuisances to be removed from the analysis. For that choice
of data objects, those effects are removed by reducing the spectrogram to appropri-
ately defined time and frequency covariance matrices (which are finite represen-
tations of covariance functions). Mean vectors also sometimes play an important
role. Color heat-map representation summaries (as discussed in Section 6.1) of
five covariance matrices (with entries colored according to the bars on the right,
26 BREADTH OF OODA

Figure 2.7 Summarization of raw recording of a human speech sound of “deux” in French,
top panel, into a corresponding spectrogram showing time and frequency information with
color coding height, shown in the bottom panel.

all using the same scale to facilitate comparison) from Pigoli et al. (2018) are
shown in Figure 2.8, also from Davide Pigoli. For each language these summaries
are based on aggregating sounds for the spoken digits (1–10). An exploratory vi-
sual comparison of these suggest some similarities (e.g. American and Castilian
Spanish) and also some stark contrasts such as Portuguese from the others. Confir-
matory analysis of these points and a number of others using permutation testing
methods can be found in Pigoli et al. (2018).
SOUNDS AS DATA OBJECTS 27

Figure 2.8 Covariance representation summaries of speech sound from five different lan-
guages/dialects. Note strong differences between them, with potentially interesting histori-
cal and geographical connections.

In the overall area of sounds as data objects, there is another interesting parallel
to the phenomenon noted in Section 2.1, that depending on the context either phase
or amplitude data objects could be the major focus of the analysis with the other
considered to be nuisance variation. In particular, the above work focused on a
particular type of analysis of sounds as data objects, where the goal was to study
human speech by a variety of speakers. As the human brain does when parsing
speech, they deliberately chose data objects which focused on aspects of the sound
that are about meaning of the words, which means generally treating issues such as
Exploring the Variety of Random
Documents with Different Content
the consideration of the qualities of the steamer in sight, a subject
on which, as seamen, they might better sympathise.
“That’s a droll-looking revenue cutter, after all, Capt. Spike,” he
said—“a craft better fitted to go in a fleet, as a look-out vessel, than
to chase a smuggler in-shore.”
“And no goer in the bargain! I do not see how she gets along, for
she keeps all snug under water; but, unless she can travel faster
than she does just now, the Molly Swash would soon lend her the
Mother Carey’s Chickens of her own wake to amuse her.”
“She has the tide against her, just here, sir; no doubt she would
do better in still water.”
Spike muttered something between his teeth, and jumped down
on deck, seemingly dismissing the subject of the revenue entirely
from his mind. His old, coarse, authoritative manner returned, and
he again spoke to his mate about Rose Budd, her aunt, the “ladies’
cabin,” the “young flood,” and “casting off,” as soon as the last
made. Mulford listened respectfully, though with a manifest distaste
for the instructions he was receiving. He knew his man, and a
feeling of dark distrust came over him, as he listened to his orders
concerning the famous accommodations he intended to give to Rose
Budd and that “capital old lady, her aunt;” his opinion of “the
immense deal of good sea-air and a v’y’ge would do Rose,” and how
“comfortable they both would be on board the Molly Swash.”
“I honor and respect Mrs. Budd, as my captain’s lady, you see,
Mr. Mulford, and intend to treat her accordin’ly. She knows it—and
Rose knows it—and they both declare they’d rather sail with me,
since sail they must, than with any other ship-master out of
America.”
“You sailed once with Capt. Budd yourself, I think I have heard
you say, sir?”
“The old fellow brought me up. I was with him from my tenth to
my twentieth year, and then broke adrift to see fashions. We all do
that, you know, Mr. Mulford, when we are young and ambitious, and
my turn came as well as another’s.”
“Capt. Budd must have been a good deal older than his wife, sir,
if you sailed with him when a boy,” Mulford observed a little drily.
“Yes; I own to forty-eight, though no one would think me more
than five or six-and-thirty, to look at me. There was a great
difference between old Dick Budd and his wife, as you say, he being
about fifty when he married, and she less than twenty. Fifty is a
good age for matrimony, in a man, Mulford; as is twenty in a young
woman.”
“Rose Budd is not yet nineteen, I have heard her say,” returned
the mate, with emphasis.
“Youngish, I will own, but that’s a fault a liberal-minded man can
overlook. Every day, too, will lessen it. Well, look to the cabins, and
see all clear for a start. Josh will be down presently with a cart-load
of stores, and you’ll take ’em aboard without delay.”
As Spike uttered this order, his foot was on the plank-sheer of the
bulwarks, in the act of passing to the wharf again. On reaching the
shore, he turned and looked intently at the revenue steamer, and his
lips moved, as if he were secretly uttering maledictions on her. We
say maledictions, as the expression of his fierce, ill-favored
countenance too plainly showed that they could not be blessings. As
for Mulford, there was still something on his mind, and he followed
to the gangway ladder and ascended it, waiting for a moment, when
the mind of his commander might be less occupied, to speak. The
opportunity soon occurred, Spike having satisfied himself with the
second look at the steamer.
“I hope you don’t mean to sail again without a second mate,
Capt. Spike?” he said.
“I do, though, I can tell you. I hate Dickies—they are always in
the way, and the captain has to keep just as much of a watch with
one as without one.”
“That will depend on his quality. You and I have both been
Dickies in our time, sir; and my time was not long ago.”
“Ay—ay—I know all about it—but you didn’t stick to it long
enough to get spoiled. I would have no man aboard the Swash who
made more than two v’y’ges as second officer. As I want no spies
aboard my craft, I’ll try it once more without a Dicky.”
Saying this in a sufficiently positive manner, Capt. Stephen Spike
rolled up the wharf, much as a ship goes off before the wind, now
inclining to the right, and then again to the left. The gait of the man
would have proclaimed him a sea-dog, to any one acquainted with
that animal, as far as he could be seen. The short squab figure, the
arms bent nearly at right angles at the elbow, and working like two
fins with each roll of the body, the stumpy, solid legs, with the feet
looking in the line of his course and kept wide apart, would all have
contributed to the making up of such an opinion. Accustomed as he
was to this beautiful sight, Harry Mulford kept his eyes riveted on
the retiring person of his commander, until it disappeared behind a
pile of lumber, waddling always in the direction of the more thickly
peopled parts of the town. Then he turned and gazed at the
steamer, which, by this time, had fairly passed the brig, and seemed
to be actually bound through the Gate. That steamer was certainly a
noble-looking craft, but our young man fancied she struggled along
through the water heavily. She might be quick at need, but she did
not promise as much by her present rate of moving. Still, she was a
noble-looking craft, and, as Mulford descended to the deck again, he
almost regretted he did not belong to her; or, at least, to any thing
but the Molly Swash.
Two hours produced a sensible change in and around that
brigantine. Her people had all come back to duty, and what was very
remarkable among seafaring folk, sober to a man. But, as has been
said, Spike was a temperance man, as respects all under his orders
at least, if not strictly so in practice himself. The crew of the Swash
was large for a half-rigged brig of only two hundred tons, but, as her
spars were very square, and all her gear as well as her mould
seemed constructed for speed, it was probable more hands than
common were necessary to work her with facility and expedition.
After all, there were not many persons to be enumerated among the
“people of the Molly Swash,” as they called themselves; not more
than a dozen, including those aft, as well as those forward. A
peculiar feature of this crew, however, was the circumstance that
they were all middle-aged men, with the exception of the mate, and
all thorough-bred sea-dogs. Even Josh, the cabin-boy, as he was
called, was an old, wrinkled, gray-headed negro, of near sixty. If the
crew wanted a little in the elasticity of youth, it possessed the
steadiness and experience of their time of life, every man appearing
to know exactly what to do, and when to do it. This, indeed,
composed their great merit; an advantage that Spike well knew how
to appreciate.
The stores had been brought alongside of the brig in a cart, and
were already stowed in their places. Josh had brushed and swept,
until the ladies’ cabin could be made no neater. This ladies’ cabin
was a small apartment beneath a trunk, which was, ingeniously
enough, separated from the main cabin by pantries and double
doors. The arrangement was unusual, and Spike had several times
hinted that there was a history connected with that cabin; though
what the history was Mulford never could induce him to relate. The
latter knew that the brig had been used for a forced trade on the
Spanish Main, and had heard something of her deeds in bringing off
specie, and proscribed persons, at different epochs in the revolutions
of that part of the world, and he had always understood that her
present commander and owner had sailed in her, as mate, for many
years before he had risen to his present station. Now, all was regular
in the way of records, bills of sale, and other documents; Stephen
Spike appearing in both the capacities just named. The register
proved that the brig had been built as far back as the last English
war, as a private cruiser, but recent and extensive repairs had made
her “better than new,” as her owner insisted, and there was no
question as to her sea-worthiness. It is true the insurance offices
blew upon her, and would have nothing to do with a craft that had
seen her two score years and ten; but this gave none who belonged
to her any concern, inasmuch as they could scarcely have been
underwritten in their trade, let the age of the vessel be what it
might. It was enough for them that the brig was safe, and
exceedingly fast, insurances never saving the lives of the people,
whatever else might be their advantages. With Mulford it was an
additional recommendation, that the Swash was usually thought to
be of uncommonly just proportions.
By half past two, P. M., every thing was ready for getting the
brigantine under way. Her foretopsail—or foretawsail, as Spike called
it—was loose, the fasts were singled, and a spring had been carried
to a post in the wharf that was well forward of the starboard bow,
and the brig’s head turned to the southwest, or down stream, and
consequently facing the young flood. Nothing seemed to connect the
vessel with the land but a broad gangway plank, to which Mulford
had attached life-lines, with more care than it is usual to meet with
on board of vessels employed in short voyages. The men stood
about the decks with their arms thrust into the bosoms of their
shirts, and the whole picture was one of silent, and possibly of
somewhat uneasy expectation. Nothing was said, however; Mulford
walking the quarter-deck alone, occasionally looking up the still little
tenanted streets of that quarter of the suburbs, as if to search for a
carriage. As for the revenue-steamer, she had long before gone
through the southern passage of Blackwell’s, steering for the Gate.
“Dat’s dem, Mr. Mulford,” Josh at length cried, from the look-out
he had taken in a stern-port, where he could see over the low
bulwarks of the vessel. “Yes, dat’s dem, sir. I know dat old gray
horse dat carries his head so low and sorrowful like, as a horse has a
right to do dat has to drag a cab about dis big town. My eye! what a
horse it is, sir!”
Josh was right, not only as to the gray horse that carried his
head “sorrowful like,” but as to the cab and its contents. The vehicle
was soon on the wharf, and in its door soon appeared the short,
sturdy figure of Capt. Spike, backing out, much as a bear descends a
tree. On top of the vehicle were several light articles of female
appliances, in the shape of bandboxes, bags, &c., the trunks having
previously arrived in a cart. Well might that over-driven gray horse
appear sorrowful, and travel with a lowered head. The cab, when it
gave up its contents, discovered a load of no less than four persons
besides the driver, all of weight, and of dimensions in proportion,
with the exception of the pretty and youthful Rose Budd. Even she
was plump, and of a well-rounded person; though still light and
slender. But her aunt was a fair picture of a ship-master’s widow;
solid, comfortable and buxom. Neither was she old, nor ugly. On the
contrary, her years did not exceed forty, and being well preserved, in
consequence of never having been a mother, she might even have
passed for thirty-five. The great objection to her appearance was the
somewhat indefinite character of her shape, which seemed to blend
too many of its charms into one. The fourth person, in the fare, was
Biddy Noon, the Irish servant and factotum of Mrs. Budd, who was a
pock-marked, red-faced, and red-armed single woman, about her
mistress’s own age and weight, though less stout to the eye.
Of Rose we shall not stop to say much here. Her deep-blue eye,
which was equally spirited and gentle, if one can use such
contradictory terms, seemed alive with interest and curiosity, running
over the brig, the wharf, the arm of the sea, the two islands, and all
near her, including the Alms-House, with such a devouring rapidity
as might be expected in a town-bred girl, who was setting out on
her travels for the first time. Let us be understood; we say town-
bred, because such was the fact; for Rose Budd had been both born
and educated in Manhattan, though we are far from wishing to be
understood that she was either very-well born, or highly educated.
Her station in life may be inferred from that of her aunt, and her
education from her station. Of the two, the last was, perhaps, a trifle
the highest.
We have said that the fine blue eye of Rose passed swiftly over
the various objects near her, as she alighted from the cab, and it
naturally took in the form of Harry Mulford, as he stood in the
gangway, offering his arm to aid her aunt and herself in passing the
brig’s side. A smile of recognition was exchanged between the young
people, as their eyes met, and the color, which formed so bright a
charm in Rose’s sweet face, deepened, in a way to prove that that
color spoke with a tongue and eloquence of its own. Nor was
Mulford’s cheek mute on the occasion, though he helped the
hesitating, half-doubting, half-bold girl along the plank with a steady
hand and rigid muscles. As for the aunt, as a captain’s widow, she
had not felt it necessary to betray any extraordinary emotions in
ascending the plank, unless, indeed, it might be those of delight on
finding her foot once more on the deck of a vessel!
Something of the same feeling governed Biddy, too, for, as
Mulford civilly extended his hand to her also, she exclaimed—
“No fear of me, Mr. Mate—I came from Ireland by wather, and
knows all about ships and brigs, I do. If you could have seen the
times we had, and the saas we crossed, you’d not think it nadeful to
say much to the likes iv me.”
Spike had tact enough to understand he would be out of his
element in assisting females along that plank, and he was busy in
sending what he called “the old lady’s dunnage” on board, and in
discharging the cabman. As soon as this was done, he sprang into
the main-channels, and thence, viâ the bulwarks, on deck, ordering
the plank to be hauled aboard. A solitary laborer was paid a quarter
to throw off the fasts from the ring-bolts and posts, and every thing
was instantly in motion to cast the brig loose. Work went on as if the
vessel were in haste, and it consequently went on with activity. Spike
bestirred himself giving his orders in a way to denote he had been
long accustomed to exercise authority on the deck of a vessel, and
knew his calling to its minutiæ. The only ostensible difference
between his deportment to-day and on any ordinary occasion,
perhaps, was in the circumstance that he now seemed anxious to
get clear of the wharf and that in a way which might have attracted
notice in any suspicious and attentive observer. It is possible that
such a one was not very distant, and that Spike was aware of his
presence, for a respectable-looking, well-dressed, middle-aged man
had come down one of the adjacent streets, to a spot within a
hundred yards of the wharf and stood silently watching the
movements of the brig, as he leaned against a fence. The want of
houses in that quarter enabled any person to see this stranger from
the deck of the Swash, but no one on board her seemed to regard
him at all, unless it might be the master.
“Come, bear a hand, my hearty, and toss that bow-fast clear,”
cried the captain, whose impatience to be off seemed to increase as
the time to do so approached nearer and nearer. “Off with it, at
once, and let her go.”
The man on the wharf threw the turns of the hawser clear of the
post, and the Swash was released forward. A smaller line, for a
spring, had been run some distance along the wharves, ahead of the
vessel, and brought in aft. Her people clapped on this, and gave way
to their craft, which, being comparatively light, was easily moved,
and was very manageable. As this was done, the distant spectator
who had been leaning on the fence, moved toward the wharf with a
step a little quicker than common. Almost at the same instant, a
short, stout, sailor-like looking little person, waddled down the
nearest street, seeming to be in somewhat of a hurry, and presently
he joined the other stranger, and appeared to enter into
conversation with him; pointing toward the Swash, as he did so. All
this time, both continued to advance toward the wharf.
In the meanwhile, Spike and his people were not idle. The tide
did not run very strong near the wharves and in the sort of a bight
in which the vessel had lain, but, such as it was, it soon took the
brig on her inner bow, and began to cast her head off shore. The
people at the spring pulled away with all their force, and got
sufficient motion on their vessel to overcome the tide, and to give
the rudder an influence. The latter was put hard a-starboard, and
helped to cast the brig’s head to the southward.
Down to this moment, the only sail that was loose on board the
Swash, was the fore-topsail, as mentioned. This still hung in the
gear, but a hand had been sent aloft to overhaul the buntlines and
clew-lines, and men were also at the sheets. In a minute the sail
was ready for hoisting. The Swash carried a wapper of a fore-and-aft
mainsail, and, what is more, it was fitted with a standing gaff, for
appearance in port. At sea, Spike knew better than to trust to this
arrangement, but in fine weather, and close in with the land, he
found it convenient to have this sail haul out and brail like a ship’s
spanker. As the gaff was now aloft, it was only necessary to let go
the brails to loosen this broad sheet of canvas, and to clap on the
out-hauler, to set it. This was probably the reason why the brig was
so unceremoniously cast into the stream, without showing more of
her cloth. The jib and flying-jibs, however, did at that moment drop
beneath their booms, ready for hoisting.
Such was the state of things as the two strangers came first
upon the wharf. Spike was on the taffrail, overhauling the main-
sheet, and Mulford was near him, casting the fore-topsail braces
from the pins, preparatory to clapping on the halyards.
“I say, Mr. Mulford,” asked the captain, “did you ever see either of
them chaps afore? These jokers on the wharf I mean.”
“Not to my recollection, sir,” answered the mate, looking over the
taffrail to examine the parties. “The little one is a burster! The
funniest looking little fat old fellow I’ve seen in many a day.”
“Ay, ay, them fat little bursters, as you call ’em, are sometimes
full of the devil. I don’t like either of the chaps, and am right glad we
are well cast, before they got here.”
“I do not think either would be likely to do us much harm, Capt.
Spike.”
“There’s no knowing, sir. The biggest fellow looks as if he might
lug out a silver oar at any moment.”
“I believe the silver oar is no longer used, in this country at
least,” answered Mulford, smiling. “And if it were, what have we to
fear from it? I fancy the brig has paid her reckoning.”
“She don’t owe a cent, nor ever shall for twenty-four hours after
the bill is made out, while I own her. They call me ready-money
Stephen, round among the ship-chandlers and caulkers. But I don’t
like them chaps, and what I don’t relish I never swallow, you know.”
“They’ll hardly try to get aboard us, sir; you see we are quite
clear of the wharf, and the mainsail will take now, if we set it.”
Spike ordered the mate to clap on the out-hauler, and spread
that broad sheet of canvas at once to the little breeze there was.
This was almost immediately done, when the sail filled, and began to
be felt on the movement of the vessel. Still, that movement was very
slow, the wind being so light, and the vis inertiæ of so large a body
remaining to be overcome. The brig receded from the wharf, almost
in a line at right angles to its face, inch by inch, as it might be,
dropping slowly up with the tide at the same time. Mulford now
passed forward to set the jibs, and to get the topsail on the craft,
leaving Spike on the taffrail, keenly eyeing the strangers, who, by
this time, had got down nearly to the end of the wharf, at the berth
so lately occupied by the Swash. That the captain was uneasy was
evident enough, that feeling being exhibited in his countenance,
blended with a malignant ferocity.
“Has that brig any pilot?” asked the larger and better-looking of
the two strangers.
“What’s that to you, friend?” demanded Spike, in return. “Have
you a Hell-Gate branch?”
“I may have one, or I may not. It is not usual for so large a craft
to run the Gate without a pilot.”
“Oh! my gentleman’s below, brushing up his logarithms. We shall
have him on deck to take his departure before long, when I’ll let him
know your kind inquiries after his health.”
The man on the wharf seemed to be familiar with this sort of
sea-wit, and he made no answer, but continued that close scrutiny of
the brig, by turning his eyes in all directions, now looking below, and
now aloft, which had in truth occasioned Spike’s principal cause for
uneasiness.
“Is not that Capt. Stephen Spike, of the brigantine Molly Swash?”
called out the little, dumpling-looking person, in a cracked, dwarfish
sort of a voice, that was admirably adapted to his appearance. Our
captain fairly started; turned full toward the speaker; regarded him
intently for a moment, and gulped the words he was about to utter,
like one confounded. As he gazed, however, at little dumpy,
examining his bow-legs, red broad cheeks, and coarse snub nose, he
seemed to regain his self-command, as if satisfied the dead had not
really returned to life.
“Are you acquainted with the gentleman you have named?” he
asked, by way of answer. “You speak of him like one who ought to
know him.”
Josh educating a Pig
Philadelphia 1847

“A body is apt to know a shipmate. Stephen Spike and I sailed


together twenty years since, and I hope to live to sail with him
again.”
“You sail with Stephen Spike? when and where, may I ask, and in
what v’y’ge, pray?”
“The last time was twenty years since. Have you forgotten little
Jack Tier, Capt. Spike?”
Spike looked astonished, and well he might, for he had supposed
Jack to be dead fully fifteen years. Time and hard service had
greatly altered him, but the general resemblance in figure, stature,
and waddle, certainly remained. Notwithstanding, the Jack Tier Spike
remembered was quite a different person from this Jack Tier. That
Jack had worn his intensely black hair clubbed and curled, whereas
this Jack had cut his locks into short bristles, which time had turned
into an intense gray. That Jack was short and thick, but he was flat
and square; whereas this Jack was just as short, a good deal thicker,
and as round as a dumpling. In one thing, however, the likeness still
remained perfect. Both Jacks chewed tobacco, to a degree that
became a distinct feature in their appearance.
Spike had many reasons for wishing Jack Tier were not
resuscitated in this extraordinary manner, and some for being glad to
see him. The fellow had once been largely in his confidence, and
knew more than was quite safe for any one to remember but himself
while he might be of great use to him in his future operations. It is
always convenient to have one at your elbow who thoroughly
understands you, and Spike would have lowered a boat and sent it
to the wharf to bring Jack off, were it not for the gentleman who
was so inquisitive about pilots. Under the circumstances, he
determined to forego the advantages of Jack’s presence, reserving
the right to hunt him up on his return.
The reader will readily enough comprehend that the Molly Swash
was not absolutely standing still while the dialogue related was going
on, and the thoughts we have recorded were passing through her
master’s mind. On the contrary, she was not only in motion, but that
motion was gradually increasing, and by the time all was said that
has been related, it had become necessary for those who spoke to
raise their voices to an inconvenient pitch in order to be heard. This
circumstance alone would soon have put an end to the conversation,
had not Spike’s pausing to reflect brought about the same result, as
mentioned.
In the mean time, Mulford had got the canvas spread. Forward,
the Swash showed all the cloth of a full-rigged brig, even to royals
and flying gib; while aft, her masts was the raking, tall, naked pole
of an American schooner. There was a taut top-mast, too, to which a
gaff-topsail was set, and the gear proved that she could also show,
at need, a staysail in this part of her, if necessary. As the Gate was
before them, however, the people had set none but the plain,
manageable canvas.
The Molly Swash kept close on a wind, luffing athwart the broad
reach she was in, until far enough to weather Blackwell’s, when she
edged off to her course, and went through the southern passage.
Although the wind remained light, and a little baffling, the brig was
so easily impelled, and was so very handy, that there was no
difficulty in keeping her perfectly in command. The tide, too, was
fast increasing in strength and velocity, and the movement from this
cause alone was getting to be sufficiently rapid.
As for the passengers, of whom we have lost sight in order to get
the brig under way, they were now on deck again. At first, they had
all gone below, under the care of Josh, a somewhat rough groom of
the chambers, to take possession of their apartment, a sufficiently
neat, and exceedingly comfortable cabin, supplied with every thing
that could be wanted at sea, and, what was more, lined on two of its
sides with state-rooms. It is true, all these apartments were small,
and the state-rooms were very low, but no fault could be found with
their neatness and general arrangements, when it was recollected
that one was on board a vessel.
“Here ebbery t’ing heart can wish,” said Josh, exultingly, who,
being an old-school black, did not disdain to use some of the old-
school dialect of his caste. “Yes, ladies, ebbery t’ing. Let Capt. Spike
alone for dat! He won’erful at accommodation! Not a bed-bug aft—
know better dan come here; jest like de people, in dat respects, and
keep deir place forrard. You nebber see a pig come on the quarter-
deck, nudder.”
“You must maintain excellent discipline, Josh,” cried Rose, in one
of the sweetest voices in the world, which was easily attuned to
merriment—“and we are delighted to learn what you tell us. How do
you manage to keep up these distinctions, and make such creatures
know their places so well?”
“Nuttin easier, if you begins right, miss. As for de pig, I teach
dem wid scaldin’ water. Whenever I sees a pig come aft, I gets a
little water from de copper, and just scald him wid it. You can’t t’ink,
miss, how dat mend his manners, and make him squeel fuss, and
t’ink arter. In that fashion I soon gets de ole ones in good trainin’,
and den I has no more trouble with dem as comes fresh aboard; for
de ole hog tell de young one, and ’em won’erful cunnin’, and know
how to take care of ’emself.”
Rose Budd’s sweet eyes were full of fun and expectation, and she
could no more repress her laugh than youth and spirits can always
be discreet.
“Yes, with the pigs,” she cried, “that might do very well; but how
is it with those—other creatures?”
“Rosy, dear,” interrupted the aunt, “I wish you would say no more
about such shocking things. It’s enough for us that Capt. Spike has
ordered them all to stay forward among the men, which is always
done on board well disciplined vessels. I’ve heard your uncle say, a
hundred times, that the quarter-deck was sacred, and that might be
enough to keep such animals off it.”
It was barely necessary to look at Mrs. Budd in the face to get a
very accurate general notion of her character. She was one of those
inane, uncultivated beings, who seem to be protected by a
benevolent Providence in their pilgrimage on earth, for they do not
seem to possess the power to protect themselves. Her very
countenance expressed imbecility and mental dependence, credulity
and a love of gossip. Notwithstanding these radical weaknesses, the
good woman had some of the better instincts of her sex, and was
never guilty of any thing that could properly convey reproach. She
was no monitress for Rose, however, the niece much oftener
influencing the aunt than the aunt influencing the niece. The latter
had been fortunate in having had an excellent instructress, who,
though incapable of teaching her much in the way of
accomplishments, had imparted a great deal that was respectable
and useful. Rose had character, and strong character, too, as the
course of our narrative will show; but her worthy aunt was a pure
picture of as much mental imbecility as at all comported with the
privileges of self-government.
The conversation about “those other creatures” was effectually
checked by Mrs. Budd’s horror of the “animals,” and Josh was called
on deck so shortly after as to prevent its being renewed. The
females staid below a few minutes, to take possession, and then
they re-appeared on deck, to gaze at the horrors of the Hell-Gate
passage. Rose was all eyes, wonder and admiration of every thing
she saw. This was actually the first time she had ever been on the
water, in any sort of craft, though born and brought up in sight of
one of the most thronged havens in the world. But there must be a
beginning to every thing, and this was Rose Budd’s beginning on the
water. It is true the brigantine was a very beautiful, as well as an
exceedingly swift vessel, but all this was lost on Rose, who would
have admired a horse-jockey bound to the West Indies, in this the
incipient state of her nautical knowledge. Perhaps the exquisite
neatness that Mulford maintained about every thing that came under
his care, and that included every thing on deck, or above board, and
about which neatness Spike occasionally muttered an oath, as so
much senseless trouble, contributed somewhat to Rose’s pleasure;
but her admiration would scarcely have been less with anything that
had sails, and seemed to move through the water with a power
approaching that of volition.
It was very different with Mrs. Budd. She, good woman, had
actually made one voyage with her late husband, and she fancied
that she knew all about a vessel. It was her delight to talk on
nautical subjects, and never did she really feel her great superiority
over her niece, so very unequivocally, as when the subject of the
ocean was introduced, about which she did know something, and
touching which Rose was profoundly ignorant, or as ignorant as a
girl of lively imagination could remain with the information gleaned
from others.
“I am not surprised you are astonished at the sight of the vessel,
Rosy,” observed the self-complacent aunt at one of her niece’s
exclamations of admiration. “A vessel is a very wonderful thing, and
we are told what extr’orny beings they are that ‘go down to the sea
in ships.’ But you are to know this is not a ship at all, but only a half-
jigger rigged, which is altogether a different thing.”
“Was my uncle’s vessel, The Rose In Bloom, then, very different
from the Swash?”
“Very different, indeed, child! Why, The Rose In Bloom was a full-
jiggered ship, and had twelve masts—and this is only a half-jiggered
brig, and has but two masts. See, you may count them—one—two!”
Harry Mulford was coiling away a top-gallant-brace, directly in
front of Mrs. Budd and Rose, and, at hearing this account of the
wonderful equipment of The Rose In Bloom, he suddenly looked up,
with a lurking expression about his eye that the niece very well
comprehended, while he exclaimed, without much reflection, under
the impulse of surprise⁠—
“Twelve masts! Did I understand you to say, ma’am, that Capt.
Budd’s ship had twelve masts?”
“Yes, sir, twelve! and I can tell you all their names, for I learnt
them by heart—it appearing to me proper that a ship-master’s wife
should know the names of all the masts in her husband’s vessel. Do
you wish to hear their names, Mr. Mulford?”
Harry Mulford would have enjoyed this conversation to the top of
his bent, had it not been for Rose. She well knew her aunt’s general
weakness of intellect, and especially its weakness on this particular
subject, but she would suffer no one to manifest contempt for either,
if in her power to prevent it. It is seldom one so young, so mirthful,
so ingenuous and innocent in the expression of her countenance,
assumed so significant and rebuking a frown as did pretty Rose Budd
when she heard the mate’s involuntary exclamation about the
“twelve masts.” Harry, who was not easily checked by his equals, or
any of his own sex, submitted to that rebuking frown with the
meekness of a child, and stammered out, in answer to the well-
meaning, but weak-minded widow’s question⁠—
“If you please, Mrs. Budd—just as you please, ma’am—only
twelve is a good many masts—” Rose frowned again—“that is—more
than I’m used to seeing—that’s all.”
“I dare say, Mr. Mulford—for you sail in only a half-jigger; but
Capt. Budd always sailed in a full-jigger—and his full-jiggered ship
had just twelve masts, and, to prove it to you, I’ll give you the
names—first, then, there were the fore, main, and mizzen masts⁠—”
“Yes—yes—ma’am,” stammered Harry, who wished the twelve
masts and The Rose In Bloom at the bottom of the ocean, since her
owner’s niece still continued to look coldly displeased—“that’s right, I
can swear!”
“Very true, sir, and you’ll find I am right as to all the rest. Then,
there were the fore, main, and mizzen top-masts—they make six, if I
can count, Mr. Mulford?”
“Ah!” exclaimed the mate, laughing, in spite of Rose’s frowns, as
the manner in which the old sea-dog had quizzed his wife became
apparent to him. “I see how it is—you are quite right, ma’am—I dare
say The Rose In Bloom had all these masts, and some to spare.”
“Yes, sir—I knew you would be satisfied. The fore, main and
mizzen top-gallant-masts make nine—and the fore, main and mizzen
royals make just twelve. Oh, I’m never wrong in any thing about a
vessel, especially if she is a full-jiggered ship.”
Mulford had some difficulty in restraining his smiles each time the
full-jigger was mentioned, but Rose’s expression of countenance
kept him in excellent order—and she, innocent creature, saw nothing
ridiculous in the term, though the twelve masts had given her a little
alarm. Delighted that the old lady had got through her enumeration
of the spars with so much success, Rose cried, in the exuberance of
her spirits⁠—
“Well, aunty, for my part, I find a half-jigger vessel so very, very
beautiful, that I do not know how I should behave were I to go on
board a full-jigger.”
Mulford turned abruptly away, the circumstance of Rose’s making
herself ridiculous giving him sudden pain, though he could have
laughed at her aunt by the hour.
“Ah, my dear, that is on account of your youth and inexperience
—but you will learn better in time. I was just so, myself when I was
of your age, and thought the fore-rafters were as handsome as the
squared-jiggers, but soon after I married Capt. Budd I felt the
necessity of knowing more than I did about ships, and I got him to
teach me. He didn’t like the business, at first, and pretended I would
never learn; but, at last, it came all at once like, and then he used to
be delighted to hear me ‘talk ship,’ as he called it. I’ve known him
laugh, with his cronies, as if ready to die, at my expertness in sea-
terms, for half an hour together—and then he would swear—that
was the worst fault your uncle had, Rosy—he would swear,
sometimes, in a way that frightened me, I do declare!”
“But he never swore at you, aunty?”
“I can’t say that he did exactly do that, but he would swear all
round me, even if he didn’t actually touch me, when things went
wrong—but it would have done your heart good to hear him laugh!
He had a most excellent heart, just like your own, Rosy dear; but,
for that matter, all the Budds have excellent hearts, and one of the
commonest ways your uncle had of showing it was to laugh,
particularly when we were together and talking. Oh, he used to
delight in hearing me converse, especially about vessels, and never
failed to get me at it when he had company. I see his good-natured,
excellent-hearted countenance at this moment, with the tears
running down his fat, manly cheeks, as he shook his very sides with
laughter. I may live a hundred years, Rosy, before I meet again with
your uncle’s equal.”
This was a subject that invariably silenced Rose. She
remembered her uncle, herself, and remembered his affectionate
manner of laughing at her aunt, and she always wished the latter to
get through her eulogiums on her married happiness, as soon as
possible, whenever the subject was introduced.
All this time the Molly Swash kept in motion. Spike never took a
pilot when he could avoid it, and his mind was too much occupied
with his duty, in that critical navigation, to share at all in the
conversation of his passengers, though he did endeavor to make
himself agreeable to Rose, by an occasional remark, when a
favorable opportunity offered. As soon as he had worked his brig
over into the south or weather passage of Blackwell’s, however,
there remained little for him to do, until she had drifted through it, a
distance of a mile or more, and this gave him leisure to do the
honors. He pointed out the castellated edifice on Blackwell’s as the
new penitentiary, and the hamlet of villas, on the other shore, as
Ravenswood, though there is neither wood nor ravens to authorize
the name. But the “Sunswick,” which satisfied the Delafields and
Gibbses of the olden time, and which distinguished their lofty halls
and broad lawns, was not elegant enough for the cockney tastes of
these later days, so “wood” must be made to usurp the place of
cherries and apples, and “ravens” that of gulls, in order to satisfy its
cravings. But all this was lost on Spike. He remembered the shore as
it had been twenty years before, and he saw what it was now, but
little did he care for the change. On the whole, he rather preferred
the Grecian Temples, over which the ravens would have been
compelled to fly, had there been any ravens in that neighborhood, to
the old fashioned and highly respectable residence that once alone
occupied the spot. The point he did understand, however, and on
the merits of which he had something to say, was a little farther
ahead. That, too, had been re-christened—the Hallet’s Cove of the
mariner being converted into Astoria—not that bloody-minded place
at the mouth of the Oregon, which has come so near bringing us to
blows with our “ancestors in England,” as the worthy denizens of
that quarter choose to consider themselves still, if one can judge by
their language. This Astoria was a very different place, and is one of
the many suburban villages that are shooting up, like mushrooms, in
a night, around the great Commercial Emporium. This spot Spike
understood perfectly, and it was not likely that he should pass it
without communicating a portion of his knowledge to Rose.
“There, Miss Rose,” he said, with a didactic sort of air, pointing
with his short, thick finger at the little bay which was just opening to
their view; “there’s as neat a cove as a craft need bring up in. That
used to be a capital place to lie in, to wait for a wind to pass the
Gate; but it has got to be most too public for my taste. I’m rural, I
tell Mulford, and love to get in out-of-the-way berths with my brig,
where she can see salt-meadows, and smell the clover. You never
catch me down in any of the crowded slips, around the markets, or
any where in that part of the town, for I do love country air. That’s
Hallet’s Cove, Miss Rose, and a pretty anchorage it would be for us,
if the wind and tide didn’t sarve to take us through the Gate.”
“Are we near the Gate, Capt. Spike?” asked Rose, the fine bloom
on her cheek lessening a little, under the apprehension that
formidable name is apt to awaken in the breasts of the
inexperienced.
“Half a mile, or so. It begins just at the other end of this island
on our larboard hand, and will be all over in about another half mile,
or so. It’s no such bad place, a’ter all, is Hell-Gate, to them that’s
used to it. I call myself a pilot in Hell-Gate, though I have no
branch.”
“I wish, Capt. Spike, I could teach you to give that place its
proper and polite name. We call it Whirl-Gate altogether now,” said
the relict.
“Well, that’s new to me,” cried Spike. “I have heard some
chicken-mouthed folk say Hurl-Gate, but this is the first time I ever
heard it called Whirl-Gate—they’ll get it to Whirlagig-Gate next. I
don’t think that my old commander, Capt. Budd called the passage
any thing but honest, up and down Hell-Gate.”
“That he did—that he did—and all my arguments and reading
could not teach him any better. I proved to him that it was Whirl-
Gate, as any one can see that it ought to be. It is full of Whirlpools,
they say, and that shows what Nature meant the name to be.”
“But, aunty,” put in Rose, half reluctantly, half anxious to speak,
“what has gate to do with whirlpools? You will remember it is called
a gate—the gate to that wicked place I suppose is meant.”
“Rose, you amaze me! How can you, a young woman of only
nineteen, stand up for so vulgar a name as Hell-Gate?”
“Do you think it as vulgar as Hurl-Gate, aunty? To me it always
seems the most vulgar to be straining at gnats.”
“Yes,” said Spike, sentimentally, “I’m quite of Miss Rose’s way of
thinking—straining at gnats is very ill-manners, especially at table. I
once knew a man who strained in this way, until I thought he would
have choked, though it was with a fly to be sure; but gnats are
nothing but small flies, you know, Miss Rose. Yes, I’m quite of your
way of thinking, Miss Rose; it is very vulgar to be straining at gnats
and flies, more particularly at table. But you’ll find no flies or gnats
aboard here, to be straining at, or brushing away, or to annoy you.
Stand by there, my hearties, and see all clear to run through Hell-
Gate. Don’t let me catch you straining at any thing, though it should
be the fin of a whale!”
The people forward looked at each other, as they listened to this
novel admonition, though they called out the customary “ay, ay, sir,”
as they went to the sheets, braces and bowlines. To them the
passage of no Hell-Gate conveyed the idea of any particular terror,
and with the one they were about to enter, they were much too
familiar to care any thing about it.
The brig was now floating fast, with the tide, up abreast of the
east end of Blackwell’s, and in two or three more minutes she would
be fairly in the Gate. Spike was aft, where he could command a view
of every thing forward, and Mulford stood on the quarter-deck, to
look after the head-braces. An old and trustworthy seaman, who
acted as a sort of boatswain, had the charge on the forecastle, and
was to tend the sheets and tack. His name was Rove.
“See all clear,” called out Spike. “D’ye hear there, for’ard! I shall
make a half-board in the Gate, if the wind favor us, and the tide
prove strong enough to hawse us to wind’ard sufficiently to clear the
pot—so mind your⁠—”
The captain breaking off in the middle of this harangue, Mulford
turned his head, in order to see what might be the matter. There
was Spike, leveling a spy-glass at a boat that was pulling swiftly out
of the north channel, and shooting like an arrow directly athwart the
brig’s bows into the main passage of the Gate. He stepped to the
captain’s elbow.
“Just take a look at them chaps, Mr. Mulford,” said Spike, handing
his mate the glass.
“They seem in a hurry,” answered Harry, as he adjusted the glass
to his eye, “and will go through the Gate in less time than it will take
to mention the circumstance.”
“What do you make of them, sir?”
“The little man who called himself Jack Tier is in the stern-sheets
of the boat, for one,” answered Mulford.
“And the other, Harry—what do you make of the other?”
“It seems to be the chap who hailed to know if we had a pilot.
He means to board us at Riker’s Island, and make us pay pilotage,
whether we want his services or not.”
“Blast him and his pilotage too! Give me the glass”—taking
another long look at the boat, which by this time was glancing,
rather than pulling, nearly at right angles across his bows. “I want
no such pilot aboard here, Mr. Mulford. Take another look at him—
here, you can see him, away on our weather bow, already.”
Mulford did take another look at him, and this time his
examination was longer and more scrutinising than before.
“It is not easy to cover him with the glass,” observed the young
man—“the boat seems fairly to fly.”
“We’re forereaching too near the Hog’s Back, Capt. Spike,” roared
the boatswain, from forward.
“Ready about—hard a-lee,” shouted Spike. “Let all fly, for’ard—
help her round, boys, all you can, and wait for no orders! Bestir
yourselves—bestir yourselves.”
It was time the crew should be in earnest. While Spike’s attention
had been thus diverted by the boat, the brig had got into the
strongest of the current, which, by setting her fast to windward, had
trebled the power of the air, and this was shooting her over toward
one of the greatest dangers of the passage on a flood tide. As
everybody bestirred themselves, however, she was got round and
filled on the opposite tack, just in time to clear the rocks. Spike
breathed again, but his head was still full of the boat. The danger he
had just escaped as Scylla met him as Charybdis. The boatswain
again roared to go about. The order was given as the vessel began
to pitch in a heavy swell. At the next instant she rolled until the
water came on deck, whirled with her stern down the tide, and her
bows rose as if she were about to leap out of water. The Swash had
hit the Pot Rock.

——
PART II.
Watch. If we know him to be a thief, shall we not lay hands on him?
Dogb. Truly, by your office, you may; but I think they that touch
pitch will be defiled: the most peaceable way for you, if you do take a
thief, is, to let him show himself what he is, and steal out of your
company.
Much Ado About Nothing.

We left the brigantine of Capt. Spike in a very critical situation,


and the master himself in great confusion of mind. A thorough
seaman, this accident would never have happened, but for the
sudden appearance of the boat and its passengers; one of whom
appeared to be a source of great uneasiness to him. As might be
expected, the circumstance of striking a place as dangerous as the
Pot Rock in Hell-Gate, produced a great sensation on board the
vessel. This sensation betrayed itself in various ways, and according
to the characters, habits, and native firmness of the parties. As for
the ship-master’s relict, she seized hold of the main-mast, and
screamed so loud and perseveringly, as to cause the sensation to
extend itself into the adjacent and thriving village of Astoria, where
it was distinctly heard by divers of those who dwelt near the water.
Biddy Noon had her share in this clamor, lying down on the deck in
order to prevent rolling over, and possibly to scream more at her
leisure, while Rose had sufficient self-command to be silent, though
her cheeks lost their color.
Nor was there any thing extraordinary in females betraying this
alarm, when one remembers the somewhat astounding signs of
danger by which these persons were surrounded. There is always
something imposing in the swift movement of a considerable body of
water. When this movement is aided by whirlpools and the other
similar accessories of an interrupted current, it frequently becomes
startling, more especially to those who happen to be on the element
itself. This is peculiarly the case with the Pot Rock, where, not only
does the water roll and roar as if agitated by a mighty wind, but
where it even breaks, the foam seeming to glance up stream, in the
rapid succession of wave to wave. Had the Swash remained in her
terrific berth more than a second or two, she would have proved
what is termed a “total loss;” but she did not. Happily the Pot Rock
lies so low, that it is not apt to fetch up any thing of a light draught
of water; and the brigantine’s fore-foot had just settled on its
summit, long enough to cause the vessel to whirl round and make
her obeisance to the place, when a succeeding swell lifted her clear,
and away she went down stream, rolling as if scudding in a gale,
and, for a moment, under no command whatever. There lay another
danger ahead, or it would be better to say astern, for the brig was
drifting stern foremost, and that was in an eddy under a bluff, which
bluff lies at an angle in the reach, where it is no uncommon thing for
craft to be cast ashore, after they have passed all the more imposing
and more visible dangers above. It was in escaping this danger, and
in recovering the command of his vessel, that Spike now manifested
the sort of stuff of which he was really made, in emergencies of this
sort. The yards were all sharp up when the accident occurred, and
springing to the lee-braces, just as a man winks when his eye is
menaced, he seized the weather fore-brace with his own hands, and
began to round in the yard, shouting out to the man at the wheel to
“port his helm” at the same time. Some of the people flew to his
assistance, and the yards were not only squared, but braced a little
up on the other tack, in much less time than we have taken to relate
the evolution. Mulford attended to the main-sheet, and succeeded in
getting the boom out in the right direction. Although the wind was in
truth very light, the velocity of the drift filled the canvas, and taking
the arrow-like current on her lee bow, the Swash, like a frantic steed
that is alarmed with the wreck made by his own madness, came
under command, and sheered out into the stream again, where she
could drift clear of the apprehended danger astern.
“Sound the pumps,” called out Spike to Mulford, the instant he
saw he had regained his seat in the saddle. Harry sprang amidships
to obey, and the eye of every mariner in that vessel was on the
young man, as, in the midst of a death-like silence, he performed
this all-important duty. It was like the physician’s feeling the pulse of
his patient before he pronounces on the degree of his danger.
“Well, sir?” cried out Spike, impatiently, as the rod re-appeared.
“All right, sir,” answered Harry, cheerfully—“the well is nearly
empty.”
“Hold on a moment longer, and give the water time to find its
way amidships, if there be any.”
The mate remained perched up on the pump, in order to comply,
while Spike and his people, who now breathed more freely again,
improved the leisure to brace up and haul aft, to the new course.
“Biddy,” said Mrs. Budd, considerately, during this pause in the
incidents, “you needn’t scream any longer. The danger seems to be
past, and you may get up off the deck now. See, I have let go of the
mast. The pumps have been sounded, and are found tight.”
Biddy, like an obedient and respectful servant, did as directed,
quite satisfied if the pumps were tight. It was some little time, to be
sure, before she was perfectly certain whether she were alive or not
—but, once certain of this circumstance, her alarm very sensibly
abated, and she became reasonable. As for Mulford, he dropped the
sounding rod again, and had the same cheering report to make.
“The brig is tight as a bottle, sir.”
“So much the better,” answered Spike. “I never had such a whirl
in her before in my life, and I thought she was going to stop and
pass the night there. That’s the very spot on which ‘The Hussar’
frigate was wrecked.”
“So I have heard, sir. But she drew so much water that she hit
slap against the rock, and started a butt. We merely touched on its
top, with our fore-foot, and slid off.”
This was the simple explanation of the Swash’s escape, and
every body being now well assured that no harm had been done,
things fell into their old and regular train again. As for Spike, his
gallantry, notwithstanding, was upset for some hours, and glad
enough was he when he saw all three of his passengers quit the
deck to go below. Mrs. Budd’s spirits had been so much agitated that
she told Rose she would go down into the cabin and rest a few
minutes on its sofa. We say sofa, for that article of furniture, now-a-
days, is far more common in vessels than it was thirty years ago in
the dwellings of the country.
“There, Mulford,” growled Spike, pointing ahead of the brig, to an
object on the water that was about half a mile ahead of them,
“there’s that bloody boat—d’ye see? I should like of all things to give
it the slip. There’s a chap in that boat I don’t like.”
“I don’t see how that can be very well done, sir, unless we
anchor, repass the gate at the turn of the tide, and go to sea by the
way of Sandy Hook.”
“That will never do. I’ve no wish to be parading the brig before
the town. You see, Mulford, nothing can be more innocent and
proper than the Molly Swash, as you know from having sailed in her
these twelvemonths. You’ll give her that character, I’ll be sworn?”
“I know no harm of her, Capt. Spike, and hope I never shall.”
“No, sir—you know no harm of her, nor does any one else. A
nursing infant is not more innocent than the Molly Swash, or could
have a clearer character, if nothing but truth was said of her. But the
world is so much given to lying, that one of the old saints, of whom
we read in the good book, such as Calvin and John Rogers, would be
vilified if he lived in these times. Then, it must be owned, Mr.
Mulford, whatever may be the raal innocence of the brig, she has a
most desperate wicked look.”
“Why, yes, sir—it must be owned she is what we sailors call a
wicked-looking craft. But some of Uncle Sam’s cruisers have that
appearance also.”
“I know it—I know it, sir, and think nothing of looks myself. Men
are often deceived in me, by my looks, which have none of your
long-shore softness about ’em, perhaps; but my mother used to say
I was one of the most tender-hearted boys she had ever heard
spoken of—like one of the babes in the woods, as it might be. But
mankind go so much by appearances, that I do not like to trust the
brig too much afore their eyes. Now, should we be seen in the lower
bay, waiting for a wind, or for the ebb tide to make, to carry us over
the bar, ten to one but some philotropic or other would be off with a
complaint to the District Attorney, that we looked like a slaver, and
have us all fetched up to be tried for our lives as pirates. No, no—I
like to keep the brig in out-of-the-way places, where she can give no
offence to your ’tropics, whether they be philos, or of any other
sort.”
“Well, sir, we are to the eastward of the Gate, and all’s safe. That
boat cannot bring us up.”
“You forget, Mr. Mulford, the revenue craft that steamed up, on
the ebb. That vessel must be off Sands’ Point by this time, and she
may hear something to our disparagement from the feller in the
boat, and take it into her smoky head to walk us back to town. I
wish we were well to the eastward of that steamer! But there’s no
use in lamentations. If there is really any danger, it’s some distance
ahead yet, thank Heaven!”
“You have no fears of the man who calls himself Jack Tier, Capt.
Spike?”
“None in the world. That feller, as I remember him, was a little
bustlin’ chap that I kept in the cabin, as a sort of steward’s mate.
There was neither good nor harm in him, to the best of my
recollection. But Josh can tell us all about him—just give Josh a call.”
The best thing in the known history of Spike was the fact that his
steward had sailed with him for more than twenty years. Where he
had picked up Josh no one could say, but Josh and himself, and
neither chose to be very communicative on the subject. But Josh
had certainly been with him as long as he had sailed the Swash, and
that was from a time actually anterior to the birth of Mulford. The
mate soon had the negro in the council.
“I say, Josh,” asked Spike, “do you happen to remember such a
hand aboard here as one Jack Tier?”
“Lor’ bless you, yes, sir—’members he as well as I do the pea-
soup that was burnt, and which you t’rowed all over him to scald
him for punishment.”
“I’ve had to do that so often, to one careless fellow or other, that
the circumstance doesn’t recall the man. I remember him, but not as
clear as I could wish. How long did he sail with us?”
“Sebberal v’y’ge, sir, and got left ashore down on the Main, one
night, when ’e boat war obliged to shove off in a hurry. Yes,
’members little Jack, right well I does.”

You might also like