Object Oriented Data Analysis 1st Edition Marron
Object Oriented Data Analysis 1st Edition Marron
com
https://ebookmeta.com/product/object-oriented-data-
analysis-1st-edition-marron/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookmeta.com/product/object-oriented-data-analysis-1st-
edition-james-stephen-marron/
ebookmeta.com
https://ebookmeta.com/product/object-oriented-python-irv-kalb/
ebookmeta.com
https://ebookmeta.com/product/object-oriented-python-1st-edition-irv-
kalb/
ebookmeta.com
https://ebookmeta.com/product/chimpanzees-and-human-evolution-1st-
edition-martin-n-muller/
ebookmeta.com
https://ebookmeta.com/product/immunology-of-endometriosis-
pathogenesis-and-management-1st-edition-kaori-koga/
ebookmeta.com
https://ebookmeta.com/product/landscaping-for-dummies-2nd-edition-
teri-dunn-chace/
ebookmeta.com
Tu rkiyenin Tarihi Deniz Fenerleri Historical Lighthouses
of Turkey First Edition Bostan ■dris Türkhan M Sait Soyer
Kolçak Emel
https://ebookmeta.com/product/tu-rkiyenin-tarihi-deniz-fenerleri-
historical-lighthouses-of-turkey-first-edition-bostan-idris-turkhan-m-
sait-soyer-kolcak-emel/
ebookmeta.com
Object Oriented
Data Analysis
MONOGRAPHS ON STATISTICS AND APPLIED
PROBABILITY
Reasonable efforts have been made to publish reliable data and information, but the author and
publisher cannot assume responsibility for the validity of all materials or the consequences of
their use. The authors and publishers have attempted to trace the copyright holders of all material
reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and
let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information
storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.
copyright.com or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. For works that are not available on CCC please contact
[email protected]
Trademark notice: Product or corporate names may be trademarks or registered trademarks and
are used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781351189675
Typeset in Nimbus
by KnowledgeWorks Global Ltd.
Dedication
To our families for their ongoing strong support over the many years it took to
fully develop these ideas, and to the many colleagues who have played a vital role
in shaping this approach to data analysis.
v
Contents
Preface xi
1 What Is OODA? 1
1.1 Case Study: Curves as Data Objects 3
1.2 Case Study: Shapes as Data Objects 10
1.2.1 The Segmentation Challenge 10
1.2.2 General Shape Representations 12
1.2.3 Skeletal Shape Representations 13
1.2.4 Bayes Segmentation via Principal Geodesic Analysis 15
2 Breadth of OODA 19
2.1 Amplitude and Phase Data Objects 19
2.2 Tree-Structured Data Objects 23
2.3 Sounds as Data Objects 25
2.4 Images as Data Objects 28
vii
viii CONTENTS
5 OODA Preprocessing 71
5.1 Visualization of Marginal Distributions 71
5.1.1 Case Study: Spanish Mortality Data 72
5.1.2 Case Study: Drug Discovery Data 74
5.2 Standardization–Appropriate Linear Scaling 85
5.2.1 Example: Two Scale Curve Data 86
5.2.2 Overview of Standardization 89
5.3 Transformation–Appropriate Nonlinear Scaling 91
5.4 Registration–Appropriate Alignment 94
6 Data Visualization 97
6.1 Heat-Map Views of Data Matrices 97
6.2 Curve Views of Matrices and Modes of Variation 104
6.3 Data Centering and Combined Views 107
6.4 Scatterplot Matrix Views of Scores 116
6.5 Alternatives to PCA Directions 120
Bibliography 371
Index 416
Preface
xi
xii PREFACE
Acknowledgments
Many of the ideas and presentation style have been developed during the teaching
of a graduate course entitled Object Oriented Data Analysis and have been taught
roughly every other year at the University of North Carolina since 2005. There
were some important precursors, including a related course taught at Cornell Uni-
versity in 2002. Two such courses were offered at the Statistical and Mathemati-
cal Sciences Institute in 2010 and 2011 with lecturers (beyond the authors of this
book) including Hans-Georg Müller, James O. Ramsay, and Jane-Ling Wang. The
course was also taught at the National University of Singapore in 2015.
Several events have played a pivotal role in the development of Object Oriented
Data Analysis. One was the Statistical and Mathematical Sciences Institute pro-
gram on “Analysis of Object Data” during 2010–2011, with co-organizers Hans-
Georg Müller, James O. Ramsay and Jane-Ling Wang. Another pivotal workshop
was the November 2012 “Statistics of Time Warpings and Phase Variations” at the
Mathematical Biosciences Institute, with co-organizers James O. Ramsay, Laura
Sangalli and Anuj Srivastava.
General research in this area by the authors has been supported over the years
by a number of grants from the National Science Foundation, including DMS-
9971649, DMS-0308331, DMS-0606577, DMS-0854908 and IIS-1633074, and
the Engineering and Physical Sciences Research Council grants EP/K022547/1
and EP/T003928/1.
The material on sounds as data objects was kindly provided by Davide Pigoli.
The authors are especially grateful to Stephan F. Huckemann, John T. Kent, James
O. Ramsay, and Anuj Srivastava for providing formal reviews of early drafts that
fundamentally impacted the final version. Additional useful comments on various
drafts have been provided by Iain Carmichael, Benjamin Elztner, Thomas Keefe,
Carson Mosso, Vic Patrangenaru, Stephen M. Pizer, Davide Pigoli, and Piercesare
Secchi.
The authors are grateful to John Kimmel, for helpful advice at many points, and
for his patience over the long time it took for this book to come together.
CHAPTER 1
What Is OODA?
The fields of human endeavor currently known as statistics, data science, and data
analytics have been radically transformed over the recent past. These transforma-
tions have been driven simultaneously by a massive increase in computational
capabilities coupled with a rapidly growing scientific appetite for ever deeper un-
derstanding and insights. The notion of forming a data matrix provides a useful
paradigm for understanding important aspects of how these fields are evolving.
In particular, the currently popular context of Big Data has several quite different
facets, ranging from low dimension high sample size areas (the basis of classi-
cal mathematical statistical thought, which is perhaps typified by sample survey
and census data), through both high dimension and sample sizes (common for in-
ternet scale data sets of many types), and on to high dimension low sample size
contexts (frequently encountered in areas such as genetics, medical imaging and
other types of extremely rich but relatively expensive measurements). The press-
ing need to analyze data in this wide array of contexts has generated many exciting
new ideas and approaches.
Yet a deeper look into these developments suggests that the organization of data
into a matrix may itself be imposing limitations. In particular, there is a growing
realization that the challenges presented by Big Data are being eclipsed by the
perhaps far greater challenges of Complex Data, which are typically not easily
represented as an unconstrained matrix of numbers. Object Oriented Data Anal-
ysis (OODA) provides a useful general framework for the consideration of many
types of Complex Data. It is deliberately intended to be particularly useful in the
analysis of data in complicated situations, diverse examples of which are given in
the first two chapters. The phrase OODA in this context was coined by Wang and
Marron (2007). An overview of the area was given in Marron and Alonso (2014).
For more discussion of Big Data and its relation to statistics, see Carmichael and
Marron (2018) and many interesting viewpoints in the special issue edited by San-
galli (2018).
The OODA viewpoint is easily understood through taking data objects to be
the atoms of a statistical analysis, where atom is meant in the sense of elementary
particle, studied in several contexts of increasing complexity:
• In a first course in statistics atoms are numbers, and the goal is to develop
methods for understanding of variation in populations of numbers.
• A more advanced course, termed multivariate analysis in the statistical culture,
generalizes the atoms, i.e. the data objects from numbers to vectors and in-
volves a host of methods for managing uncertainty in that context. For example
DOI: 10.1201/9781351189675-1 1
2 WHAT IS OODA?
see Mardia et al. (1979), Muirhead (1982), and Koch (2014) (for a more up to
date treatment).
• At the time of this writing a fashionable area in statistics is Functional Data
Analysis (FDA), where the goal is to analyze the variation in a population of
curves. A good introduction to this vibrant research area, where functions are
the data objects, can be found in Ramsay and Silverman (2002, 2005). A case
study, illustrating many of the basic concepts of FDA which are useful for
understanding OODA is given in Section 1.1.
• OODA provides the next step in terms of complexity of atoms of a statistical
analysis to a wide array of more complicated objects. The important example
of shapes as data objects is considered in the case study of Section 1.2. A wide
variety of other examples, which highlight the breadth of OODA, appears in
Chapter 2.
Note that each of the above areas can be thought of as containing the preced-
ing ones as special cases. For example, multivariate analysis is the case of FDA
where the functions are discretely supported. Similarly multivariate analysis and
FDA are special cases of OODA. In later chapters it is useful to recall that OODA
includes these predecessors as special cases. This is because often simple multi-
variate examples are used for maximal clarity in the illustration of concepts and
methods, but the ideas are useful more generally for OODA.
A good question is: What is the value added to applied statistics and data sci-
ence from the concept of OODA and its attendant terminology? The terminology
is based on very substantial real-world experience with a wide variety of complex
data sets. A fact that rapidly becomes clear in the course of interdisciplinary re-
search is that there frequently are substantial hurdles in terminology. Especially at
the beginning of such endeavors, it can feel like collaborators are even speaking
different languages, so often serious effort needs to be devoted to the develop-
ment of a common set of definitions just to carry on a useful discussion. An added
complication is that for complex data contexts, it is frequently not obvious how to
even “get a handle on the data”. Usually there are many options available, which
are most effectively decided upon through careful discussion between domain sci-
entists and statisticians. In such discussions, the issue of what should be the data
objects? has proven to frequently lead to useful choices, thus resulting in an ef-
fective and insightful data analysis.
Real data examples demonstrating data object choices in a variety of contexts
are given in the following and Chapter 2. In particular, Section 1.1 introduces
curves as data objects. A more complex variant involves curves with interesting
variation in phase in place of, or in addition to, the usual FDA amplitude variation
discussed in Section 2.1. A mathematically deeper case is considered in Section
1.2 where shapes are the data objects which require special treatment as shapes
are most naturally viewed as points on a curved manifold. Section 2.2 considers
a perhaps even more challenging data set of tree-structured data objects. The data
objects in Section 2.3 are recordings of sounds, in particular human spoken words,
CASE STUDY: CURVES AS DATA OBJECTS 3
which bring special challenges in the choice of data objects. Finally, in Section
2.4, a fun example with images of faces as data objects is considered.
It is seen that the notion of data objects provides a particularly useful format
for discussing modes of variation that give insights about population structure.
This term is formally defined in Section 3.1, but until then the meaning should be
intuitively clear from the context.
One more general aspect of OODA is that there are frequently three major
phases of this type of data analysis:
1. Object Definition. This is the phase where the fundamental issue of what should
be the data objects is addressed. A number of examples of this phase are pro-
vided in the rest of this chapter and also in further examples in other sections.
2. Exploratory Analysis. Here the goal is to find perhaps surprising population
structure in data, often using some type of visualization method. A wide va-
riety of examples and methods for exploratory analysis are given in the rest
of this Chapter and in Chapters 2, 4, 5–10. Exploratory analysis frequently
only appears sparingly in most classical statistics courses, but is usually more
prominent in machine learning. However it has a strong statistical tradition,
going back well before the ideas nicely summarized in Tukey (1977).
3. Confirmatory Analysis. While many great discoveries have been made using
exploratory methods, it is also very easy to make discoveries that are not real,
in the sense of being non-replicable sampling artifacts. For this reason it is very
important to validate such discoveries. This critical topic and many variants of
approaches to it are discussed in detail in the very large classical statistical lit-
erature. Some less well known aspects, that are particularly relevant to OODA
are discussed in Chapter 13.
A companion website to this book, containing links to available software, the Mat-
lab or R programs used to generate most of the figures in this book, and additional
graphics can be found at Marron (2020).
Further discussion on other ideas and nomenclature related to OODA can be
found in Chapter 18. Additional big picture discussion of data science and statis-
tics can be found in Marron (2017a) and Carmichael and Marron (2018).
0.4
-1
log10(mortality)
0.3
mortality
-2
0.2
-3
0.1
0 -4
20 40 60 80 20 40 60 80
age age
Figure 1.1 Spanish Mortality curves as a function of age. Raw male mortality is in the
left panel, with log10 mortality on the right. Years are distinguished using a rotating color
palette. Shows age effects and large variation (factors of more than 10 for some age groups)
across years, as well as the data object choice of log10 mortality being the more useful
scaling of the data.
This view already shows interesting aspects of the data. For example, being
born is a risky activity, with a high mortality rate. However, the chance of dy-
ing falls off rapidly, up until the teen years when risky behavior tends to begin.
Then through adulthood the death rate slowly increases, becoming quite high in
CASE STUDY: CURVES AS DATA OBJECTS 5
old age. Also note the bundle of curves is quite thick, with the axes indicating
approximately a 10 fold change over the years, begging an investigation into how
things have changed over time. This is easily provided in Figure 1.2, by applying
a different color scheme to the curves in the right panel of Figure 1.1. Here time
ordering of the curves is highlighted through coloring with a rainbow scheme to
indicate years, starting with magenta ([1 0 1] in RGB coordinates) for 1908 and
ranging through violet, blue, cyan, green, yellow, and orange to red ([1 0 0] in
RGB) for 2002.
-0.5 1997
1987
-1
1977
-1.5
log10(mortality)
1967
-2 1957
-2.5 1947
-3 1937
1927
-3.5
1917
-4
20 40 60 80
age
Figure 1.2 Spanish Mortality curves using a rainbow color scheme to indicate progression
in time (over years 1908–2002). Shows major improvements in mortality over this time
range.
This shows a very clear overall improvement over the years in mortality, due
mostly to improvements in medicine and public health. Note also that these im-
provements have benefited younger people more than the old, as there is not yet
much treatment available for aging. As happens frequently with OODA data, ad-
ditional visual insights come from careful decomposition of the variation present
in these curves, through a Principal Component Analysis (PCA). See Chapters
4 and 17 and Jolliffe (2002) for background information concerning the many
ways this method is used. One important use of PCA is to gain insight into how
data objects relate to each other. Insight comes from considering the data as lying
in an abstract point cloud in d = 99 dimensional space, where low-dimensional
projections frequently visually illustrate key relationships (e.g. clustering of data
objects). An often useful first step of a PCA is mean centering, which essentially
moves the point cloud so that it is centered at the origin. As seen in Figure 1.3,
this centering operation itself can provide an informative decomposition of the
data into the mean and residuals about the mean.
6 WHAT IS OODA?
-0.5
1
-1
log10(mortality)
log10(mortality)
0.5
-1.5
0
-2
-0.5
-2.5
-1
-3
20 40 60 80 20 40 60 80
age age
Figure 1.3 Left panel is the mean mortality curve. Right panel contains the mean residuals,
where the mean is subtracted from each curve, using the same color scheme. Shows that
age effects are essentially common for all (i.e. over time), in the sense of appearing in the
mean. Improvements over time appear in the residuals, with overall most improvement for
the young.
The left panel of Figure 1.3 shows the mean curve, computed as the point-wise
mean of the curves in Figure 1.2. The right panel contains the mean residuals,
which are computed by subtracting the mean from each of the data curves, while
retaining the original year coloring. Note that the mean curve contains many of the
important features of the raw data, especially those related to age. In particular, the
danger of being born together with low mortality for the young with increasingly
higher mortality for the old are all properties of the mean. These essentially do
not appear in the mean residuals, indicating that these are population properties
which have not changed much over time. A perhaps surprising aspect of the mean
is the occasional blips that appear. One might think these are random noise, but
note that they are quite periodic and in fact appear at decades. This is a function
of historically poor record keeping. The early lack of birth certificates for the full
population led to some uncertainty of age at the time of death for some, with
subsequent rounding to decades which is clearly visible. The mean residuals also
reflect an important aspect of the population structure, being driven by the changes
over time. Most important are the dramatic improvements in mortality that have
been made over the course of this study. This view also makes it clear that the
young have benefited the most with that benefit decreasing as a function of age.
PCA is usefully understood as decomposing the mean centered data in the right
panel of Figure 1.3 into insightful modes of variation (this concept is formally de-
fined in Section 3.1.4). One such mode is the variation revealed by the first princi-
pal component as shown in Figure 1.4. Insight comes from thinking of the above
mentioned point cloud, where each data object (curve in this case) is a point. The
PCA modes of variation are developed by seeking orthogonal directions of maxi-
mal variation within the point cloud. The first PC direction is the unit (i.e. norm 1)
vector, based at the sample mean, which maximizes the variance of the data pro-
jected onto that vector. This direction is easily computed as the first eigenvector of
the sample covariance matrix (defined at (3.5)). The entries of that vector (which
CASE STUDY: CURVES AS DATA OBJECTS 7
1.5
1 0.15
log (mortality) 0.5
0.1
0
10
-0.5 0.05
-1
0
20 40 60 80 -5 0 5
age PC1 Scores
Figure 1.4 PC1 mode of variation plot (left) and scores distribution plot (right). The former
shows that this dominant mode of variation reflects most of the overall improvement in
mortality. Scores plot shows most of the improvements happened relatively rapidly, plus
highlights the 1918 Flu Pandemic (violet outlier on the right) and the Spanish Civil War
(light blue sharp trend to the right).
indicate how it relates to the variables, i.e. features, of the data set) are called the
loadings. Visual insight into these loadings comes from the mode of variation plot
in the left panel of Figure 1.4. The horizontal axis indexes the variables, which are
ages in this case, and the curves are all multiples of the eigenvector. In particular
the curves are projections of the data curves onto the direction vector. These are
the columns of the rank 1 matrix that is the product of the column vector of load-
ings times the row vector of scores, which are the projection coefficients of each
data object onto the eigenvector. In classical multivariate analysis the scores are
also called the principal components. This matrix is the (least squares) best rank 1
approximation of the mean residual matrix shown in the right panel of Figure 1.3.
This PC1 view highlights the dominant mode of variation, which nicely reflects
the major overall improvement in mortality. In addition, as life and death record
keeping has improved over time, the decline in age rounding effects is reflected in
the decadal spikes pointing upwards (early) and downwards (later). In particular,
the rounding was present earlier, not later, so it shows up partially in the mean in
Figure 1.3, and then as this contrast in Figure 1.4 (left panel).
The right panel of Figure 1.4 is the PC1 scores distribution plot, which will be
used frequently in the following to display detailed information as to how the data
objects relate to each other. Each circle represents one score using the same color
scheme. Horizontal coordinates indicate the score and vertical coordinates indi-
cate order in the data set, in this case the year. The magenta color of the top circle
is the year 1908 and the red color of the lowest circle is for the year 2002. The
overall leftward trend again shows the overall improvement in mortality over these
years. The black curve shows a kernel density estimate, which can be thought of as
a smooth histogram. The vertical axis records the heights of this curve. Detailed
discussion of kernel density estimation is in Chapter 15. See Wand and Jones
(1995) for a more in-depth overview. This type of display of one-dimensional dis-
tributions, which includes both the actual data points and the smooth histogram, is
used many other times in the following. In this case it shows much higher density
8 WHAT IS OODA?
of scores in the higher and lower regions, which is another way of seeing that
most of the overall transition from higher to lower mortality was relatively rapid.
A couple of smaller-scale aspects are also clear in this scores plot. The violet year,
farthest to the right was the year 1918, when many people around the world died
during a flu pandemic, which until recently was the largest ever well-documented
epidemiological event worldwide. Also notable is the shift toward higher mortal-
ity (i.e. to the right) shown as light blue, which was the time of the Spanish Civil
War, just before World War II (in which Spain was not a combatant).
0.2 1.2
0.1 1
log10(mortality)
0 0.8
0.6
-0.1
0.4
-0.2
0.2
-0.3
0
20 40 60 80 -0.5 0 0.5 1
age PC2 Scores
Figure 1.5 PC2 mode of variation (left) and scores distribution (right), using the same
format as Figure 1.4. The loadings plot shows this second mode of variation provides a
contrast between the 20–45-year-olds with the rest. The scores plot shows the deep effects
of the flu pandemic, the Spanish civil war and automotive death rate.
Figure 1.4 showed the first mode of variation in the mortality data called PC1.
An interesting complementary mode of variation is the second PC, as shown in
Figure 1.5. This represents the direction of second strongest variation (in the sense
of being orthogonal to the first direction) measured again in terms of variance of
projections. It is computed as the second eigen direction of the sample covariance
matrix. The PC2 mode of variation plot (left panel) shows that this direction high-
lights differences between the 20–45-year-old cohort, with the union of the young
and the old. The color pattern is harder to interpret in this mode, but is very clear
in the scores distribution plot (right panel). Note that the 20–45-year-olds suffered
even stronger effects from both the pandemic and also the war, as they died at a
substantially higher rate than usual in those times. Another interesting feature is
the growing mortality for this cohort in the 1960s to 1980s (green to orange). This
period corresponds to growing access to automobiles, and apparently the idea that
young males are the group most prone to risky automobile behavior. Note that
in the final years, the direction of this trend has fortunately reversed, which has
been ascribed to much improved car safety (such as seat belts) and also to major
improvements in roads.
The concept of modes of variation as determined by PCA loadings and scores
is explored more deeply in Section 3.1.
CASE STUDY: CURVES AS DATA OBJECTS 9
0.5
PC2 Scores
-0.5
-4 -2 0 2 4 6
PC1 Scores
Figure 1.6 Scatterplot of PC1 vs. PC2 scores for the Spanish Mortality data. This shows
many of the above historical trends in a single plot.
Figure 1.6 shows a scatterplot of the bivariate distribution of the PC1 and PC2
scores, which provides a useful and concise summary of both modes of variation,
i.e. of much of the structure in this data set. The one-dimensional PC1 scores
distribution in the right panel of Figure 1.4 is on the horizontal axis, while the ver-
tical axis has the corresponding PC2 scores distribution from the right of Figure
1.5. This is the two-dimensional projection of the data onto the plane with max-
imal variation. Note that the circles representing the data objects (i.e. the mor-
tality curves) are connected with line segments in time order, which facilitates
keeping the progression of years in mind when interpreting the plot. The overall
improvement in mortality, with the exceptions of flu and war, are clear from the
main leftwards progression. Variation over time of the contrast between the 20–
45-year-olds and the rest are also clear on the vertical axis, nicely highlighting the
flu, war and automobile effects.
For this data set, the most interesting views are in the first two PC components.
For others, more components can also be quite insightful. A useful summary of
several PC components is a matrix of such scatterplots, with the axes carefully
coordinated over both rows and columns. The diagonal of such a display is most
useful when it shows some sort of 1-d distributional summary, e.g. the combina-
tion of jitter plots (the colored circles) and kernel density estimates used to show
the distribution of scores in the right panels of Figures 1.4 and 1.5. Jitter plots
are discussed in more detail in Section 4.1. Further examples of such matrices of
scatterplots can be found in Figures 4.4, 4.12, 4.13, and many other places in later
chapters.
Mortality rates for other countries can be explored in a similar way. For exam-
ple mortality data from Switzerland (also available in Wilmoth and Shkolnikov
(2008)) show similar flu pandemic and automobile effects as observed here, but
10 WHAT IS OODA?
neither the data rounding (due to a longer period of good record keeping) nor the
war caused mortality effects are visible as expected.
Figure 1.7 One slice of 3-d CT image in Bladder-Prostate-Rectum data. Bones are white,
black gas bubbles indicate the rectum. Bladder and prostate are light gray near the center
and lower center. This image shows that automatic segmentation is very challenging.
Figure 1.8 Left panel shows the results of a manual segmentation of the bladder, performed
sequentially on orthogonal slices. Right panel shows a rotated view of the same bladder, to
highlight the 3-d aspect of the segmentation.
3.5
2.5
1.5
0.5
0
0 1 2 3 4 5 6 7
Figure 1.9 Toy data set of triangles in R2 , to illustrate shapes as data objects. Lines sep-
arate three equivalence classes (i.e. fibers or orbits) with respect to translation, rotation,
and scaling.
CASE STUDY: SHAPES AS DATA OBJECTS 13
2
of the quotient space of triangle shapes is the sphere S (see (3.2) for a formal
definition). Sections 8.2 and 8.4 contain a broader discussion of shape quotient
spaces, where it is seen that many of those are also curved. This provides strong
motivation for studying data objects lying on curved manifolds, as done below
and in more depth in Section 8.3.
While landmark approaches are useful for many tasks, they are typically less
useful in many medical imaging situations, such as soft tissues, where landmarks
that correspond across cases can be hard to find, with often very few obvious
choices apparent. Hence, there has been much research devoted to boundary rep-
resentations. In the computer graphics world a very common boundary represen-
tation is a triangular mesh, see e.g. Owen (1998). A major challenge to the use of
mesh representations in shape statistics is correspondence, i.e. relating the mesh
parameters (e.g. triangle vertices) across instances of shape data objects. Two im-
portant approaches to this are Active Shape Models, see Cootes et al. (1994) for
a good introduction, and the entropy-based ideas of Cates et al. (2007). Another
major formulation of boundary representations is through Fourier methods, e.g.
as in Kelemen et al. (1999). For sufficiently smooth shapes, Kurtek et al. (2013)
have shown that superior representation comes from enhancing boundary repre-
sentations by also including surface normal vectors in the data objects.
atoms, the lengths of the spokes, and the angles of the spokes, each of which is
represented as a point on the sphere S 2 .
The data objects in this OODA case study are chosen to be the skeletal models
represented by the locations of k atoms in R3 , l positive spoke lengths in R+ and
m directions on S 2 . For CT images where a manual segmentation has been per-
formed, the skeletal shape model can be fit to the binary image shown in blue in
Figure 1.8 (i.e. the various parameters estimated), using direct methods such as
least squares. However as discussed above, for clinical applications such as radi-
ation treatment planning, with a need for a technician to perform this operation
several tens of times for one course of treatment, manual segmentation is pro-
hibitively expensive. This motivated the work cited at the end of this section, on
automating fitting of skeletal models (as shown in Figure 1.10) directly to raw CT
images (as shown in Figure 1.7). As discussed above, this requires incorporation
of something akin to anatomical information. That is done using a Bayesian sta-
tistical approach. Essentially some manual segmentations are used to train a prior
distribution using OODA, which is combined with a likelihood that is based on a
new CT image, to generate a posterior distribution which is maximized over the
parameters of the skeletal shape representation, to give an automatic segmentation.
CASE STUDY: SHAPES AS DATA OBJECTS 15
1.2.4 Bayes Segmentation via Principal Geodesic Analysis
The Bayes implementation employed in this type of application differs somewhat
from most modern Bayes applications. On one hand, the underlying probability
distributions are very basic, since only conjugate Gaussian priors, likelihood, and
hence posteriors are used. This is a strong contrast with the complicated models
involving Markov chain Monte Carlo methods that are currently very prevalent in
applications of Bayes methods. On the other hand, this Bayes application is rel-
atively deep in two ways. First the number of parameters to fit is typically much
higher then the number of training instances, i.e. it lies in the high dimensional
OODA domain discussed in a general way in Chapter 14. The second compli-
cation is the non-Euclidean nature of the reparameterizations, caused mostly by
each spoke naturally lying on the surface of the sphere S 2 . As this research has
progressed, the high dimensionality has been handled by a variety of methods
related to PCA. More challenging is that skeletal parameterized data objects are
naturally elements of a space of the form R3k × Rl+ × (S 2 )m (i.e. tuples of k
real numbers, l positive reals, and m points on the sphere). Such spaces are called
manifolds in differential geometry (see Section 8.2 for an introduction to aspects
of this topic needed for OODA) and are usefully thought of as curved surfaces
(e.g. the surface of a sphere).
The need to address the first complication (the high dimension) in the bladder-
prostate-rectum segmentation challenge described above has led to a series of de-
velopments in terms of analogs of PCA for data lying on the manifolds of skeletal
representations. The Principal Geodesic Analysis (PGA) of Fletcher et al. (2004)
represents an important early advance in this work. The main idea of PGA is
to consider the Euclidean PCA basis as a set of orthogonal lines that (sequen-
tially) best fit the data. In PGA these best fitting lines are replaced by best fitting
geodesics (e.g. great circles on S 2 ) which are a natural analog of lines. The results
of a PGA, based upon n = 17 skeletal representations (collected over a sequence
of days) from a single patient are shown in Figure 1.11.
Figure 1.11 reveals clinically interesting modes of variation of these organs
within this person. The left column (first mode of variation) seems to reflect verti-
cal shift variation driven by the rectum. The second mode (middle column) shows
twisting, while the third (right column) is about emptying and filling of the blad-
der. This input led to the Bayes segmentation method giving very effective auto-
matic segmentation. That was the basis for the successful start-up company Mor-
phormics, which was subsequently purchased by the radiation treatment equip-
ment manufacturer Accuray.
More recently there has been a series of improvements to PGA, motivated by a
succession of deeper and deeper integrations of statistical ideas with differential
geometry. Detailed discussion of this progression appears in Section 8.3. While
this discussion has focused mostly on segmentation using skeletal shape repre-
sentations, much important related work has been done on classification as dis-
cussed in Chapter 11 and on confirmatory analysis which appears here in Chapter
13. A good overview of the usefulness of skeletal representations, especially in
16 WHAT IS OODA?
Breadth of OODA
This chapter illustrates the breadth of OODA through relatively brief overviews
of quite diverse applications.
DOI: 10.1201/9781351189675-2 19
20 BREADTH OF OODA
using additional information as detailed in Koch et al. (2014). These peak loca-
tions, for each of the 15 curves, are indicated by peak numbers (1–14), with colors
corresponding to the curves. The peak numbers are sorted vertically by height of
the corresponding peak and connected with gray line segments to give some visual
correspondence. It is hard to see much pattern, showing this to be a challenging
curve registration problem.
As noted above, there are a number of approaches to this type of data challenge,
with several such analyses of this data set discussed in Marron et al. (2014a).
The bottom panel of Figure 2.1 shows the results of registration of these same
Figure 2.1 Top panel contains raw TIC Curves data, with a labeling of certain important
peaks in the lower part of the panel. Bottom panel shows a Fisher-Rao registration of the
TIC curves. Numbers under the curves indicate peak locations, showing that the registra-
tion has been mostly quite effective.
AMPLITUDE AND PHASE DATA OBJECTS 21
TIC curves using the Fisher-Rao method proposed in Srivastava et al. (2011) and
Kurtek et al. (2012) (discussed in more detail in Section 9.1), using only the curves
themselves and not the peak location information. The colored numbers reveal that
this is a particularly challenging problem, because the peaks have quite different
heights across patients. Peak 10 is particularly challenging as it is quite low for
the red patient (especially compared to nearby very tall peaks), yet is the highest
peak for other patients. Note the alignment is not perfect for every numbered peak,
but it is still of impressively high quality. Roughly comparable quality has been
obtained using a linear registration approach that is integrated with clustering in
Bernardi et al. (2014b), and by a Bayesian approach in Cheng et al. (2014).
An important point made in the overview of Marron et al. (2015) is that curve
registration methods are useful more generally than simply to align curves. While
in some contexts, such as that of Figure 2.1, the phase component is merely nui-
sance variation to be dealt with but of no intrinsic interest, there are many other
situations where the warps themselves represent useful modes of variation. In
such contexts it is insightful to consider different types of data objects for OODA.
In particular, amplitude data objects, whose variation is contained in the aligned
curves, and phase data objects which are the warps used to achieve the alignment.
Depending on the context either or both choices of data object can be of primary
interest or either could represent just nuisance variation.
The notions of amplitude and phase data objects are illustrated using the Bi-
modal Phase Shift example in Figure 2.2. The upper left panel shows a simulated
functional data set, where every data object (curve) has two peaks and is a multiple
of a beta mixture probability density. A rainbow color scheme is used to distin-
guish the curves, in order of how separated the peaks are. The peaks have both
different heights showing substantial amplitude variation, and also quite different
locations reflecting strong phase variation. These two types of mode of variation
are decomposed in a useful way by the warping functions shown in the bottom
right panel, computed using the Fisher-Rao method, described in Section 9.1. The
vertical axis is the same as in the upper left panel. Rescaling that axis using the ma-
genta warp functions moves the magenta peaks inwards, and using the red warp
functions moves the red peaks outwards. The top right panel shows the ampli-
tude data objects, i.e. aligned curves. A careful look shows that the random peak
heights are linearly related with the left peak being high when the right peak is
low. This set of amplitude data objects consists of just a single one-dimensional
mode of variation. The warps in the lower right panel can be thought of as the
phase data objects, although they are not easy to interpret. Enhanced interpreta-
tion of the variation in the phase data objects comes from the view in the lower left
panel. That is an application of each of the warps to the Fréchet mean (discussed
in Section 7.7) template from the Fisher-Rao calculation, which nicely reflects the
one-dimensional phase variation.
22 BREADTH OF OODA
Figure 2.2 Bimodal Phase Shift data (top left panel) showing decomposition into amplitude
(top right panel) and phase (bottom left panel) modes of variation. Decomposition is based
on the warping functions (bottom right panel). Rainbow color scheme highlights the phase
mode, with red for closest peaks through magenta for farthest peaks.
Figure 2.3 Analysis of the Juggling data. Far left panel shows the input acceleration
curves. Center left is the Principal Nested Spheres scatterplot, revealing two distinct clus-
ters highlighted by brushing. Right panels verify these clusters represent two different types
of cycles.
Figure 2.3 uses parts of Figures 2, 3, and 4 from Lu and Marron (2014a).
Figure 2.4 Three adjacent slices of an MRA image for a single subject. Arteries show up
as white dots and curves.
Figure 2.5 Three views of the arterial tree for the subject in Figure 2.4, showing the 3-d
structure through somewhat different rotations.
Such data object representations have been computed for approximately 100
people, in the Brain Artery data set. For example, three more of these for three
different subjects are shown in Figure 2.6. The original study was a little larger,
but some were deleted due to MRA acquisition problems.
SOUNDS AS DATA OBJECTS 25
Figure 2.6 Artery tree data objects for three additional subjects.
Data objects of this type present major challenges to doing statistical analysis.
For example, it is really not clear how to define even the sample mean of such a
set of objects. Understanding variation about the mean, e.g. as done by PCA in
Section 1.1, is a further challenge. A number of approaches to this are discussed
in Chapter 10, which studies these trees in the more general context of graphs as
data objects.
Figure 2.7 Summarization of raw recording of a human speech sound of “deux” in French,
top panel, into a corresponding spectrogram showing time and frequency information with
color coding height, shown in the bottom panel.
all using the same scale to facilitate comparison) from Pigoli et al. (2018) are
shown in Figure 2.8, also from Davide Pigoli. For each language these summaries
are based on aggregating sounds for the spoken digits (1–10). An exploratory vi-
sual comparison of these suggest some similarities (e.g. American and Castilian
Spanish) and also some stark contrasts such as Portuguese from the others. Confir-
matory analysis of these points and a number of others using permutation testing
methods can be found in Pigoli et al. (2018).
SOUNDS AS DATA OBJECTS 27
Figure 2.8 Covariance representation summaries of speech sound from five different lan-
guages/dialects. Note strong differences between them, with potentially interesting histori-
cal and geographical connections.
In the overall area of sounds as data objects, there is another interesting parallel
to the phenomenon noted in Section 2.1, that depending on the context either phase
or amplitude data objects could be the major focus of the analysis with the other
considered to be nuisance variation. In particular, the above work focused on a
particular type of analysis of sounds as data objects, where the goal was to study
human speech by a variety of speakers. As the human brain does when parsing
speech, they deliberately chose data objects which focused on aspects of the sound
that are about meaning of the words, which means generally treating issues such as
Exploring the Variety of Random
Documents with Different Content
the consideration of the qualities of the steamer in sight, a subject
on which, as seamen, they might better sympathise.
“That’s a droll-looking revenue cutter, after all, Capt. Spike,” he
said—“a craft better fitted to go in a fleet, as a look-out vessel, than
to chase a smuggler in-shore.”
“And no goer in the bargain! I do not see how she gets along, for
she keeps all snug under water; but, unless she can travel faster
than she does just now, the Molly Swash would soon lend her the
Mother Carey’s Chickens of her own wake to amuse her.”
“She has the tide against her, just here, sir; no doubt she would
do better in still water.”
Spike muttered something between his teeth, and jumped down
on deck, seemingly dismissing the subject of the revenue entirely
from his mind. His old, coarse, authoritative manner returned, and
he again spoke to his mate about Rose Budd, her aunt, the “ladies’
cabin,” the “young flood,” and “casting off,” as soon as the last
made. Mulford listened respectfully, though with a manifest distaste
for the instructions he was receiving. He knew his man, and a
feeling of dark distrust came over him, as he listened to his orders
concerning the famous accommodations he intended to give to Rose
Budd and that “capital old lady, her aunt;” his opinion of “the
immense deal of good sea-air and a v’y’ge would do Rose,” and how
“comfortable they both would be on board the Molly Swash.”
“I honor and respect Mrs. Budd, as my captain’s lady, you see,
Mr. Mulford, and intend to treat her accordin’ly. She knows it—and
Rose knows it—and they both declare they’d rather sail with me,
since sail they must, than with any other ship-master out of
America.”
“You sailed once with Capt. Budd yourself, I think I have heard
you say, sir?”
“The old fellow brought me up. I was with him from my tenth to
my twentieth year, and then broke adrift to see fashions. We all do
that, you know, Mr. Mulford, when we are young and ambitious, and
my turn came as well as another’s.”
“Capt. Budd must have been a good deal older than his wife, sir,
if you sailed with him when a boy,” Mulford observed a little drily.
“Yes; I own to forty-eight, though no one would think me more
than five or six-and-thirty, to look at me. There was a great
difference between old Dick Budd and his wife, as you say, he being
about fifty when he married, and she less than twenty. Fifty is a
good age for matrimony, in a man, Mulford; as is twenty in a young
woman.”
“Rose Budd is not yet nineteen, I have heard her say,” returned
the mate, with emphasis.
“Youngish, I will own, but that’s a fault a liberal-minded man can
overlook. Every day, too, will lessen it. Well, look to the cabins, and
see all clear for a start. Josh will be down presently with a cart-load
of stores, and you’ll take ’em aboard without delay.”
As Spike uttered this order, his foot was on the plank-sheer of the
bulwarks, in the act of passing to the wharf again. On reaching the
shore, he turned and looked intently at the revenue steamer, and his
lips moved, as if he were secretly uttering maledictions on her. We
say maledictions, as the expression of his fierce, ill-favored
countenance too plainly showed that they could not be blessings. As
for Mulford, there was still something on his mind, and he followed
to the gangway ladder and ascended it, waiting for a moment, when
the mind of his commander might be less occupied, to speak. The
opportunity soon occurred, Spike having satisfied himself with the
second look at the steamer.
“I hope you don’t mean to sail again without a second mate,
Capt. Spike?” he said.
“I do, though, I can tell you. I hate Dickies—they are always in
the way, and the captain has to keep just as much of a watch with
one as without one.”
“That will depend on his quality. You and I have both been
Dickies in our time, sir; and my time was not long ago.”
“Ay—ay—I know all about it—but you didn’t stick to it long
enough to get spoiled. I would have no man aboard the Swash who
made more than two v’y’ges as second officer. As I want no spies
aboard my craft, I’ll try it once more without a Dicky.”
Saying this in a sufficiently positive manner, Capt. Stephen Spike
rolled up the wharf, much as a ship goes off before the wind, now
inclining to the right, and then again to the left. The gait of the man
would have proclaimed him a sea-dog, to any one acquainted with
that animal, as far as he could be seen. The short squab figure, the
arms bent nearly at right angles at the elbow, and working like two
fins with each roll of the body, the stumpy, solid legs, with the feet
looking in the line of his course and kept wide apart, would all have
contributed to the making up of such an opinion. Accustomed as he
was to this beautiful sight, Harry Mulford kept his eyes riveted on
the retiring person of his commander, until it disappeared behind a
pile of lumber, waddling always in the direction of the more thickly
peopled parts of the town. Then he turned and gazed at the
steamer, which, by this time, had fairly passed the brig, and seemed
to be actually bound through the Gate. That steamer was certainly a
noble-looking craft, but our young man fancied she struggled along
through the water heavily. She might be quick at need, but she did
not promise as much by her present rate of moving. Still, she was a
noble-looking craft, and, as Mulford descended to the deck again, he
almost regretted he did not belong to her; or, at least, to any thing
but the Molly Swash.
Two hours produced a sensible change in and around that
brigantine. Her people had all come back to duty, and what was very
remarkable among seafaring folk, sober to a man. But, as has been
said, Spike was a temperance man, as respects all under his orders
at least, if not strictly so in practice himself. The crew of the Swash
was large for a half-rigged brig of only two hundred tons, but, as her
spars were very square, and all her gear as well as her mould
seemed constructed for speed, it was probable more hands than
common were necessary to work her with facility and expedition.
After all, there were not many persons to be enumerated among the
“people of the Molly Swash,” as they called themselves; not more
than a dozen, including those aft, as well as those forward. A
peculiar feature of this crew, however, was the circumstance that
they were all middle-aged men, with the exception of the mate, and
all thorough-bred sea-dogs. Even Josh, the cabin-boy, as he was
called, was an old, wrinkled, gray-headed negro, of near sixty. If the
crew wanted a little in the elasticity of youth, it possessed the
steadiness and experience of their time of life, every man appearing
to know exactly what to do, and when to do it. This, indeed,
composed their great merit; an advantage that Spike well knew how
to appreciate.
The stores had been brought alongside of the brig in a cart, and
were already stowed in their places. Josh had brushed and swept,
until the ladies’ cabin could be made no neater. This ladies’ cabin
was a small apartment beneath a trunk, which was, ingeniously
enough, separated from the main cabin by pantries and double
doors. The arrangement was unusual, and Spike had several times
hinted that there was a history connected with that cabin; though
what the history was Mulford never could induce him to relate. The
latter knew that the brig had been used for a forced trade on the
Spanish Main, and had heard something of her deeds in bringing off
specie, and proscribed persons, at different epochs in the revolutions
of that part of the world, and he had always understood that her
present commander and owner had sailed in her, as mate, for many
years before he had risen to his present station. Now, all was regular
in the way of records, bills of sale, and other documents; Stephen
Spike appearing in both the capacities just named. The register
proved that the brig had been built as far back as the last English
war, as a private cruiser, but recent and extensive repairs had made
her “better than new,” as her owner insisted, and there was no
question as to her sea-worthiness. It is true the insurance offices
blew upon her, and would have nothing to do with a craft that had
seen her two score years and ten; but this gave none who belonged
to her any concern, inasmuch as they could scarcely have been
underwritten in their trade, let the age of the vessel be what it
might. It was enough for them that the brig was safe, and
exceedingly fast, insurances never saving the lives of the people,
whatever else might be their advantages. With Mulford it was an
additional recommendation, that the Swash was usually thought to
be of uncommonly just proportions.
By half past two, P. M., every thing was ready for getting the
brigantine under way. Her foretopsail—or foretawsail, as Spike called
it—was loose, the fasts were singled, and a spring had been carried
to a post in the wharf that was well forward of the starboard bow,
and the brig’s head turned to the southwest, or down stream, and
consequently facing the young flood. Nothing seemed to connect the
vessel with the land but a broad gangway plank, to which Mulford
had attached life-lines, with more care than it is usual to meet with
on board of vessels employed in short voyages. The men stood
about the decks with their arms thrust into the bosoms of their
shirts, and the whole picture was one of silent, and possibly of
somewhat uneasy expectation. Nothing was said, however; Mulford
walking the quarter-deck alone, occasionally looking up the still little
tenanted streets of that quarter of the suburbs, as if to search for a
carriage. As for the revenue-steamer, she had long before gone
through the southern passage of Blackwell’s, steering for the Gate.
“Dat’s dem, Mr. Mulford,” Josh at length cried, from the look-out
he had taken in a stern-port, where he could see over the low
bulwarks of the vessel. “Yes, dat’s dem, sir. I know dat old gray
horse dat carries his head so low and sorrowful like, as a horse has a
right to do dat has to drag a cab about dis big town. My eye! what a
horse it is, sir!”
Josh was right, not only as to the gray horse that carried his
head “sorrowful like,” but as to the cab and its contents. The vehicle
was soon on the wharf, and in its door soon appeared the short,
sturdy figure of Capt. Spike, backing out, much as a bear descends a
tree. On top of the vehicle were several light articles of female
appliances, in the shape of bandboxes, bags, &c., the trunks having
previously arrived in a cart. Well might that over-driven gray horse
appear sorrowful, and travel with a lowered head. The cab, when it
gave up its contents, discovered a load of no less than four persons
besides the driver, all of weight, and of dimensions in proportion,
with the exception of the pretty and youthful Rose Budd. Even she
was plump, and of a well-rounded person; though still light and
slender. But her aunt was a fair picture of a ship-master’s widow;
solid, comfortable and buxom. Neither was she old, nor ugly. On the
contrary, her years did not exceed forty, and being well preserved, in
consequence of never having been a mother, she might even have
passed for thirty-five. The great objection to her appearance was the
somewhat indefinite character of her shape, which seemed to blend
too many of its charms into one. The fourth person, in the fare, was
Biddy Noon, the Irish servant and factotum of Mrs. Budd, who was a
pock-marked, red-faced, and red-armed single woman, about her
mistress’s own age and weight, though less stout to the eye.
Of Rose we shall not stop to say much here. Her deep-blue eye,
which was equally spirited and gentle, if one can use such
contradictory terms, seemed alive with interest and curiosity, running
over the brig, the wharf, the arm of the sea, the two islands, and all
near her, including the Alms-House, with such a devouring rapidity
as might be expected in a town-bred girl, who was setting out on
her travels for the first time. Let us be understood; we say town-
bred, because such was the fact; for Rose Budd had been both born
and educated in Manhattan, though we are far from wishing to be
understood that she was either very-well born, or highly educated.
Her station in life may be inferred from that of her aunt, and her
education from her station. Of the two, the last was, perhaps, a trifle
the highest.
We have said that the fine blue eye of Rose passed swiftly over
the various objects near her, as she alighted from the cab, and it
naturally took in the form of Harry Mulford, as he stood in the
gangway, offering his arm to aid her aunt and herself in passing the
brig’s side. A smile of recognition was exchanged between the young
people, as their eyes met, and the color, which formed so bright a
charm in Rose’s sweet face, deepened, in a way to prove that that
color spoke with a tongue and eloquence of its own. Nor was
Mulford’s cheek mute on the occasion, though he helped the
hesitating, half-doubting, half-bold girl along the plank with a steady
hand and rigid muscles. As for the aunt, as a captain’s widow, she
had not felt it necessary to betray any extraordinary emotions in
ascending the plank, unless, indeed, it might be those of delight on
finding her foot once more on the deck of a vessel!
Something of the same feeling governed Biddy, too, for, as
Mulford civilly extended his hand to her also, she exclaimed—
“No fear of me, Mr. Mate—I came from Ireland by wather, and
knows all about ships and brigs, I do. If you could have seen the
times we had, and the saas we crossed, you’d not think it nadeful to
say much to the likes iv me.”
Spike had tact enough to understand he would be out of his
element in assisting females along that plank, and he was busy in
sending what he called “the old lady’s dunnage” on board, and in
discharging the cabman. As soon as this was done, he sprang into
the main-channels, and thence, viâ the bulwarks, on deck, ordering
the plank to be hauled aboard. A solitary laborer was paid a quarter
to throw off the fasts from the ring-bolts and posts, and every thing
was instantly in motion to cast the brig loose. Work went on as if the
vessel were in haste, and it consequently went on with activity. Spike
bestirred himself giving his orders in a way to denote he had been
long accustomed to exercise authority on the deck of a vessel, and
knew his calling to its minutiæ. The only ostensible difference
between his deportment to-day and on any ordinary occasion,
perhaps, was in the circumstance that he now seemed anxious to
get clear of the wharf and that in a way which might have attracted
notice in any suspicious and attentive observer. It is possible that
such a one was not very distant, and that Spike was aware of his
presence, for a respectable-looking, well-dressed, middle-aged man
had come down one of the adjacent streets, to a spot within a
hundred yards of the wharf and stood silently watching the
movements of the brig, as he leaned against a fence. The want of
houses in that quarter enabled any person to see this stranger from
the deck of the Swash, but no one on board her seemed to regard
him at all, unless it might be the master.
“Come, bear a hand, my hearty, and toss that bow-fast clear,”
cried the captain, whose impatience to be off seemed to increase as
the time to do so approached nearer and nearer. “Off with it, at
once, and let her go.”
The man on the wharf threw the turns of the hawser clear of the
post, and the Swash was released forward. A smaller line, for a
spring, had been run some distance along the wharves, ahead of the
vessel, and brought in aft. Her people clapped on this, and gave way
to their craft, which, being comparatively light, was easily moved,
and was very manageable. As this was done, the distant spectator
who had been leaning on the fence, moved toward the wharf with a
step a little quicker than common. Almost at the same instant, a
short, stout, sailor-like looking little person, waddled down the
nearest street, seeming to be in somewhat of a hurry, and presently
he joined the other stranger, and appeared to enter into
conversation with him; pointing toward the Swash, as he did so. All
this time, both continued to advance toward the wharf.
In the meanwhile, Spike and his people were not idle. The tide
did not run very strong near the wharves and in the sort of a bight
in which the vessel had lain, but, such as it was, it soon took the
brig on her inner bow, and began to cast her head off shore. The
people at the spring pulled away with all their force, and got
sufficient motion on their vessel to overcome the tide, and to give
the rudder an influence. The latter was put hard a-starboard, and
helped to cast the brig’s head to the southward.
Down to this moment, the only sail that was loose on board the
Swash, was the fore-topsail, as mentioned. This still hung in the
gear, but a hand had been sent aloft to overhaul the buntlines and
clew-lines, and men were also at the sheets. In a minute the sail
was ready for hoisting. The Swash carried a wapper of a fore-and-aft
mainsail, and, what is more, it was fitted with a standing gaff, for
appearance in port. At sea, Spike knew better than to trust to this
arrangement, but in fine weather, and close in with the land, he
found it convenient to have this sail haul out and brail like a ship’s
spanker. As the gaff was now aloft, it was only necessary to let go
the brails to loosen this broad sheet of canvas, and to clap on the
out-hauler, to set it. This was probably the reason why the brig was
so unceremoniously cast into the stream, without showing more of
her cloth. The jib and flying-jibs, however, did at that moment drop
beneath their booms, ready for hoisting.
Such was the state of things as the two strangers came first
upon the wharf. Spike was on the taffrail, overhauling the main-
sheet, and Mulford was near him, casting the fore-topsail braces
from the pins, preparatory to clapping on the halyards.
“I say, Mr. Mulford,” asked the captain, “did you ever see either of
them chaps afore? These jokers on the wharf I mean.”
“Not to my recollection, sir,” answered the mate, looking over the
taffrail to examine the parties. “The little one is a burster! The
funniest looking little fat old fellow I’ve seen in many a day.”
“Ay, ay, them fat little bursters, as you call ’em, are sometimes
full of the devil. I don’t like either of the chaps, and am right glad we
are well cast, before they got here.”
“I do not think either would be likely to do us much harm, Capt.
Spike.”
“There’s no knowing, sir. The biggest fellow looks as if he might
lug out a silver oar at any moment.”
“I believe the silver oar is no longer used, in this country at
least,” answered Mulford, smiling. “And if it were, what have we to
fear from it? I fancy the brig has paid her reckoning.”
“She don’t owe a cent, nor ever shall for twenty-four hours after
the bill is made out, while I own her. They call me ready-money
Stephen, round among the ship-chandlers and caulkers. But I don’t
like them chaps, and what I don’t relish I never swallow, you know.”
“They’ll hardly try to get aboard us, sir; you see we are quite
clear of the wharf, and the mainsail will take now, if we set it.”
Spike ordered the mate to clap on the out-hauler, and spread
that broad sheet of canvas at once to the little breeze there was.
This was almost immediately done, when the sail filled, and began to
be felt on the movement of the vessel. Still, that movement was very
slow, the wind being so light, and the vis inertiæ of so large a body
remaining to be overcome. The brig receded from the wharf, almost
in a line at right angles to its face, inch by inch, as it might be,
dropping slowly up with the tide at the same time. Mulford now
passed forward to set the jibs, and to get the topsail on the craft,
leaving Spike on the taffrail, keenly eyeing the strangers, who, by
this time, had got down nearly to the end of the wharf, at the berth
so lately occupied by the Swash. That the captain was uneasy was
evident enough, that feeling being exhibited in his countenance,
blended with a malignant ferocity.
“Has that brig any pilot?” asked the larger and better-looking of
the two strangers.
“What’s that to you, friend?” demanded Spike, in return. “Have
you a Hell-Gate branch?”
“I may have one, or I may not. It is not usual for so large a craft
to run the Gate without a pilot.”
“Oh! my gentleman’s below, brushing up his logarithms. We shall
have him on deck to take his departure before long, when I’ll let him
know your kind inquiries after his health.”
The man on the wharf seemed to be familiar with this sort of
sea-wit, and he made no answer, but continued that close scrutiny of
the brig, by turning his eyes in all directions, now looking below, and
now aloft, which had in truth occasioned Spike’s principal cause for
uneasiness.
“Is not that Capt. Stephen Spike, of the brigantine Molly Swash?”
called out the little, dumpling-looking person, in a cracked, dwarfish
sort of a voice, that was admirably adapted to his appearance. Our
captain fairly started; turned full toward the speaker; regarded him
intently for a moment, and gulped the words he was about to utter,
like one confounded. As he gazed, however, at little dumpy,
examining his bow-legs, red broad cheeks, and coarse snub nose, he
seemed to regain his self-command, as if satisfied the dead had not
really returned to life.
“Are you acquainted with the gentleman you have named?” he
asked, by way of answer. “You speak of him like one who ought to
know him.”
Josh educating a Pig
Philadelphia 1847
——
PART II.
Watch. If we know him to be a thief, shall we not lay hands on him?
Dogb. Truly, by your office, you may; but I think they that touch
pitch will be defiled: the most peaceable way for you, if you do take a
thief, is, to let him show himself what he is, and steal out of your
company.
Much Ado About Nothing.