PubH 7470:
STATISTICS
FOR TRANSLATIONAL & CLINICAL RESEARCH
DIAGNOSTICS:
THE OPTIMIZATION PROBLEM
We need an optimal cutpoint
; but what do
we mean by optimal? Good, but what it
is good for? May be more than one solution
because there are different criteria.
For a continuous marker/predictor such as
PSA; the basic question is How high is
high?
or How low is low?. In practice,
cutpoints are formed arbitrarily
because we
fail
to form and justify a criterion or criteria.
KEY PARAMETERS IN
THE APPLICATIONAL STAGE
Let D
and T
denote the true diagnosis and the
test result, D
is unknown in this stage; the key
parameters are 2 other
conditional probabilities:
Positive Predictive Value, P
+
= Pr(D=+|T=+)
Negative Predictive Value, P
-
= Pr(D=-|T=-)
Positive predictive value, or positive predictivity,
is the probability have an accurate positive result
and negative predictive value
is the probability
have an accurate negative result.
YOUDENS INDEX
The
Youdens Index
(Cancer, 1950) is
defined simply as:
It is not influenced by the disease prevalence
and its value is large when both sensitivity
and specificity are high.
There are a number
of other important reasons
that make the
Youdens index J
even more special!
J = 1 -
(o
+ |) = S
+
+ S
-
-1
Reason #1:
IT DETERMINES PROCESS QUALIFICATION
The minimum criterion a process
must pass
to qualify as a test
is that it detects disease
better
than by chance alone.
That a process can only qualify as a test if
it
selects diseased persons with higher
probability, that is:
P
+
= Pr(D=+|T=+)
> t
= Pr(D=+);
that is, knowledge (T=+) helps.
0 > >
+
J P t
In other words, the Youdens Index J is
a
measure of basic association: D
and
T
are independent if and only if J = 0
Reason #2:
IT MEASURES DIAGNOSTIC POWER
In using a diagnostic marker/test, the
gains
are
(i) [P
+
-
t], and
(ii) [P
-
- (1-t)]
Youdens Index J is some kind of weighted
average of those gains:
) (1 P
1
P
1
)J (1
1
+
+
Reason #3:
IT MEASURES PRECISION IN PREVALENCE SURVEY
We have a screening test T; its sensitivity S
+
and
specificity S
-
have been independently established.
A prevalence survey
is conducted in order to
estimate the disease prevalence.
Results:
x of n subjects found positive
by the test;
p
t
= x/n is a point estimate of t
t
= Pr (T=+); but
t
= Pr (D=+) is estimated by:
J
S p
p
t
1 +
=
PRECISION OF ESTIMATOR
n
) p ( p
J
SE(p)
J
) Var(p
Var(p)
J
S p
p
t t
t
t
=
=
+
=
1 1
1
2
The precision
of estimation of the prevalence
depends only on the Youdens index; the
larger the Index, the smaller the standard error
Basic Strategy/Criterion:
To determine an optimal cutpoint
for a
continuous marker by maximizing the
Youdens Index
of the dichotomized test.
Using this strategy, when using the
resulting dichotomized test in a prevalence
survey we would obtain an estimate with
minimal error. There are other gains too.
CONTINUOUS MARKERS
In many cases, the larger values of the marker
X are associated with the diseased population
For others, the smaller values of the marker X
are associated with the diseased population
We will
assume that larger
values of Y are
associated with the diseased population.
A SIMPLE PLAUSIBLE MODEL
FOR A CONTINUOUS MARKER
Marker Y is normally distributed with the same variance, but
different means;
no matter where you cut, both errors result!
More
important,
the sizes of these errors depend on the cutpoint
Y=y.
cutpoint T=+ T=-
SENSITIVITY & SPECIFICITY
With our assumption that larger values of Y are
associated with the diseased population:
Sensitivity:
S
+
(y) = Pr(T=+|D=+) = 1 -
F
+
(y)
where F
+
(y) is cdf of Y for diseased population.
Specificity:
S
-
(y) = Pr(T=-|D=-) = F
-
(y)
1 -
S
-
(y) = 1 -
F
-
(y) where F
-
(y) is cdf of Y for
the non-diseased or healthy population.
ROC FUNCTION & ROC CURVE
ROC function
maps
(1-
Specificity) to
Sensitivity:
R[1-S
-
(y)] = S
+
(y)
The graph of R(.) is called the ROC curve;
the graph is generated as y moves through its
range of possible values.
FORMAL EXPRESSION
(1-cdf) is called the Survival Function, S(t); let
] 1 , 0 [ )]; ( [ ) (
) ( )] ( [
) ( 1 ) (
) ( 1 ) (
1
e =
=
=
=
+
u u S S u R
t S t S R
t F t S
t F t S
H D
D H
H
D
0
1
1
(1,1)
sensitivity
1-specificity, 1-F
-
(y)=1-S
-
(y)
1-F
+
(y)= S
+
(y)
ROC
Curve
Note that what is on each axis is: (1-cdf) = Survival Function
False positive rate
True positive rate
Youden Index and ROC Curve:
J is equal to the quantity on the vertical axis
minus the quantity on the horizontal axis
SOLUTION #1: EMPIRICAL
Pool the two samples and arrange in
increasing order
At each
midway between two data points,
calculate the Sensitivity S
+
and Specificity
S
-
; then the Youdens Index J = S
+
+S
-
-1
Locate the cutpoint corresponding to
the maximum value of J.
Note: Its hard to determine standard error
SOLUTION #2: NON-PARAMETRIC
The ROC function R(.) maps (U = 1-S
-)
on the
horizontal axis to (V = S
+)
on the vertical axis:
V = R(U)
The Youdens Index (J = S
+
+S
-
-1 = R(U) -
U) is
maximized when: 0 = R(U) -
1, or R(U) = 1.
Process:
(i) Smooth empirical estimate by any
smoothing technique (eg. Lowess), (ii) Locate the
point with (slope = 1)
to obtain specificity, then
(iii) Go to control sample to get cut-point.
It may require lots of data & Its still very hard to
determine standard error.
0
1
1
(1,1)
sensitivity
1-specificity, 1-F
-
(y)=1-S
-
(y)
1-F
+
(y)= S
+
(y)
If the ROC curve is symmetric between (0,0) and (1,1), the
point on the curve with slope=1 is
closest to corner (0,1).
Alternative Solution:
Instead of maximizing the Youden Index, one
could minimize the distance to the upper
left corner (0,1).
Weaknesses:
Characteristics of resulting test is not known
and Its hard to determine standard error
EXAMPLE: Choline as a marker
Can even use to assess the role of a confounder
SOLUTION #3: SEMI-PARAMETRIC
Still looking for the point on the curve
with (Slope = 1)
but, first fitting empirical
data with a smooth curve
Y = R(U|)
because it would take less data to do a better
job than nonparametric smoothing (we need
a model but can check for goodness-of-fit).
The two components needed are:
(i) Choosing a meaningful parameter , and
(ii) Choosing a functional form for R(.)
AREA UNDER ROC CURVE
ROC curve is a graphical device
to show all
possible combinations of sensitivity and
specificity but it is still desirable to reduce the
entire curve to one single quantitative.
One popular one is
area under the ROC curve.
This area has a powerful interpretation
and it is
related to other well-known statistics
making it
easier to learn its statistical properties.
Suppose that an observation Y
1
is randomly sampled
from the diseased population and another random
observation Y
0
is independently sampled from the
non-diseased population:
curve ROC under Area
) (
) Pr(
1
0
0 1
=
=
> =
}
du u R
Y Y A
Its the probability of correct ranking; the
probability of separating a case from a
control, a measure of separation power
PROPORTIONAL HAZARDS MODEL
Since what we have on the axes of the ROC
curve are two survival functions, one possibility
is the Proportional Hazards Model, also
called the Lehmanns Alternatives:
u
u
u
u
) 1 ( 1 ) ( : Index s Youden'
1
: curve ROC under the Area
1 0 ;
, )) ( 1 ( ) ( 1
u u u v u J
A
u
or y F y F
= =
+
=
s s = =
=
+
u) (1 1 R(u) v
SUMMARY:
FOUR STEPS FOR SOLUTION #3
(1) Model the ROC function by PHM:
R(u)=1-(1-u)
u
;0 s
u s
1
(2) Maximize the Youdens Index:
J = 1-u -(1-u)
u
to obtain u = F
-
(y)
(3) Solve for optimal cutpoint: y = (F
-
)
-1
(u)
(4) In the result, u
is estimated from A= u/(u+1); & A is
obtained from the Wilcoxons rank sum.
)} 1 (
2
1
{
1
1
1
1
0 ) 1 ( 1
) (
1 1 1
1 0
) 1 (
1
1
+ =
=
=
= + =
+
n n W
n n
A
A
A
u
u
du
u dJ
u
u
u
u
u
DETAILS
The value obtained is optimal value for the cdf
of the
control group; knowing the value of u, and having the sample
of controls, leads the optimal cutpoint for the marker Y.
First, get the rank sum
W by SAS, say
Then the Area A under ROC curve
Then u, parameter of the PHM
Then u which is the (1-
cdf) of the controls
Then the optimal cutpoint
Note:
Still not easy to obtain standard error,
but possible by Delta method.
Focus on sensitivity
for goodness-of
fit: comparing observed value
versus
fitted value under PH model
MEDICAL IMAGING STUDY
Disease Status Def. Normal (1) Prob. Normal (2) Questionanle (3) Prob. Abnormal (4) Def. Abnormal (5) Total
Normal 33 6 6 11 2 58
Abnormal 3 2 2 11 33 51
cdf F- 0.569 0.672 0.776 0.966 1
Rating by Reader
986 . ) (
1
1
ely approximat 014 .
1
014 .
) 1 (
1
= = =
=
=
= =
x F u
A
A
Area A
u
u
u
F
-
(x) is .986; optimal cutpoint is between 4 and 5
(56/58=.966): classify as abnormal
only those with rating
5
(Resulting test is 99% specific but only 63% sensitive)
Result is similar to ACS recommendation
to classify as abnormal/tumor
only those
mammograms with rating 5 = definitely
abnormal
EXAMPLE: PROSTATE CANCER
There were 53 patients with prostate cancer; 20 of them with nodal
involvement and 33 without. We examined level of acid
phosphatase in blood serum (x100). Data are reproduced from
Miller et al (1980) and are as follows:
Patients without Nodal Involvement: 40, 40, 46, 47, 48, 48, 49,
49, 50, 50, 50, 50, 50, 52, 52, 55, 55, 56, 59, 62, 62, 63, 65, 66, 71,
75, 76, 78, 83, 95, 98, 102, 187.
Patients with Nodal Involvement: 48(6), 49(9), 51(16), 56(21.5),
67(30), 67(30), 67(30), 70(32,5), 70(32.5), 72(35), 76(37.5),
78(40), 81(41), 82(42.5), 82(42.5), 84(45), 89(46), 99(49), 126(51),
136(52);
numbers in parentheses are the ranks in the combined
sample, mid-ranks are used for tied observations.
RESULTS
We found the following results:
Leading to an optimal cutpoint of .75, subjects with level
of acid phosphatase in blood serum greater than .75 are
classified as involved; this corresponds to a specificity
of .791
(which is 1-u
op
) and a sensitivity of .554.
209 .
378 .
726 . : curve ROC under the Area
689 : cases for Sum Rank
1
=
=
=
=
op
u
A
W
u
SOLUTION #4: PARAMETRIC
) ( 1 0
1 ) (
1 ; ) (
] 1 , 0 [ )]; ( [ ) (
) ( )] ( [
1 ) ( 1 ) (
) ( 1 ) (
'
' '
1
u R
u R J
S u u u R J
u u S S u R
t S t S R
S t F t S
S t F t S
H D
D H
H
D
= =
=
= =
e =
=
= =
= =
+ +
LOG-LOGISTIC DISTRIBUTION
If ln(X) is distributed as logistic, X is distributed as log-logistic; the
log-logistic distribution is similar to log-normal distribution but
with thicker tails
so fits better real
non-negative measurements.
Deviation St where ,
1
Mean is where ,
;
) ( 1
1
) (
o
v
v
=
=
+
=
e
t
t S
BOTH LOG-LOGISTIC DISTRIBUTIONS
|
o
o o o
v
) 1 (
) exp( ) 1 (
) (
:
) ( 1
1
) (
u u
u
u u
u
u R
Then
t
t S
H D
H D
+
=
+
=
= =
+
=
) ( for 1 0
exp
H D
H D
|
o
|
> < <
|
.
|
\
|
=
) exp( exp :
1
1
1 :
1
1 ) (
] ) 1 ( [
) (
) 1 (
) (
'
2
'
d where
S u S Optimal
u u R
u u
u R
u u
u
u R
H D
=
|
.
|
\
|
=
= = =
+
= =
+
=
+
=
+
o
|
|
|
|
| |
|
|
|
SCREENING VALUE OF
BIOMARKERS
d S-=S+
1 62%
2 73%
3 82%
4 88%
Recall the strategy: Maximize the Youdens
Index; implicit assumption is: false positives
and false negatives are equally important.
Alternative: Could minimize the weighted
average error,
oPr[T=-|D=+]+(1-
o)Pr[T=+|D=-]; 0< o
<1
EXERCISES
In a previous study (Anderson et al, 2001) of environmental
tobacco smoke, we compared two groups of non-smoking women,
n
1
= 23 women had male partners who smoke in the home and n
0
=
22 women who had male partners who did not smoke. Urine
samples were obtained and analyzed and the comparison based on a
number of chemicals, among then cotinine (a metabolite of
nicotine, in nmol/mL) and NNAL and its glucuronide, NNAL-Gluc
(NNAL and NNAL-Gluc are metabolites of the tobacco-specific
lung carcinogen called NNK, in pmol/mL). Data (cotinine,
NNAL+NNAL-Gluc) are given in the following page (ND is for
not detectable, the limit of detection for cotinine is .003 nmol/mL
and for NNAL and NNAL-Gluc is .005 pmol/mL; one case has
missing value for NNAL+NNAL-Gluc):
Non-exposed women: (ND,ND), (ND,ND), (ND,ND), (ND,ND),
(ND,ND), (ND,.008), (.003,ND), (.003,.015), (.006,ND),
(.007,ND), (.007,ND), (.007,.018), (.008,ND), (.008,ND),
(.009,ND), (.01,ND), (.012,ND), (.016,ND), (.017,ND),
(.019,.047), (.025,ND), and (.03,ND).
Exposed women: (ND,.067), (.003,.009), (.003,.012), (.007,.039),
(.008,ND), (.008,.010), (.008,.011), (.009,ND), (.011,.037),
(.017,.072), (.018,-), (.021,.083), (.036,.022), (.037,.032),
(.042,.063), (.046,ND), (.053,.210), (.076,.041), (.099,.018),
(.101,.031), (.111, .018), (.122,.282), and (.200,.027).
17.1
Determine the optimal cutpoint for cotinine and the
corresponding sensitivity and specificity.
17.2
Determine the optimal cutpoint for NNAL+NNAL-Gluc and
the corresponding sensitivity and specificity.
17.3 Compare the two resulting tests.
REFERENCES
Anderson KE, Carmella SG, Ye M, Bliss RL, Le c, Murphy L, Hecht SS
(2001). Journal of the National Cancer Institute 93: 378-381.
Bamber D (1975). Journal of Mathematical Psychology 12: 387-415.
Begg CB (1991). Statistics in Medicine 10: 1887-1895.
Chou TC (1976). Journal of Theoretical Biology 59: 233-276.
Gart JJ, Buck AA (1966). American Journal of Epidemiology 83: 593-602.
Le CT (1997). Biometrics 53: 998-1007.
Le CT (2006). SMMR 15(6): 571-584
Simpson AJ, Fritter MJ (1973). Psychological Bulletin 80: 481-488.
Steck GB and Zimmer (1972). Proceedings 1972 NATO conference on
reliability evaluation and reliability testing; The Hague, Netherlands.
Youden WJ (1950). Index for rating diagnostic tests. Cancer 3: 32-35.