0% found this document useful (0 votes)
234 views66 pages

Analytical Methods Validation Guide

The document outlines the process of analytical methods validation, emphasizing the importance of establishing documented evidence to ensure methods yield accurate results. It details various validation characteristics such as accuracy, precision, specificity, and acceptance criteria, while also referencing guidelines from organizations like ICH, FDA, and Health Canada. Additionally, it provides a structured approach for preparing validation protocols and includes specific methodologies for testing and documenting results.

Uploaded by

imma67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
234 views66 pages

Analytical Methods Validation Guide

The document outlines the process of analytical methods validation, emphasizing the importance of establishing documented evidence to ensure methods yield accurate results. It details various validation characteristics such as accuracy, precision, specificity, and acceptance criteria, while also referencing guidelines from organizations like ICH, FDA, and Health Canada. Additionally, it provides a structured approach for preparing validation protocols and includes specific methodologies for testing and documenting results.

Uploaded by

imma67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ANALYTICAL METHODS

VALIDATION
TABLE OF

CONTENTS
ANALYTICAL METHODS
VALIDATION
Step-by-Step Analytical Methods
Validation AND Protocol in the
Quality System Compliance Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Validation of Analytical Methods Used in Cleaning Validation . . . . . . . . . . . . . . . . . 15

Good Analytical Method Validation Practice:


Deriving Acceptance Criteria for the AMV Protocol: Part II . . . . . . . . . . . . . . . . . . . 31

Good Analytical Method Validation Practice:


Setting up for Compliance and Efficiency: Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Validating Immunoassays Using the Fluorescence Polarization


Assay for the Diagnosis of Brucellosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

A n a l y t i c a l M e t h o d s Va l i d a t i o n 3
Step-by-Step Analytical
Methods Validation and
Protocol in the Quality System
Compliance Industry
BY GHULAM A. SHABIR


Introduction peatability and intermediate precision), specificity, detec-
Methods Validation: Establishing documented evidence tion limit, quantitation limit, linearity, range, and robustness
that provides a high degree of assurance that a specific (Figure 1). In addition, methods validation information
method, and the ancillary instruments included in the should also include stability of analytical solutions and sys-
7
method, will consistently yield results that accurately reflect tem suitability.
the quality characteristics of the product tested.
Health Canada (HC) has also issued guidance on meth-
8
Method validation is an important requirement for any ods validation entitled Acceptable Methods Guidance. HC
package of information submitted to international regula- has been an observer of ICH, and has adopted ICH guide-
tory agencies in support of new product marketing or clini- lines subsequent to its reaching Step Four of the ICH
cal trials applications. Analytical methods should be vali- process. An acceptable method predates ICH, and HC
dated, including methods published in the relevant pharma- plans to revise this guidance to reflect current ICH termi-
copoeia or other recognized standard references. The suit- nology.
ability of all test methods used should always be verified Figure 2 shows the data required for different types of
under the actual conditions of use and should be well docu- analysis for method validation. Where areas of the Accept-
mented. able Methods Guidance are superseded by ICH Guidelines
1 2
Methods should be validated to include consideration of Q2A and Q2B, HC accepts the requirements of either the
characteristics included in the International Conference on ICH or Acceptable Methods Guidance; however, for
1, 2
Harmonization (ICH) guidlines addressing the validation method validation, ICH acceptance criteria are preferred.
of analytical methods. Analytical methods outside the scope HC’s Acceptable Methods Guidance provides useful guid-
of the ICH guidance should always be validated. ance on methods not covered by the ICH guidelines (e.g.,
ICH is concerned with harmonization of technical re- dissolution, biological methods), and provides acceptance
quirements for the registration of products among the three criteria for validation parameters and system suitability tests
major geographical markets of the European Community for all methods.
(EC), Japan, and the United States (U.S.) of America. The HC has also issued templates recommended as an ap-
recent U.S. Food and Drug Administration (FDA) methods proach for summarizing analytical methods and validation
3-5
validation guidance document, as well as the United States data ICH terminology was used when developing these tem-
6
Pharmacopoeia (USP), both refer to ICH guidelines. plates.
The most widely applied typical validation characteris- This paper suggests one technique of validating meth-
tics for various types of tests are accuracy, precision (re- ods. There are numerous other ways to validate methods, all

4 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Ghulam A. Shabir

Figure 1
___________________________________________________________________________
ICH, USP, and FDA Methods Validation Characteristics Requirements for Various Types of Tests

Validation Assay Testing for Impurities Identification


Characteristics Quantitative Limit
Accuracy Yes Yes No No
Precision - Repeatability Yes Yes No No
Precision - Intermediate Yes1 Yes* No No
Precision
Specificity Yes Yes Yes Yes
Detection limit No No Yes No
Quantitation limit No Yes No No
Linearity Yes Yes No No
Range Yes Yes No No
Robustness Yes Yes No No
7
* In cases where reproducibility has been performed, intermediate precision is not needed.

Figure 2
____________________________________________________________________
Health Canada Methods Validation Parameter Requirements for Various Types of Tests

Validation Identity Active Ingredients Impurities / Degradation


Physico-Chemical
Parameters Tests Drug Drug Products
Tests
Substance Product Quantitative Limit Tests
Precision
(of the system) No Yes Yes Yes 1 Yes
Precision
(of the method) No 1 Yes Yes 1 Yes
Linearity No Yes Yes Yes No Yes
Accuracy No Yes Yes Yes 1 Yes
Range No 1 Yes Yes No Yes
Specificity Yes 1 Yes Yes Yes *
Detection Limit 1 No No Yes Yes *
Quantitation Limit No No No Yes No *
Ruggedness 1 Yes Yes Yes Yes Yes
* May be required depending upon the nature of the test.

equally acceptable when scientifically justified. required accuracy, and required sensitivity. (Note: Most of
Prepare a Protocol the acceptance criteria come from the characterization
study.) Furthermore, some tests may be omitted, and the
The first step in method validation is to prepare a proto- number of replicates may be reduced or increased based on
col, preferably written, with the instructions in a clear step- scientifically sound judgment.
by-step format, and approved prior to their initiation. This A test method is considered validated when it meets the
approach is discussed in this paper. The suggested accep- acceptance criteria of a validation protocol. This paper is a
tance criteria may be modified depending on method used, step-by-step practical guide for preparing protocols and per-

A n a l y t i c a l M e t h o d s Va l i d a t i o n 5
Ghulam A. Shabir

forming test methods validation with reference to High Methods validation must have a written and approved
Performance Liquid Chromatography (HPLC) (use simi- protocol prior to its initiation. A project controller will se-
lar criteria for all other instrumental test method valida- lect a validation Cross-Functional Team (CFT) from var-
tion) in the quality system compliance industry. ious related departments and functional areas. The project
controller assigns responsibilities. The following tables il-
Analytical Methods Validation Protocol lustrate one suggested way of documenting and preserv-
Approval Cover Page ing a record of the approvals granted at the various phases

Summary Information

Summary Information
Organization name
Site location
Department performing validation
Protocol title
Validation number
Equipment
Revision number

Project Controller

Project Name Signature Date


Controller

Document Approval

Document Approval
Department /
Functional Area Name Signature Date
Technical Reviewer
End Lab Management
Health & Safety
Quality Assurance
Documentation Control
(reviewed and archived by)

Revision History

Revision History
Revision No. Date Description of change Author

6 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Ghulam A. Shabir

of the validation: 1.1.1. Test procedure


The specificity of the assay method will be investigated
Writing a Test Method Validation Protocol by injecting of the extracted placebo to demonstrate the
absence of interference with the elution of analyte.
Analytical method validations should contain the fol-
lowing information in detail: 1.1.2. Documentation
Purpose: This section provides a short description of Print chromatograms.
what is to be accomplished by the study.
Project scope: Identify the test methods and which prod- 1.1.3. Acceptance criteria
ucts are within the scope of the validation. The excipient compounds must not interfere with the
Overview: This section contains the following: a gen- analysis of the targeted analyte.
eral description of the test method, a summary of the char-
acterization studies, identification of method type and vali- 1.2. Linearity
dation approach, test method applications and validation
protocol, the intended use of each test method application, 1.2.1. Test procedure
and the analytical performance characteristics for each test Standard solutions will be prepared at six concentra-
method application. tions, typically 25, 50, 75, 100, 150, and 200% of target
Resources: This section identifies the following: end concentration. Three individually prepared replicates at
user laboratory where the method validation is to be per- each concentration will be analyzed. The method of
formed; equipment to be used in the method validation; standard preparation and the number of injections will
software to be used in the method validation; materials to be be same as used in the final procedure.
used in the method validation; special instructions on han-
dling, stability, and storage for each material. 1.2.2. Documentation
Appendices: This section contains references, signa- Record results on a datasheet. Calculate the mean, stan-
ture, and a review worksheet for all personnel, their specific dard deviation, and Relative Standard Deviation (RSD)
tasks, and the documentation of their training. Listings of all for each concentration. Plot concentration (x-axis) ver-
equipment and software necessary to perform the method sus mean response (y-axis) for each concentration. Cal-
validation should be found here along with document and culate the regression equation and coefficient of deter-
materials worksheets used in the method validation and in mination (r2). Record these calculations on the
the test method procedure(s). datasheet.

1. Analytical Performance Characteristics 1.2.3. Acceptance criteria


Procedure The correlation coefficient for six concentration levels
Before undertaking the task of methods validation, it is will be ≥ 0.999 for the range of 80 to 120% of the target
necessary that the analytical system itself be adequately concentration. The y-intercept must ≤ 2% of the target
designed, maintained, calibrated, and validated. All per- concentration response. A plot of response factor versus
sonnel who will perform the validation testing must be concentration must show all values within 2.5% of the
properly trained. Method validation protocol must be target level response factor, for concentrations between
9,10
agreed upon by the CFT and approved before execution. 80 and 120% of the target concentration. HC states
For each of the previously stated validation characteristics that the coefficient of determination for active ingredi-
(Figure 1), this document defines the test procedure, doc- ents should be ≥ 0.997, for impurities 0.98 and for bio-
umentation, and acceptance criteria. Specific values are logics 0.95.8
taken from the ICH, U.S. FDA, USP, HC, and pertinent
literature as references. (See the References section at the 1.3. Range
end of this article for further definitions and explanations.)
1.3.1. Test procedure
1.1. Specificity The data obtained during the linearity and accuracy
studies will be used to assess the range of the method.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 7
Ghulam A. Shabir

Linearity - Data Sheet Electronic file name:


Concentration Concentration as % Peak Area (mean of Peak Area
(mg/ml) of Analyte Target three Injections) RSD (%)

5 (e.g.) 25
10 50
15 75
20 100
30 150
40 200
Equation for regression line = Correlation coefficient (r2) =

Range - Data Sheet Electronic file name:


Record range:

Accuracy - Data Sheet Electronic file name:


Sample Percent Amount of Standard Recovery (%)
of Nominal (mean of (mg)
three injections) Spiked Found
1 75 (e.g.)
2 100
3 150
Mean
SD
RSD%

Repeatability - Data Sheet Electronic file name:


Injection No. Retention Time (min) Peak Area Peak Height

Replicate 1
Replicate 2
Replicate 3
Replicate 4
Replicate 5
Replicate 6
Replicate 7
Replicate 8
Replicate 9
Replicate 10
Mean
SD
RSD%

8 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Ghulam A. Shabir

The precision data used for this assessment is the preci- Record the retention time, peak area, and peak height on
sion of the three replicate samples analyzed at each level the datasheet. Calculate the mean, standard deviation,
in the accuracy studies. and RSD.
1.3.2. Documentation 1.5.3. Acceptance criteria
Record the range on the datasheet. The FDA states that the typical RSD should be 1% for
drug substances and drug products, ± 2% for bulk drugs
1.3.3. Acceptance criteria and finished products. HC states that the RSD should be
The acceptable range will be defined as the concentra- 1% for drug substances and 2% for drug products. For
tion interval over which linearity and accuracy are ob- minor components, it should be ± 5% but may reach
8
tained per the above criteria, and in addition, that yields 10% at the limit of quantitation.
9
a precision of ≤ 3% RSD.
1.6. Intermediate Precision
1.4. Accuracy
1.6.1. Test procedure
1.4.1. Test procedure Intermediate precision (within-laboratory variation) will
Spiked samples will be prepared at three concentrations be demonstrated by two analysts, using two HPLC sys-
over the range of 50 to 150% of the target concentration. tems on different days and evaluating the relative per-
Three individually prepared replicates at each concen- cent purity data across the two HPLC systems at three
tration will be analyzed. When it is impossible or diffi- concentration levels (50%, 100%, 150%) that cover the
cult to prepare known placebos, use a low concentration analyte assay method range 80 to 120%.
of a known standard.
1.6.2. Documentation
1.4.2. Documentation Record the relative % purity (% area) of each concentra-
For each sample, report the theoretical value, assay tion on the datasheet.
value, and percent recovery. Calculate the mean, stan-
dard deviation, RSD, and percent recovery for all sam- Calculate the mean, standard deviation, and RSD for the
ples. Record results on the datasheet. operators and instruments.

1.4.3. Acceptance criteria 1.6.3. Acceptance criteria


The mean recovery will be within 90 to 110% of the the- The assay results obtained by two operators using two
oretical value for non-regulated products. For the U.S. instruments on different days should have a statistical
9, 10
pharmaceutical industry, 100 ± 2% is typical for an RSD ≤ 2%.
assay of an active ingredient in a drug product over the
9
range of 80 to 120% of the target concentration. Lower 1.7. Limit of Detection
percent recoveries may be acceptable based on the needs
of the methods. HC states that the required accuracy is a 1.7.1. Test procedure
bias of ≤ 2% for dosage forms and ≤ 1% for drug sub- The lowest concentration of the standard solution will be
8
stance. determined by sequentially diluting the sample. Six
replicates will be made from this sample solution.
1.5. Precision - Repeatability
1.7.2. Documentation
1.5.1. Test procedure Print the chromatogram and record the lowest detectable
One sample solution containing the target level of ana- concentration and RSD on the datasheet.
lyte will be prepared. Ten replicates will be made from
this sample solution according to the final method pro- 1.7.3. Acceptance criteria
2
cedure. The ICH references a signal-to-noise ratio of 3:1. HC rec-
ommends a signal-to-noise ratio of 3:1. Some analysts
1.5.2. Documentation calculate the standard deviation of the signal (or response)

A n a l y t i c a l M e t h o d s Va l i d a t i o n 9
Ghulam A. Shabir

Intermediate Precision - Datasheet Electronic file name:


Relative % Purity (% area)
Instrument 1 Instrument 2
Sample S1 S2 S3 S1 S2 S3
(50%) (100%) (150%) (50%) (100%) (150%)
Operator 1, day 1
Operator 1, day 2
Operator 2, day 1
Operator 2, day 2
Mean (Instrument)
Mean (Operators)
RSD% S1 + S1 S2 + S2 S3 + S3
Instruments
Operators

Limit of Detection - Data Sheet Electronic file name:


Record sample data results: (e.g., concentration, S/N ratio, RSD%)

Limit of Quantitation - Data Sheet Electronic file name:


Record sample data results: (e.g., concentration, S/N ratio, RSD%)

of a number of blank samples and then multiply this num- centration that gives an RSD of approximately 10% for
8
ber by two to estimate the signal at the limit of detection. a minimum of six replicate determinations.

1.8. Limit of Quantitation


1.9. System Suitability
1.8.1. Test procedure
Establish the lowest concentration at which an analyte in 1.9.1. Test procedure
the sample matrix can be determined with the accuracy System suitability tests will be performed on both HPLC
and precision required for the method in question. This systems to determine the accuracy and precision of the
value may be the lowest concentration in the standard system by injecting six injections of a solution contain-
curve. Make six replicates from this solution. ing analyte at 100% of test concentration. The following
parameters will be determined: plate count, tailing fac-
1.8.2. Documentation tors, resolution, and reproducibility (percent RSD of re-
Print the chromatogram and record the lowest quantified tention time, peak area, and height for six injections).
concentration and RSD on the datasheet. Provide data
that demonstrates the accuracy and precision required in 1.9.2. Documentation
the acceptance criteria. Print the chromatogram and record the data on the
datasheet
1.8.3. Acceptance criteria
The limit of quantitation for chromatographic methods 1.9.3. Acceptance criteria
has been described as the concentration that gives a sig- Retention factor (k): the peak of interest should be well
nal-to-noise ratio (a peak with height at least ten times as resolved from other peaks and the void volume; gener-
2
high as the baseline noise level) of 10:1. HC states that ally k should be ≥2.0. Resolution (Rs): Rs should be ≥2
the quantitation limit is the best estimate of a low con- between the peak of interest and the closest eluted peak,

10 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Ghulam A. Shabir

System Suitability – Data Sheet Electronic file name:


System Suitability Acceptance Results Criteria Met/
Parameter Criteria Not Met
HPLC 1 HPLC 2
Injection precision for
retention time (min) RSD ≤ 1%
Injection precision for
peak area (n = 6) RSD ≤ 1%
Injection precision for
peak height RSD ≤ 1%
Resolution (Rs) Rs = ≥ 2.0
USP tailing factor (T) T = ≤ 2.0
Capacity factor (k) K = ≥ 2.0
Theoretical plates (N) N = ≥ 2000

Robustness - Data Sheet Electronic file name:


Explain / record sample data:

which is potentially interfering (impurity, excipient, and temperature adjusted by ± 5˚C. If these changes are within
degradation product). Reproducibility: RSD for peak the limits that produce acceptable chromatography, they
9, 10
area, height, and retention time will be 1% for six injec- will be incorporated in the method procedure.
tions. Tailing factor (T): T should be 2. Theoretical
3
plates (N): ≥2000.

1.10. Robustness 2. Appendices


As defined by the USP, robustness measures the capac- List all appendices associated with this protocol. Each
ity of an analytical method to remain unaffected by appendix needs to be labeled and paginated separately
small but deliberate variations in method parameters.
Robustness provides some indication of the reliability of
an analytical method during normal usage. Article Acronym Listing
Parameters, which will be investigated, are percent or-
ganic content in the mobile phase or gradient ramp, pH CFT: Cross-Functional Team
of the mobile phase, buffer concentration, temperature, EC: European Community
and injection volume. These parameters may be evalu- FDA: Food and Drug Administration
ated one factor at a time or simultaneously as part of a HC: Health Canada
factorial experiment. HPLC: High Performance Liquid
Chromatography
The chromatography obtained for a sample containing ICH: International Conference on
representative impurities, when using modified parame- Harmonization
ter(s), will be compared to the chromatography obtained RSD: Relative Standard Deviation
using the target parameters. The effects of the following U.S.: United States
changes in chromatographic conditions will be deter- USP: United States Pharmacopoeia
mined: methanol content in mobile phase adjusted by ±
2%, mobile phase pH adjusted by ± 0.1 pH units, column

A n a l y t i c a l M e t h o d s Va l i d a t i o n 11
Ghulam A. Shabir

List of Appendices
Appendix Document Title Total Pages
No.

Appendix 1
______________________________________________________________________________
Method Validation Personnel Signature and Review Worksheet

Analyst Name Dept. Validation Protocol Analyst Date


Activity Reference Signature

Comments:

Completed By: Signature: Date:

Appendix 2
______________________________________________________________________________
Equipment and Software Used in Method Validation Worksheet

Equipment Last Next Software Validation


Name/Module # Calibration Date Calibration Date Name and Version Reference

Comments:

Completed By: Signature: Date:

12 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Ghulam A. Shabir

Appendix 3
______________________________________________________________________________
Document and Materials Used in Method Validation Worksheet

Complete Pre-protocol Execution

Document Edition/Version Material Name Supplier/ Expiration


Name/Ref. No. Number Lot Number Date

Comments:

Completed By: Signature: Date:

Appendix 4
_____________________________________________________________________________
Analytical Test Method Procedure

This procedure should include the entire testing method and all procedures associated with it.
This appendix can appear in any format, but it should always be included in the documentation

quality assurance, in-process control, R&D, and val-


from the body of the document. The following informa- idation in the pharmaceutical industry. Ghulam has
tion must be found on every page of each appendix: val- received many technical excellence industrial
idation protocol number; validation protocol title; ap- awards as well as academic awards for authoring
pendix number (e.g., 1, 2, 3, … or A, B, C, …); and page ‘best scientific papers.’ Ghulam’s work has ap-
X of Y. ❏ peared in many publications. He has given several
Acknowledgements presentations at international conferences as well
as having organized and moderated at sympo-
I thank Abbott Laboratories and MediSense for permis- siums. He holds a Master’s Degree in Chemistry
sion to publish this article. I also thank Dr. Alison Ingham and Pharmaceutical Sciences. He can be reached
(Health Canada) for his comments on the text. by phone at 44-1993-863099, by fax at 44-1235-
467737, or by e-mail at [Link]@[Link].

About the Author


References
Ghulam Shabir is a Principal Scientist at Abbott
Laboratories, MediSense UK. His group is respon- 1. International Conference on Harmonization (ICH), Q2A:
sible for materials characterization, analytical meth- Text on Validation of Analytical Procedures, March 1995.
ods development, and validation and equipment 2. International Conference on Harmonization (ICH), Q2B: Val-
qualification. Ghulam is a Fellow of the Institute of idation of Analytical Procedures: Methodology, May 1997.
Quality Assurance and a Companion of the Institute 3. U.S. Center for Drug Evaluation and Research, Reviewer
of Manufacturing with 17 years of broad-based ex- Guidance: Validation of Chromatographic Methods, Novem-
perience in the areas of production, quality control, ber 1994.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 13
Ghulam A. Shabir

4. U.S. FDA, Guidance for Submitting Samples and Analytical


Data for Methods Validation, Rockville, Md., USA, Center
for Drugs and Biologics, Department of Health and Human
Services, February 1987.
5. U.S. FDA DHHS, 21 CFR Parts 210 and 211, Current
Good Manufacturing Practice of Certain Requirements for
Finished Pharmaceuticals, Proposed Rule, May 1996.
6. Validation of Compendial Methods, <1225>, U.S. Pharma-
copoeia 26-National Formulary 21, United States Pharma-
copeial Convention, Rockville MD, 2003.
7. U.S. FDA, Guidance for Industry: Analytical Procedures and
Methods Validation: Chemistry, Manufacturing and Controls
Documentation, August 2000.
8. Drugs Directorate Guidelines, Acceptable Methods, Na-
tional Health and Welfare, Health Protection Branch,
Canada, July 1994. (This guidance is available from HC as
a print copy, but is soon to be released on the website
[Link]
9. M.J. Green, Anal. Chem., Vol. 68, 1996. p. 305A.
10. G.A. Shabir, J. Chromatogra. A, Vol. 987, 2003. p. 57.

14 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Validation of Analytical
Methods Used in
Cleaning Validation
BY HERBERT J. KAISER, PH.D. & BRUCE RITTS, M.S.


The information used to establish a positive cleaning val- with cleaning validation, analytical methods may need to be
idation is based on the result of validated analytical mea- revalidated.10 This revalidation may arise from changes in in-
surements. There must be a high degree of confidence in strumentation, analytes or manufacturing methods, or clean-
these results, as human safety depends on the lack of ing processes that affect the ability of the analytical method
residues remaining on equipment. This article will describe to determine the correct analyte level.
various aspects regarding the validation of analytical meth- Measuring cleanliness is a difficult task. Essentially, trace
ods used in cleaning validations. The validation elements residues on surfaces are the target analytes. The residue must
are explored from both a theoretical point of view and first be extracted from a surface, recovered from the extrac-
through examples. References are provided to guide the tion medium, and then suitably quantitated. Residue analysis
reader to more in-depth information. is quite different from analyzing bulk or formulated drug ac-
tives, as obtainable precisions and accuracies may be larger

A
n analytical method is one of the deciding factors in than the analyst is accustomed. Sensitivity levels of the tech-
establishing the cleanliness of pharmaceutical man- niques employed need to be considered for linearity, preci-
ufacturing equipment. It is, therefore, important that sion, and accuracy. The first decision to be made is the deci-
there be a high level of confidence in the results obtained sion as to which residue will be measured. This residue
using the method. This high level of confidence is estab- could be the active drug, formulation excipients, or a com-
lished by testing and defining the usefulness of the analytical ponent of the cleaner.11 In most cases, the residue being ana-
method. A properly developed cleaning validation strategy lyzed has the potential to be a combination of all of these.
includes the analytical method validation, which defines the The next step is to decide on the allowed residue limit,12 ,13
method parameters necessary in providing a high level of followed by the choice of whether to use a specific or non-
confidence in the cleaning results. The analytical method specific technique. It is only after these decisions have been
validation study demonstrates to scientific staffs, manufac- made, that the analytical technique can be selected.
turing personnel, and regulatory agencies, that the method Analytical method validation is the analysis of repro-
performs as required, and that the results are reliable. There ducibility of the method developed. There should not be any
are many articles available that address analytical method surprises in a validation study. All of the parameters required
validation within and outside of the pharmaceutical industry, in a validation study need to be preliminarily evaluated dur-
1,2,3,4,5,6,7,8
both domestically and worldwide. ing method development. Method development is the process
Personnel other than analytical chemists may not under- by which the analytical chemist obtains the initial informa-
stand the need for analytical method validation, let alone the tion to establish the limits and goals that are listed in a vali-
extent to which these methods need to be evaluated. They dation protocol, e.g., precision of 5% between analysts, lin-
may not understand that analytical method validation, as earity, accuracy, repeatability, etc. Understanding the required
well as cleaning validation, has an important impact on parameters during method development is a requirement for
9
everyday pharmaceutical manufacturing. As is the case successful analytical method validation.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 15
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Regulations
egory III methods are used for the evaluation of performance
The requirement for analytical method validation is characteristics, and Category IV methods are identification
identified in the Good Manufacturing Practice (GMP) reg- tests. The ICH guidelines define the same categories as the
ulations (21 CFR 211). The United States Pharmacopoeia USP, except for USP Category III. Table 1 lists the ICH cat-
(USP) provides a widely used standard for analytical egories and required parameters.
method validation, and is probably the most often used What parameters apply to analytical methods used in
reference regarding the subject.14 The Food and Drug Ad- cleaning validations? The residues being determined are po-
ministration (FDA) submitted guidelines for analytical tential impurities. Therefore, the parameters that should be
15
method validation in 1995 that correlate with the recom- evaluated are most closely associated with Category II re-
mendations of the International Conference on Harmo- quirements (quantitative analysis of residues). It could be ar-
16
nization (ICH). The ICH then issued a document de- gued that the most important parameters are the limits of
scribing different approaches that can be used in analytical quantitation and detection, because these are the measures of
17
method validation. The FDA has also issued a guide for sensitivity of the analytical method.
the validation of cleaning processes that state the need for
18
validated analytical methods in cleaning validations. Chromatographic versus
The ICH documents, along with the USP document, de- non-Chromatographic Methods
scribe validation guidelines for methods used in different ap- While various chromatographic methods, specifically
plications. The USP describes four different categories of High Performance Liquid Chromatography (HPLC), may be
methods. Category I methods involve the quantitation of the more common methods used in analytical laboratories,
19,20
major components of bulk drug substances or active ingredi- there are certainly other applicable methods. Some exam-
21
ents in finished pharmaceutical products. Category II meth- ples are Total Organic Carbon (TOC), capillary elec-
22
ods involve the determination of impurities in bulk drug sub- trophoresis , Atomic Absorption (AA), Inductively Coupled
stances, or degradation compounds in finished pharmaceuti- Plasma (ICP), titrations, ultraviolet spectroscopy, near in-
23 24
cal products. These include quantitative and limit tests. Cat- frared, enzymatic, etc. These “other” methods also require

Table 1
__________________________________________________________________
ICH method parameters.

Identity Impurities Assay


Parameter
Quantitative Limit

Accuracy - + - +

Precision
Repeatability - + - +

Intermediate - + - +

Specificity + + + +

LOD - -* + -

LOQ - + - -

Linearity - + - +

Range - + - +

* May be needed for some applications

16 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

validation, and while the USP and ICH pa- Table 2


_____________________________________________
rameters may appear more suitable for Data used for linearity evaluations
chromatographic methods, they are cer-
tainly applicable to non-chromatographic [Active X], Response Standard % RSD
25
methods. A good understanding of how to ppm Deviation
adapt and measure the required parameters 0.10 613 72.5 11.80
using the specified analytical technique, is
0.40 900 100.0 11.00
all that is required. In fact, various methods
26
can be used to validate each other, i.e., a 0.70 998 33.3 3.33
mass spectroscopic technique could be
1.1 1472 51.1 3.50
used to determine the specificity of another
technique. 1.7 2398 53.0 2.21
2.5 3398 55.3 1.63
Specificity
5.0 6800 100.0 1.50
Specificity is the ability of a method to 10.0 13467 153.0 1.10
measure the analyte in the presence of
components, which may be expected to be 20.0 27133 208.0 0.77
present. For cleaning validation methods, 30.0 40417 284.0 0.70
the potential presence of drug actives, for-
mulation excipients, impurities, known 40.0 49500 500.0 1.00
degradation products, and cleaner compo-
27
nents (if any) should be anticipated. Experiments must be
Linearity and Range
conducted that demonstrate the absence of interferences The linearity of a method is the ability of an assay to
when the analyte is in a typical matrix. elicit a direct and proportional response to changes in analyte
Co-elution of components in chromatographic methods concentration. There are some detectors that produce, or
is typically the primary concern here. If HPLC with diode have the ability to produce, non-linear responses (e.g., gas
array capabilities is utilized, peak purity can be evaluated by chromatography with flame photometric detectors and oth-
examining the spectra across the peak. Most HPLC soft- ers, such as evaporative light scattering or mass spectrome-
ware programs will automatically calculate peak purity. ters may have limited linear ranges when compared to flame
Poor peak purity may be an indication of the presence of ex- ionization or UV-visible detectors). However, specific ranges
cipients, degradants, or cleaning components within the could be found within the non-linear response that approach
peak of interest. near linearity. If a non-linear curve must be used, a suitable
Cleaning validation methods should be carefully evalu- number of points need to be utilized that will accurately de-
ated for interferences. Studies should be conducted involving scribe the curve.
the analyte in the presence of the cleaning agent. If the clean- The range of a method is the interval between the upper
ing agent is being quantitated, the effect of the drug active and lower concentrations of analyte for which the method has
and formulation components should be evaluated. If the drug been shown to have suitable precision, linearity, and accuracy.
active or a formulation component is being analyzed, it must ICH recommends a range of 80 to 120% of the test concen-
be shown that the cleaning process does not affect the ana- tration for finished drug products, 70 to 130% of the test con-
lyte. This means that the effect of the cleaning process does centration for content uniformity, and up to 120% of the re-
not change the analyte in such a manner that it is no longer porting level for impurities, ensuring that the detection and
analyzable using the method being validated. One approach quantitation limits are lower than the controlled level. For
is to perform a recovery study of the analyte, by exposing it cleaning validation analysis, the range will potentially be
to the cleaning agent at use concentrations, time, and tem- much greater than what ICH recommends for drug actives
perature. If suitable, recovery can be obtained, then the and impurities. Generally, the range will extend from the limit
method can be used. If not, the method must be modified, or of quantitation to perhaps 200% or greater for the amount of
a new method developed. allowable residue in the sample. The wide range is important

A n a l y t i c a l M e t h o d s Va l i d a t i o n 17
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 1
________________________________________________________________________
Linear evaluation of all data points

in order to allow monitoring of the residue. If the low end of tation in use. Acceptable %RSD values will also vary
the range was equal to the allowable residue limit, process greatly, depending on the level of analyte. This will be cov-
monitoring would not allow for early warning of potential ered in detail later.
problems. Again, the key requirement for the range is that it The linearity experiment should include a minimum of
is suitable with regard to precision, linearity, and accuracy. five concentration levels. Each concentration level should be
There are several methods described in industry literature analyzed minimally in duplicate. Three analysis for each
to determine the linearity and the range of a method.28, 29, 30 point are generally utilized, but more are preferred. The
These methods range from simple observation to compre- %RSD is then calculated for each injected level. This data is
hensive statistical treatments. Statistical methods should be the base information set that will be utilized in establishing
used to verify observed results. The method of choice will the linearity and range of the method.
normally be dictated by a pharmaceutical company’s poli- Two simple methods for determining linearity will be de-
cies. The specifics should be described in the validation, pro- scribed here. Both methods involve preparing a series of so-
tocol if not in Standard Operating Procedures (SOPs). lutions containing known concentrations of the analyte of in-
The linearity criteria are to be set prior to the validation. terest. This series encompasses the range of results expected
For example, for an HPLC analysis, the criteria for linearity from the analysis of actual samples. Usually, these solutions
may be that injections at each level must have an RSD < 3%, are free of expected matrix components. For the following
and the regression analysis yield an R2 value of greater than examples, the pre-established (and somewhat arbitrary) cri-
0.999 over the defined range. It should be noted that these re- teria that was applied, is that all points must have a %RSD <
quirements will vary greatly by method type and instrumen- 5 and R2 > 0.999. These solutions would then be analyzed by

18 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 2
________________________________________________________________________
Linear evaluation less the first two data points

the method of choice. Table 2, (see prior page) presents the that the y-intercept has also increased. The y-intercept value
data that will be used for both examples. These data are only is a good indication of bias. If the y-intercept is 0, no bias ex-
presented by way of example. A graph of the response ver- ists. Bias exists if the y-intercept deviates from 0. Statistical
sus the concentration shown earlier in Figure 1. significance testing of the regression can provide evidence of
The first step would be to examine the %RSD values for bias. Examination of the y-intercept can be used to justify
each point. The very low concentrations of 0.10 and 0.40 single point calibration criteria. If there is no statistical evi-
ppm produce a %RSD of 12 and 11, respectively. Since these dence of bias, or the bias is judged to be small, then single
%RSDs are greater than the 5% RSD required, these data point calibration can be successfully used. In making this
would be eliminated from the curve. The rest of the data evaluation of the statistical evidence, it is the P-value and the
meet the criteria. It should be noted that data cannot be arbi- upper and lower 95% confidence intervals for the y-intercept
trarily eliminated. For example, if the 0.10 ppm data point that are important to examine. These are typically values ob-
produced an acceptable %RSD, but the 0.40 ppm point did tained from automated regression analysis as is shown in
not, the 0.40 ppm alone could not be eliminated. All data Figure 3 (see next page). If the P-value is large and 0 is in-
below 0.40 ppm would have to be eliminated. There may cluded in the confidence interval, there is no evidence that
also be a need to investigate the discrepancy in RSD values. the y-intercept is anything other than 0.
Figure 2 shows the resulting graph after the lowest con- Close examination of this calibration curve reveals that
centration levels have been eliminated. The R2 value has de- the highest concentration point (40 ppm) may be deviating
creased, due to the shortening of the range. Since the lower from linearity. This concentration level is eliminated from the
concentration levels have been eliminated, it is not surprising data set as a result of the fact that the inclusion of this data

A n a l y t i c a l M e t h o d s Va l i d a t i o n 19
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 3
________________________________________________________________________
Typical output from a regression

Figure 4
________________________________________________________________________
Linear evaluation less first two and the last data point. This produced quite acceptable results.

20 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 5
_______________________________________________________________
Graphical representation of linearity evaluation (bottom) with an exploded view
(top) of questionable data area. The dashed lines represent 95 and 105% values of
the line of constant response. The line of constant response was calculated by
averaging the most similar five points.

point results in a failure to meet the regression acceptance cri- example analysis, the linearity of the method has been estab-
teria (R2 = 0.999). Figure 4 shows the resulting graph after lished for the range of 0.70 ppm up to 30 ppm.
eliminating the lowest two concentration points, along with The responses of analytes across a linear range should be
28
the highest concentration point. The R2 value now equals relatively constant. The plot of the quotients of the re-
1.000. This meets the pre-established criteria, and the linear- sponses and their respective concentrations versus the con-
ity experiment has been completed. Note that the y-intercept centrations is shown in Figure 5. A line of constant response
has now also dramatically dropped. The use of a single point is drawn that will fit the more similar points. Two lines rep-
calibration for future analysis may be justified after checking resenting 95% and 105% of the constant response line are
for significance of the intercept. Based on the results of this then drawn. Points that lie above 105% or below 95% are

A n a l y t i c a l M e t h o d s Va l i d a t i o n 21
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

considered to be outside of the linear range. As in the previ- Table 4


__________________________________
ous technique, the lowest two data points are obviously not Recovery data
linear, and are eliminated from the data set. The highest con-
centration point is also outside of this range, and can be elim- Found Theoretical %
inated. The third lowest concentration is borderline accept- [Active X], [Active X], Recovered
able, as is shown in the top graph of Figure 5. The inclusion ppm ppm
or elimination of this point is insignificant to the line. The 0.691 0.701 98.6
elimination of this point does not change the slope, and only 0.710 0.701 101.0
decreases the y-intercept from 48.232 to 46.439. The results 0.693 0.701 98.9
of this second technique are in close agreement with the first 9.970 9.990 99.8
technique for determining linearity.
9.950 9.990 99.6
10.100 9.990 101.0
Accuracy
29.800 30.100 99.0
The accuracy of measurements must be established. Ac- 30.500 30.100 101.0
curacy is also referred to as “recovery” or “trueness.” Accu- 29.600 30.100 98.3
racy is a measurement of how close the analytical result is to Average: 99.7
the actual value. This parameter is obtained through the eval- Standard Deviation: 1.17
uation of known analyte concentrations, or results from an-
other accepted method. The result is typically expressed as a proach for evaluating recovery is to spike solutions contain-
percentage of the calculated concentration. The accuracy ing common or perhaps maximum possible levels of drug
needs to be determined across the analysis range. Table 4 pre- impurities, drug excipients, and cleaning formulation com-
sents example data for a recovery or accuracy experiment. ponents with the analyte of choice, and demonstrate that the
Recovery studies provide assurances that an analyte can method results are close to the amounts calculated. The sol-
be analyzed in a matrix. The matrix can be a process sample vents used in these experiments should be representative of
containing a variety of reactants, formulation, swab, rinse those used in the final rinse of the equipment being cleaned.
sample, surface, body fluid, etc. The goal is to accurately de- Usually, these are aqueous-based solutions, but can be aque-
fine the amount of the analyte that can be recovered from a ous/solvent (such as alcohol) combinations or other solvents
sample. In some cases, this amount will need to be as close used in the manufacturing process. Several solutions are pre-
to 100% as possible. In other cases, an acceptable recovery pared having the appropriate concentrations of impurities,
31
may be as low as 50%. excipients, and cleaner components. The analyte is then
In cleaning validation studies, both specific and nonspe- spiked into these solutions at various concentrations. These
cific methods are commonly used. If a nonspecific method is concentrations should bracket the concentration that repre-
being used (i.e., TOC) then the total response is usually as- sents the allowable limit of the analyte. The range used is
sumed to be that of the analyte. Sampling is sometimes done typically 25 to 150% of the allowable limit of analyte, al-
32
by indirect methods (i.e., rinse solutions ), but direct meth- though values ranging from 10 to 500% of the allowable
ods (swabbing) are usually preferred. The choice is usually limit have been used. Again, the stability of the analyte in
dependent on the accessibility of the surfaces to be sampled these solutions should be documented.
and the solubility of the residues. For swab samples, the chore becomes a little more te-
When using analyte specific methods, i.e., HPLC, IMS dious, as the ability of the swab to remove the analyte from
for rinse samples (indirect sampling) it is generally sufficient the surface being cleaned, ability of the sample preparation
to show that the presence of matrix components does not in- procedure to remove the analyte from the swab, and the po-
terfere with the determination of the analyte over the range tential of interferences from impurities, formulation excipi-
required. Matrix components can be defined as excipients ents, cleaning agents, and the swabbing material itself need
used in the formulation of a formulated drug, or impuri- to be considered. Assuming that the analytical method is
ties/solvents present in the bulk drug, depending on what is thought to have sufficient accuracy, all of these parameters
being manufactured. If a cleaning agent is used, the compo- can be studied as a whole. If poor results are obtained, then
nents of the cleaning agent should be considered. One ap- separate testing of each step cannot only identify the step re-

22 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

sponsible for the poor results, but also allow the testing of al- times, the second swab is dry and picks up excess solvent
ternate solvents or swab materials that lead to improved re- and residue left by the first swab. Sometimes, the second
covery. Ideally, the accuracy of these different steps is con- swab is wetted with the same extraction solvent to help pick
sidered during method development so that the analyst is not up stubborn residuals still dried to the coupon surface. Any
surprised during the validation process. As mentioned be- swabbing pattern can be used, as long as the same pattern is
fore, there should not be any surprises during validation. If used, both in the recovery study from the coupon and in the
there are, then sufficient method development was not done. actual sampling of the equipment. The important considera-
In a typical experiment, the surface area that is to be tion is that it is done consistently each and every time. In-
swabbed during the cleaning validation is established. This sufficient training is often a source of errors. Appropriate di-
is usually based on the levels of analyte expected and/or the agrams should be placed in SOPs, accompanied by specific
sensitivity of the method. Coupons representing the compo- training and qualification of the person(s) performing the
sition and size of this surface(s) are obtained (e.g., 5x5 inch swabbing.
sections of 304 stainless steel, 316 stainless steel, Teflon®, The analyte can then be extracted from the swab by plac-
Tygon®, or other product contact materials). Solutions of the ing the swab in the same extraction vial used to wet the swab.
analyte and matrix components are prepared, and aliquots This vial contains the appropriate amount of extraction sol-
evenly applied to the coupon surface. The coupon is allowed vent. Ideally, this solvent is appropriate for direct use in the
to dry for a specified period (e.g., overnight). As an example, analysis. For TOC, ultra-pure water is used as the solvent.
assume that the active ingredient of a formulation or a With an HPLC method, water, mobile phase or a solvent
cleaner is the target analyte with an allowable limit in the (weaker than the mobile phase) can be used. If a stronger sol-
final prepared sample of 20 µg/mL. Solutions containing vent is used, dilution or other modifications may be neces-
1500, 1000, 250, and 0 µg/mL of the analyte, as well as ap- sary prior to analysis. In this example, 5mL of water or mo-
propriate amounts of cleaner components, formulation ex- bile phase would be used. Assuming 100% recovery, these
cipients, and/or impurities, are prepared. Since the solvents final solutions would represent 150, 100, 25, and 0% of the
are evaporated, the solvent composition of this solution is allowable limit. It is common practice to cut the handle from
usually less important than in rinse samples, but should be the swab (swabs are available that break upon bending), and
similar to those used in the cleaning process. Typically, a 100 allow the portion of the swab that came in contact with the
µL aliquot of each solution is applied as evenly as possible coupon to be fully immersed in a minimum amount of ex-
to separate coupons. This can be accomplished by slowly traction solvent. The swab is often allowed to sit immersed
dragging the pipette tip across the coupon surface in a uni- for long periods or sonicated to produce the highest recover-
form row pattern. The residue is allowed to dry for several ies. Studies should be conducted to be certain that samples
hours, typically overnight. Care should be taken to ensure the are stable if long periods are expected between the sampling
coupons dry on a flat surface, and are protected from conta- time and the time of analysis. The extraction solvent can then
mination during the drying process. If the equipment being be filtered (if necessary) and analyzed. TOC samples are
cleaned is heated, then the coupons can be placed in an oven generally not filtered. Results are expressed as % recovery,
to simulate the temperatures seen during the cleaning process. by comparing the experimentally found results with the the-
Once dry, the analyte is recovered for analysis by rubbing a oretical values. The 0 concentration (blank) experiment can
swab uniformly across the surface of the coupon. Cotton and be used to determine whether any interferences are present
synthetic swabs have been cited in industry literature. The in the system. Ideally, a system is found where the blank has
21
type of swab used will depend on the method of analysis. no effect on the results. It is acceptable, however, for the
Prior to swabbing the surface, the swab is usually dipped into blank values to be subtracted from the other results to adjust
the solvent that will be used to extract the residue. The ex- for possible interferences.
cess solvent is removed by squeezing the swab against the Acceptable levels of recovery for analytical methods
inside of the vial containing the extraction solvent. The swab used in cleaning validation are a source of debate. The FDA’s
is usually rubbed along the coupon surface side-to-side from guideline on cleaning validation18 states that the % recovery
both a vertical and horizontal perspective. The swab is peri- should be determined, and lists 50% and 90% recoveries as
odically rotated, so that both sides of the swab surface are examples. The recovery value should either be used in cal-
used. In some instances, a second swab can be used to pick culating the found value, or factored into the acceptance cri-
up any remaining analyte from the coupon surface. Often- teria, but not included as a factor in both. Acceptance crite-

A n a l y t i c a l M e t h o d s Va l i d a t i o n 23
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

ria based on the results obtained with a particular designated evaluated for each injection with respect to one another. This
method, is a logical approach. is a measurement of instrument precision, such as the injec-
For example, the recovery for a specific sampling method tor and integrator performance. Table 5 presents example
was determined to be 75%. The acceptable limit in an ana- data obtained for a repeatability experiment using six analy-
lytical sample had been previously calculated to be 5 ppm in sis. If an auto-injector is utilized, care must be taken to assure
the analytical sample (independent of recovery). A sample is that the septa are not damaged from repeated penetrations.
analyzed and found to contain 4 ppm of the target analyte. This may introduce errors into the precision experiments. It is
Utilizing the recovery value, the actual value for the sample wise to limit the number of injections from a single vial. For
is 4 ppm / 0.75 = 5.33 ppm. This is above the acceptance cri- non-chromatographic methods that utilize a sampling device,
teria, and the sample fails. The recovery factor can be ap- such as a cell, the repeatability can be measured by loading
plied to the acceptance criteria: 5 ppm x 0.75 = 3.75 ppm. the cell, scanning, emptying the cell, and repeating this pro-
This would then be the new limit per analytical sample. cedure for up to six replicates. Concentration may not be the
Again, be certain not to apply the factor to both the analyti- only parameter measured for precision. The wavelength, re-
cal result and the limit criteria. tention time, etc. may be important parameters to character-
If the % recoveries are lower than desired, additional re- ize.
covery experiments should be conducted in such a way that Intermediate precision evaluates intra-laboratory varia-
the major source of the reduced recovery can be ascertained. tions. For example, samples may be analyzed on different
A common experiment is to spike the swab material with the days by various analysts using different equipment (or at least
solutions that were used to prepare the coupons. If the spik- a different column if a chromatographic method is being eval-
ing solution is different than the extraction solution, the swabs uated). Intermediate precision evaluates the precision be-
are allowed to dry, and then placed directly into the extraction tween analysts and instruments. It has the potential to account
solvent. If this experiment yields low recoveries, stronger ex- for often overlooked environmental conditions. Fluctuations
traction solvents or alternate swabbing materials can be tried. in lab temperature, or even humidity, may add (or subtract) to
If the recoveries are acceptable, the problem can be assumed the intermediate precision. Intermediate precision can be de-
to arise from removing the residue from the coupon itself. termined by having a second analyst repeat the accuracy
Again, alternate swabbing materials, stronger swabbing/ex- and/or repeatability study, performing three replicates at each
traction solvents, or a variation in the swabbing technique level being determined, or six replicates at the target value. If
(i.e., more swabs, wet/dry variations, more forceful swab- duplicate equipment is available, the analyst should use a dif-
bing) can be tried. A reexamination of the potential volatility ferent piece of equipment (i.e., a second HPLC or TOC). If a
of the analyte should not be overlooked. This is especially chromatographic method is employed, a second column,
true if a heating or vacuum oven is used in the drying process. preferably from a different lot or even a different supplier,
should be used. For swab studies, alternate sources or lots of
Precision swabs may be included. Table 6 (see next
Table 5 page) presents example data for an inter-
______________________
Precision is a measurement of how mediate precision experiment.
Repeatability data
close a series of numbers are to each Reproducibility evaluates the preci-
other with no regard to the actual value Injection Result sion between laboratories. This is utilized
(accuracy). Precision is broken down 1 0.691 when methods are transferred, or when
into three subgroups: repeatability, in- collaborative method standardizations
2 0.705
termediate precision, and reproducibil- are being made. One such example is the
ity. Precision is typically expressed as a 3 0.699 transfer of an analytical method from a
percentage of the relative standard de- 4 0.705 Research and Development (R&D) labo-
viation. 5 0.715 ratory to a quality control laboratory.
Repeatability refers to the precision 6 0.690 In most instances, since materials of
under the same operating conditions Average: 0.701 construction and cleaning procedures are
over a short interval of time. For exam- generally site specific, validations of ana-
Standard Deviation: 0.00952
ple, the same sample may be injected lytical methods used in cleaning valida-
%RSD: 1.36
six consecutive times and the results tions will be concerned with repeatability

24 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Table 3
____________________________________________________________________________
Simplistic evaluation of precision as a function of analyte level

Level 1 Level 2 Level 3 Level 4 Level 5


Injection 1 100 75 50 25 5
Injection 2 101 76 51 26 6
Injection 3 99 74 49 24 4
Average: 100 75 50 25 5
Standard Deviation: 1.0 1.0 1.0 1.0 1.0
%RSD: 1.0 1.3 2.0 4.0 20

and intermediate precision. If another facility Table 6


__________________________________________
within an organization does not have the exact Intermediate precision data
same procedures and equipment, reproducibil-
ity may be of concern. Including other facilities Sample Analyst A Analyst B Combined
in the validation or performing method transfer Result Result
after validation should be considered. 1 0.715 0.702
ICH guidelines recommend that a mini- 2 0.725 0.720
mum of nine determinations throughout the
3 0.730 0.699
range of interest or six determinations at the
test concentration be used to determine re- 4 0.715 0.691
peatability. The limits of quantitation are often- 5 0.709 0.698
times being approached during validation of 6 0.720 0.695
the analytical method. Repeating the accuracy Average: 0.719 0.701 0.710
testing outlined above, so that three data points Standard Deviation: 0.00762 0.0101 0.0128
are collected for each level, allows separate es-
%RSD: 1.06 1.44 1.80
timates of %RSD to be calculated at each of the
three levels. Additional data points at each
level are a benefit, and additional statistical testing can be analysis of the same analyte.
used to determine if the different levels have different
%RSDs. Limits of Detection and Quantitation
It is not uncommon for the accuracy and repeatability of
the method to change with the amount of analyte being de- The American Chemical Society (ACS) has extensively
termined. It is also common for the %RSD to be a function addressed the concept of Limits Of Detection (LOD) and
34,35
of the accuracy. If there are significant changes in accuracy, Limits Of Quantitation (LOQ). Figure 6 (see next page)
as the amount of analyte determined decreases, it is not un- presents a graphical representation of LOD and LOQ. The
usual to also see the repeatability of the method decrease LOD of an individual analytical procedure is the lowest
31, 33
with decreasing amounts of analyte. amount of analyte in a sample that can be detected, but not
A simple example of this is shown in Table 3. If five lev- necessarily quantitated as an exact value. The LOQ of an in-
els of an analyte are determined, and in each case, the dif- dividual analytical procedure is the lowest amount of analyte
ferences in three injection values differ by only 1 ppm from in a sample that can be quantitatively determined with suit-
the average, the precision can vary significantly. In this case, able precision and accuracy.
the repeatability ranged from 1% for a 100 ppm sample, up In routine assays of bulk or formulated analytes, the LOD
to 20% for a 5 ppm sample. This is something that is also and LOQ values are generally not of great concern. This is
often overlooked when establishing acceptance criteria for because the levels of analytes are generally large, and the
trace levels of analyte. It is a rare case indeed that identical methods are applied well above the LOQ value. In cleaning
precision will be obtained for both bulk analysis and trace validation analysis, this is not the case, because the analysis

A n a l y t i c a l M e t h o d s Va l i d a t i o n 25
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 6
___________________________________________________________________________
Graphical comparison of the limits of detection and quantitation relative to the signal from a blank.

are typically looking for trace residues, and the results are of- used for determining LOD and LOQ values needs to be writ-
tentimes “bumping” against these values. These values need ten in an official document, such as the analytical method or
to be defined during the validation of the analytical methods an SOP. Residue limits should not be based on LOD values,
used for cleaning validation. since values measured below the LOQ have an inherently
The traditional method is based on signal-to-noise ratio. high degree of uncertainty. Likewise, if a process is to be
The majority of chromatographic and spectroscopic data monitored with alert and action limits in place, residue lim-
analysis software available today incorporate a means of its should not be based on the LOQ, if at all possible. A sin-
measuring the noise. The LOD would then be the concen- gle approach or methodology of LOD and LOQ determina-
tration equivalent to three times the signal-to-noise ratio. tion should never be blindly followed. The good judgment of
The LOQ would be the concentration equivalent to ten a well-trained, experienced analyst should be regarded as a
times the signal-to-noise ratio. A common method for es- valuable asset. The discussion of several methodologies
tablishing LOD and LOQ is the same as that used for es- found in the literature follows.
tablishing the lower limit for the range of the method. The Figure 7 on the facing page, shows an example of one
LOQ is the concentration that meets a certain pre-estab- such method for the determination of LOQ and LOD using
39
lished precision and satisfies the linearity requirement. In confidence intervals. Data are collected approaching the
the example used for the linearity and range experiment, the LOD, plotted and fitted with a regression line. The upper
LOQ was found to be 0.70 ppm. The LOD would then be and lower 95% Confidence Intervals (CI) for the regression
(3/10) x LOQ or 0.21 ppm. are then plotted. This example shows the CIs to be linear,
There are numerous approaches to calculating the LOD but they typically are not. A line is then drawn horizontally
17,36,37,38
and LOQ values. There is, however, no specific from the end of the upper CI to the lower CI parallel to the
method recommended in any regulatory document. The es- x-axis. A line is then dropped vertically to the x-axis. The
tablishment of these values is up to the analyst. The method intersection point on the x-axis is regarded as the LOD. The

26 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 7
___________________________________________________________________________
Graphical determination of the limit of detection (LOD). A horizontal line is drawn from the lower
end of the upper 95% CI line to the lower 95% CI. A vertical line is then dropped from this point to
the x-axis. The point on the x-axis is the LOD.

LOQ is then 3.3 x LOD. For this example, the LOD was
found to be 0.33 ppm and the LOQ was 1.1 ppm. Equation 1
_____________________________________
Another approach makes use of the slope and standard LOD calculation from linear fit
17,37
error of the regression line. Figure 8 provides an example
using the same data as Figure 7. The LOD and LOQ values
of 0.84 and 2.5 ppm, respectively, are defined by Equation 2
and Equation 1, where is the standard error of the regres-
sion, and S is the slope of the regression line.
All of the methods used here (See Equations 1 and 2) for
Equation 2
the LOD and LOQ calculations were based on the same set _____________________________________
of hypothetical data. These calculations produced different LOQ calculation from linear fit
results. The statistical significance of these differences can
be argued, but in actuality they are quite similar. In most real-
world cases, the experimentally determined LOQ will be
higher than the values that are calculated.
The LOD value, being free of stringent precision require-
ments, will be primarily based on instrumental or technique cleaning validations, where residues are being recovered from
considerations, whereas LOQ values will not only be based on the surfaces of manufacturing equipment, a constant matrix
the instrument or technique, but also on all of the steps and cannot be assumed (this is one reason why residue limits
manipulations included in the method. In addition, all of these based on the LOD or LOQ of a method are not advisable).
literature LOD and LOQ techniques assume that there is a
constant matrix that contributes to a constant noise level. In

A n a l y t i c a l M e t h o d s Va l i d a t i o n 27
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Figure 8
___________________________________________________________________________
Data plotted and fitted with a linear regression line. The slope and standard error obtained are
used in the calculations for LOD and LOQ.

40
through experimental designs or by simply varying para-
meters in a stepwise process. Examples of parameters that
Ruggedness could be studied are column temperature, injection size, flow
All analytical methods that are used routinely by a vari- rates, buffer concentrations, pH, solvent concentrations, etc.
ety of personnel should be rugged. That is, a method should
not be affected by small changes that may occur on a day-to- Summary
day basis. For example, a chromatographic method that or-
dinarily does not use a column heater should not be affected The output of an analytical method validation is an ana-
by changes of temperature in a laboratory. If it is, then this lytical method validation report. It is suggested (but not re-
should be evaluated and defined. quired) that the report follow the same format as other doc-
There are no regulatory requirements for ruggedness and uments in the validation process. The similar formatting will
ruggedness studies, and do not need to be included in a val- make it easier for both internal and external agencies to un-
idation report. However, it is assumed by regulatory agencies derstand and follow. The report should contain all of the pa-
that the effects of changes to various parameters of the rameters studied, along with whether or not the pre-estab-
method have been explored during method development, lished criteria were met.
that those parameters that do affect the method are con- The validation of an analytical method is the last step in
trolled, and that these controls are included in the method. the process prior to implementation. The various parameters
Again, the ruggedness of a method is usually explored required in an analytical method validation are first explored
during method development. Ruggedness can be explored during method development, and then rigorously defined

28 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

during the method validation process. These parameters are


dictated by various regulatory bodies around the world, and
Article Acronym Listing
necessitated by the need to understand the process. The val- AA: Atomic Absorption
idation of analytical methods should not only be understood ACS: American Chemical Society
by the chemists who perform them, but also by the person- CI: Confidence Interval
nel involved in cleaning validation, as it is an integral part of FDA: Food and Drug Administration
the process. It is important that all personnel involved in a GMP: Good Manufacturing Practice
cleaning validation understand that changes in the overall HPLC: High Performance Liquid Chromatography
process may affect the method and lead to the necessity for ICH: International Conference on Harmonization
revalidation. Industry literature is full of information, re- ICP: Inductively Coupled Plasma
sources, and examples that should be called upon and uti- LOD: Limits of Detection
lized. A chemist is only as good as her/his resources. ❏ LOQ: Limits of Quantitation
R&D: Research and Development
RSD: Relative Standard Deviation
About the Author SOP: Standard Operating Procedure
TOC: Total Organic Carbon
Herbert J. Kaiser, Ph.D., Senior Manager – Analyti- USP: United States Pharmacopoeia
cal Services and Development, at STERIS Corpora- UV: Ultra Violet
tion, has twenty-two years experience in cleaning
and surface technologies. Dr. Kaiser received his 4. Collard, J.; Kishi, Y.; Costanza, C. “Analytical Method Valida-
B.A. degree from St. Mary’s University in San Anto- tion in the Semiconductor Industry Applied to High Purity
nio, Texas, his M.S.(R) from St. Louis University, and Hydrogen Peroxide” 2001 Semiconductor Pure Water and
his Ph.D. from the University of Missouri. He is a Chemicals Conference, 20th, 190-213.
member of the American Chemical Society, and the 5. Seno, S.; Ohtake, S.; Kohno, H. “Analytical Validation in
editorial advisory board for the Journal of Validation Practice at a Quality Control Laboratory in the Japanese
Technology. Pharmaceutical Industry” Accreditation and Quality Assur-
ance, 1997, 2(3), 140-145.
Bruce Ritts, M.S., Senior Scientist – Analytical Ser- 6. Clarke, G. S. “The Validation of Analytical Methods for Drug
vices and Development, at STERIS Corporation, Substances and Drug Products in UK Pharmaceutical Labo-
has twenty years of expedience in the development ratories” Journal of Pharmaceutical and Biomedical Analy-
and validation of analytical methods. Mr. Ritts re- sis, 1994, 12(5), 643-52.
7. Kirsch, R. B. “Validation of Analytical Methods Used in Phar-
ceived both his B.S. and M.S. degrees from the
maceutical Cleaning Assessment and Validation” Pharma-
University of Missouri. He is a member of the Amer-
ceutical Technology (Supplement), 1998, 40-46.
ican Chemical Society.
8. Frewein, E.; Geisshuesler, S.; Sibler, E. “Pharmaceutical
Quality Assurance. Validation of Analytical Methods in Vita-
min Assays” Swiss Chem, 1996, 18(7/8), 10-16.
References 9. Hokanson, G.C. “A Life Cycle Aproach to the Validation of
Analytical Methods During Pharmaceutical Product Devel-
1. “Guidelines for Single-Laboratory Validation of Analytical opment. Part I: The Initial Method Validation Process” Phar-
Methods for Trace-Level Concentrations of Organic Chemi- maceutical Technology, 1994, 18(9), 118-130.
cals” Special Publication - Royal Society of Chemistry: Prin- 10. Hokanson, G.C. “A Life Cycle Approach to the Validation of
ciples and Practices of Method Validation, 2000, 256, 179- Analytical Methods During Pharmaceutical Product Devel-
252. opment. Part II: Changes and the Need for Additional Vali-
2. Chowdary, K. P. R.; Rao, G. D.; Himabindu, G. “Validation of dation” Pharmaceutical Technology, 1994, 18(10), 92-100.
Analytical Methods” Eastern Pharmacist, 1999, 42(497), 39- 11. Kaiser, H. J.; Tirey, J. F.; LeBlanc, D. A. “Measurement
41. of Organic and Inorganic Residues Recovered from Sur-
3. Hsu, H.; Chien, C. "Validation of Analytical Methods: A Sim- faces” Journal of Validation Technology 1999, 6(1), 424-436.
ple Model for HPLC Assay Methods" Yaowu Shipin Fenxi, 12. Fourman, G. L.; Mullen, M. V. “Determining Cleaning Valida-
1994, 2(3), 161-76. tion Acceptance Limits for Pharmaceutical Manufacturing

A n a l y t i c a l M e t h o d s Va l i d a t i o n 29
Herbert J. Kaiser, Ph.D. & Bruce Ritts, M.S.

Operations” Pharmaceutical Technology, 1993, 17(4), 54-60. in the Pharmaceutical Industry” Journal of Pharmaceutical
13. LeBlanc, D. A. “Establishing Scientifically Justified Accep- and Biomedical Analysis, 1996, 14(8-10), 867-869.
tance Criteria for the Cleaning Validation of APIs” Pharma- 28. Huber, L. “Validation of Analytical Methods: Review and
ceutical Technology, 2000, 24(10), 160-168. Strategy” LC-GC International, February 1998, 96-105.
14. <1225> Validation of Compendial Methods” United States 29. Boque, R.; Maroto, A.; Riu, J.; Rius, F. X. “Validation of Ana-
Pharmacopoeia XXVI, 2003. lytical Methods” Grasas y Aceites 2002, 53(1), 128-143.
15. Food and Drug Administration “International Conference on 30. Kirsch, R. “Validation of Methods Used in Pharmaceutical
Harmonization; Guideline on Validation of Analytical Proce- Cleaning Validation” Pharmaceutical Technology, 1998 (Sup-
dures: Definition and Terminology; Availability” Federal Reg- plement), 40-46.
ister, 1995, 60 (40), 11260-11262. 31. Ellis, R. L., “Validation and Harmonization of Analytical
16. International Conference on Harmonization of Technical Re- Methods for Residue Detection at the International Level” in
quirements for the Registration of Pharmaceuticals for Residues of Veterinary Drugs and Mycotoxins in Animal
Human Use “Validation of Analytical Procedures” ICH-Q2A, Products; Enne, G.; Kuiper, H. A.; Valentini, A., Eds.; Wa-
Geneva, 1994. geningen Pers: Wageningen, The Netherlands, 1996; pp 52-
17. International Conference on Harmonization of Technical Re- 62.
quirements for the Registration of Pharmaceuticals for 32. LeBlanc, D. A. “Rinse Sampling for Cleaning Validation
Human Use “Validation of Analytical Procedures: Methodol- Studies” Pharmaceutical Technology, 1998, 22(5), 66-74.
ogy” ICH-Q2B, Geneva, 1996. 33. AOAC Peer Verified Methods Program “Manual on Policies
18. Food and Drug Administration “Guide to Inspections, Valida- and Procedures” Arlington, Virginia, 1993.
tion of Cleaning Processes” Office of Regulatory Affairs, 34. American Chemical Society Committee Report “Guidelines
FDA, Rockville, MD, 1993. for Data Acquisition and Data Quality Evaluation in Environ-
19. Kaiser, H. J.; Minowitz, M. “Analyzing Cleaning Validation mental Chemistry” Analytical Chemistry, 1980, 52(14), 2242-
Samples: What Method?” Journal of Validation Technology 2249.
2001, 7(3), 226-236. 35. American Chemical Society Committee Report “Principles of
20. Kaiser, H. J. “Methods for Pharmaceutical Cleaning Valida- Environmental Analysis” Analytical Chemistry, 1983, 55(14),
tions” in Surface Contamination and Cleaning; Mittal, K.L., 2242-2249.
Ed.; VSP BV: Utrecht, The Netherlands, 2003; Vol. 1, pp 75- 36. Paine, T. C.; Moore, A. D. “Determination of the LOD and
84. LOQ of an HPLC Method Using Four Different Techniques”
21. Jenkins, K. M., Vanderwielen, A. J., Armstrong, J. A., Pharmaceutical Technology, 1999, 23(10), 86-90.
Leonard, L. M., Merphy, G. P., Piros, N. A., “Application of 37. Krull, I.; Swartz, M. “Determining Limits of Detection and
Total Organic Carbon Analysis to Cleaning Validation” PDA Quantitation” LC-GC, 1998, 16(10), 922-924.
Journal of Pharmaceutical Science & Technology, 1996, 50, 38. Thomsen, V.; Schatzlein, D.; Mercuro, D. “Limits of Detection
6-15. in Spectroscopy” Spectroscopy, 2003, 18(12), 112-114.
22. Altria, K. D.; Chanter, Y. L. “Validation of a Capillary Elec- 39. Hubaux, A.; Vos, G. “Decision and Detection Limits for Lin-
trophoresis Method for the Determination of a Quinolone ear Calibration Curves” Analytical Chemistry, 1970, 42(8),
Antibiotic and Its Related Impurities” Journal of Chromatog- 849-855.
raphy A, 1993, 652, 459-463. 40. Virlichie, J. L.; Ayache, A. “A Ruggedness Test Model and Its
23. Ciurczak, E. “Validation of Spectroscopic Methods in Phar- Application for HPLC Method Validation” S.T.P. Pharma Pra-
maceutical Analysis” Pharmaceutical Technology, 1998, tiques, 1995, 5(1), 49-60.
22(3), 92-102.
24. Goez, C. E.; Luu, T. D.; Meier, D. J. “The Luciferin-Luciferase
Assay. An Example of Validating an Enzymic Analytical
Method” Swiss Chem, 1996, 18(6), 9-11.
25. Brittain, H. G. “Validation of Nonchromatographic Analytical
Methodology” Pharmaceutical Technology, 1998, 22(3), 82-
90.
26. Van Zoonen, P.; Van 't Klooster, H.; Hoogerbrugge, R.; Gort,
S. M.; Van De Wiel, H. J. “Validation of Analytical Methods
and Laboratory Procedures for Chemical Measurements”
Arhiv za Higijenu Rada i Toksikologiju 1998, 49(4), 355-370
27. Vessman, J. “Selectivity or specificity? Validation of Analyt-
ical Methods from the Perspective of an Analytical Chemist

30 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Good Analytical Method
Validation Practice
Deriving Acceptance Criteria for
the AMV Protocol: Part II
Stephan O. Krause, Ph.D.
Bayer HealthCare Corporation

T
o avoid potential inspection requires a certain expertise, since ac-
observations for test method ❝The scope of ceptance criteria should balance
validations by the regulatory this article method performance expectations
agencies, it has become critical for with method requirements (from pro-
pharmaceutical companies to derive provides more duct specifications) and AMV exe-
reasonable acceptance criteria for detail on how to cution conditions (conducted by QC
the Analytical Method Validation analysts under routine QC testing).
(AMV) protocol. Part I of Good An- systematically In general, the time required to pre-
alytical Method Validation Practice derive pare the AMV protocol should ac-
(GAMVP) (November 2002 issue, count for about 50% of the total time
Journal of Validation Technology) reasonable ac- allocated to the complete (approved)
mostly emphasized ground rules for ceptance validation. Less time spent on the pro-
the AMV department to be compliant tocol may result in time-consuming
and efficient within the Quality Con- criteria for discrepancy reports, and validation
trol (QC) unit. The scope of this arti- AMVs, and how retesting when acceptance criteria
cle provides more detail on how to failed during execution. Or, the ac-
systematically derive reasonable ac- to integrate ceptance criteria do not sufficiently
ceptance criteria for AMVs, and those into the challenge the test system suitability,
how to integrate those into the AMV so this validation failed to de-
protocol. One specific example to de- AMV protocol.❞ monstrate that this method will yield
scribe the process of deriving AMV accurate and reliable results under
acceptance criteria is provided. This example summa- normal testing conditions. In addition, invalid and po-
rizes most aspects to be considered in order to gener- tential Out-Of-Specification (OOS) results may be ob-
ate an AMV protocol that can readily be executed, tained when test system suitability is not properly de-
and lead to a solid AMV report. monstrated. Management should keep in mind that a
For successful AMVs, available data and other sup- rigorous AMV program, employing reasonable accep-
porting information for the test method to be validated tance criteria, may prevent discrepancy reports, OOS re-
must be carefully reviewed against current in-process sults, and potential product loss, since unsuitable test
or product specifications. This process takes time and methods should not be used for routine QC testing.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 31
Stephan O. Krause, Ph.D.

Selecting Assay Categories Figure 1

When an AMV is generated to demonstrate test Required Validation Parameters


system suitability to bring a routine testing procedure for ICH Assay Categories and
into compliance, an assay category must be selected. Specification Codes
Guidelines published by the International Conference Required Validation Parameters
on Harmonization (ICH), United States Pharmocopeia for ICH Category (I – IV)
(USP), and the Food and Drug Administration (FDA) Specification Code 1 2 3 4 and 5
are similar in content and terminology used. Following ICH Category I II III IV
ICH guidelines is advisable when product is distrib-
Accuracy No Yes No Yes
uted worldwide. This article will only focus on fol-
Repeatability Precision No Yes No Yes
lowing ICH guidelines. The FDA accepts these guide-
Intermediate Precision No Yes No Yes
lines as long as those are consistently followed, as in-
Specificity Yes Yes Yes Yes
tended by the ICH. The ICH Q2A guidelines list four
assay categories: Linearity No Yes No Yes
Assay Range No Yes No Yes
• Category I: Identification Tests Limit of Detection No No Yes No
• Category II: Quantitation of Impurities Limit of Quantitation No Yes No No
• Category III: Qualitative Limit Test for Impurities capability expectations. All factors should be evalu-
• Category IV: Quantitation of Active Ingredients ated and integrated to derive acceptance criteria. Pro-
duct specifications for qualitative assays are generally
Once an assay category is appropriately chosen, “coded” as Match/No Match (or pass/fail, present/ab-
all validation parameters required for that category sent, etc.), and should be qualified or validated on a
must be included in the AMV protocol. All product case-by-case basis. Many microbiological assays have
or in-process specifications can be classified within abnormal (non-gaussian) data distributions (usually
five specification “codes.” well-described by Poisson statistics), and are more
difficult to generally classify for validation.
1. Match/No Match (Yes/ No)
2. No More Than (NMT; i.e., ≤ 1.0%) ICH Validation Parameters
3. Less Than (i.e., < 1%)
4. No Less Than (NLT; i.e., ≥ 85%) When an AMV protocol is generated, the assay cat-
5. Range (i.e., 80 – 120 units/mL) egory must be selected first. Then, the scientific ap-
proach to demonstrate assay suitability for each requir-
Specification code no. 1 (Match/No Match) will ed validation parameter must be described in detail in
require validation as per ICH category I. Specification the protocol. General guidance and considerations are
code no. 2 (≤ 1.0%) will require ICH category II val- described for each validation parameter. These should
idation, because results are numerically reported be followed when acceptance criteria are derived. Add-
(quantitated). Code no. 3 requires ICH category III, itional information can be found in the specific example
since results are reported as “less than” (< 1%). Codes used in this article.
no. 4 and 5 (≥ 85% and 80 – 120 units/mL) require Accuracy is usually demonstrated by spiking an ac-
validation per ICH category IV. The relevant required cepted reference standard into the product matrix. Per-
validation parameters (i.e., Accuracy) for each prod- cent recovery (observed/expected x 100%) should ide-
uct specification code and ICH category are listed in ally be demonstrated over the entire assay range by
Figure 1. using multiple data points for each selected analyte
Three out of five specification codes (nos. 2, 4, and concentration. In practice, the demonstration of accu-
5) require numerical (quantitative) results. Those are racy is mostly affected by how well systematic errors
graphically illustrated in Figure 2. In this figure, prod- can be controlled. When deriving acceptance criteria,
uct specifications are related to ICH Q2B and method one must keep in mind that in addition to ideal accu-

32 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

Figure 2
Numerical Product Specifications and Assay Range:
Quantitative Product Specifications
1c) Product Specification Code
Number 4
(Target: NLT 80%, Range: 80% +)

1a) Product Specifi- 1b) Product Specification Code


cation Code Number Number 5
2 (Target: 60%, Range: 50-70%)
(Target: NMT 20%,
Range: 0-20%)

0 10 20 30 40 50 60 70 80 90 100
Percentages
Legend/Definitions
Graphical (Quantitative) Representation of Product Specifications
ICH Q2B Required Demonstration of Assay Range (within Assay Range Results must be: Accurate, Precise, and Linear)
Method Capability = Method Performance Expectations

racy expectations (assuming expected equals true), po- ability (precision) within the assay range can routinely
tential systematic error (i.e., different response factor of be limited in order to support results within and out-
spiked reference material) must be evaluated and fac- side product specifications (OOS). In other words, the
tored into the acceptance criteria, unless the AMV pro- acceptance criteria for accuracy and precision, com-
tocol permits ‘normalization,’ if required. To keep sys- bined within the assay range, should not be wider than
tematic error at a minimum, common scientific sense half of the product specifications range, (at maximum)
should be used when describing spike sample prepara- because one would otherwise fail to demonstrate test
tion in the protocol (i.e., large volumes for spiked stock system suitability for this product. Intermediate preci-
solutions, only calibrated equipment). sion should ideally be used here, since all routine test-
Many quantitative assays have ranges for their ing samples could be tested by any trained operator on
product specifications (code no. 5). The midpoint of any qualified instrument on any given day. Repeata-
this range is the target concentration that was either set bility precision (less variability) simply would not re-
historically from testing results, or as a manufacturing flect this overall assay variability. The derivation of ac-
process target. When deriving acceptance criteria, one ceptance criteria for the remaining quantitative assays
should consider that test system suitability must be (code nos. 2 and 4) should also be dealt with in a sim-
demonstrated for this target range, which is exactly ilar matter.
half of the specification range (target range = target Given what was mentioned above, there are several
concentration ± 0.5 x specification range). During QC ways to derive acceptance criteria for accuracy. One
routine testing, the test system must be capable to way is: intermediate precision acceptance criteria
readily meeting this target range, and must be demon- could be derived first from historical data (Analytical
strated in the AMV. It must therefore be demonstrated Method Development [AMD] or QC testing). The
that the combined effects of lack of accuracy and reli- numerical limits for intermediate precision are then

A n a l y t i c a l M e t h o d s Va l i d a t i o n 33
Stephan O. Krause, Ph.D.

subtracted from the target range, and the remaining (coefficient of variation, CV = SD/Mean x 100%), and
difference will set the maximum permissible accep- should be used as the absolute limit for the AMV data,
tance criteria range for accuracy. This is illustrated in since historical data (several operators, instruments,
the AMV acceptance criteria example (Figure 6). days) should have less precision (greater CV) than
It may be more advisable not to use statistical ap- AMV data.
proaches to demonstrate accuracy, such as t-statis- Intermediate Precision should be demonstrated by
tics (comparing means of observed versus expected generating a sufficiently large data set that includes re-
percent recoveries of various spike concentrations). plicate measurements of 100% product (analyte) con-
The reason is that a potential systematic error is not centration. This data should ideally be generated by
accounted for in the expected recovery (mean = three operators on each of three days, on each of three
100%, variance = 0). The expected recovery will instruments. Different analyte concentrations to de-
then be compared to the observed recovery (mean monstrate intermediate precision over the entire assay
≠100%, variance ≠ 0), so that a statistical difference range could be used, but results must be converted to
(i.e., t-test at 95% confidence) is likely to occur, al- percent recoveries before those can be compared. A
though this difference may not be significant when data matrix where the total amount of samples can be
compared to a numerical limit (percent or units). It limited, but differences among or between variability
may therefore be more practical to give numerical factors, such as operators and days, can still be differ-
limits for accurate acceptance criteria. entiated, is illustrated in Figure 3.
Data generated for accuracy may be used to cover The complete data set should then be statistically
required data for other validation parameters, such evaluated by an Analysis of Variance (ANOVA), where
as, repeatability precision, linearity, assay range, and results are grouped by each operator, day, and instru-
Limit of Quantitation (LOQ). ment, but analyzed in one large table. Acceptance cri-
Repeatability Precision indicates how precise the teria state no significant difference at 95% confidence
test results are under ideal conditions (same sample, (p > 0.05) of data sets evaluated by ANOVA. It is ad-
operator, instrument, and day). Repeatability preci- visable to include a numerical limit (or percentage) be-
sion should be demonstrated over the entire assay cause the likelihood of observing statistical differences
range, just like accuracy and data generated for this increases with the precision of the test method. In ad-
parameter may be used. This has the advantage that dition, some differences among various instruments,
fewer samples will have to be run. Even more impor- operator performances, and days (i.e., sample stability
tant, when acceptance criteria are derived and con- or different sample preparations for each day) are nor-
nected, only one data set will be used, therefore, de- mal. The overall intermediate precision allowed should
creasing potential random error introduced by multi- be relative to the expected accuracy, and must be within
ple sample preparations. The demonstration of repeat- the combined limits for accuracy and intermediate pre-
ability precision is mostly affected by how well ran-
Figure 3
dom errors in sample preparation can be controlled.
Random experimental errors can only be controlled to Intermediate Precision
some degree, since the Standard Operating Procedure Sample Matrix
(SOP) and AMV protocol should be followed as writ- Sample Day Operator Instrument
ten by operators routinely generating QC testing re- Number Number Number
sults. 3x 100% Conc. 1 1 1
When using AMD data, the actual generation con- 3x 100% Conc. 1 2 2
ditions of this data must be evaluated and put into per- 3x 100% Conc. 1 3 3
spective to set AMV acceptance criteria. When using 3x 100% Conc. 2 1 2
QC routine testing data, data for the assay control can 3x 100% Conc. 2 2 3
be summarized and used as a worse-case scenario for 3x 100% Conc. 2 3 1
the AMV protocol. The Standard Deviation (SD) of 3x 100% Conc. 3 1 3
this historical data can be expressed as confidence lim- 3x 100% Conc. 3 2 1
its (i.e., 95% confidence ≅ 2 x SD), units, or percent 3x 100% Conc. 3 3 2

34 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

cision. Additional F-tests and T-tests should be per- than one may reflect a lack of linearity, inaccuracy,
formed if overall p-value is less than 0.05 to evaluate imprecision, or all of the above. ICH Q2B requires re-
the differences among factors and within factors. More porting the regression line y-intercept, slope, correla-
detail will be provided in Part III of GAMVP: Data tion coefficient, and the residual sum of squares. Only
Analysis and the AMV Report. acceptance criteria for the slope and the correlation
Specificity of an assay is usually ensured by de- coefficient need to be reported for linearity. Deriving
monstrating none or insignificant matrix and analyte these from accuracy and precision expectations is
interference. The matrix may interfere with assay re- rather complex, and may not be practical. Depending
sults by increasing the background signal (noise). Or, on the sample preparation and the method capabilities
matrix components may bind to the analyte of inter- for accuracy and precision, reasonable acceptance cri-
est, therefore potentially decreasing the assay signal. teria should be stated in the AMV protocol. Reason-
Spiking of the analyte of interest into the product (li- able criteria are: r ≥ 0.98 (98% curve fit) and the 95%
quid), and comparing the net assay response increase confidence interval of the regression line slope should
versus the same spike in a neutral liquid (i.e., water contain 1.
or buffer), provides information on potential matrix The Assay Range of a method must bracket the
interference. Reasonable acceptance criteria are: No product specifications. By definition, the LOQ consti-
observed statistical difference (t-test at 95% confi- tutes the lowest point of the assay range, and is the
dence) between assay responses of spiked samples of lowest analyte concentration that can be quantitated
product matrix, versus those of buffer matrix. If the with accuracy and precision. In addition to the requir-
assay precision is relatively high, it is advisable to also ed accuracy and precision for all analyte concentra-
include a numerical limit, in case p < 0.05, which tion points within the assay range, the assay response
should be similar to the limit stated under the valida- must also be linear, as indicated by the regression line
tion parameter repeatability precision. This has the coefficient. Data for the assay range may be generated
advantage that in case a statistical difference is ob- in the AMV protocol under accuracy. Again, the ad-
served, a reasonably derived numerical limit should vantages are a limited sample size to be run and eval-
be able to compensate for differences in sample pre- uated, and the ability to evaluate this and other vali-
paration. dation parameters from one set of prepared samples.
Other analytes potentially present in the product Acceptance criteria for assay range should therefore
matrix should be spiked in proportional concentra- be identical to those of accuracy, repeatability preci-
tions into the product matrix (keeping final analyte sion, and linearity.
concentrations constant). Results of unspiked versus Limit of Detection (LOD) of an analyte may be de-
spiked product should also be compared by a t-test, scribed as that concentration giving a signal signifi-
and the acceptance criteria should be the same as cantly different from the blank or background signal.
those for matrix interference. ICH Q2B suggests three different approaches to de-
Linearity of the assay response demonstrates pro- termine the LOD. Other approaches may also be ac-
portionality of assay results to analyte concentration. ceptable when these can be justified. Per ICH, the
Data from accuracy may be used to evaluate this pa- LOD may be determined by visual inspection (A),
rameter. Linearity should be evaluated through a lin- signal-to-noise ratio (B), or the SD of the response
ear regression analysis, plotting individual results of and the slope (C).
either analyte concentration versus assay results, or Visual inspection should only be used for qualita-
observed versus expected results. The later approach tive assays where no numerical results are reported.
should ideally yield a linear regression line slope of The signal-to-noise approach (B) may be used when-
one (1). A slope smaller than one indicates a decreas- ever analyte-free product matrix is available. The an-
ing assay response with increasing analyte concentra- alyte should then be spiked at low concentrations in
tions and vice versa. A y-intercept significantly great- small increasing increments into the product matrix.
er or less than 0 with a slope of one, suggests a sys- The LOD is then determined as the signal-to-noise
tematic error (i.e., sample preparation or spiked sam- ratio that falls between 2:1 and 3:1. This is the sim-
ple response factor ≠ 1). A correlation coefficient less plest and most straightforward quantitative approach.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 35
Stephan O. Krause, Ph.D.

Acceptance criteria derived for approach B should be the LOQ. Selecting and justifying a particular ap-
similar to those based on repeatability precision. Cri- proach should be done with a knowledge of method
teria could be, for a desired signal-to-noise ratio of capabilities, in particular the level of precision. One
3:1, three times the SD of repeatability precision. cannot expect to determine a relatively low LOD, as
Approach C uses the following formula: LOD = the variance within low analyte concentrations is rel-
3.3 s/m , where s is the SD of the response, and m is atively high.
the slope of the calibration or spiked-product regres- Limit of Quantitation (LOQ) is by definition the
sion line. An estimate of the LOD is then obtained by lowest analyte concentration that can be quantitated
the principle of the method of standard additions. This with accuracy and precision. Since the LOQ consti-
is graphically represented in Figure 4. If an assay si- tutes the beginning of the assay range, the assay range
multaneously quantitates the active product and the criteria for linearity must be passed for the particular
impurity, data generated in the accuracy section and analyte concentration determined to be the LOQ. The
evaluated in linearity may be used to estimate the LOD determination of the LOQ involves the same ap-
using the regression line approach. Sufficient low an- proaches (A, B, and C) as those for LOD. The only
alyte (impurity) concentrations must be included in difference is the extension of the required signal-to-
the initial data set for accuracy to evaluate the LOD noise ratio to 10:1 (approach B), or the change in the
from one sample preparation set. The LOD accep- formula (approach C) to: LOQ = 10 s/m. The accep-
tance criteria for approach C should be identical to tance criteria for LOQ should therefore be set propor-
those based on repeatability precision if the identical tionally similar to those indicated for LOD. In addi-
data set was used. When linearity data is evaluated by tion, the LOQ acceptance criteria should contain the
regression analysis, the LOD must not exceed the re- same limits for accuracy, repeatability precision, and
peatability precision criteria when the predicted SD linearity, as set for each of these validation parameters.
regression line y-intercept is multiplied by 3.3, and di- Two reasons of caution should be considered when
vided by the regression line slope (slope ≅ 1). following ICH approach C. One, the determination of
For approach A, B, or C, and any other justified particular analyte concentrations for LOD and LOQ
approaches, the LOD acceptance criteria must be sig- are independent of sample size, but sample size should
nificantly lower than the product specifications and be ≥ 6. Individual results plotted for each analyte con-
centration tested (instead of averages) generally yield
Figure 4 higher SDs, and therefore higher LODs and LOQs.
Two, approach C only delivers acceptable LODs and
Expected versus Observed
LOQs when the assay response is highly linear, pre-
Spike Concentration cise, and accurate over the plotted concentration range.
14 y = 1.033 x +1.15 In addition, the spiked sample preparation must be ac-
r = 0.9994
(total n = 18)
curately performed to prevent further random de-
Observed Analyte Concentration

12 SD on Y-intercept = 0.217% viations from the regression line. If any of these raised
LOQ: 0%
issues may be a real concern, a different justified ap-
10
Spike + 10 proach should be chosen.
(Percentages)

8 SD (3.15%) Robustness should be addressed during method


development. The main reason is that a method and
LOD: 0%
6 Spike + 3.3 its governing SOP are not to be changed for routine
SD (1.84%) testing and the validation of that SOP. The SOP con-
4 trols operational limits within the overall system suit-
LOD = 3.3 x 0.217% / 1.033 = 0.69% ability criteria that are set during AMD. Deliberate
2 LOQ = 10 x 0.217% / 1.033 = 2.1%
small changes to the test system should be done dur-
0 ing development, because significant differences in
0 1 2 3 4 5 6 7 8 9 10 the AMV results may not be easily explained in the
Analyte present Expected (Spiked) Analyte AMV report.
at 0% Spike Concentration (Percentages)
(1.15%) System Suitability should be demonstrated by showing

36 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

that a complete test system is capable of delivering ac- Approach


curate and reliable results over time when used under rou- The CZE test method must be validated for con-
tine QC testing conditions. All materials to be tested or tent/potency (major component) and for quantitation
used in testing should be stable in excess of the dur- of impurities. From the information listed in Figure 1,
ation of the test procedure. Appropriate reference mater- the CZE test method must be validated simultaneously
ial (standards and/or controls) should be used to estab- for ICH category I and II. The required validation pa-
lish and control system suitability. Standards and controls rameters are accuracy, repeatability precision, inter-
should have reasonable acceptance limits properly derived mediate precision, specificity, linearity, assay range,
from historical data. These limits should be regularly LOD, and LOQ.
monitored and adjusted to account for minor The next step is to analyze product specifications,
changes, such as those potentially expected from and compare those to the historical assay performance.
switching reagents. In general, the historical assay performance can be
Overall test system suitability is generally demon- evaluated from AMD data, previous validation data,
strated by passing the acceptance criteria of all AMV historical product final container QC testing data, and
parameters evaluated. During the AMV execution, all historical assay control data. Since we are revalidating
invalids, repeats, and OOS results generated should this CZE test procedure without having changed test
be evaluated in the AMV report. More detail will be method system parameters besides our minor product
provided in Part III of GAMVP. reformulation, there is no need to evaluate AMD and
previous validation data. Assuming that there were no
AMV Acceptance Criteria Example recent minor changes (i.e., change in reagent manu-
facturer) that could have shifted historical results for
Once it has been decided that a test method must be the assay control (and product), historical QC data for
validated, as per standard practice instructions (see also final containers of product, and the assay control of the
GAMVP, Part I, November 2002 issue, Journal of Vali- last several months (n ≥30) should be evaluated. His-
dation Technology), a successful AMV approach should torical product results will contain lot-to-lot variation
be thoroughly planned. Provided below is an example due to an expected lack of complete product unifor-
how to select the appropriate assay categories (therefore mity. These results are therefore expected to have a
the required validation parameters), develop and de- greater variation than those of the assay control. The
scribe a validation strategy, and systematically derive historical QC testing data for the control and product
reasonable acceptance criteria for the AMV protocol. are listed in Figure 5.

Hypothetical Scenario Figure 5


The formulation of a therapeutic protein will be Historical Testing Data for the
changed (minor) at a late stage of the purification pro- Assay Control and Product Over
cess. Several final container test methods require com- the Last Six Months
plete revalidations (current method) or validations (new
method), while some will require only partial revalida- Sample/Statistic Percent Percent Percent
Purity Impurity A Impurity B
tions, depending on the formulation change impact on
each test method. It was decided that the purity test re- Sample Prod. Cont. Prod. Cont. Prod. Cont.
quires a complete revalidation. Quantitative Capillary Product
Specifications 90% 5% 10%
Zone Electrophoresis (CZE) is used to simultaneously
n 90 90 90 90 90 90
provide results for the active protein and the impurities
Mean
present in low, but reported concentrations. All protein (in percentages) 94.1 91.4 2.0 2.8 3.9 5.8
components present are quantitated as Relative Percent Standard Deviation
Area (RPA) out of all components present (100%). (in percentages) 1.32 1.14 0.43 0.31 0.55 0.39
Final container product specifications are NLT 90% for CV
active protein, NMT 5% of protein impurity A, NMT (in percentages) 1.41 1.25 28.6 11.1 13.8 6.72
10% of protein impurity B. KEY: Prod. (Product) Cont. (Control)

A n a l y t i c a l M e t h o d s Va l i d a t i o n 37
Stephan O. Krause, Ph.D.

The data of Figure 5 may then be used to gener- and on which instrument. This table will demonstrate to
ate the acceptance criteria for all required validation the reader of this document that the proposed validation
parameters. Figure 6 lists each validation parameter is well-planned, and should furthermore prevent execu-
with the relevant AMV design, brief sample prepa- tion deviations by the operators. A validation execution
ration, reported results, acceptance criteria, and a ra- matrix example is given in Figure 8.
tionale for acceptance criteria for those areas. A list of references to the governing Standard Prac-
tice (SP) and supporting documents ensures the reader
The Validation Protocol that all relevant procedures are followed, and that rele-
vant supporting documents (CoA, product spec-
The AMV protocol may consist of sections listed ifications, historical data, and supporting reports) were
in Figure 7. In general, the protocol should have suf- consulted. All supporting documents should be at-
ficient detail to be executed by the operators routinely tached (list of attachments) and filed with the protocol.
performing the test procedure to be validated. The A final section, AMV matrix and acceptance criteria, in
SOP (or draft version) must be followed as written, which the reader can refer to a table where each valida-
unless specified and reasoned in the protocol. This is tion parameter’s validation approach, reported results,
important because the SOP, which includes sample and acceptance criteria are summarized, will be help-
preparation and instructions as to how results are gen- ful. Information can be copied from the validation
erated and reported, should be validated as a complete parameter section.
test system.
Following a signature page and a list of content sec- Significant Digits of Reported Results
tions, reasons, and scope of the AMV, as well as previ-
ous or supporting validations, should be mentioned in Final container and in-process product specifi-
the introduction section. A brief description of the prin- cations should report test results with the appropri-
ciple of the test methodology should be given in the ate number of significant digits. AMVs should gen-
principle section. Materials, equipment, and instru- erate this number by consistently following a des-
mentation to be used must be described in detail, in- ignated SP. Test results must be reported reflecting
cluding Certificates of Analysis (CoA) for all reference the uncertainty in these results. This uncertainty can
materials, instrument ID numbers, and all products or be expressed by using the appropriate number of
in-process material to be tested. Historical assay perfor- significant digits based on assay precision. How
mance should be summarized from analytical method exactly this is to be done, depends on definitions
development data (new method) or routine testing re- and instructions within the SP(s). One relatively
sults (revalidation), and integrated into the acceptance simple way of dealing with this issue is to use a
criteria. The selected assay classification (i.e., category widely accepted SP, such as E 29-02, published by
IV assay validation to be used for the quantitation of the American Society for Testing and Materials
the main drug component) should be clearly given in (ASTM E 29-02).1 This practice gives clear in-
the beginning of the section on validation parameters structions on how to generate significant digits
and design. The validation approach used to demon- from repeatability precision, which is required of
strate system suitability for each validation parameter quantitative AMVs, as per ICH, USP, and FDA
should be described and justified, and reported results guidelines. The reason that AMVs should deliver
and their acceptance criteria should be provided. In add- the appropriate reported uncertainty for test results
ition, detailed instructions for sample preparation, lies mostly in the fact that by the time an AMV is
AMV execution, and validation result generation should executed, at a minimum, a draft version of the SOP
be included. A section, data analysis, should indicate is already in place on which QC operators have
which software (validated) should be used to statisti- been trained. Following this ASTM E 29-02 prac-
cally evaluate results versus acceptance criteria. tice, in which the definition for repeatability preci-
A table (validation execution matrix) should be in- sion matches those of ICH, USP, and FDA, pro-
cluded in the protocol, listing which validation parame- vides the advantage of having reference to an ac-
ter will be executed by which operator, on which day, cepted federal document.

38 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

Figure 6
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Pre-require- Identification and Follow corre- N/A Mean purity Identification We cannot expect
ment (1) purity of commer- sponding SOPs (n=3) in %, of commer- 100% purity of com-
cially purchased for other tests. identification cially pur- mercial proteins. Less
protein impurity A (n=3): Yes/no chased pro- than 100% purity can
and B must be teins must be normalized for per-
determined using match impu- cent recovery calcula-
complimentary rity protein A tions. Identification(s)
tests (other meth- and B, respec- must match because
ods such as SDS- tively. response factors for
PAGE, HLPC, impurity protein A and
HPSEC, MS, B (Accuracy) can oth-
Western Blot). erwise not be vali-
Run in triplicates. dated.
Pre-require- Potential response Follow SOP for N/A Mean area (Caution must be
ment (2) factor differences CZE. Ideally, pro- counts for None exerted here be-
for protein impurity tein impurity A each of impu- cause we are cur-
A and B must be and B should be rity A and B. rently using the
determined. Differ- tested individu- Response fac- CZE test (validated
ences in purity ally at product tors. for final product re-
and/or response specification lease testing).
factors must be concentration,
‘normalized’ for and final con-
percent recovery tainer product lot
calculations. Run (A) should be
in triplicates. tested at 100%.
Accuracy Percent recoveries Spike commer- Data: three repli- Mean percent The combination
of commercially cially purchas- cates over three recoveries Mean spike (worst-case sce-
purchased refer- ed protein im- concentrations (n=3) for each recoveries for nario) of assigned
ence material for purity A and B covering the spiked con- impurity A and limits for Interme-
protein impurity A each into refor- Assay Range. centration impurity B for diate Precision and
and B will be de- mulated final (n=7) for impu- each spike Accuracy must be
termined from in- container prod- rity A, impurity concentration no greater than the
creasing spike uct (lot A) with B, and the cor- (n=7) must fall difference between
concentrations by increasing con- responding within historical mean prod-
using Relative centrations (0.0, percent recov- 100±40% and uct results (n=3, see
Percent Area 0.5, 1.0, 2.0, eries for the 100+ -20%, Table 3) and their
(RPA). RPAs for 5.0, 10.0, 15.0, therapeutic respectively. corresponding prod-
each protein impu- 20.0 %) keep- protein will be Each corre- uct specifications
rity and corre- ing final protein tabulated. sponding (n=3). A worst-case
sponding thera- concentration mean spike limit of historically
peutic protein will constant. recovery recorded 2 SDs (of
be determined (n=2x7) for assay control, see
using individual the therapeu- Intermediate Pre-
response factors (if tic protein cision) has been as-
required). All spike must fall signed to Interme-
concentrations will within 98- diate Precision. This
be run in triplicates 102%. limit is then sub-
by Operator 1 on tracted from the
Day 1 using Instru- product specifica-
ment 1. tions, and constitutes
Percent Recovery the maximum value
= (Observed for the acceptance
RPA/Expected criteria for Accuracy.
RPA) x 100%. An example for the
therapeutic protein
Continued
A n a l y t i c a l M e t h o d s Va l i d a t i o n 39
Stephan O. Krause, Ph.D.

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Accuracy recovery is given
here: {[(94.1% -
90.0%) - (2 x
1.14%)] / 90.0%} x
100% = 2.02%.
Therefore, percent
recovery = 100±2%.
Repeatability Data will be gen- Follow SOP for Data: Nine deter- From Accur-
Mean CVs CVs may differ over
Precision erated in Accur- CZE and test minations over acy data: CVs
(n=8) from Ac- the studied assay
acy to demon- one final prod- Assay Range (in %), means
curacy data range, and we have
strate precision uct container (e.g., three repli-
(n=3), SDs,
must be within very limited data
over the entire lot (A) at 100%. cates over three CIs (p=0.05)
the following points (n=3) for
Assay Range. concentrations). for means, for
limits (in each test concen-
In addition, Op- six determina- % therapeutic
RPA): % ther- tration. Therefore,
erator 1 on Day 1 tions at 100% protein, pro-
apeutic pro- we must keep
using Instrument 1 test concentra- tein impurity A,
tein: NMT 2.5, mean CVs as wide
will generate tion. and protein
% impurity A: as possible to avoid
n=15 data points impurity B.
NMT 22. % failing acceptance
using one final impurity B: criteria.
Report: Standard From Re-
NMT 13.
product container Deviation (SD), peatability CVs from samples
lot. This extensive Coefficient of data:
CVs (n=3) at 100% test con-
data set for Re- Variation (CV), from 15 data
CV (in %), centrations (n=15
peatability Pre- points must
Confidence Inter- mean (n=15), data points) shall
cision will be used val (CI). be within the
SD, CI (p=0.05) be no greater than
to generate the ap- following limits
for mean, for those of the histori-
propriate number (in RPA): %
% therapeutic cal assay control
of significant digits therapeutic
protein, pro- because these data
to be reported for protein: NMT
tein impurity were generated
test results. 1.3, % impu-
A, and protein over six months by
rity A: NMT
impurity B. different operators
11. % impurity on different instru-
B: NMT 6.7. ments.
Intermediate One unspiked final Follow SOP for Data/Report: No Overall and P-value of The means and pre-
Precision product container CZE and test specific require- individual P- ANOVA must cision variabilities
lot (A) will be one final prod- ments. Variations values of fac- be NLT 0.05. If among and between
tested in triplicates uct container (factors) to be tors (opera- p < 0.05, addi- factors should not
on each of three lot (A) at 100%. studied (in a ma- tors etc.) from tional F-tests be statistically differ-
days by each of trix) are days, ANOVA. Over- and T-tests will ent at 95% confi-
three operators on operators, and all and factor be performed dence. The likeli-
each of three in- equipment. CV(s) and to isolate fac- hood of observing
struments. Inter- SD(s) for % tors with statis- statistical differ-
mediate Precision therapeutic tically different ence(s) increases
will be determined protein, pro- means and/or with assay precision,
for each purity tein impurity variations. An and may not impact
and integrity char- A, and protein investigation system suitability. It
acteristic by using impurity B. must demon- is therefore advis-
an Analysis of Var- strate that each able to set an “es-
iance (ANOVA). different factor cape clause” by
Any statistical dif- mean (at generating numeri-
ferences (at the p=0.05) will not cal limits for overall
95% confidence affect assay CV (2 SDs of assay
level) between performance control) and factor
and within factors and overall sys- CVs (1 SD of assay
(operators, days, tem suitability. control) from the
instruments) will historical data. It is
Continued
40 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Intermediate be investigated. (A Overall CV more meaningful to
Precision matrix for Interme- must comply use the historical
diate Precision is with the follow- assay control data
illustrated in Table ing limits: % (see Table 3) here
2 of this article) therapeutic because product
protein (in data includes nor-
RPA): NMT mal variation among
2.5, % impu- different product lots.
rity A: NMT
22. % impurity
B: NMT 13.
Factor CVs
must comply
with the follow-
ing limits: %
therapeutic pro-
tein (in RPA):
NMT 1.3, %
impurity A:
NMT 11.% im-
purity B: NMT
6.7.
Specificity Matrix interference: Matrix interfer- No specific re- Individual and No statistical The means and
Matrix interference ence: All sam- quirements. mean (n=3) significant dif- precision variabili-
will be evaluated by ples (constant RPAs and cor- ference (at ties among and be-
comparing results final concentra- responding 95% confi- tween factors
for each impurity- tions) will each percent recov- dence level) should not be sta-
spiked (A and B) be spiked with eries for shall be tistically different at
sample, spiked into 5% of protein spiked sam- obtained 95% confidence.
final product con- impurity A and ples (n=6) will (p > 0.05) in Similar to Inter-
tainer (lot A), to B. be reported. ANOVA. If p < mediate Precision,
those of spiked An ANOVA 0.05, additional the likelihood of ob-
assay control, and table will be F-tests and t- serving statistical
spiked current final presented. tests will be difference(s) in-
product (lot B). Per- performed to creases with assay
cent recoveries will isolate spiked precision, and may
be compared by samples with not impact system
ANOVA and, if re- statistically dif- suitability. In addi-
quired, by t-tests to ferent means tion, we should ac-
evaluate potential and/or varia- count for potential
differences be- tions. An inves- differences in re-
tween product lot tigation must sults due to sample
(lot A), the assay demonstrate preparations. It is
control, and current that each dif- therefore advisable
final product (lot B). ferent factor to set an “escape
One operator will mean (at clause” by generat-
run all samples on p=0.05) will not ing numerical limits
one day on one in- affect assay for difference limit
strument. The fol- performance (1 SD of assay
lowing samples will and overall control) from the
be prepared: Three system suit- historical data. It is
spiked sample pre- ability. more meaningful to
parations of each The differ- use the historical
impurity (n=2) for ence(s) among assay control data
each sample spiked (see Table 3) here

Continued
A n a l y t i c a l M e t h o d s Va l i d a t i o n 41
Stephan O. Krause, Ph.D.

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Specificity (n=3). All samples matrices (lots A because product
will be run three and B, and data includes nor-
times (total runs: assay control) mal variation among
n=3x2x3x3=54). for each spiked different product
Analyte interfer- impurity (n=2), lots. A reason of
ence: Analyte in- must be no caution is how well
terference can be greater than the differences in sam-
inferred from the following limits ple preparation can
matrix interfer- (in RPA): NMT be controlled.
ence studies. 1.3, % impurity
A: NMT 11. %
impurity B:
NMT 6.7.
Linearity Linearity will be See Accuracy. Correlation coeffi- Regression Correlation co- Because lack of
determined at the cient(s), y-inter- line slopes, in- efficient ≥ 0.98 Accuracy, Repeat-
low percentage cept(s), slope(s) of tercepts, corre- for each of ability Precision,
range (approx. 0- regression line(s), lation coeffi- three re- and differences in
20 RPA) to cover a and Residual cients, RSS for gression lines. sample prepara-
potential impurity Sum(s) of Squares each regres- All three CIs tion(s) may con-
range (NMT 5% (RSS) should be sion line. (at 95% confi- tribute to a de-
impurity A; NMT reported. Plots (n=3) of dence) for each crease in regres-
10% impurity B), A plot of the data the regression regression line sion line fit (lower
and at the high (regression line) lines of individ- slope must correlation coeffi-
percentage range to be provided. ual RPA re- contain 1. cient), a generally
(approx. 75 to 95 sults (n=3 for acceptable correla-
RPA) to cover the NLT 5 concentra- tion coefficient (≥
tions to be each spiked
product specifica- concentration) 0.98) should be
tions for the thera- tested. used here. The
for each
peutic protein (NLT spiked con- confidence limits of
90 %). Three re- centration (0.0, the slope should
gression lines will 0.5, 1.0, 2.0, contain 1 since oth-
then be generated, 5.0, 10.0, 15.0, erwise assay re-
one each for the 20.0%) versus sponse may not be
two low (impurity A actual spike sufficiently propor-
and B), and one for concentrations tional to support
the high (therapeu- (in RPA) pre- quantitative results
tic protein) percent- sent will be over the entire
age ranges. In- provided. assay range.
dividual RPA re-
sults (n=3 for
each spiked con-
centration) for each
spiked concentra-
tion (0.0, 0.5, 1.0,
2.0, 5.0, 10.0,
15.0, 20.0%) will
be plotted against
actual spike con-
centrations (in
RPA) present.
Assay Range Assay Range will See Accuracy. For therapeutic Regression Correlation co- All results generated
be determined at protein: line slopes, in- efficients for within the deter-
the low percent- 80 to 120% of tercepts, cor- each of three mined Assay Range
age range (ap- test concentra- relation coeffi- regression must be accurate
prox. 0-20 RPA) to tion. cients, RSS lines. All three and precise. The
Continued
42 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Assay Range cover a potential For impurity A for each re- regression line assay response
impurity range and B: gression line slope CIs within the Assay
(NMT 5% impurity From reporting will be reported. (95% confi- Range must be lin-
A; NMT 10% im- level to 120% of All coefficients dence) must ear. For further de-
purity B), and at specification of variation contain 1. All tails, see sections
the high percent- (CV) for RPA acceptance Accuracy, Repeat-
age range (ap- for each spiked criteria for Ac- ability Precision,
prox. 75 to 95%) concentration curacy, Re- and Linearity.
to cover the prod- will be reported. peatability
uct specifications An overall CV Precision, and
for the therapeutic for each of the Linearity must
protein (NLT 90 three spiked be passed.
%). For details, samples series
see Linearity sec- (impurity A, B,
tion. and therapeutic
protein) will be
reported.
Limit of Detec- The LOD will be See Accuracy Approach C (see All concentra- The LODs for In general, this ICH
tion determined for and Repeat- section LOD of tions and re- impurity A and recommended ap-
each impurity (A ability Pre- this article): sults (in RPA) B must be proach to determine
and B) concentra- cision. LOD = (3.3 x σ) / will be tabu- NMT 0.4% LOD may yield rela-
tion from data S, where σ = SD lated. The ap- and 0.9%, re- tively high values for
generated in the of response and parent LODs spectively. LOD (and LOQ) ver-
Accuracy section S = regression (in RPA) for sus some alternative
and evaluated in line slope. each impurity approaches. The
the Linearity sec- (n=2) will be level of Accuracy,
tion. For details, reported. Repeatability Preci-
refer to the Line- sion, and Linearity
arity section. Since in results generated
final product con- by this test system
tainer lot (A) may will be reflected in
contain significant the LOD (and LOQ).
levels of each of The LOD should be
impurity A and B less (33%) than the
(> 1%), the LOD LOQ, which in turn
will be determined must be significantly
from the regres- less than the histori-
sion lines gener- cal product impurity
ated for impurity A means. See also
and B in the Line- LOQ.
arity section as per
section VII.C.1 of
ICH Guidance to
Industry document
Q2B.
LOD = (3.3 x σ) / S
The slopes (S) will
be determined from
the linear regres-
sion data for each
impurity (A and B).
The standard devi-
ation (σ) of the re-
sponse will be de-
termined from
Continued
A n a l y t i c a l M e t h o d s Va l i d a t i o n 43
Stephan O. Krause

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
Limit of Detec- the RPA results
tion for each impurity
(A and B) in the
Repeatability Pre-
cision section.
Limit of Quan- The LOQ will be See Accuracy Approach C (see All concentra- The LOQs for The LOQ should be
titation determined for and Repeat- section LOQ of tions and re- impurity A and significantly less
each impurity (A ability Pre- this article): sults (in RPA) B must be than the historical
and B) concentra- cision. LOQ = (10 x σ) / will be tabu- NMT 1.1% mean impurity re-
tion from data gen- S, where ( = SD lated. The ap- and 2.8%, re- sults (2.0% and
erated in the Ac- of response and parent LOQs spectively. 3.9% for impurity A
curacy section, S = regression (in RPA) for and B, respectively,
and evaluated in line slope. each impurity see Table 3). We
the Linearity sec- (n=2) will be can determine the
tion. For details, re- reported. LOQ (and therefore
fer to the Linearity the LOD) by sub-
section. Since final tracting 2SDs for
product container product impurity re-
lot (A) may contain sults from the his-
significant levels torical mean impu-
each of impurity A rity results (e.g., im-
and B (> 1%), the purity A: 2.0% - 2 x
LOQ will be deter- 0.43% = 1.14%).
mined from the re- See also rationale
gression lines gen- under LOD.
erated for impurity
A and B in the
Linearity section,
as per section
VIII.C.1 of ICH
Guidance to Indus-
try document Q2B.
LOQ = (10 x σ) / S
The slopes (S) will
be determined
from the linear re-
gression data for
each impurity (A
and B). The stan-
dard deviation (σ)
of the response
will be determined
from the RPA re-
sults for each im-
purity (A and B) in
the Repeatability
Precision section.
System Suit- All current criteria See all sec- No specific re- Number of val- As per SOP. System suitability will
ability for system suita- tions. quirements. id and invalid No acceptance be demonstrated by
bility (per SOP) tests. Appro- criteria for num- passing all accep-
must be satisfied priate number ber of invalids tance criteria. Sys-
in order for each of significant and appropri- tem suitability criteria
test to be consid- digits to be ate number of of the SOP may
ered valid. Each used for final significant dig- change, depending
failing test will be result reporting. its. on the number
Continued
44 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

Figure 6 (Continued)
Summary of Overall AMV Design and Acceptance Criteria
Validation AMV Design Sample Minimum Reported Acceptance Rationale for
Parameter Preparation ICH Q2B Results Criteria Acceptance
Requirements Criteria
System Suit- repeated per SOP of valids/invalids
ability until the current generated.
criteria are met.
System suitability
will be evaluated
by listing invalid
tests. The appro-
priate number of
significant digits in
reported results
will be determined
following ASTM E-
29-02.

Acceptance Criteria System mance expectations should be reflected in an Accep-


tance Criteria System (ACS) where all acceptance cri-
When acceptance criteria are determined for each teria for the required validation parameters (as per
validation parameter, the fact that these are connected assay classification) are meaningful, and will focus on
may often be overlooked. Each quantitative test sys- permissible worst-case conditions.
tem has certain capabilities to yield accurate, precise, Like most concepts, the ACS has several drawbacks.
and analyte-specific results over the desired assay One, it takes time and experience to evaluate and inte-
range. Since every test system has certain limits on its grate all assay performance expectations into one sys-
capabilities, the acceptance criteria that ideally should tem for all validation parameters, especially when val-
define these limits should be connected. Test perfor- idation data will be generated under QC routine testing
Figure 7
Suggested AMV Protocol Sections
Section Section Subsections
Number Number
N/A Protocol Approval Protocol Title; Signatures with Job Titles
N/A List of Protocol Sections Table of Content; List of Figures (if applicable); List of Tables
1 Introduction N/A
2 Principle N/A
3 Materials, Equipment, and Materials; Equipment; Instrumentation
Instrumentation
4 Historical Assay Performance Historical Data for Assay Control; Historical Data for Samples
(if available); Product Specifications
5 Validation Parameters and Design Test Method Description (summarizes SOP); Validation Pre-Re-
quirements (if applicable); Validation Parameters
6 Validation Execution Matrix See Table 5
7 Data Analysis Calculation Samples; Statistical Software
8 List of References N/A
9 List of Attachments N/A
10 AMV Matrix and Acceptance Table with Column Headings: Validation Parameters, Validation
Criteria Approach, Sample Preparation, Reported Results, Acceptance
Criteria

A n a l y t i c a l M e t h o d s Va l i d a t i o n 45
Stephan O. Krause, Ph.D.

Figure 8
Validation Execution Matrix
Validation Op. Day Ins. Run Sample
Parameter Number Number Number Number (Spike Conc.)
Accuracy 1 1 1 1 (3x): 5, 10, 20, 40, 60, 80, 100, 120%
Repeatability 1 1 1 1 As Accuracy
Int. Precision 1 2 1 2 3x 100% Conc.
Int. Precision 2 2 2 3 3x 100% Conc.
Int. Precision 3 2 3 4 3x 100% Conc.
Int. Precision 1 3 2 5 3x 100% Conc.
Int. Precision 2 3 3 6 3x 100% Conc.
Int. Precision 3 3 1 7 3x 100% Conc.
Int. Precision 1 4 3 8 3x 100% Conc.
Int. Precision 2 4 1 9 3x 100% Conc.
Int. Precision 3 4 2 10 3x 100% Conc.
Specificity 1 5 1 11 Matrix Interference
Specificity 1 5 1 12 Analyte Interference
Linearity 1 1 1 1 As Accuracy
Assay Range 1 1 1 1 As Accuracy
LOD 1 1 1 1 As Accuracy
LOQ 1 1 1 1 As Accuracy

conditions. Two, systematic errors introduced during worst-case combination(s) of regression line slope, y-
sample preparation for spiking studies (initially, small intercept, and regression coefficient becomes very
errors could also be magnified at the end of a dilution complex.
series) to determine accuracy (percent recovery) may With a well-developed ACS, the auditors can no
not be accounted for when the ACS is solely developed longer criticize acceptance criteria. Acceptance crite-
using historical data and method capabilities. Three, ria are now derived as part of the ACS, which in turn,
when one validation parameter will fail its acceptance demonstrates method capabilities in respect to prod-
criteria, in general, all validation parameters will fail, uct specifications, historical data, and method capa-
leading to potential complete failure to demonstrate test bilities. Furthermore, the ACS is a dynamic system
system suitability. On the other hand, the opposite must that can be readily adapted as a unit to changes to the
then also be true, meaning that all criteria within the system, or for other reasons for revalidation. With ex-
complete ACS will be passed when one acceptance cri- perience, it will become easier and faster to set up an
terion will be passed. ACS, even for the AMV of a new test method.
Although ACS may only be a concept at this point,
and may not be applicable for all AMVs, the potential Conclusion
advantages of a well-developed ACS should outweigh
the drawbacks, because the ACS is solid as a system, Deriving reasonable acceptance criteria requires
and can easily be justified and defended. Each indi- experience and a deep understanding of the method
vidual acceptance criterion is now meaningful, related capabilities, product specifications, and historical
to all others, and reflects the test system performance data. This article provides a detailed approach to de-
capabilities. The concept of ACS should be considered rive these criteria, which can now be justified and
for accuracy, precision (repeatability and intermedi- easily defended in an audit. The AMV can now ac-
ate), assay range, LOQ, and specificity. However, de- curately demonstrate that the test system is suitable
riving acceptance criteria for the linearity parameter for its intended use. ❏
will be difficult, since an estimation of the potential

46 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

About the Author Article Acronym Listing


Stephan O. Krause, Ph.D. is managing the QC Analyt- ACS: Acceptance Criteria System
ical Validation department within the Biological Prod-
AMD: Analytical Method Development
ucts division of Bayer HealthCare Corporation. He re-
ANOVA: Analysis of Variance
ceived a doctorate degree in bioanalytical chemistry
from the University of Southern California. Dr. Krause AMV: Analytical Method Validation
can be reached by phone 510-705-4191, and by e- ASTM: American Society for Testing and
mail at [Link].b@[Link]. Materials
CI: Confidence Interval
Acknowledgement CoA: Certificates of Analysis
I would like to thank my colleague, Christopher Fisher,
CV: Coefficient of Variation
for his helpful comments and critical review of this ar- CZE: Capillary Zone Electrophoresis
ticle. FDA: Food and Drug Administration
GAMVP: Good Analytical Method Validation
Reference Practice
1. As per ASTM E 29-02 Section 7.4, the following instructions HPLC: High Performance Liquid Chromatog-
are given: “A suggested rule relates the significant digits of the raphy
test result to the precision of the measurement expressed as the
standard deviation σ. The applicable standard deviation is the HPSC: High Performance Size Exclusion
repeatability standard deviation (see Terminology E 456). Test Chromatography
results should be rounded to not greater than 0.5 σ or not less
than 0.05 σ, provided that this value is not greater than the unit ICH: International Conference on Harmo-
specified in the specification (see 6.2). When only an estimate, nization
s, is available for σ, s, may be used in place of σ in the preced-
ing sentence. Example: A test result is calculated as 1.45729. LOD: Limit of Detection
The standard deviation of the test method is estimated to be, LOQ: Limit of Quantitation
0.0052. Rounded to 1.457 since this rounding unit, 0.001, is be-
tween 0.05 σ = 0.00026 and 0.5 σ = 0.0026.”
MS: Mass Spectometry
For the rationale for deriving this rule, refer to ASTM E 29-02. NLT: No Less Than
For definitions refer to ASTM E 456. OOS: Out-Of-Specification
QC: Quality Control
Suggested Reading RPA: Relative Percent Area
1. Krause, S. O., “Good Analytical Method Validation Practice, RSS: Residual Sum(s) of Squares
Part I: Setting-Up for Compliance and Efficiency.” Journal of
Validation Technology. Vol. 9 No. 1. November, 2002. pp 23-32.
SD: Standard Deviation
2. International Conference on Harmonization (ICH), Q2A, “Vali- SDS-PAGE: Sodium Dodecyl Sulphate-Polyacry-
dation of Analytical Procedures.” Federal Register. Vol. 60. lamide Gel Electrophoresis
1995.
3. ICH, Q2B, “Validation of Analytical Procedures: Methodolo- SOP: Standard Operating Procedure
gy.” Federal Register. Vol. 62. 1996. SP: Standard Practice
4. United States Pharmacopoeia. USP 25 <1225>. “Validation of
Compendial Methods.” USP: United States Pharmocopeia
5. American Society for Testing and Materials (ASTM) E 29-02.
“Standard Practice for Using Significant Digits in Test Data to
Determine Conformance with Specifications.” July, 2002.
6. ASTM E 456 – 96. “Standard Terminology for Relating to Qual-
ity and Statistics.” September, 1996.
7. Miller, J. C. and Miller, J. N. “Statistics for Analytical Chemistry.”
(2nd ed.). Ellis Horwood Ltd., England. 1988.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 47
Good Analytical Method
Validation Practice
Setting Up for Compliance
and Efficiency: Part I
By Stephan O. Krause, Ph.D.
Bayer HealthCare Corporation

I
n recent years, Analytical Me- How many validation resources a
thod Validation (AMV) has be- company decides to devote to the
come an increasing concern and
❝Once the FDA AMV program is a business decision
focus of regulatory agencies. Agen- has audited AMVs like the purchase of insurance for the
cies expect companies to continu- shipped licensed product. Companies
ously modify test methods along the
and found general can invest a minimum level of re-
long path from drug development to compliance, the sources, and wait to be eventually hit
final release testing of the licensed by the U.S. Food and Drug Admin-
product. The level of detail and val-
Agency will istration (FDA) with 483 observa-
idity within test method qualification certainly be more tions and potential Warning Letters.
and validation documents increases On the other hand, companies may
along this path. Although directed
confident and invest more resources into the AMV
towards Quality Control (QC), de- trusting in program and supporting departments,
partments of mid- to large-sized man- and this would then pay off later in
ufacturers for biological and phar-
the overall time when FD-483 observations and
maceutical products, most of the testing results.❞ warning letters could be avoided.
suggestions presented here may also Once the FDA has audited AMVs
be applied to other industries. The and found general compliance, the
scope of this article is to illustrate a rigorous AMV Agency will certainly be more confident and trusting in
program, that when consistently applied, should main- the overall testing results. This impression may carry
tain compliance with regulatory agencies in the future. on for years to come, and will furthermore ensure prod-
This article will provide recommendations on how to uct safety to the patient.
incorporate compliance requirements into a practical Efforts towards a compliant and efficient AMV
and efficient approach to produce analytical valida- program include the generation of a method valida-
tions. tion master plan, which aligns the timelines of the
Good Analytical Method Validation Practice regulatory and manufacturing units with validation
(GAMVP) ensures accurate and reliable test results, project completion. To be efficient, the AMV depart-
and therefore the safety and quality of the product. ment must be well-organized, and staffed with expe-

48 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

rienced scientists who are sufficiently trained to con- tem suitability, but requires a more rigorous approach.
sistently plan and execute validation projects. It is When following regulatory expectations set forth in
important to realize that the continuously increasing the SP, a series of validation performance characteris-
number of required validations must be counterbal- tics (see section on Validation Parameters) must be
anced also by an equivalent staffing increase of Qual- strictly followed. All non-compendial test methods
ity Assurance (QA) personnel, delegated for protocol must be validated in addition to those previously qual-
and report review and approval. In order to cope with ified, or validated methods, which have been auto-
increasing expectations, the AMV department must mated or otherwise changed. Method validation must
be integrated effectively within the QC unit. How- challenge system suitability, and should therefore have
ever, the set-up of the AMV program starts with the “tighter” pre-set acceptance criteria for each validation
generation of the Standard Practice/Procedure (SP) parameter in the validation protocol. Suggested appli-
document, which governs the process of AMVs. The cations for test method qualifications and validations
SP clearly defines all required AMV procedures and
responsibilities. Figure 1

The AMV Standard Practice/Procedure Method Qualification

Method Qualifica-
Qualification versus Validation tion
Differences in the definitions for analytical method
qualifications versus validations among regulatory
agencies, and often even within a company’s func-
tional units, have somehow lead to confusion for the Minor Minor
user. Unfortunately, there is currently no guidance Compendial Changes to Changes to Requalifi-
available why and when methods must be qualified or Method Compendial Validated cation
Method Method
validated, with the exception of non-compendial
methods used for release testing, which must be vali- Method Qualification = Demonstration of system suitability
dated. It is important that these terms are clearly de- Minor changes to (qualified) compendial or validated test sys-
fined in the governing SP. This document should state tem: Reagents, materials, measurements, equipment, etc.
at which drug development or process stage methods Requalification of test system: Test system failure (i.e., invalid
tests >10%), or required per SP to be performed (i.e., rou-
must be qualified or validated. Since regulatory agen- tinely every two years)
cies have still no harmonized classification for the ter-
minology, it is not important how companies define are given in Figure 1 and Figure 2, respectively.
these terms, as long as these are well-defined in the SP, Test methods could also be defined as qualified
and these definitions are understood and followed by when developed by the Analytical Method Devel-
all users. opment (AMD) group. This definition would then
Method qualification is demonstrating test system also correlate with the equipment and instrument ter-
suitability. The method qualification report should con- minology (qualification), and the AMD report could
tain an assessment of the accuracy and precision of then be used as the Performance Qualification (PQ)
the test results, and the appropriate number of sam- for the relevant instrument(s) after the Design Qual-
ples to be tested. Method qualifications should be per- ification (DQ), Installation Qualification (IQ), and
formed for all compendial methods used. Pre-set ac- Operation Qualification (OQ) are executed and ap-
ceptance criteria for system suitability (accuracy, pre- proved.
cision, and sample size) should be used in the qualifi- All test method qualification and validation proto-
cation protocol, but may be “wider” than those in the cols should contain acceptance criteria for each stud-
validation protocol, since method qualification is not ied validation parameter. Not only validation, but
intended to challenge a compendial or otherwise val- also, qualification protocols should be generated with
idated test method. clear expectations what criteria an instrument or
Method validation is also demonstrating test sys- method must routinely produce. Qualifications may

A n a l y t i c a l M e t h o d s Va l i d a t i o n 49
Stephan O. Krause, Ph.D.

Figure 2
Method Validation
Method Qualifica-
tion

Minor Minor
Not a Com- Changes to
Changes to Changes to
pendial Product or Revalidation
Compendial Validated
Method Matrix
Method Method
Method Qualification = Performance characteristics as per ICH/USP/FDA guidances and SP
Analytical Major changes to (qualified) compendial or validated test system: Different procedures, data
Method De- analysis, standards, full automation
velopment Requalification of test system: Test system failure (i.e., invalid tests >10%), or required per SP
to be performed (i.e., routinely every two years)

include “wider” acceptance criteria than validations AMV protocols and reports. This is truly the ulti-
because, in general, less history of system suitability mate document control, ensuring that acceptance
and overall assay performance may be available. criteria and results cannot be changed during or after
the validations are performed. Although not cur-
Assigning Responsibilities rently expected from regulatory agencies, this cum-
Recent trends of auditors that criticize companies bersome effort will be rewarded with an overall pos-
for the lack of assigned responsibility and account- itive impression on the regulatory agency auditor
ability should be considered when these areas are de- that AMVs are performed at the highest quality
fined in the SP. The method development reports level.
should be reviewed by the AMV department and ap- AMV protocols and reports should have unique
proved by QA management. The AMV protocol and identification numbers, such as AMVP-02-001 for
report should be prepared by experienced AMV sci- Analytical Method Validation Protocol generated
entists, reviewed by the AMV manager, QC manager chronologically as the first one in the year 2002
or supervisor, and by the leading QC operator. Re- (AMVR-02-001 would be the corresponding report).
view by the leading QC operator ensures that all An addition to the protocol or report is defined as an
technical details of the protocol are corresponding to addendum, and should be denoted as such (i.e., AD =
the Standard Operating Procedure (SOP), and are fol- Addendum, therefore: AMVP-02-001-AD). A cor-
lowed by the operator(s), and that all data and devia- rection or change to the protocol or report is defined as
tions are properly documented. QA management an amendment, and should also be denoted as such
must approve all documents required by the U.S. (i.e., AM = Amendment so AMVP-02-001-AM). All
Code of Federal Regulations (CFR) Title 21. The identification numbers used for AMV documenta-
AMV department is responsible for the timely com- tion should be defined in the SP and correlate with
pletion (i.e., six months) of the validation, data, in- general company document numbering rules. Ide-
formation integrity, and compiling and archiving of ally, the numbering could start with the analytical
the AMV package. Figure 3 illustrates the assigned method development report (AMDR-02-001), but
department responsibilities, interfaces, and the over- this may not be practical, since the AMV and AMD
all process flow of a new analytical test method. departments may report into different departments
(i.e., QC versus Research & Development [R&D]).
AMV Documentation Validation data must be generated using only ac-
Nothing demonstrates control and overall in- ceptable data recording means, such as validated
tegrity more than serialized (in red) pages for the software applications and serial numbered laboratory

50 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

notebooks. Validation data should be, if possible, real- cate problems with the method itself (AMD is re-
time reviewed (same day) by the operator and the rele- sponsible). The efficiency in AMV project comple-
vant QC supervisor. Only original data must be in- tions is just as important as the overall quality, since
cluded in the AMV package for archiving. Recom- even a poorly executed AMV is still better than a non-
mended documents to be included in the AMV pack- existing AMV.
age are listed below:
Agency Guidance Documents
• Original Validation Data Depending on where the licensed product will be
• SOP sold, published guidance from the FDA, United States
• Change Request for SOP (if required) Pharmacopoeia (USP), and/or International Con-
• Certificates of Analysis (C of A) ference on Harmonization (ICH) should be referred to
• Other Relevant Reagent and Reference Information in the SP, and the minimum requirements in the rele-
• Relevant Sample Information vant guidelines should be integrated. Management
• In-process and/or Product Specifications should keep in mind that regulatory agencies will audit
• Operator Training Records the AMV department based upon current guidance
• Instrument Calibration Records documents and the company’s governing SP. It is be-
• Instrument User Logs yond the scope of this article to discuss regulatory
• Historical Data guidelines (Part II of this article will contain recom-
• Statistical Analysis mendations on how to integrate current guidelines into
• Operator Notebook Copies (if used) the AMV protocol). The reader can readily find these
• Analytical Method Development Report guidelines over the Internet ([Link]/cber/guide-
• Analytical Method Validation Protocol [Link]). Regulatory agencies will become more ex-
• Analytical Method Validation Report perienced in GAMVP, and may expect more solid and
• Addenda and/or Amendments (if done) stringent AMVs in the future. It may therefore be ad-
visable, in order to achieve long-term compliance, to
Once all documents have been reviewed by the deliver AMV packages now that can stand up to future
AMV study originator, and are organized in a binder, expectations.
the AMV package should then be archived in a limited Although listed in the ICH Q2B guidelines for
access and secure location to avoid any losses of these method validation and USP 25 <1225>, the robust-
critical documents. Clearly, nothing indicates disorga- ness of a test method should be demonstrated and
nization and incompetence more than lost validation documented during method development. The vali-
documents, which are a company’s proof for the valid- dation parameter ‘Ruggedness,’ in USP 25 <1225> is
ity of all test results. Auditors will use the SP to com- equivalent to ICH Q2B’s ‘Reproducibility,’ in that
pare it to the actual AMV documents, and will search they describe overall precision among different labo-
for any lack of detail and clear guidance, and de- ratories. Both uses of terminology are different from
viations or inconsistencies between the SP and AMV ‘Robustness’ (deliberate changes to test conditions)
documents. The SP, AMV protocol, and AMV report and ‘Inter-assay Precision’ or ‘Intermediate Pre-
should be well-structured, detailed documents that cision’ (overall precision within one laboratory). Un-
should be concisely written, well-formatted, and with- fortunately, there is still some confusion among in-
out grammatical or spelling errors. dustry experts when and how exactly to apply which
A limit should be set for the maximum time al- terminology. Anyhow, ‘Robustness’ should be cov-
lowed between the approvals of the AMV protocol ered during development. This is clearly more eco-
and report (i.e., six months). Clearly, an exceeded nomical, and makes more sense, since the validation
time limit suggests problems with either the protocol, scientist must know the level of robustness before a
execution, actual results obtained (AMV is respon- method is validated. A method should not be modi-
sible), the lack of resources or subject matter ex- fied once the method development work is completed
pertise of delegated personnel from QA management. (including robustness), and the development report is
In the worst case, an excessive time delay may indi- approved. An unfinished method is unlikely to be ro-

A n a l y t i c a l M e t h o d s Va l i d a t i o n 51
Stephan O. Krause, Ph.D.

bust enough to yield consistent test system suitability • Robustness (if not done during AMD)
over time, and may cause test system failures and po- • System Suitability
tential cumbersome and expensive efforts to fix the Attention to assay classifications should be paid
system. The function of the AMV department should when one assay simultaneously tests for the main
clearly be to qualify and validate, not to modify and product and impurities (i.e., an electrophoresis assay
develop test methods. may yield concentrations for the desired therapeutic
protein, and also for other proteins not completely sep-
Biological versus Biochemical/Chemical Methods arated during purification). The SP should state when
Biological assays are generally less quantitative and how each validation parameter is to be executed.
than biochemical or chemical assays. In addition, less Guidance should be given how to derive acceptance
industry guidance on how to validate biological test criteria and use statistics (i.e., Analysis Of Variance
methods is currently available. Although Agency ex- [ANOVA], t-tests) to express confidence in each vali-
pectations may therefore be less stringent, this how- dation parameter. In-process or product specifications,
ever, does not make the task of biological method val- historical data (assay development and/or routine test-
idation easier or more economical. Due to this general ing), and guidelines set forth in the SP should be con-
lack of experience by industry experts, the demonstra- sidered and integrated into each acceptance criterion.
tion of test system suitability is often far from trivial. Other factors, as listed under Figure 4, should also be
The SP should differentiate between biological me- considered, since different methodologies and instru-
thod qualifications (compendial) or validations (non- ments may yield different assay performance expecta-
compendial), and biochemical or chemical methods tions. More detail will be provided in Part II: Deriving
(generally more quantitative). Protocol acceptance cri- Acceptance Criteria for the AMV Protocol.
teria must be thoroughly determined to incorporate ex-
pected results from qualitative versus quantitative tests, Acceptance Criteria and Statistics
and normal versus non-normal (i.e., Poisson statistics) The SP should state guidelines when and how de-
result distributions. Commonly used validation guid- scriptive statistics (i.e., mean, standard deviation) and
ance documents (FDA, USP, ICH) should only be used comparative statistics (i.e., F-test, T-test) are to be used
with appropriate test methods. It may not be appropri- for AMVs, and how they are to be incorporated into
ate to incorporate the validation parameters, quantita- acceptance criteria. For example, when using assay
tion limit, linearity, and assay range when qualifying or variability factors, such as multiple instruments, oper-
validating a microbiological test procedure. ators, and days to demonstrate intermediate assay pre-
cision (within one laboratory), it is recommended to
Validation Parameters use, whenever possible, n=3 of each factor. To demon-
Assay performance criteria, such as accuracy and strate that none of these factors contribute significantly
precision, are defined for chemical and biochemical (at 95 % confidence) more than the remaining others,
methods in current guidelines, and should be covered an ANOVA should be used to evaluate intermediate
according to assay classification (i.e., quantitative precision. If significant differences among factors are
test, limit test). The validation parameters below are observed, a series of F-tests and Student’s T-tests
listed per ICH guidelines (Q2B): should be used to isolate not only particular factors,
but also individual variability contributors, such as a
• Specificity particular operator.
• Linearity However, statistics should only be applied when the
• Assay Range conclusions are meaningful. Due to sample spiking and
• Accuracy other non-routine sample handling during the validation
• Repeatability Precision execution, systematic errors may not be accounted for
• Intermediate Precision with simple statistics, since those assume random data
• Reproducibility distributions. Acceptance criteria should be derived
• Limit of Detection thoroughly, connected to product specifications, sample
• Limit of Quantitation size, and assay performance to clearly demonstrate test

52 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

Figure 3
Process Flow for Development and Validation of a New Analytical Method

A n a l y t i c a l M e t h o d s Va l i d a t i o n 53
Stephan O. Krause, Ph.D.

system suitability (see also Figure 4 for more detail). versus validation). Figure 1 and Figure 2 provide ap-
These criteria may therefore be more convincingly de- plications for each of these two classifications. A
rived from numerical limits readily available (i.e., in- good practice would be to set timelines for test meth-
strument specifications set by the manufacturer). The ods to be requalified or revalidated even when test
SP should further include guidelines how the uncer- systems perform as expected, indicated by quarterly
tainty of reported results (and therefore the number of or annual QC trend reports. The requalifications or
significant digits used) is to be determined in AMVs. revalidations may then only be performed as partial
Whether to derive this uncertainty estimation from a requalifications and revalidations by demonstrating
propagation of error analysis, overall assay precision mainly accuracy and precision.
(intermediate precision), repeatability precision, or a Whenever changes to the test system are required
combination of precision and accuracy (using an ac- (i.e., increase in invalid tests), a full requalification or
ceptable reference standard) may, at least for now, not revalidation may be required. Once compendial
be as important as the consistent implementation and (USP) methods are modified or automated, these
adherence to this procedure. It may, therefore, be advis- methods should then be validated. Changes to the
able to keep instructions simple, so that these can be product, product formulation, or product matrix are
consistently understood and followed by everyone in- considered major changes by the regulatory agencies
volved. Management must realize that this procedure and require test method validations, since these
does impact how final release results and product spec- changes may impact test system suitability. It is, at
ifications are to be reported. The author will provide least partially, a business decision how these condi-
more detail in Part II. tions are defined in the SP, since any additional AMV
projects will require more resources devoted from the
Requalifications and Revalidations AMV, QC, and QA staff to complete these projects,
The SP should define when and to what extent test and may potentially interfere with a continuous pro-
systems are to be requalified or revalidated (de- duction process. However, management should keep
pending also on the initial definition for qualification in mind that the expectations of regulatory agencies

Figure 4
Deriving Acceptance Criteria for the AMV Protocol

Major Con- Minor Con-


tributors Acceptance tributors
(must be Criteria (must be
considered) considered)

Historical
In-Process Maximum
ICH/USP/ Instrument Data of Al-
and/or Pro- Historical Range of
FDA Guide- Specifica- ternative
duct Specifi- Data Uncertainty
lines tions (equivalent)
cations in Test Result
Method

Maximum
Previous
QC Routine Previous Number of
Acceptance
AMD Testing Data Validation Reported
Criteria (if
Data (if revalida- Data (if Significant
revalidation)
tion) revalidation) Figures

AMD QA Review and Approval AMV QC Laboratories Analytical Support

54 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

are likely to be more demanding in the near future. program, since they have the final word on why, when,
The AMV-AMD Interface and how things are to be completed.
In case that the AMV department is permitted to
These two departments should be separate quality run, according to the SP, with a relatively high level of
units, since one department alone should not be al- autonomy, the department should educate QA on the
lowed to control development and validation. This is principles of AMVs, as performed. In many compa-
simply the same GMP principle, from a compliance nies, this may be more likely to be the case, since
standpoint, that QC validations must be approved by AMV scientists may be more experienced in how to
the QA department. Depending on the company size practically plan and execute AMVs. The QA depart-
and requirements for AMVs, the AMV-AMD inter- ment will, and should always have, a certain level of
face could be defined in the SP. Although it may not control, as it naturally is the department that assures
be necessary to include this in the SP, this interface that activities are conducted with the uttermost adher-
should be carefully analyzed from a compliance and ence to compliance and integrity. For the AMV de-
workflow (economical) point. partment to be efficient, it should establish and main-
The AMV department should be involved in the tain a professional work atmosphere, staffed with
review and approval process for developed methods, qualified personnel where documents are reviewed
since they are expected to be the best scientific re- timely, carefully, and objectively, since this is critical
source on which method can be validated, and will be for timely project completion and compliance.
sufficiently robust to withstand years of problem-free The validation protocol should be executed by
routine QC testing. Problematic assays should never trained QC operators who routinely perform this test
reach the AMV department, since those will be diffi- procedure in the appropriate QC laboratory, using only
cult, if not impossible, to validate, and would cause qualified instruments and equipment. This ensures that
many problems during routine QC testing. On the validation samples are tested blindly, and that valida-
other hand, the AMD scientists are the method ex- tion results are directly comparable to future routine
perts, and should be involved in the generation of ac- testing results. Whenever validation projects lag be-
ceptance criteria and other technical aspects of the hind manufacturing, or when revalidations are required
validation protocol. Ultimately, the AMV department (new formulation, matrix change etc.), historical QC
is responsible for the integrity of all data generated assay performance must be integrated in the AMV
leading to the approved validations. A friendly and protocols.
professional work environment between these de- The AMV department should also be integrated in
partments is critical for efficiency, workflow, com- the analysis of timely (i.e., yearly) QC test method
pliance, and overall quality of final method valida- evaluations to ensure that assays maintain suitability,
tion packages, thus ensuring product quality and as demonstrated in the AMV report. This is important
safety to the patient. because not all QC test methods may contain appro-
priate system suitability criteria (i.e., blanks, stan-
The AMV-QA Interface dards, controls, etc.). In addition, assay controls may
“drift” over time from the historical average towards
The QA department oversees and controls the AMV the control limits (i.e., ± 3 standard deviations), there-
department by their approval of all AMV documents. fore not only potentially increasing the number of in-
Whenever the number of AMV projects increases, the valid results, but also indicating uncertainty in the
staffing of delegated personnel from QA management overall test system suitability. Whenever test system
should proportionally increase to avoid hold-ups of suitability becomes uncertain (i.e., loss of accuracy
AMV approvals. Although this may be logically con- and/or precision), the overall test system must be re-
cluded, companies often overlook this fact, and must evaluated. Assay control drifts, for example, may be
then consequently pay the price by having AMV pro- caused by unstable control material, change in re-
jects not completed on time; therefore potentially oper- agents, or testing equipment (i.e., chromatography
ating out of compliance. The QA department should columns). In either case, overall test system suitabil-
clearly communicate their expectations for the AMV ity may be affected, and any required modifications to

A n a l y t i c a l M e t h o d s Va l i d a t i o n 55
Stephan O. Krause, Ph.D.

qualified or validated test methods should be sup- finitions therein. The inspectors usually (but not al-
ported by corresponding requalifications or revalida- ways) proceed systematically, requesting documents
tions. and clarifications, starting from general to specific
The SP should also state which procedure to fol- procedures. The personnel presenting documents and
low in case validation samples are Out-of-Specifi- clarifications to the inspectors have an impact as to
cation (OOS). This is important
since OOS results for validation
samples or routine samples may be
“It is also important to realize
otherwise classified as (failing) that different agencies
retesting.
Process and method validation
communicate their inspection
project planning and execution notes, and schedule their
should be well-orchestrated, and be
run chronologically parallel to en-
audits among themselves.”
sure that at some point, the com-
plete production and testing system is validated, and where, and to which detail, the inspection is heading.
therefore compliant. The process to achieve overall This is why it is so important to have in place a solid
compliance is somehow analogous to the ‘chicken SP that is consistently followed. The inspectors may
and egg’ paradox. Technically, the complete process ‘spot-check’ particular AMVs once they have famil-
cannot be validated as run during large-scale pro- iarized themselves with the SP(s). At this time, a list
duction without validated test procedures for in- of all in-process and final container assays (cus-
process and final container testing. At the same time, tomized for each particular agency), and their qualifi-
test procedures cannot be validated for routine pro- cation/validation status (as defined in the SP) should
duction samples when those are not produced under be provided to the inspectors upon request.
final validated production conditions. Commitments to the agencies should be conserva-
tive, since these will usually require additional res-
Inspections by Regulatory Agencies ources. Like the level of insurance invested through a
solid AMV program, it is again a delicate balance to
The category of the drug product (i.e., biological) what degree companies should commit to complete
will determine which agency branch (i.e., Center for AMVs, since an instant ‘pleasing’ of the inspectors
Biologics Evaluation and Research [CBER], Team may not add any value to the AMV program and,
Biologics) will be responsible for conducting in- therefore, not add to the safety of the patient. Potential
spections. The FDA usually sends a team of in- severe penalties (i.e., consent decree) may be im-
spectors, which may include a local representative pinged, and a loss of trust may substantiate if these
from the Agency, depending on the district and the commitments are not completed by the committed
company being inspected. When an inspection will dates. With a solid SP in hand, the defending company
take place, and by which team members, can be pre- delegate may not have to extensively commit to ad-
dicted to some degree. It is also important to realize ditional AMV projects.
that different agencies communicate their inspection
notes, and schedule their audits among themselves. Conclusions
Finally, inspection records should be provided to the
AMV department, so that it can anticipate follow-up GAMVP is a continuous improvement process.
requests. Once the inspectors are on-site, and have The industry must set new standards to cope with the
eventually devoted attention to QC laboratories, se- increasing requirements set forth by regulatory agen-
nior management should have organized and dele- cies which have, and will themselves, become more
gated AMV or QA personnel to defend AMVs at that educated and experienced. The AMV SP must ensure
time. Delegated personnel should be familiar with the that all test methods provide accurate and reliable re-
content of the AMV SP, and should understand all de- sults. The SP will be reviewed by auditors from reg-

56 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
Stephan O. Krause, Ph.D.

ulatory agencies. Auditors will identify any lack of


Article Acronym Listing
detail and guidance, and evident inconsistencies be-
tween the SP and AMV package. The SP will de- ANOVA: Analysis Of Variance
monstrate to the auditors that the AMV department AMD: Analytical Method Development
can consistently control the quality of the validation AMV: Analytical Method Validation
process. This document directs the number and detail CBER: Center for Biologics Evaluation and Re-
of qualifications and validations to be expected under search
normal operating conditions. Management must an- CFR: Code of Federal Regulations
ticipate the level of AMVs required for continuous C of A: Certificate of Analysis
compliance. The validity and integrity of the test re- DQ: Design Qualification
sults must be properly demonstrated. Only then may FDA: Food and Drug Administration
companies produce and sell their product. ❏ ICH: International Conference on Harmoniza-
tion
IQ: Installation Qualification
About the Author GAMVP: Good Analytical Method Validation
Practice
Stephan O. Krause is the manager of the QC Analytical
OOS: Out-of-Specification
Validation department within the Biological Products
division of Bayer HealthCare Corporation. He received
OQ: Operational Qualification
a doctorate degree in bioanalytical chemistry from the PQ: Performance Qualification
University of Southern California. Dr. Krause can be QA: Quality Assurance
reached by phone 510-705-4191. QC: Quality Control
R&D: Research & Development
Suggested Reading SOP: Standard Operating Procedure
1. Code of Federal Regulations, 21 CFR Part 211. Current Good SP: Standard Practice/Procedure
Manufacturing Practice for Finished Pharmaceuticals. 2001 USP: United States Pharmacopoeia
2. FDA. Guidelines for Validation. 1994
3. FDA. Guide to Inspections of Pharmaceutical Quality Control
Laboratories.
4. International Conference on Harmonization (ICH), Q2A. “Val-
idation of Analytical Procedures.” Federal Register. Vol. 60. 1995.
5. ICH, Q2B, “Validation of Analytical Procedures: Methodology.”
Federal Register. Vol. 62. 1996.
6. United States Pharmacopoeia, USP 25 <1225>. “Validation of
Compendial Methods.”
7. Pharmacopoeial Forum, PF <1223>. “Validation of Alterna-
tive Microbiological Methods.” Vol. 28 No. 1. 2002.
8. Huber, L. “Validation and Qualification in Analytical Labora-
tories.” Interpharm Press. Englewood, CO. 1999.

A n a l y t i c a l M e t h o d s Va l i d a t i o n 57
I N T E R N A T I O N A L C O N T R I B U T O R

Validating Immunoassays Using


the Fluorescence Polarization
Assay for the Diagnosis
of Brucellosis
An Example and as an Application to
ISO Standards 9000 and 17025
By David Gall
and Klaus Nielsen
Canadian Food Inspection Agency, Animal Disease Research Institute

T
he statistical methods pre- plate). For the purposes of this man-
sented in this paper are read- “For regulatory uscript, brucellosis will be used as
ily available in standard epi- an example using the fluorescence
demiological texts. Unfortunately, agencies to accept polarization assay of a new test to be
many researchers, diagnosticians, and approve a new validated. A review of the literature
and laboratorians either are unaware suggests that most of the current in
or do not use these statistical tech- serological test use assays for the diagnosis of bru-
niques. Their use in validating im- at national or cellosis, which are often incorrectly
munoassays would greatly enhance called gold standards, have not been
confidence in the acceptance of a
international validated. Yet, new assays are often
new assay at national and interna- levels, they must compared to these current in use as-
tional levels. For regulatory agen- be assured that a says resulting in misunderstanding
cies to accept and approve a new or incorrect conclusions about the
serological test at national or inter- new assay can new assay. This may result in non
national levels, they must be assured distinguish true acceptance of a perfectly useful as-
that a new assay can distinguish true say, due to poor data or choice of
positive from true negative samples positive from statistics. This can be avoided with
with minimum false positive and true negative better test design and the use of ap-
false negative results, with more ac- propriate statistical methods to com-
curacy than the currently used test.
samples…” pare and validate new tests with cur-
The best way to accomplish this is rent in use tests. With access to com-
by using the various statistics (i.e., sensitivity, speci- puters and software, it is now possible and easier to
ficity, confidence limits, sample size, and kappa that use statistics such as Receiver Operating Character-
are available from the authors in an Excel® 97 tem- istics (ROC).

58 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
David Gall & Klaus Nielsen

Thousands of samples can be measured, tabulated, positive and negative samples.13 Previous to the FPA,
and analyzed. An example is the validation of the flu- the Indirect Enzyme Immunoassay (IELISA) and the
orescence polarization assay with a sensitivity (n = Competitive Enzyme Immunoassay (CELISA) were
1084), specificity (n = 23,754) of 99.4% (98.7 – 99.7 similarly validated.14 The data were compared with the
95% Confidence Limits [CL]), and 99.8% (99.8 – current in use tests, thus meeting the ISO definition of
99.9 95% CL), respectively. Meeting internationally examination and objective evidence, the USAHA cri-
accepted criteria for validation and measurements of teria for better sensitivity, specificity, cost, performance,
uncertainty as defined in ISO standards 9000 and and adaptation to automation, and superior diagnostic
17025 will increase the likelihood of acceptance by in- performance as suggested by Wright et al.12
ternational and national regulatory agencies. Failure to Before validation of an assay can commence, a de-
properly validate a new test may increase the opportu- veloped test must undergo various verification proce-
nity for litigation. dures designed to provide objective evidence that
What is validation? A review of scientific litera- specified requirements have been fulfilled.8 Optimiza-
ture regarding general principles of validation is not tion and standardization of an assay would be an es-
common, but papers discussing validation of spe- sential part of this verification.14 Optimization of a
cific procedures or tests are numerous.1-7 In the In- serological assay would include optimizing concen-
ternational Organization for Standardization (ISO) trations or dilutions of reagents, determining variation
standard 8402, validation is defined as “the confir- between replicate samples (i.e., quality control and
mation by examination and the provision of objec- test sera), determining variation in background activ-
tive evidence that the particular requirements for a ity, determining an initial cutoff and finally, determin-
specific intended use are fulfilled.”8 This definition ing initial test performance.14 Standardization would
is used in ISO standards 9000, 17025, and the earlier include a standardized assay format, standardized
reference manual on validation of test methods pro- preparation of all reagents such as buffers, chemicals,
posed by the European Cooperation for Ac- and biological reagents, and strict adherence to proto-
creditation of Laboratories.9 For serological assays, cols and procedures between laboratories using the
Jacobson defined validation as an assay that consis- same assay.14 Laboratories accredited using ISO 9000
tently provided test results identifying animals as and ISO 17025 standards are required to produce and
positive or negative, and by inference, accurately de- show compliance with approved technical and operat-
termining the disease status of animals with statisti- ing procedures, thus facilitating validation of serolog-
cal certainty.10 In other words, using objective evi- ical assays and technology transfer.
dence, confirm through examination that the test has Test performance can be objectively measured in
fulfilled its intended use. terms of repeatability and accuracy. Repeatability (a
The United States Animal Health Association form of precision) is the ability of an assay or proce-
(USAHA) proposed criteria for evaluating experimen- dure to produce consistent results in repeated tests.
tal brucellosis tests for specificity and sensitivity.11 Consistent results with quality control sera in a sero-
However, new tests would have been subjected to cri- logical assay would be a good example of repeat-
teria more rigorous than official in use tests required by ability. Variation between replicate test samples and
the United States Department of Agriculture (USDA). quality control sera would result in poor repeatabil-
For the acceptance of new tests, the association recom- ity. the accuracy of an assay is the ability to identify
mended that new tests be more sensitive and/or specific positive and negative samples correctly. Noting that
than current in use tests. If the sensitivity and/or speci- assays can be repeatable without being accurate, but
ficity were comparable to the in use tests, then they not the reverse, is important.15 Using defined refer-
should be less costly, easier to conduct, and adaptable ence samples, the accuracy of an assay can be de-
to automation. Similarly, Wright et al., suggested scribed as sensitivity and specificity.
that a new test must be equal or superior concerning di- Sensitivity is the ability of a test to produce a pos-
agnostic performance.12 Nielsen et al., validated the itive result when the sample is from a diseased ani-
Fluorescence Polarization Assay (FPA) for detection of mal. It is measured as a percentage of the population
antibodies to Brucella abortus using defined reference with the disease that has a positive result.16 Similarly,

A n a l y t i c a l M e t h o d s Va l i d a t i o n 59
David Gall & Klaus Nielsen

specificity is the ability of a test to produce a nega- How many samples are required when assembling
tive result when the sample is from a healthy animal, reference samples for validation? Failure to consider
and being a percentage of the healthy population that sample size may affect sensitivity and specificity es-
has produced a negative result. The underlying pre- timates. Often, more samples are required to demon-
valence of a disease does not influence sensitivity or strate small differences in sensitivity or specificity.19
specificity.16 For example, 15,000 samples would be required to
Relative sensitivity and relative specificity should distinguish between tests with specificity differences
not be confused with sensitivity and specificity. Rela- of 0.1%.21 More important, sample size affects the
tive sensitivity is the ability of a test to produce a pos- confidence limits for calculated values of sensitivi-
itive result compared with another test or series of ties and specificities. Too few samples result in wide
tests producing a positive result. Relative specificity is confidence limits, negating any usefulness of the data
the ability of a test to produce a negative result rela- or resulting in the wrong conclusions about the new
tive to another test or series of tests. The other test(s) test. The greater the sample size, the better the confi-
sensitivities and specificities should approach 100% if dence limits for sensitivity and specificity. More data
used to classify animals in the positive category or and tighter confidence limits raise the confidence in
negative category resulting in a close approximation the test to distinguish between positive and negative,
of the actual sensitivity and specificity.17, 18 If ascer- especially near the cutoff. As well, larger size in-
taining the actual disease status is difficult or costly, creases the probability of being representative of the
relative sensitivity and specificity may be used.18 In field population.
comparing a new test that may be superior to another Associated with test performance is uncertainty
test or series of tests, the resultant relative sensitivity of measurement which describes dispersion or vari-
or specificity may be lower, resulting in false or ation of data about a value. Examples of measures of
wrong conclusions. dispersion are range, standard deviation, and confi-
A known number of reference positive and nega- dence intervals. Sensitivity and specificity estimates
tive samples representative of the population under are subject to sampling variation, and as such should
study must be assembled for estimation of sensitiv- have CL which are a measure of uncertainty as men-
ity and specificity. Reference positive and negative tioned in the ISO standard 17025.
samples can be defined in any way, if the definition
is explicit.19 In veterinary medicine, reference posi- Materials and Methods
tive and negative samples are usually determined
using a “gold standard” which could be another test, Reference Samples
procedure or multiple tests or procedures.16 Ideally, The FPA is an appropriate example of test valida-
reference positive samples should be obtained from tion.22 The defined positives (n = 1084) were sam-
individual animals of known disease status, or from ples selected from animals from which Brucella
animals whose herd status was known through dis- abortus was isolated from various tissues or secre-
ease status.12 However, obtaining samples of known tions (Gold Standard). The defined negatives (n =
disease status may not be possible, so the best avail- 23,754) were randomly selected Canadian samples
able method, such as current in use tests, could be submitted for routine testing from animals with no
used.16 For example, using immunologically based previous clinical or epidemiological evidence of
tests to examine a new immunologically based test is Brucellosis. Canada was officially declared free of
acceptable, but may introduce a bias against the new bovine brucellosis in 1985.
test because the evaluation of the new test was lim-
ited to those animals selected by the other test may Statistical Data Analysis
not be measuring the same antibody population de- Sensitivity, relative sensitivity, specificity, and
pending on the antigen used.10, 20 Similarly, a combi- relative specificity are determined as indicated in
nation of other tests or procedures could be used to Figure 1 and Figure 2. A template in Excel® 97 is
define reference negative samples. Once assembled, available from the authors.
testing of the samples should be conducted blindly. A simple formula for estimating the lower and

60 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
David Gall & Klaus Nielsen

Figure 1 moderate size (0.3 ≤p ≤0.7).24 A better estimate of


confidence limits is the non symmetrical limits pre-
Determination of Sensitivity and sented in Figure 4, especially if the sensitivity or
Specificity (Accuracy) specificity value is near zero or 100%.24 A template in
Test/Disease Disease (+) Disease (-) Excel 97 is available from the authors for the non
Status symmetrical limits.
New Test A B Frequently, results from known reference positive
Positive (+) (true positives)a,e (false positives)c and negative samples overlap, creating a certain
New Test C D amount of uncertainty regarding the choice of a cutoff
Negative (-) (false negatives)d (true negatives)b,f and the resultant sensitivity and specificity as depicted
in Figure 5.25 The exact placement of the cutoff is sub-
a. True positives are those samples testing positive and
have the disease. jective. An otherwise excellent assay’s chances for ac-
b. True negatives are those samples testing negative and ceptance are reduced through a poor choice of cutoff.
do not have the disease. An ROC curve, plotting sensitivity against specificity
c. False positives are those samples testing positive and results at various cutoff points (Figure 6), removes the
do not have the disease. subjectivity inherent in frequency distributions shown
d. False negatives are those samples testing negative and
have the disease.
in Figure 5. This is the only measure available that is
e. Sensitivity (in percent) = (A / A+C) x 100
uninfluenced by decision biases and prior probabilities
f. Specificity (in percent) = (D / B+D) x 100 comparing different assays with a common easily un-
derstandable scale. 26 Figure 6 is also a graphic repre-
Figure 2 sentation of the relationship between sensitivity and
specificity estimates that is easily determined using
Determination of the Relative Sen- ROC software.27 Each point on a ROC curve repre-
sitivity or Specificity sents a two x two table of sensitivity and specificity es-
New Test/ In Use Test(s) (+) In Use Test(s) (-) timates associated with a cutoff value from the lowest
In Use Test(s) to the highest value (Figure 6). Along the diagonal
New Test A B line, a true positive response equals a false positive re-
Positive (+) (true positives)a,e (false positives)c sponse and is often called the chance line. The greater
New Test C D
the curve above this chance line, the better the dis-
Negative (-) (false negatives)d (true negatives)b,f crimination between the reference positive and nega-
tive samples. As well, the Area Under the ROC Curve
a. True positives are those samples testing positive and (AUC) is a good measure of detection, and is useful
classified postive.
b. True negatives are those samples testing negative and
when distribution assumptions cannot be made or do
are classified negative. not hold.25, 28 An AUC of 0.91 implies that a randomly
c. False positives are those samples testing positive and selected sample from the positive group will test
are classified negative. higher than a randomly selected sample from the neg-
d. False negatives are those samples testing negative and ative group 91% of the time with 95% certainty.29 ROC
are classified positive.
curve analysis can also compare the assay perfor-
e. Relative sensitivity (in percent) = (A / A+C) x 100
f. Relative specificity (in percent) = (D / B+D) x 100
mance of two or more tests.30

upper confidence limits is presented in Figure 3.19, 23 Sample Size


A template in Excel 97 is available from the authors. A simple formula for determining sample size for
The formula is a normal approximation to calculate sensitivity or specificity is presented in Figure 7, and is
95 percent confidence limits providing symmetrical available from the authors in an Excel 97 template. 10, 31
confidence limits about the point estimate. When the This formula is useful for surveys when an estimate
point estimate approaches 100%, the confidence for allowable error, sensitivity, or specificity is avail-
limit often exceeds 100%. This formula may be used able, but could also be used to help a researcher in
when the sensitivity and specificity values are of determining an appropriate sample size for validating

A n a l y t i c a l M e t h o d s Va l i d a t i o n 61
David Gall & Klaus Nielsen

Figure 3 Figure 4
Calculation of Symmetrical 95% Calculation of Non Symmetrical
Confidence Limits for 95% Confidence Limits for
Sensitivity or Specificity Sensitivity or Specificity
Using the Fluorescence Polarization Using the Fluorescence Polarization
Assay (FPA) as an Example Assay (FPA) as an Example

L = 1.96 x ([pxq]/n) (2np + C2∝-/21) -C 2


∝ /2
C∝2 /-2 (2 + 1/n) + 4p(nq + 1)
PL =
2(n + C2 )
∝ /2
L = the lower or upper 95% limit for p
p = the observed proportion (i.e., sensitivity or specificity value) (2np + C∝2 / + 1) +C∝2 / C∝2 + (2 + 1/n) + 4p(nq - 1)
2 2 /2
q=1–p Pu =
2(n + C2 )
n = total number of samples ∝ /2

Example: In a reference positive population of 1084 animals PL = the lower 95% limit for p
infected with Brucella abortus, 1077 tested positive
on the FPA while seven tested negative. The sensi- Pu = the upper 95% for p
tivity point estimate (p) was 1077/1084 or 99.4%. p = the observed proportion (i.e., sensitivity or specificity value)
p = 0.994, q = 0.006, n = 1084 q=1–p
n = total number of samples
L = 1.96 x ([0.994 x 0.006]/1084) C∝/2 = 95% confidence limits for p of 1.96
2

Lower 95% for p: 0.994 - 0.005 = 0.989 or 98.9% Example: In a reference positive population of 1084 animals
Upper 95% for p: 0.994 + 0.005 = 0.999 or 99.9% infected with Brucella abortus, 1077 tested positive
on the new test while seven tested negative. The
sensitivity point estimate (p) was 1077/1084 or
a new test.31 The allowable error is a percentage error 99.4%.
expressed as a decimal allowed for the estimate of p = 0.994, q = 0.006, n = 1084
sensitivity or specificity. Using the formula, the data PL = 98.7%
presented in Figure 8 are the estimates of the number Pu= 99.7%
reference positive animals required for each sensitiv-
ity listed at the top of the figure. The allowable error are symmetrical and should only be used when either
is listed on the left column of the table (i.e., the ex- sensitivity or specificity is of moderate value (be-
pected estimate for FPA sensitivity is within 1% of tween 30% and 70%). If 99% confidence limits are
the true level 95% of the time). In brackets, are the required, 2.56 can be substituted for 1.96. From the
calculated non symmetrical 95% confidence limits example for the FPA, the lower 99% for p (sensitivity
using the formula presented in Figure 4 for the ex- point estimate) would be 98.8% versus 98.9% for
pected sensitivities or relative sensitivities. The limits 95% CL, and the upper 99% for p would be 100%
get wider as the sample size decreases. Similarly, the versus 99.9% for 95% CL.
formula can be used to estimate the sample size re- For the same data presented in Figure 3, the non
quired for specificity or relative specificity as pre- symmetrical limits presented in Figure 4 for p at
sented in Figure 9.10, 20, 32 Since specificity or relative 95%, CL is 98.7% for the lower limit, and 99.7% for
specificity is expected to be higher than sensitivity, the upper limit. When the sensitivity or specificity
the range of specificity and allowable errors are dif- value approaches 0% or 100%, non symmetrical
ferent. Due to the greater sample size, the confidence confidence limits should be used.
limits are narrow. Each point on a ROC curve represents a two by (x)
two table of true positive and false positive estimates
Results associated with a cutoff value from the lowest to the
The 95% CL presented in Figure 3 are calculated highest value as presented by Figure 6 for the FPA.
using the template provided in Excel 97. The limits The sensitivity and specificity for the optimal cutoff

62 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
David Gall & Klaus Nielsen

Figure 5 Figure 6
Frequency Distribution of Reference Relationship of Sensitivity
Positive and Negative Samples and Specificity Estimates
Exact placement of the cut off is subjective, affecting Optimal Cutoff 90 mP
the sensitivity and specifity of the assay.
80 85 90 95
Reference Negatives Reference Positives
100
(a) Cutoff 1 (78.5)
Each point on a Receiver Operat- 95
ing Characteristics (ROC) curve
represents a 2 x 2 table of sensi-
tivity and specificity estimates as- (180.8) 90
TP sociated with a cut off value from
the lowest to the highest value for
the FPA. The enlarged view more 85
TN
clearly shows the relationship of
FP sensitivity and specificity esti- 80
FN mates.

(b) Fluorescence Polarization Assay


(FPA)
Cutoff 2
100
(67.6) (71.5) (77.2) (78.5)
(180.8)
TP
80
TN

Sensitivity
FP (218.2) 60
FN

(260.1) 40
values of 90 mP are 99.4% (98.7 - 99.7) and 99.8%
(99.8 - 99.9), respectively. If sensitivity were wanted
(274) 20
over specificity, then a lower cutoff value of 90 mP
could be chosen as illustrated in Figure 6, removing
any possibility of false negatives. The converse would
0 20 40 60 80 100 0
be true for specificity. For instance, the sensitivity and Specificity
specificity for cutoff value 67.6 mP are 100% (100 -
100) and 20% (20.3 - 21.3), respectively, while for 97.3% to 99.7%. The actual size of 1084 samples ex-
cutoff value 274 mP, the sensitivity and specificity are ceeded the number calculated, resulting in a sensitiv-
20% (17.7 - 22.5) and 100% (100 - 100), respectively. ity of 99.4% (98.7% to 99.7% CL). Similarly, the
The numbers in brackets after each point estimate for sample size required with an allowable error of 0.1%
sensitivity and specificity are the 95% confidence in- for a specificity of 99% was 39,600 with confidence
tervals as determined by software.27 The AUC for the limits of 98.9% to 99.1% from Figure 9. The actual
FPA, as presented in Figure 6 and determined by soft- sample size was 23,754, giving a specificity of 99.8%
ware, was 0.999 (0.999 - 1.000) indicating that a ran- with confidence limits of 99.8% to 99.9%.
domly selected sample from the positive group will As previously mentioned, sample size also affects
test higher than a randomly selected sample from the the confidence limits for sensitivities and specifici-
negative group 99.9% of the time with 95% certainty. ties. For example, using the formula illustrated in
Using the data presented in Figure 8, the sample Figure 3, the range between the lower confidence
size required for the FPA (expected sensitivity equals limit (98.8%) and the upper confidence limit (100%)
99 ± 1%) was 396 samples giving confidence limits of is 1.2% for a sample size of 600. Using the same

A n a l y t i c a l M e t h o d s Va l i d a t i o n 63
David Gall & Klaus Nielsen

Figure 7 occur by chance alone for two tests being compared


when both exceed 50% in sensitivity and specificity.10
Formula for Calculation of the As well, the kappa value is affected by the prevalence
Number of Samples Required for of the disease in the population of interest. Two
Expected Sensitivity or Specificity poorly validated tests may have good agreement be-
The Allowed Error is a Percentage Error cause they are measuring the same antibody or anti-
Expressed as a Decimal for Each Esti- gen. Consequently, kappa statistics should be used
mate of Sensitivity or Specificity cautiously. A template in Excel 97 is available from
the authors for determination of kappa.
n = ((4 x p x q)/L)2 The other statistical techniques presented in this
paper are available in standard epidemiological
Where: texts.31 However, the use of these techniques for val-
n = the number of samples required
idating immunoassays is sparse. Both Martin and
p = the expected proportion
Swets et al. have alluded to this.18, 33 Use of these sta-
q=1–p
tistical techniques would increase the likelihood of
L = the allowable error
successfully validating a new test, and acceptance by
The number 4 = approximate square of Z = 1.96
regulatory agencies responsible for approving new
which provides a 95% confidence level
for 99% confidence level the number 6.6 diagnostic assays. Most new assays or even current in
(z = 2.56) should be substituted use assays for the serological diagnosis of brucel-
losis, such as the Rose Bengal Plate Test (RBPT) and
ex. n = ((4 x .99 x .01)/.01)2 the Tube Agglutination Test (TAT), have not been or
= 396 are not validated.
Of the statistical techniques presented in this
data, a sample size of 1084 results in a range of 1% paper, the formula for calculation of sample size is
between the lower (98.9%) and upper confidence the most important. Insufficient sample size could
limits (99.9%), resulting in a higher confidence in the result in wrong conclusions about a newly developed
data. For a sample size of 10,000, the range is further assay or comparisons to other assays. At the very
reduced to 0.4%. For the FPA, the range is very tight minimum, 300 samples should be selected for esti-
for sensitivities and specificities, since the sample mates of sensitivity or specificity.10 Obtaining the de-
sizes for the reference positive and reference negative sired number or sample type is not always possible
are large. To decrease the range of the confidence due to resource or logistical limitations, or prohibi-
limits from 1% to 0.4% required increasing the sam- tive costs. Although the ideal number and type of
ple size approximately nine fold. samples are not available, most assays can still be
The flowchart presented in Figure 10 summarizes validated if the sample definitions are clear. Conclu-
the steps required to validate an assay such as the sions can be drawn from ROC analysis of as few as
FPA. 100 samples.19 The CLs for the resulting sensitivities
and specificities will, of course, be wider because of
Conclusion fewer samples.
Another longer term approach to validation of an
Not covered in this article is another statistic assay is the banking of samples to obtain the number
known as “Kappa” used to determine agreement be- and type of samples.20 This approach has the added
tween tests. Kappa may be used when the disease sta- advantage of linking the past, present, and future pro-
tus is unknown, and assembling known reference pos- ducing more reliable validation data, and may be the
itive and negative samples is not possible. A kappa only method available for assays where collection of
value of 0 suggests no agreement beyond chance samples is difficult, hazardous, or costly.
while a kappa value of 1 reflects perfect agreement. Other factors that can influence validation of an
Generally, excellent agreement is greater than 0.7.16 assay are calibration and maintenance of equipment,
The possibility exists, however, that agreement can analyst proficiency, training and laboratory conditions.

64 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
David Gall & Klaus Nielsen

Figure 8
Expected Sensitivity or Relative Sensitivity
Number of reference positive samples required for the expected estimates for sensitivity or relative sensitivity of a new
test with an allowable error for sensitivity or relative sensitivity within a percentage of the true level 95% of the time.
Allowable 85%a 90% 95%a 99%
Error
1%b 5100c 3600 1900 396e
(83.98 – 85.96%)d (88.96 – 90.95%) (93.90 – 95.92%) (97.27 – 99.68%)
2% 1275 900 475 99
(82.89 – 86.89%) (87.81 – 91.84%) (92.53 – 96.71%) (93.71 – 99.95%)
5% 204 144 76 16
(79.18 – 89.46%) (83.61 – 94.17%) (86.71 – 98.44%) (74.52 – 99.77%)
10% 51 36 19 4
(71.65 – 92.99%) (74.34 – 96.98%) (72.23 – 99.77%) (38.75 – 98.09%)

a. Estimates of sensitivity.
b. Allowable error for estimate of sensitivity or relative sensitivity within a chosen percentage of the true level 95% of time.
c. Estimate of number of reference positive samples required to achieve sensitivity.
d. Non symmetrical 95% confidence limits for each estimate of sensitivity or relative sensitivity.
e. Example from Figure 7.

Figure 9
Expected Specificity or Relative Specificity
Number of reference negative samples required for the expected estimates for specificity or relative specificity of a new
test with an allowable error for specificity or relative specificity within a percentage of the true level 95% of the time.

Allowable 90%a 95% 97%a 98%a 99%


Error
0.01%b
36,000,000 19,000,000 11,640,000 7,840,000 3,960,000
(89.99 – 90.01) (94.99 – 95.01) (96.99 – 97.01) (97.99 – 98.01) (98.99 – 99.01)
0.1%
360,000 190,000 116,400 78,400 39,600
(89.90 – 90.10) (94.90 – 95.10) (96.90 – 97.10) (97.90 – 98.10) (98.90 – 99.09)
0.5%
14,400 7,600 4,656 3,136 1,584
(89.50 – 90.48) (94.48 – 95.47) (96.46 – 97.46) (97.43 – 98.45) (98.34 – 99.41)
1%
3,600c 1,900 1,164 784 396
(88.96 – 90.95)d (93.90 – 95.92) (95.81 – 97.87) (96.68 – 98.82) (97.27 – 99.68)

a. Estimates of specificity.
b. Allowable error for estimate of specificity or relative specificity within a chosen percentage of the true level 95% of time.
c. Estimate of number of reference negative samples required to achieve specificity.
d. Non symmetrical 95% confidence limits for each specificity and allowable error.

Data produced from instrumentation that is not well of poorly maintained and calibrated equipment. In the
maintained or calibrated is suspect and will affect the wrong hands, a new test that performed well in the re-
validation outcome. The estimates of sensitivity, speci- searcher’s laboratory fails because the analyst who
ficity, and ROC analysis could be erroneous because was evaluating the test was not properly trained re-

A n a l y t i c a l M e t h o d s Va l i d a t i o n 65
David Gall & Klaus Nielsen

Figure 10 Regardless of which definition is used, statistical


analysis of the data increases the scientific integrity
Flowchart for Validating of the assay, therefore gain national and interna-
an Immunoassay with tional acceptance. ❏
Statistical Confidence
Assemble Reference Sera (Both -ve and +ve)
About the Author
Test Sera on New and Standard Assays
David Gall and Klaus Nielsen of the Canadian Food
Inspection Agency, Animal Disease Research In-
stitute have been involved in the development, opti-
Input Data Into Database mization, standardization, quality control, validation
and technology transfer of primary binding assays
such as enzyme immunoassays (ELISA), fluores-
Perform ROC Analysis Using Software to cence polarization assay (FPA) for the past 20
Decide Cutoff, Sensitivity and Specificity
years. Gall and Nielsen have been involved in the
successful transfer and acceptance of immunoas-
Make A Decision Regarding says at the national and international levels. They
Value of the New Test can be reached by phone at 613-228-6698, and by
fax at 613-228-6667. David Gall can be reached by
garding the test, or does not have the suitable back- e-mail at gall@[Link].
ground for evaluating the test. As a result, the test does
not gain acceptance. The analyst may not be proficient References
due to poor laboratory techniques such as mixing of 1. Biancifiori, F., Garrido, F., Nielsen, K., Moscati, L., Duran, M. and
Gall, D. (2000) “Assessment of a Monoclonal Antibody-Based
samples or pipetting of samples. This can be addressed Competitive Enzyme Linked Immunosorbent Assay (CELISA)
by using proficiency panels to assist the analyst in be- for Diagnosis of Brucellosis in Infected and REV 1 Vaccinated
coming proficient. Laboratory conditions can affect Sheep and Goats.” Microbiologica 23: 2000. pp. 399-406.
2. Gall, D., Nielsen, K., Forbes, L., Davis, D., Elzer, P., Olsen, S.,
the validation of assays, such as improper storage of Balsevicius, S., Kelly, L., Smith, P., Tan, S. and Joly, D.
samples, which affects the quality of the samples and (2000). “Validation of the Fluoresence Polarization Assay and
Comparison to Other Serological Assays for the Detection of
the resultant data produced. The above factors are cov- Serum Antibodies to Brucella abortus in Bison.” Journal of
ered in ISO 9000 and ISO 17025 standards. Lab- Wildlife Dis. Vol. 36, No. 3. 2000. pp. 469-476.
3. Paulo, P. S., Vigliocco, A., Ramondino, R. F., Marticorena, D., Bissi,
oratories accredited through these standards are re- E., Briones, G., Gorchs, C., Gall, D. and Nieslen, K. “Evaluation of
quired to maintain records, thus improving the likeli- Primary Binding Assays for Presumptive Serodiagnosis of Swine
Brucellosis in Argentina.” Clinical and Diagnostic Laboratory Im-
hood of proper validation. munology Vol. 7 No. 5. 2000. pp. 828-831.
Validation or validated assays elicit various in- 4. Samartino, L., Gregoret, R., Gall, D. and Nielsen, K. “Fluor-
escence Polarization Assay: Application to the Diagnosis of
terpretations and responses.34-36 In veterinary medi- Bovine Brucellosis in Argentina.” Journal of Immunoassay.
cine, it is an assay that consistently provides results Vol. 20 No. 3. 1999. pp. 115-126.
5. Gall, D., Colling, A., Marino, O., Moreno, E., Nielsen, K., Perez,
that correctly identify samples as positive or nega- B. and Samartino, L. “Enzyme Immunoassays for Serological
tive.10 Validation to other researchers is a time-lim- Diagnosis of Bovine Brucellosis: A Trial in Latin America.”
Clinical and Diagnostic Laboratory Immunology. Vol. 5 No. 5.
ited process, or an ongoing process of the assay per- 1998. pp. 654-661.
formance. Validation is also defined as “the confir- 6. Vanzini, V. R., Aguirre, N., Lugaresi, C. I., de Echaide, S. T.,
de Canavesio, V. G., Guglielmone, A. A., Marchesino, M. D.
mation by examination and the provision of objec- and Nieslen, K. (1998). “Evaluation of Indirect ELISA for the
tive evidence that the particular requirements for a Diagnosis of Bovine Brucellosis in Milk and Serum Samples in
Dairy Cattle in Argentina.” Preventative Veterinary Medicine.
specific intended use are fulfilled.” 8, 37, 38 The Food Vol. 36. 1998. pp. 211-217.
and Drug Administration (FDA) defines the term as 7. Nielsen, K. H., Kelly, L., Gall, D., Balsevicius, S., Bossé, J., Nico-
“Establishing documented evidence which provides letti, P. and Kelly, W. Comparison of Enzyme Immunoassays for
the Diagnosis of Bovine Brucellosis. Preventative Veterinary Med-
a high degree of assurance that the specific process icine. 26: 1996b. pp. 17-32.
will consistently produce a product meeting its pre- 8. International Organization for Standardization (ISO 8402, 1994).
Quality Management and Quality Assurance - Vocabulary.
determine specifications and quality attributes.” 39 9. European Cooperation for Accreditation of Laboratories

66 I n s t i t u t e o f Va l i d a t i o n Te c h n o l o g y
David Gall & Klaus Nielsen

(1997). Validation of Test Methods: General Principles and 2nd Edition, Wiley Series in Probability and Mathematical
Concepts EA-2/06. 37, rue de Lyon, FR-75012 Paris, France. Statistics. 1981. New York, USA.
10. Jacobson, R. H. “Validation of Serological Assays for Diag- 25. Erdreich, L.S. and Lee, E.T. “Use of Relative Operating Char-
nosis of Infectious Diseases.” Rev. Sci. Tech. Off. Int. Epiz. Vol. acteristic Analysis in Epidemiology: A Method for Dealing with
17 No. 2. 1998. pp. 469-486. Subjective Judgement.” American Journal of Epid. Vol. 114 No.
11. The United States Animal Health Association. Brucellosis 5. 1981. pp. 649-662.
Scientific Advisory Committee. Critique of Proposed “Criteria 26. Swets, J. A. “Measuring the Accuracy of Diagnostic Systems.”
for Evaluating Experimental Brucellosis Tests for Specificity Science Vol. 240 No.4857. 1988. pp. 1285-1293.
and Sensitivity.” Appendix F: 1987. pp. 290-293. 27. Schoojans, F., Zalata, A., Depuydt, C.E. and Comhaire, F.H.
12. Wright, P. F., Nilsson, E., Van Rooij, E. M. A., Lelenta, M and “MedCalc: A New Computer Program for Medical Statistics.”
Jeggo, M. H. “Standardisation and Validation of Enzyme- Computer Methods and Programs in Biomedicine. Vol. 48.
Linked Immunosorbent Assay Techniques for the Detection of 1995. pp. 257-262.
Antibody in Infectious Disease Diagnosis.” Rev. Sci. Tech. Off. 28. Swets, J. A. “ROC Analysis Applied to the Evaluation of Med-
Int. Epiz. Vol. 12 No. 2. 1993. pp. 435-450. ical Imaging Techniques.” Invest. Radiology 1979. Vol. 14. pp.
13. Nielsen, K., Gall, D., Jolley, M., Leishman, G., Balsevicius, S., 109-121.
Smith, P., Nicoletti, P. and Thomas, F. “A Homogeneous Flu- 29. Zweig, M. H. and Campbell, G. “Receiver-Operating Characteristic
orescence Polarization Assay for Detection of Antibody to (ROC) Plots: a Fundamental Evaluation Tool in Clinical Medicine.”
Brucella Abortus.” Journal of Immunological Methods. 1996c. Clinical Chemistry. Vol. 39. 1993. pp. 561-577.
pp. 161-168. 30. Griner, P. F., Mayewski, R. J., Mushlin, A. I. and Greenland,
14. Nielsen, K., Gall, D., Kelly, W., Vigliocco, A., Henning, D. and P. “Selection and Interpretation of Diagnostic Tests and Pro-
Garcia, M. (1996a). Immunoassay Development: Application cedures.” Annals of Internal Medicine. Vol. 94 No.4. 1981. pp.
to Enzyme Immunoassay for the Diagnosis of Brucellosis. 555-600.
Agriculture and Agri-Food Canada, 3851 Fallowfield Road, 31. Martin, S. W., Meek, A. H. and Willeberg, P. (1987). Veterinary
Nepean, Ontario, Canada K2H 8P9. ISBN 0-662-24163-0. Epidemiology, Principles and Methods. Iowa State University
15. Cannon, R. M. and Roe, R. T. (1982). “Livestock Disease Sur- Press. Ames, Iowa, USA. ISBN 0-8138-1856-7.
veys: A Field Manual for Veterinarians.” Australia Govern- 32. McNab, B. (1997). Basic principles of evaluating test performance
ament Publication Serv. Canberra, Australia. ISBN 0-644- for making decisions: Ministry of Agriculture Food and Rural Af-
02101-2. fairs, Epidemiology and Risk Assessment, Guelph, Ontario,
16. Martin, S. W. “The Interpretation of Laboratory Results.” Vet- Canada. [Link]
erinary Clinical American Food Animal Practice. Vol. 4 No. 1. frameworks/[Link].
1988. pp. 61-78. 33. Swets, J. A., Dawes, R. M. and Monahan. “Better Decisions
17. Baldock, F. C. (1988). Epidemiological Evaluation of Im- Through Science.” Scientific American. 2000. pp. 82-85.
munological Tests. In: ELISA Technology in Diagnosis and 34. Green, C. “Biological Method Validation: A Practical Ap-
Research, Ed. Graham Burgess, James Cook University of proach.” Journal of Validation Technology. 1998. Vol. 4 No. 3.
North Queensland, Townsville, Australia. pp. 217-22
18. Martin, S. W. (1977). The evaluation of tests. Canadian Jour- 35. Noy, R. J. “Quality Assurance of Validation Data.” Journal of
nal Comparative Medicine. Vol. 41 No.1. pp. 19-25. Validation Technology. 1997. Vol. 3 No. 2. pp. 148-156.
19. Metz, C. “Basic Principles of ROC Analysis. Seminars in Nu- 36. Possa, C. “How Much Validation is Enough? The Balancing Act.”
clear Medicine.” Vol. 3 No. 4. 1978. pp. 283-298. Journal of Validation Technology. 1997. Vol. 3 No. 2. p. 188.
20. McNab, B. (1991). General Concepts for Evaluating Test Per- 37. Mathers, D. D., Marshall, R. T., Caillibot, P. F. and Phipps, M. J.
formance and Making Decisions: Notes for Program Design- (1994). The ISO 9000 Essentials: A Practical Handbook for Im-
ers, Managers and Researchers. Agriculture and Agri-Food plementing the ISO 9000 Standards. Canadian Standards Associ-
Canada, Animal and Plant Health Directorate, Nepean, On- ation, Mississauga, Ontario, Canada. ISBN 0-921347-40-5.
tario, Canada (Unpublished). 38. General Requirements for the Competence of Testing and Cali-
21. Crofts, N., Maskill, W. and Gust, I.D. (1988). Evaluation of bration Laboratories (2000). Document CAN-P-4D (ISO
Enzyme-Linked Immunosorbent Assays: a Method of Data 17025). Standards Council of Canada, Ottawa, Ontario, Canada.
Analysis. Journal of Virological Methods. Vol. 22. pp. 139. 39. Ferrante, M. “A Simple Way to Establish Acceptance Criteria
22. Nielsen, K. and Gall, D. “Fluorescence Polarization Assay for the for Validation Studies.” Journal of Validation Technology
Diagnosis of Brucellosis: a Review.” Journal of Immunoassay 1999. Vol. 3 No. 4.
and Immunochemistry. 2001. Vol. 22 No. 3. pp. 183-201 (In pub-
lication).
23. Remington, R.D. and Schork, M.A. “Statistics with Applications
to the Biological and Health Sciences.” 1970. Prentice-Hall Inc.,
Englewood Cliffs, N.J. USA.
24. Fleiss, J.L “Statistical Methods for Rates and Proportions.”

Article Acronym Listing


AUC: Area Under the ROC Curve RBPT: Rose Bengal Plate Test
CELISA: Competitive Enzyme Immunoassay ROC: Receiver Operating Characteristics
CL: Confidence Limits TAT: Tube Agglutination Test
FDA: Food and Drug Administration USAHA: United States Animal Health Associa-
FPA: Fluorescence Polarization Assay tion
IELISA: Indirect Enzyme Immunoassay USDA: United States Department of Agricul-
ISO: International Organization for Stan- ture
dardization

A n a l y t i c a l M e t h o d s Va l i d a t i o n 67

You might also like