0% found this document useful (0 votes)
45 views4 pages

Virtual Screening in Drug Discovery

Uploaded by

vibhamishra017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views4 pages

Virtual Screening in Drug Discovery

Uploaded by

vibhamishra017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

6.

Shoichet old 8/12/04 5:37 pm Page 862

insight commentary

Virtual screening of chemical libraries


Brian K. Shoichet
Department of Pharmaceutical Chemistry, University of California, 600 16th Street, San Francisco, California 94143-2240, USA
(e-mail: [email protected])

Virtual screening uses computer-based methods to discover new ligands on the basis of biological
structures. Although widely heralded in the 1970s and 1980s, the technique has since struggled to meet
its initial promise, and drug discovery remains dominated by empirical screening. Recent successes in
predicting new ligands and their receptor-bound structures, and better rates of ligand discovery compared
to empirical screening, have re-ignited interest in virtual screening, which is now widely used in drug
discovery, albeit on a more limited scale than empirical screening.

T
he dominant technique for the identification of anti-influenza drug Relenza, typically through cycles of
new lead compounds in drug discovery is the modification and subsequent experimental structure deter-
physical screening of large libraries of chemicals mination. Computational modelling has been used exten-
against a biological target (high-throughput sively in these efforts5,6 and indeed in non-receptor-based
screening). An alternative approach, known as methods; for example, when searching for new ligands on
virtual screening, is to computationally screen large the basis of their chemical similarity to a known ligand or
libraries of chemicals for compounds that complement when matching candidate molecules to a ‘pharmacophore’
targets of known structure, and experimentally test those that represents the chemical properties of a series of known
that are predicted to bind well. Such receptor-based virtual ligands7. But until recently there have been few instances of
screening faces several fundamental challenges, including completely new ligands (not resembling those previously
sampling the various conformations of flexible molecules known) discovered directly from receptor-based computa-
and calculating absolute binding energies in an aqueous tion. Although there are now many more and much better
environment. Nevertheless, the field has recently had receptor structures than there were in the 1970s and 1980s,
important successes: new ligands have been predicted along and computer speed has grown exponentially, drug discov-
with their receptor-bound structures — in several cases ery and chemical biology remain dominated by empirical
with hit rates (ligands discovered per molecules tested) screening and substrate-based design.
significantly greater than with high-throughput screening. Three problems have impeded progress in receptor-
Even with its current limitations, virtual screening accesses guided explorations of ligand chemistry. First, chemical
a large number of possible new ligands, most of which may space is vast but most of it is biologically uninteresting:
then be simply purchased and tested. For those who can blank, lightless galaxies exist within it into which good ideas
tolerate its false-positive and false-negative predictions, at their peril wander. Constraining the number of chemical
virtual screening offers a practical route to discovering new compounds that are searched to biologically relevant and
reagents and leads for pharmaceutical research. synthetically accessible molecules remains an area of active
research. Second, receptor structures are complicated,
Problems with virtual screening resembling “tangled knot(s) of viscera”8. They consist of
A founding idea in molecular biology was that biological several thousand atoms, each of which is more or less free to
function follows from molecular form. If you knew the move, and they frequently change shape and solvent struc-
molecular structure of a receptor — defined here as a bio- ture upon binding to a ligand. To predict what molecules
logical macromolecule that converts ligand binding into an might be recognized by a given receptor, energetically
activity — you could understand and predict its function. accessible receptor and ligand conformations should be
This notion has underpinned a 70-year project to determine calculated. Unfortunately, the number of possible confor-
receptor structures to atomic resolution. From the early mations rises exponentially with the number of rotatable
X-ray diffraction studies of pepsin and of haemoglobin, to bonds, of which there are thousands in a protein–ligand
those of macromolecular assemblies like the ribosome and complex, and the full sampling of conformations involves a
to structural genomics, the taxonomic part of this enter- set of computational problems for which no general solution
prise (that is, cataloguing receptor structures) has been is known. Third, calculating ligand–receptor binding energies
extraordinarily successful. But still largely unfulfilled is the is difficult9. Binding affinity in an aqueous environment is
promise of exploiting receptor structures to discover new determined by the solvation energies of the individual mole-
ligands that modulate the activities of these molecules and cules (high solvation energies typically disfavour binding),
macromolecular assemblies. and by the interaction energies between them (high interac-
As early as the mid-1970s, investigators suggested that tion energies favour binding). Solvation and interaction
computational simulations of receptor structures and the energies are both typically much larger in magnitude than
chemical forces that govern their interactions would enable the net affinity, making calculation of the latter problematic.
‘structure-based’ ligand design and discovery1,2. Ligands Although it has been possible to calculate accurately the
could be designed on the basis of the receptor structure differential affinity between two related ligands using
alone, which would free medicinal chemistry from the thermodynamic integration methods, doing so is time
tyranny of empirical screening, substrate-based design and consuming. Calculating the absolute affinities for many
incremental modification. Since then, structure-based thousands of unrelated molecules necessary to encode new
design has contributed to and even motivated the develop- chemical functionality remains beyond our reach. So in
ment of marketed drugs3,4, such as the human immunodefi- principle, it could be argued that structure-based computa-
ciency virus (HIV) protease inhibitor Viracept and the tional screens for new ligands do not work at all.
862 NATURE | VOL 432 | 16 DECEMBER 2004 | www.nature.com/nature
©2004 Nature Publishing Group
6. Shoichet old 8/12/04 5:37 pm Page 863

insight commentary

Figure 1 Complexes predicted from virtual screening compared to X-ray trimethylthizole in the W191G cavity of cytochrome c peroxidase11. c, Predicted
crystallographic structures that were subsequently determined. a, Predicted (carbons (green)12 and experimental structure (carbons in grey) of an amprenavir mimic in HIV
in grey) and experimental (green) structures for Sustiva in HIV reverse transcriptase10. protease (ligands with thick bonds, enzyme residues with thin bonds; structure
b, Predicted (magenta) and experimental (carbons in grey) structures of 2,3,4- determined by A. Wlodawer, A. Olson, personal communication).

Successes from virtual screening Even relatively simple receptor-based constraints can improve the
However, genuinely novel ligands have been discovered using struc- likelihood of finding ligands from among the many possible struc-
ture-based computation. Recently, the structures of known ligands in tures in a library, if only by screening out those that are unlikely to
complex with their receptors have been correctly predicted computa- bind the receptor17. In library design, for instance, pre-calculation of
tionally using the structures of the independent receptor and ligand possible side chains that would complement a receptor structure
molecules10–12 (Fig. 1). From the standpoint of exploring chemical resulted in structure-based libraries that were tenfold more likely to
space, computational screens of chemical databases have identified contain ligands than random18 or diverse17 libraries constructed at
new ligands for over 50 receptors of known or even, in some cases, the same time. Similarly, virtual and high-throughput screening have
computer-modelled structures13,14 (for reviews of recent studies and been deployed simultaneously to discover new ligands from libraries
methods see refs 15 and 16). In these virtual or ‘docking’ screens, large of several-hundred-thousand diverse molecules. The virtual screens
libraries of organic molecules are docked into receptor structures and had ‘hit rates’ (defined as the number of compounds that bind at a
ranked by the calculated affinity (Fig. 2). Although the energy calcula- particular concentration divided by the number of compounds
tions are crude, the compounds in the library are readily available, experimentally tested) that were 100-fold to 1,000-fold higher than
making experimental testing easy and false-positives tolerable5. those achieved by empirical screens19,20 (Table 1); intriguingly, each
technique discovered classes of ligands that the other technique had
overlooked19, suggesting that the two screening approaches (virtual
and empirical) can be complementary.
In a few cases the structures of the new ligands in complex with the
receptors have been subsequently determined experimentally —
typically by X-ray crystallography. Although the docking-derived
hits are very different from natural ligands for a given receptor, they
often bind at the active site, interacting with conserved receptor
groups, as predicted by the docking program21–24 (Fig. 3). From a
Dock molecular recognition perspective, this suggests that the structural
‘code’ for binding is plastic in that multiple ligand scaffolds can be
recognized by the same receptor site. Methodologically, these structures
suggest that although virtual screens are plagued by false-positives,
in favourable circumstances they can predict genuinely novel ligands
and do so for the right reasons.
How can these successes be reconciled with the field’s method-
ological weaknesses? Virtual screening avoids the problem of broad
searches of chemical space by restricting itself to libraries of specific,
accessible compounds (often those that can simply be purchased).
This avoids costly syntheses and restricts the search to compounds
that are interesting enough biologically to have been previously
Test predictions
made, albeit for another reason. Filters may be applied to ensure that
the library meets some standard of biological relevance or ‘drug-
likeness’25,26. Progress in both the number and quality of molecules in
Figure 2 Virtual screening for new ligands. Large libraries of available, often docking libraries has contributed to the increasingly drug-like
purchasable, compounds are docked into the structure of receptor targets by a character of docking hits in recent studies19. Although the problems
docking computer program. Each compound is sampled in thousands to millions of sampling molecular conformations and of calculating affinities
of possible configurations and scored on the basis of its complementarity to the remain acute, progress has been made both algorithmically16 and in
receptor. Of the hundreds of thousands of molecules in the library, tens of top- the computer resources available for these calculations. Moreover, we
scoring predicted ligands (hits) are subsequently tested for activity in an can define success in virtual screening as ‘finding some interesting
experimental assay. new ligands’, and not as ‘correctly ranking all the molecules in the
library’ or ‘finding all the possible ligands in a library’. Virtual
NATURE | VOL 432 | 16 DECEMBER 2004 | www.nature.com/nature 863
©2004 Nature Publishing Group
6. Shoichet old 8/12/04 5:37 pm Page 864

insight commentary
Table 1 Hit rates and drug-like properties for inhibitors discovered with high-throughput and virtual screening against the enzyme PTP-1B (ref.19)
Technique Compounds tested Hits with IC50 < 100M Hits with IC50 < 10M Lipinski compliant hits* Hit rate†
HTS 400,000 85 6 23 0.021%
Docking 365‡ 127 18 57 34.8%
*Number of 100 µM or better inhibitors that passed all four of the drug-like criteria identified in Lipinski’s ‘rule of five’25; †The number of compounds experimentally tested divided by the number of
compounds with IC50 values of 100 µM or less; ‡The number of top-scoring docking hits that were experimentally tested; IC50, The concentration of inhibitor at which the enzyme is 50% inhibited.

screening thus adopts the same logic as high-throughput screening: had little experience, and as low as 5% (ref. 22) against an enzyme,
as long as some interesting ligands are found, false-negatives are AmpC -lactamase, that we had studied intensely. For many
tolerated. Indeed, the two techniques, because of their emphasis on medicinal chemists and structural biologists, such unpredictability
large libraries, share other similarities: both accept limited accuracy lends a whiff of sulphur to an enterprise that has been advertised as
in return for screening on a large scale; both look to enrich a list of ‘rational drug design’.
likely-but-not-certain candidates for further quantitative study; and
both are dogged by curious false-positive hits27. Although high- Prospects
throughput screening remains the dominant technique, virtual Notwithstanding these caveats, virtual screening will be an ever-
screening is now commonly used in pharmaceutical research. more important tool for exploring biologically relevant chemical
Finally, it must be admitted that these successes retain an episodic space. Large high-throughput screens have liabilities of their own,
character. Even expert practitioners are frequently surprised and and are inaccessible to many investigators (although this will begin to
sometimes disappointed. Geometries of true ligands may be slightly change with the advent of screening resource centres30). In contrast,
(Fig. 3e)28 or conspicuously (Fig. 3f)29 mis-predicted and hit rates can virtual screening processes large libraries (in principle, libraries that
vary greatly. We have had hit rates as high as 35% (ref. 19) against an are larger than any library used by empirical screening) and any
enzyme, protein tyrosine phosphatase 1B (PTP1B), with which we receptor for which there is a structure at little cost. What advances

Figure 3 Comparing the structures of new ligands predicted from virtual screening in cyan) and crystallographic (carbons in grey) structures of a 0.25 µM inhibitor
to the structures subsequently determined experimentally. a, The docked (carbons in bound to carbonic anhydrase (enzyme carbons in grey)23. Oxygen atoms in red,
orange) versus the crystallographic structure (carbons in grey) of the 8.3 µM sulphurs in yellow, nitrogens in blue. e, The docked (ligand carbons in grey) versus
inhibitor 4-aminophthalhydrazide bound to transfer RNA guanine transglycosylase the crystallographic structure (ligand carbons in orange) for a new inhibitor of aldose
(ligand in the centre surrounded by enzyme residues)21. b, The docked (carbons in reductase (enzyme carbons in green). Electron density maps for the ligand are
cyan) versus the crystallographic structure (carbons in grey) of the 100 µM ligand shown in blue. The ordered water (red sphere) observed in the experimental
phenol bound to a cavity site in T4 lysozyme (ligand in the centre surrounded by the structure was not considered in the docking28 (H. Steuber and G. Klebe, unpublished
molecular surface of the surrounding protein residues)24. c, The docked (carbons in work). f, The docked (carbons in cyan) versus the crystallographic structure (carbons
green) versus the crystallographic structure (carbons in red) of the 26 µM inhibitor in yellow) of the new inhibitor of TEM-1 -lactamase (enzyme in magenta)29. The
3-((4-chloroanilino)-sulphonyl)-thiophene-2-carboxylate bound to AmpC -lactamase experimentally observed binding mode — 16 Å from the active site targeted in the
(enzyme carbons in grey)22. d, The docked (carbons in magenta), re-scored (carbons docking calculations — occurs in a cryptic site absent from the native structure.

864 NATURE | VOL 432 | 16 DECEMBER 2004 | www.nature.com/nature


©2004 Nature Publishing Group
6. Shoichet old 8/12/04 5:37 pm Page 865

insight commentary
might be anticipated to make virtual screening reliable and accessible commitment in time, material and infrastructure that an empirical
enough to be widely used? screen demands. ■
Improved sampling and ‘scoring functions’ (calculations of lig-
and–receptor energetics) will undoubtedly help. The good news is doi:10.1038/nature03197

that the fundamentals of molecular interactions are well understood, 1. Beddell, C. R., Goodford, P. J., Norrington, F. E., Wilkinson, S. & Wootton, R. Compounds designed
and so the field has a clear way forward. But the challenge, as always, to fit a site of known structure in human haemoglobin. Br. J. Pharmacol. 57, 201–209 (1976).
2. Cohen, S. S. A strategy for the chemotherapy of infectious disease. Science 197, 431–432 (1977).
will be to implement good physical models for hundreds of thousands
3. Itzstein, M. V. et al. Rational design of potent sialidase-based inhibitors of influenza virus replication.
of possible ligands, each one sampled in many thousands of possible Nature 363, 418–423 (1993).
receptor complexes. Indeed, accurate calculation of absolute binding 4. Varney, M. D. et al. Crystal-structure-based design and synthesis of Benz[cd]indole-containing
affinity in screens of large, diverse libraries will remain beyond us for inhibitors of thymidylate synthase. J. Med. Chem. 35, 663–676 (1992).
5. Kuntz, I. D. Structure-based strategies for drug design and discovery. Science 257, 1078–1082 (1992).
the foreseeable future; even predicting the rank order of affinity for 6. Jorgensen, W. L. The many roles of computation in drug discovery. Science 303, 1813–1818 (2004).
disparate ligands in a hit list will be difficult. What we may anticipate 7. Stahura, F. L. & Bajorath, J. Virtual screening methods that complement HTS. Comb. Chem. High
are improved explorations of conformational states for ligand and Throughput Screen 7, 259–269 (2004).
receptor, and scoring functions that use more sophisticated models 8. Perutz, M. F. The hemaglobin molecule. Sci. Am. 211, 64–76 (1964).
9. van Gunsteren, W. F. & Berendsen, H. J. C. Computer simulation of molecular dynamics:
of solvation and a better balance of electrostatic and non-polar terms. methodology, applications, and perspectives in chemistry. Angew. Chem. Int. Ed. Engl. 29, 992–1023
An interesting strategy will be the use of higher-level, typically much (1990).
slower methods to re-score initial hits from virtual screening, using 10. Rizzo, R., Wang, D., Tirado-Rives, J. & Jorgensen, W. Validation of a model for the complex of HIV-1
the screening calculation as a fast first filter31. From these we can hope reverse transcriptase with sustiva through computation of resistance profiles. J. Am. Chem. Soc. 122,
12898–12900 (2000).
for better hit rates and better predictions of geometries23 (Fig. 3d), 11. Rosenfeld, R. J. et al. Automated docking of ligands to an artificial active site: augmenting
which are the first and most important goals of virtual screening. crystallographic analysis with computer modeling. J. Comput. Aided Mol. Des. 17, 525–536 (2003).
To bring virtual screening to a wide community it will be impor- 12. Brik, A. et al. Rapid diversity-oriented synthesis in microtiter plates for in situ screening of HIV
protease inhibitors. Chembiochem. 4, 1246–1248 (2003).
tant to democratize the resources on which it depends. Receptor 13. Schapira, M. et al. Discovery of diverse thyroid hormone receptor antagonists by high-throughput
structures are already available through the Protein Data Bank or PDB docking. Proc. Natl Acad. Sci. USA 100, 7354–7359 (2003).
(for experimental structures), and through databases such as MOD- 14. Evers, A. & Klebe, G. Ligand-supported homology modeling of G-protein-coupled receptor sites:
BASE (for a much larger number of structures from computer-based models sufficient for successful virtual screening. Angew. Chem. Int. Ed. Engl. 43, 248–251 (2004).
15. Shoichet, B. K., McGovern, S. L., Wei, B. & Irwin, J. J. Lead discovery using molecular docking. Curr.
modelling32). Several groups provide docking programs without Opin. Chem. Biol. 6, 439–446 (2002).
charge to the academic community, although these programs often 16. Schneidman-Duhovny, D., Nussinov, R. & Wolfson, H. J. Predicting molecular interactions in silico:
require some effort to learn. Programs less demanding of expert II. Protein-protein and protein-drug docking. Curr. Med. Chem. 11, 91–107 (2004).
17. Wyss, P. C. et al. Novel dihydrofolate reductase inhibitors. Structure-based versus diversity-based
knowledge, perhaps as a web-accessible resource, would bring dock-
library design and high-throughput synthesis and screening. J. Med. Chem. 46, 2304–2312 (2003).
ing to many interested non-specialists. Finally, community-accessible 18. Kick, E. K. et al. Structure-based design and combinatorial chemistry yield low nanomolar inhibitors
chemical libraries are needed. The National Cancer Institute (NCI) of cathepsin D. Chem. Biol. 4, 297–307 (1997).
provides calculated structures for about 140,000 of its compounds, 19. Doman, T. N. et al. Molecular docking and high-throughput screening for novel inhibitors of protein
tyrosine phosphatase-1B. J. Med. Chem. 45, 2213–2221 (2002).
and will provide at least some of these for experimental testing 20. Paiva, A. M. et al. Inhibitors of dihydrodipicolinate reductase, a key enzyme of the diaminopimelate
(http://cactus.nci.nih.gov/). MDL Inc. sells the Available Chemicals pathway of Mycobacterium tuberculosis. Biochim. Biophys. Acta. 1545, 67–77 (2001).
Directory (ACD; http://www.mdl.com/products/experiment/avail- 21. Gradler, U. et al. A new target for shigellosis: rational design and crystallographic studies of inhibitors
able_chem_dir/index.jsp) of commercially available compounds and of tRNA-guanine transglycosylase. J. Mol. Biol. 306, 455–467 (2001).
22. Powers, R. A., Morandi, F. & Shoichet, B. K. Structure-based discovery of a novel, noncovalent
the ACD-SC for screening collections. To use these libraries in dock- inhibitor of AmpC beta-lactamase. Structure (Camb.) 10, 1013–1023 (2002).
ing screens, molecular properties such as protonation, charge, stereo- 23. Gruneberg, S., Stubbs, M. T. & Klebe, G. Successful virtual screening for novel inhibitors of human
chemistry, accessible conformations and solvation must be carbonic anhydrase: strategy and experimental confirmation. J. Med. Chem. 45, 3588–3602 (2002).
24. Wei, B. Q., Baase, W. A., Weaver, L. H., Matthews, B. W. & Shoichet, B. K. A model binding site for
calculated. Even details such as stereochemistry, tautomerization and
testing scoring functions in molecular docking. J. Mol. Biol. 322, 339–355 (2002).
protonation, which we frequently take for granted, are often 25. Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational
ambiguous, or can change on binding to a receptor. Recently, about approaches to estimate solubility and permeability in drug discovery and development settings.
one million commercially accessible molecules have become available Adv. Drug Deliv. Rev. 23, 3–25 (1997).
26. Oprea, T. I. Current trends in lead discovery: are we looking for the appropriate properties?
through the ZINC database (http://blaster.docking.org/zinc/). ZINC Mol. Divers 5, 199–208 (2002).
is a free, web-accessible database constructed with docking, sub- 27. McGovern, S. L., Caselli, E., Grigorieff, N. & Shoichet, B. K. A common mechanism underlying
structure searching and compound purchasing in mind. promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 45, 1712–1722
In the immediate future, virtual screening is mature enough to (2002).
28. Krämer, O., Hazemann, I., Podjarny, A. D. & Klebe, G. Virtual screening for inhibitors of human
benefit from an aggressive programme of experimental testing. As aldose reductase. Proteins 55, 814–823 (2004).
more docking predictions are evaluated, and sometimes falsified, 29. Horn, J. R. & Shoichet, B. K. Allosteric inhibition through core disruption. J. Mol. Biol. 336,
the methods will improve, especially if care is taken to remove the 1283–1291 (2004).
30. Kaiser, J. NIH Gears up for chemical genomics. Science 304, 1728 (2004).
false-positives that have plagued both high-throughput and virtual
31. Kalyanaraman, C., Bernacki, K. & Jacobson, M. P. Virtual screening against highly charged active
screening27. Subsequent solution of receptor–ligand complex sites: Identifying substrates of alpha-beta barrel enzymes. Biochemistry in the press.
structures will be particularly informative; so far, too few of these 32. Pieper, U., Eswar, N., Stuart, A. C., Ilyin, V. A. & Sali, A. MODBASE, a database of annotated
have been determined. For those who can tolerate its false-positives, comparative protein structure models. Nucleic Acids Res. 30, 255–259 (2002).

structure-based virtual screening is reliable enough to justify its use Acknowledgements I thank G. Klebe, A. Olson, and W. Jorgensen for contributing figures
in active ligand discovery projects, providing an important com- and comments, and I. D. Kuntz, M. Jacobson, A. Sali, K. Dill and J. Irwin for many
plementary approach to empirical screening. For some projects, insightful conversations. My laboratory’s research in docking is supported by NIGMS.
especially those centred in academic laboratories, virtual screening Competing interests statement The author declares competing financial interests: details
will be the best way to access a large chemical space without the accompany the paper on www.nature.com/nature.

NATURE | VOL 432 | 16 DECEMBER 2004 | www.nature.com/nature 865


©2004 Nature Publishing Group

You might also like