|
Description  |
|
|
FIELD OF THE INVENTION
The present invention relates generally to methods for synthesizing very
large collections of diverse molecules and for identifying and isolating
compounds with useful and desired activities from such collections. The
invention also relates to the incorporation of identification tags in such
collections to facilitate identification of compounds with desired
properties. The invention therefore relates to the fields of chemistry,
biology, pharmacology, and related fields.
BACKGROUND OF THE INVENTION
Ligands for macromolecular receptors can be identified by screening diverse
collections of peptides produced through either molecular biological or
synthetic chemical techniques. Recombinant peptide libraries have been
generated by inserting degenerate oligonucleotides into genes encoding
capsid proteins of filamentous bacteriophage and the DNA-binding protein
Lac I. See Cwirla et al., 1990, Proc. Natl. Acad. Sci. USA 87: 6378-6382;
Scott & Smith, 1990, Science 249: 386-390; Devlin et al., 1990 Science
249: 404-406; Cull et al., 1992, Proc. Natl. Acad. Sci USA 89: 1865-1869;
and PCT publication Nos. WO 91/17271, WO 91/19818, WO 93/08278, each of
which is incorporated herein by reference. These random libraries may
contain more than 10.sup.9 different peptides, each fused to a larger
protein sequence that is physically linked to the genetic material
encoding it. Such libraries are efficiently screened for interaction with
a receptor by several rounds of affinity purification, the selected
exposition or display vectors being amplified in E. coli and the DNA of
individual clones sequenced to reveal the identity of the peptide
responsible for receptor binding. See also PCT publication Nos. WO
91/05058 and WO 92/02536.
Chemical approaches to generating peptide or other molecular libraries are
not limited to syntheses using just the 20 genetically coded amino acids.
By expanding the building block set to include unnatural amino acids and
other molecular building blocks, the accessible sequence and structural
diversity is dramatically increased. In several of the strategies
described for creating synthetic molecular libraries, the reaction
products are spatially segregated and the identity of individual library
members is unambiguously defined by the nature of the synthesis See Geysen
et al., 1984, Proc. Natl. Acad. Sci. USA 81: 3998-4002; Geysen et al,
1986, in Synthetic Peptides as Antigens; Ciba Foundation Symposium 119,
eds. Porter, R. & Wheelan, J. (Wiley, New York) pp. 131-146; Fodor et al.,
1991, Science 251: 767-773; U.S. Pat. No. 5,143,854; and PCT patent
publication Nos. WO 84/03564; 86/00991; 86/06487; 90/15070; and 92/10092,
each of which is incorporated herein by reference.
Libraries of more than 30 million soluble peptides have been prepared by
the "tea-bag" method of multiple peptide synthesis. See Houghten, 1985,
Proc. Natl. Acad. Sci. USA 82: 5131-5135; and U.S. Pat. No. 4,631,211,
each of which is incorporated herein by reference. Each library is
synthesized and screened as degenerate peptide mixtures in which
individual amino acids within the sequence are explicitly defined. An
iterative process of screening (e.g. in a competition binding assay) and
resynthesis is used to fractionate these mixtures and define the most
active peptides within the library. See Houghten et al., 1991, Nature 354:
84-86; Pinilla et al., 1992, Peptide Research 5: 351-358; Blake, J. &
Litzi-Davis, 1992, Bioconjugate Chem. 3: 510-513; and PCT patent
publication No. WO 92/09300, each of which is incorporated herein by
reference.
Using the split-synthesis protocol of Furka et al., 1988, Abstr. 14th Int.
Congr. Biochem., Prague, Czech. 5: 47 (see also Furka et al., 1991, Int I.
Peptide Protein Res. 37: 487-493; and Sebestyen et al., 1993, Bioorg. Med.
Chem. Lett. 3: 413-418), Lam and coworkers have prepared libraries
containing .about.10.sup.6 peptides attached to 100-200 .mu.m diameter
resin beads. See Lam et al., 1991, Nature 354: 82-84; Lam et al., 1993,
Bioorg. Med. Chem. Lett. 3: 419-424; and PCT patent publication No. WO
92/00091, each of which is incorporated herein by reference. The bead
library is screened by incubation with a labelled receptor: beads binding
to the receptor are identified by visual inspection and are selected with
the aid of a micromanipulator. Each bead contains 50-200 pmol of a single
peptide sequence which may be determined directly either by Edman
degradation or mass spectrometry analysis. In principle, one could create
libraries of greater diversity using this approach by reducing the
dimensions of the beads. The sensitivity of peptide sequencing techniques
is limited to .about.1 pmole, however, placing a clear limitation on the
scope of direct peptide sequencing analysis. Moreover, neither analytical
method provides for straightforward and unambiguous sequence analysis when
the library building block set is expanded to include D- or other
non-natural amino acids or other chemical building blocks.
High throughput screening of collections of chemically synthesized
molecules and of natural products (such as microbial fermentation broths)
has traditionally played a central role in the search for lead compounds
for the development of new pharmacological agents. The remarkable surge of
interest in combinatorial chemistry and the associated technologies for
generating and evaluating molecular diversity represent significant
milestones in the evolution of this paradigm of drug discovery. See Pavia
et al., 1993, Bioorg. Med. Chem. Left. 3: 387-396, incorporated herein by
reference. To date, peptide chemistry has been the principle vehicle for
exploring the utility of combinatorial methods in ligand identification.
See Jung & Beck-Sickinger, 1992, Angew. Chem. Int. Ed. Engl. 31: 367-383,
incorporated herein by reference. This may be ascribed to the availability
of a large and structurally diverse range of amino acid monomers, a
relatively generic, high-yielding solid phase coupling chemistry and the
synergy with biological approaches for generating recombinant peptide
libraries. Moreover, the potent and specific biological activities of many
low molecular weight peptides make these molecules attractive starting
points for therapeutic drug discovery. See Hirschmann, 1991, Angew. Chem.
Int. Ed. Engl. 30: 1278-1301, and Wiley & Rich, 1993, Med. Res. Rev. 13:
327-384, each of which is incorporated herein by reference. Unfavorable
pharmacodynamic properties such as poor oral bioavailability and rapid
clearance in vivo have limited the more widespread development of peptidic
compounds as drugs however. This realization has recently inspired workers
to extend the concepts of combinatorial organic synthesis beyond peptide
chemistry to create libraries of known pharmacophores like benzodiazepines
(see Bunin & Ellman, 1992, I. Amer. Chem. Soc. 114: 10997-10998,
incorporated herein by reference) as well as polymeric molecules such as
oligomeric N-substituted glycines ("peptoids") and oligocarbamates. See
Simon et al., 1992, Proc. Natl. Acad. Sci. USA 89: 9367-9371; Zuckermann
et al., 1992, I. Amer. Chem. Soc. 114: 10646-10647; and Cho et al., 1993,
Science 261: 1303-1305, each of which is incorporated herein by reference.
Despite the great value that large libraries of molecules can have for
identifying useful compounds or improving the properties of a lead
compound, the difficulties of screening such libraries, particularly large
libraries, has limited the impact access to such libraries should have
made in reducing the costs of, e.g., drug discovery and development.
Consequently, the development of methods for generating and screening
libraries of molecules in which each member of the library is tagged with
a unique identifier tag to facilitate identification of compounds (see PCT
patent publication No. WO 93/06121, incorporated herein by reference; see
also U.S. patent application Ser. Nos. 946,239, filed Sep. 16, 1992, and
762,522, filed Sep. 18, 1991, supra) met with great enthusiasm. In the
method, products of a chemical synthesis procedure, typically a
combinatorial synthesis on resin beads, are explicitly specified by
attachment of an identifier tag to the beads coincident with each coupling
or other product generating reaction step in the synthesis. Each tag
specifies what happened in a reaction step of interest, e.g. which amino
acid monomer was coupled in a particular step of a peptide synthesis
procedure. The structure or identity of a compound, e.g. the sequence of a
peptide, on any bead can be deduced by reading the set of tags on that
bead. Ideally, such tags have a high information content, are amenable to
very high sensitivity detection and decoding, and are stable to reagents
used in the synthesis. The concept of an oligonucleotide-encoded chemical
synthesis was also proposed by Brenner and Lerner, 1992, Proc. Natl. Acad.
Sci. USA 89: 5181-5183, incorporated herein by reference.
The encoding method has been employed to show that, starting with an
orthogonally differentiated diamine linker, parallel combinatorial
synthesis can be used to generate a library of soluble chimeric peptides
comprising a "binding" strand and a "coding" strand. See Kerr et al.,
1993, I. Amer. Chem. Soc. 115: 2529-2531, incorporated herein by
reference. The coupling of either natural or unnatural amino acid monomers
to the binding strand was recorded by building an amino acid code
comprised of four L-amino acids on the "coding" strand. Compounds were
selected from equimolar peptide mixtures by affinity purification on a
receptor and were resolved by HPLC. The sequence of the coding strand of
individual purified molecules was then determined by Edman degradation to
reveal the structure of the binding strand. An analogous peptidic coding
scheme was also recently reported by Nikolaiev et al., 1993, Peptide
Research 6: 161-170.
Constraints on the sensitivity and throughput of the Edman procedure will
ultimately restrict the scope of this aspect of the encoding method to
analyzing libraries of limited diversity. The use of oligonucleotide tags
offers greater promise, but improved methods for synthesizing
oligonucleotide-tagged molecular libraries are needed. Moreover, there
remains a need for alternate methodology for synthesizing and screening
very large tagged molecular libraries. The present invention meets these
and other needs.
SUMMARY OF THE INVENTION
The present invention provides methods and reagents for tagging the
products of combinatorial chemical processes to construct encoded
synthetic chemical libraries. In one important embodiment, the invention
provides a method for performing peptide and oligonucleotide synthesis on
microscopic beads through an alternating and compatible synthetic
procedure. The large oligonucleotide-encoded synthetic peptide library
produced by this combinatorial synthesis is composed of many beads, each
of which contains many copies of a single peptide (with a defined
sequence) and a single-stranded DNA tag whose sequence artificially and
unambiguously codes for the structure of the associated peptide. The
library can be efficiently interrogated for interaction with
fluorescently-labeled biological receptors by flow cytometry, and
individual beads selected by exploiting the ability of FACS
instrumentation to sort single beads. The DNA tag on a sorted bead is
amplified by the PCR and sequenced to determine the structure of the
encoded peptide ligand. The library can be used, for example, to find high
affinity (nanomolar) ligands for a receptor such as an anti-peptide
monoclonal antibody.
A synthetic molecular library of the invention can be produced by
synthesizing on each of a plurality of solid supports a compound, the
compound being different for different solid supports. The compound is
synthesized in a process comprising the steps of: (a) apportioning the
supports in a stochastic manner among a plurality of reaction vessels; (b)
exposing the supports in each reaction vessel to a first chemical building
block; (c) pooling the supports; (d) apportioning the supports in a
stochastic manner among the plurality of reaction vessels; (e) exposing
the supports in each reaction vessel to a chemical building block; and (f)
repeating steps (a) through (e) from at least one to twenty times.
Typically, substantially equal numbers of solid supports will be
apportioned to each reaction vessel. In one embodiment of the method, the
chemical building blocks are chosen from the set of amino acids, and the
resulting compound is a peptide oligomer.
More particularly, the invention relates to certain improvements in the
coupling chemistries associated with such methods. One such improvement
relates to the chemistry used to remove the Fmoc protecting group from the
alpha-amino group of a bead, linker, or growing peptide chain in such
syntheses. Preferably, such removal is effected by treatment with 5 to
15%, preferably 10%, piperidine for 5 to 60 minutes, preferably 5 to 10
minutes, although other conditions may be employed, e.g., 15 to 30%
piperidine for 5 to 30 minutes. Other improvements relate to the
activation chemistry of the peptide coupling reactions, in that when
certain automated instrumentation is used to perform the synthesis of an
oligonucleotide tagged peptide library, the invention provides for a
simple mixture of HOBt/HBTU to reduce reagent supply bottles.
In another aspect, the invention relates to methods and instrumentation for
synthesizing encoded synthetic chemical libraries on beads too small to be
separated on convention flow cytometry instrumentation. Such small beads
allow the resulting library size to increase from the more typical range
of 10.sup.9 to 10.sup.13 for bead based libraries up to a size of
10.sup.18 members for bead-free libraries. The invention also relates to
methods for screening such libraries.
The invention also relates to methods for screening encoded synthetic
libraries to identify useful compounds. In one important aspect, the
invention provides important advances in the field of natural product
screening relating to methods for generating, tagging, and screening
natural product libraries to characterize and identify compounds with
useful activity.
In another aspect, the invention relates to an improved process for rapidly
and efficiently identifying a pool of compounds from a molecular library
of the invention. In this method, the oligonucleotide tags from a pool of
tagged compounds that exhibit a desired property (e.g., binding to a
receptor) are concatemerized and cloned to facilitate sequencing of a
plurality of tags in a single sequencing reaction. If the tagged compounds
are peptides, and an encoding scheme based on the genetic code is
employed, then one can subclone individual tags from the concatemer into
other selection and expression systems, such as the plasmid and
phage-based systems described in the background section above, for further
analysis of the peptide.
In general, the invention provides improved methods for generating and
screening molecular libraries in which the individual molecules in the
library are tagged with unique, easily decoded identifier tags.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 shows a device for synthesizing combinatorial chemical libraries on
microscopic beads. The device is composed of a vacuum manifold or magnetic
plate attached to a solid substrate having a synthesis surface having an
array of reaction sites at which compounds can be synthesized. The
partition block is composed of an array of reaction wells corresponding to
said reaction sites and is used to partition library members after each
mixing step. The device can also be used to aid the synthesis of tagged
chemical libraries.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS
The present invention relates generally to improved methods for generating
and screening tagged chemical libraries. To appreciate the value of the
improvements, one must understand not only the basic methodology for
making and using tagged libraries but also how the various steps of
synthesis and screening interact and how the selection of reagents impacts
the results achieved. Tagged chemical libraries are often synthesized on a
solid support, and the choice of support and linker is critical to
success. A linker can be used to attach the support to the tag, to attach
the support to a library molecule, or, in an embodiment where there is no
solid support, to attach the tag to a library molecule. The choices
relating to chemical building blocks, tags, and synthesis methods can be
equally critical and are also impacted by the nature of the solid supports
and linkers available. The assays and applications for which the tagged
libraries are intended also impact these choices, as well as the
instrumentation and reagents available. The description of the invention
is therefore provided as indicated by the following outline.
OUTLINE
I. Overview of a Synthesis of a Tagged Chemical Library
II. The Solid Support
A. Types
B. Linkers
C. Molecular Supports
III. The Chemical Building Blocks
A. Oligomers and Monomers
B. Other Building Blocks
IV. The Tag
V. Synthesis Methods
A. Oligonucleotide Tagged Peptide Libraries
B. Improved Method for Synthesizing Oligonucleotide-Tagged Peptide
Libraries
C. Methods for Generating Soluble Libraries
VI. Assay Methods
A. Screening Assays for Bead-based Libraries
B. Screening Soluble Molecules
C. Screening Natural Product Libraries
VII. Instrumentation and Reagents
Examples
End of Outline
In addition to the outline above, the following glossary is provided to
facilitate the description of the invention, and a number of abbreviations
and terms are defined to have the general meanings indicated as used
herein to describe the invention.
Abbreviations: HBTU, O-(benzotriazol-1-yl)-1,1,3,3-tetramethyluronium
hexafluorophosphate; HOBt, 1-hydroxybenzotriazole; TFA, trifluoroacetic
acid; TCA, trichloroacetic acid; DIEA, diisopropylethylamine; DMF,
dimethylformamide; Fmoc, 9-fluorenylmethyloxycarbonyl; DMT,
dimethoxytrityl; Trt, trityl; Bz, benzoyl; Pmc,
2,2,5,7,8-pentamethylchroman-6-sulfonyl; .sup.t Boc,
tert-butyloxycarbonyl; PBS, phosphate-buffered saline; BSA, bovine serum
albumin; mAb, monoclonal antibody.
Complementary or substantially complementary: These terms refer to the
ability of one compound to bind to another, e.g. as a ligand binds to its
complementary receptor. Typically, these terms are used in connection with
a description of base pairing between nucleotides of nucleic acids, such
as, for instance, between the two strands of a double stranded DNA
molecule or between an oligonucleotide primer and a primer binding site on
a single stranded nucleic acid to be sequenced or amplified.
"Complementary" nucleotides are, generally, A and T (or A and U), and C
and G, but there are a wide variety of synthetic or modified nucleotides
with binding properties known to those of skill in the art. "Substantial
complementarity" exists when an RNA or DNA strand will hybridize under
selective hybridization conditions to a complementary nucleic acid.
Typically, hybridization will occur when there is at least about 55%
complementarity over a stretch of at least 14 to 25 nucleotides, but more
selective hybridization will occur as complementarity increases to 65%,
75%, 90%, and 100%. See Kanehisa, 1984, Nucl. Acids Res. 12: 203,
incorporated herein by reference. Highly selective hybridization
conditions are known as "stringent hybridization conditions", defined
below.
Epitope: This term is used to describe a portion of an antigen molecule
delineated by the area of interaction with the subclass of receptors known
as antibodies.
Identifier tag: In the most general sense, this term is used to denote a
physical attribute that provides a means whereby one can identify a
chemical reaction, such as a monomer addition reaction an individual solid
support has experienced in the synthesis of an oligomer on that solid
support. The identifier tag serves to record a step in a series of
reactions used in the synthesis of a chemical library. The identifier tag
may have any recognizable feature, including for example: a
microscopically or otherwise distinguishable shape, size, mass, color,
optical density, etc.; a differential absorbance or emission of light;
chemically reactivity; magnetic or electronic properties; or any other
distinctive mark capable of encoding the required information, and
decipherable at the level of one (or a few) molecules. A preferred example
of such an identifier tag is an oligonucleotide, because the nucleotide
sequence of an oligonucleotide is a robust form of encoded information. An
"identifier tag" can be coupled directly to the oligomer synthesized,
whether or not a solid support is used in the synthesis. In this latter
embodiment, the identifier tag can conceptually be viewed as also serving
as the "support" for oligomer synthesis.
Ligand: This term is used to denote a molecule that is recognized by,
typically by binding to, a particular receptor. The agent bound by or
reacting with a receptor is called a "ligand", a term which is
definitionally meaningful only in terms of its counterpart receptor. The
term "ligand" does not imply any particular molecular size or other
structural or compositional feature other than that the substance in
question is capable of binding or otherwise interacting with the receptor.
Also, a "ligand" may serve either as the natural ligand to which the
receptor binds, or as a functional analogue that may act as an agonist or
antagonist. Ligands that can be investigated by this invention include,
but are not restricted to, agonists and antagonists for cell membrane
receptors, toxins and venoms, viral epitopes, hormones, sugars, cofactors,
peptides, enzyme substrates, cofactors, drugs (e.g., opiates, steroids,
etc.), and proteins.
Monomer: This term is used to denote any member of a set of molecules that
can be joined together to form another molecule or set of molecules, such
as a set of oligomers or polymers. Sets of monomers useful in the present
invention include, but are not restricted to, for the example of peptide
synthesis, the set of L-amino acids, D-amino acids, or synthetic amino
acids. As used herein, "monomer" refers to any member of a basis set for
synthesis of an oligomer. For example, dimers of L-amino acids form a
basis set of 400 "monomers" for synthesis of polypeptides. Different basis
sets of monomers may be used at successive steps in the synthesis of a
polymer. Those of skill in the art will recognize that a "monomer" is
simply one type of "chemical building block" and that any type of chemical
building block can be employed in the present method, regardless of
whether one is synthesizing an oligomer or a small organic molecule or
some other molecule.
Oligomer or Polymer: These terms are used to denote molecules that are
formed by a process involving the chemical or enzymatic addition of
monomer subunits. Such oligomers include, for example, both linear,
cyclic, and branched polymers of nucleic acids, polysaccharides,
phospholipids, and peptides having either alpha-, beta-, or omega-amino
acids, heteropolymers, polyurethanes, polyesters, polycarbonates,
polyureas, polyamides, polyethyleneimines, polyarylene sulfides,
polysiloxanes, polyimides, polyacetates, or other polymers, as will be
readily apparent to one skilled in the art upon review of this disclosure.
Peptide: This term is used to denote an oligomer in which the monomers are
alpha amino acids joined together through amide bonds. A "peptide" can
also be referred to as a "polypeptide." In the context of this invention,
one should appreciate that the amino acids may be the L-optical isomer or
the D-optical isomer. Peptides are more than two amino acid monomers long,
but more often are more than 5 to 10 amino acid monomers long and can be
even longer than 20 amino acids, although peptides longer than 20 amino
acids are more likely to be called "polypeptides." Standard single letter
abbreviations for amino acids are used (e.g., P for proline). These
abbreviations are included in Stryer, Biochemistry, Third Ed. (1988),
which is incorporated herein by reference.
Oligonucleotides: This term is used to denote a single-stranded DNA or RNA
molecule, typically prepared by synthetic means. Oligonucleotides employed
in the present invention will usually be 50 to 150 nucleotides in length,
preferably from 80 to 120 nucleotides, although oligonucleotides of
different length may be appropriate in some circumstances. For instance,
an oligonucleotide tag can be built nucleotide-by-nucleotide in
coordination with the monomer-by-monomer addition steps used to synthesize
the oligomer. In addition, very short, i.e., 2 to 10 nucleotides,
oligonucleotides may be used to extend an existing oligonucleotide tag to
identify a monomer coupling step. Suitable oligonucleotides may be
prepared by the phosphoramidite method described by Beaucage and
Carruthers, 1981, Tetr. Left. 22: 1859-1862, or by the triester method,
according to Matteucci et al., 1981, I. Am. Chem. Soc. 103: 3185, both
incorporated herein by reference, or by other methods such as by using
commercial automated oligonucleotide synthesizers.
Operably linked: This terms refers to a functional relationship between one
segment of a nucleic acid and another. For instance, a promoter (or
enhancer) is "operably linked" to a coding sequence if the promoter causes
or otherwise positively influences the transcription of the coding
sequence. Generally, operably linked means that the nucleic acid segments
or sequences being linked are contiguous and, where necessary to join two
protein coding regions, contiguous and in reading frame.
Receptor: This term refers to a molecule that has a specific affinity for a
given ligand. Receptors may be naturally occurring or synthetic molecules.
Receptors can be employed in their unaltered natural or isolated state or
as aggregates with other species. Receptors may be attached, covalently or
noncovalently, to other substances. Examples of receptors that can be
employed in the method of the present invention include, but are not
restricted to, antibodies, cell membrane receptors, monoclonal antibodies,
antisera reactive with specific antigenic determinants (such as on
viruses, cells, or other materials), polynucleotides, nucleic acids,
lectins, polysaccharides, cells, cellular membranes, and organelles.
Receptors are also known as "anti-ligands." A "ligand-receptor pair" is
formed when two molecules, typically macromolecules, have combined through
molecular recognition to form a complex. Other examples of receptors
include, but are not restricted to specific transport proteins or enzymes
essential to survival of microorganisms for which antibiotics are needed;
the binding site of any enzyme; the ligand-binding site on an antibody
molecule; a nucleic acid; a catalytic polypeptides as described in Lerner
et at., 1991, Science 252: 659, incorporated herein by reference; and
hormone receptors such as the receptors for insulin and growth hormone.
Substrate or Solid Support: These terms denote a material having a rigid or
semi-rigid surface. Such materials will preferably take the form of small
beads, pellets, disks, or other convenient forms, although other forms may
be used. In some embodiments, at least one surface of the substrate can be
substantially flat. | | |