|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to the field of molecular synthesis. More specifically, the invention provides systems and methods for directed synthesis of diverse molecular sequences on substrates.
Methods for preparing different polymers are well known. For example, the "Merrifield" method, described in Atherton et al., "Solid Phase Peptide Synthesis," IRL Press, 1989, which is incorporated herein by reference for all purposes, has been
used to synthesize peptides on a solid support. In the Merrifield method, an amino acid is covalently bonded to a support made of an insoluble polymer or other material. Another amino acid with an alpha protecting group is reacted with the covalently
bonded amino acid to form a dipeptide. After washing, the protecting group is removed and a third amino acid with an alpha protecting group is added to the dipeptide. This process is continued until a peptide of a desired length and sequence is
obtained.
Other techniques have also been described. These methods include the synthesis of peptides on 96 plastic pins which fit the format of standard microtiter plates. Advanced techniques for synthesizing large numbers of molecules in an efficient
manner have also been disclosed. Most notably, U.S. Pat. No. 5,143,854 (Pirrung et al.) and PCT Application No. 92/10092 disclose improved methods of molecular synthesis using light directed techniques. According to these methods, light is directed
to selected regions of a substrate to remove protecting groups from the selected regions of the substrate. Thereafter, selected molecules are coupled to the substrate, followed by additional irradiation and coupling steps.
SUMMARY OF THE INVENTION
Methods, devices, and compositions for synthesis and use of diverse molecular sequences on a substrate are disclosed, as well as applications thereof.
A preferred embodiment of the invention provides for the synthesis of an array of polymers in which individual monomers in a lead polymer are systematically substituted with monomers from one or more basis sets of monomers. The method requires a
limited number of masks and a limited number of processing steps. According to one specific aspect of the invention, a series of masking steps are conducted to first place the first monomer in the lead sequence on a substrate at a plurality of synthesis
sites. The second monomer in the lead sequence is then added to the first monomer at a portion of the synthesis sites, while different monomers from a basis set are placed at discrete other synthesis sites. The process is repeated to produce all or a
significant number of the mono substituted polymers based on the lead polymer using a given basis set of monomers. According to a preferred aspect of the invention, the technique uses light directed techniques, such as those described in Pirrung et al.,
U.S. Pat. No. 5,143,854.
Another aspect of the invention provides for efficient synthesis and screening of cyclic molecules. According to a preferred aspect of the invention, cyclic polymers are synthesized in an array in which the polymers are coupled to the substrate
at different positions on the cyclic polymer ring. Therefore, a particular polymer may be presented in various "rotated" forms on the substrate for later screening. Again, the cyclic polymers are formed according to most preferred embodiments with the
techniques of Pirrung et al.
The resulting substrates will have a variety of uses including, for example, screening polymers for biological activity. To screen for biological activity, the substrate is exposed to one or more receptors such as an antibody, oligonucleotide,
whole cells, receptors on vesicles, lipids, or any one of a variety of other receptors. The receptors are preferably labeled with, for example, a fluorescent marker, a radioactive marker, or a labeled antibody reactive with the receptor. The location
of the marker on the substrate is detected with, for example, photon detection or auto-radiographic techniques. Through knowledge of the sequence of the material at the location where binding is detected, it is possible to quickly determine which
polymer(s) are complementary with the receptor. The technique can be used to screen large numbers of peptides or other polymers quickly and economically.
A further understanding of the nature and advantages of the inventions herein may be realized by reference to the remaining portions of the specification and the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A to 1C illustrate a systematic substitution masking strategy;
FIG. 2 illustrates additional aspects of a systemmatic substitution masking strategy;
FIGS. 3, 4, and 5 illustrate rotated cyclic polymer groups;
FIGS. 6A to 6E illustrate formation of rotated cyclic polymers;
FIGS. 7A to 7C illustrate formation of rotated and substituted cyclic polymers;
FIG. 8 illustrates the array of cyclic polymers resulting from the synthesis;
FIG. 9 illustrates masks used in the synthesis of cyclic polymer arrays;
FIGS. 10A and 10B illustrate coupling of a tether in two orientations;
FIGS. 11A and 11B illustrate masks used in another embodiment;
FIGS. 12A and 12B show a tripeptide used in a fluorescence energy-transfer substrate assay and that substrate after cleavage;
FIGS. 13A to 13H illustrate donor/quencher pairs;
FIG. 14 illustrates sequence versus normalized fluorescence intensity for the ten possible single deletion peptides binding to the D32.39 antibody. A blank space represents a deleted amino acid relative to the full length kernel sequence
(FLRRQFKVVT) (SEQ ID NO: 1) shown on the bottom. Error bars represent the standard deviation of the averaged signals from four replicates. All peptides are acetylated on the amino terminus and are linked to the surface via an amide bond to the carboxyl
terminus; and
FIG. 15 illustrates sequence versus normalized fluorescence intensity for the terminally truncated peptides. The full length kernal sequence (FLRQFKVVT) (SEQ. ID NO: 2) is shown in the center of the graph. Error bars represent the standard
deviation of the averaged signals from a minumum of four replicates. All peptides are acetylated on the amino terminus and are linked to the surface via an amide bond to the carboxyl terminus.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
CONTENTS
I. Definitions
II. Synthesis
A. Systematic Substitution
B. Cyclic Polymer Mapping
III. Data Collection
A. CCD Data Collection System
B. Trapping Low Affinity Interactions
C. Fluorescence Energy-Transfer Substrate Assays
IV. Examples
A. Example
B. Example
V. Conclusion
I. Definitions
Certain terms used herein are intended to have the following general definitions:
1. Complementary: This term refers to the topological compatibility or matching together of interacting surfaces of a ligand molecule and its receptor. Thus, the receptor and its ligand can be described as complementary, and furthermore, the
contact surface characteristics are complementary to each other.
2. Epitope: An epitope is that portion of an antigen molecule which is delineated by the area of interaction with the subclass of receptors known as antibodies.
3. Ligand: A ligand is a molecule that is recognized by a particular receptor. Examples of ligands that can be investigated by this invention include, but are not restricted to, agonists and antagonists for cell membrane receptors, toxins and
venoms, viral epitopes, hormones, hormone receptors, peptides, enzymes, enzyme substrates, cofactors, drugs (e.g., opiates, steroids, etc.), lectins, sugars, oligonucleotides (such as in hybridization studies), nucleic acids, oligosaccharides, proteins,
benzodiazapines, prostaglandins, beta-turn mimetics, and monoclonal antibodies.
4. Monomer: A monomer is a member of the set of smaller molecules which can be joined together to form a larger molecule. The set of monomers includes but is not restricted to, for example, the set of common L-amino acids, the set of D-amino
acids, the set of natural or synthetic amino acids, the set of nucleotides and the set of pentoses and hexoses. As used herein, monomer refers to any member of a basis set for synthesis of a larger molecule. A selected set of monomers forms a basis set
of monomers. For example, dimers of the 20 naturally occurring L-amino acids form a basis set of 400 monomers for synthesis of polypeptides. Different basis sets of monomers may be used in any of the successive steps in the synthesis of a polymer.
Furthermore, each of the sets may include protected members which are modified after synthesis.
5. Peptide: A peptide is a polymer in which the monomers are natural or unnatural amino acids and which are joined together through amide bonds, alternatively referred to as a polypeptide. In the context of this specification, it should be
appreciated that the amino acids may be, for example, the L-optical isomer or the D-optical isomer. Specific implementations of the present invention will result in the formation of peptides with two or more amino acid monomers, often 4 or more amino
acids, often 5 or more amino acids, often 10 or more amino acids, often 15 or more amino acids, and often 20 or more amino acids. Standard abbreviations for amino acids are used (e.g., P for proline). These abbreviations are included in Stryer,
Biochemistry, Third Ed., 1988, which is incorporated herein by reference for all purposes.
6. Radiation: Radiation is energy which may be selectively applied, including energy having a wavelength of between 10.sup.-14 and 10.sup.4 meters including, for example, electron beam radiation, gamma radiation, x-ray radiation, light such as
ultra-violet light, visible light, and infrared light, microwave radiation, and radio waves. "Irradiation" refers to the application of radiation to a surface.
7. Receptor: A receptor is a molecule that has an affinity for a given ligand. Receptors may be naturally-occurring or synthetic molecules. Also, they can be employed in their unaltered state, in derivative forms, or as aggregates with other
species. Receptors may be attached, covalently or noncovalently, to a binding member, either directly or via a specific binding substance. Examples of receptors which can be employed by this invention include, but are not restricted to, antibodies,
cell membrane receptors, monoclonal antibodies and antisera reactive with specific antigenic determinants (such as on viruses, cells, or other materials), drugs, oligonucleotides, polynucleotides, nucleic acids, peptides, cofactors, lectins, sugars,
polysaccharides, cells, cellular membranes, and organelles. Receptors are sometimes referred to in the art as anti-ligands. As the term receptors is used herein, no difference in meaning is intended. A "Ligand Receptor Pair" is formed when two
molecules have combined through molecular recognition to form a complex.
Other examples of receptors which can be investigated by this invention include but are not restricted to microorganism receptors, enzymes, catalytic polypeptides, hormone receptors, and opiate receptors.
8. Substrate: A substrate is a material having a rigid or semi-rigid surface, generally insoluble in a solvent of interest such as water, porous and/or non-porous. In many embodiments, at least one surface of the substrate will be substantially
flat, although in some embodiments it may be desirable to physically separate synthesis regions for different polymers with, for example, wells, raised regions, etched trenches, or the like. According to other embodiments, small beads may be provided on
the surface which may be released upon completion of the synthesis.
9. Protecting group: A protecting group is a material which is chemically bound to a monomer unit or polymer and which may be removed upon selective exposure to an activator such as electromagnetic radiation or light, especially ultraviolet and
visible light. Examples of protecting groups with utility herein include those comprising ortho-nitro benzyl derivatives, nitropiperonyl, pyrenylmethoxy-carbonyl, nitroveratryl, nitrobenzyl, dimethyl dimethoxybenzyl, 5-bromo-7-nitroindolinyl,
o-hydroxy-.alpha.-methyl cinnamoyl, and 2-oxymethylene anthraquinone.
10. Predefined Region: A predefined region is a localized area on a surface which is, was, or is intended to be activated for formation of a molecule using the techniques described herein. The predefined region may have any convenient shape,
e.g., circular, rectangular, elliptical, wedge-shaped, etc. For the sake of brevity herein, "predefined regions" are sometimes referred to simply as "regions." A predefined region may be illuminated in a specified step, along with other regions of a
substrate.
11. Substantially Pure: A molecule is considered to be "substantially pure" within a predefined region of a substrate when it exhibits characteristics that distinguish it from other predefined regions. Typically, purity will be measured in
terms of biological activity or function as a result of uniform sequence. Such characteristics will typically be measured by way of binding with a selected ligand or receptor. Preferably the region is sufficiently pure such that the predominant species
in the predefined region is the desired sequence. According to preferred aspects of the invention, the molecules formed are 5% pure, more preferably more than 10% pure, preferably more than 20% pure, more preferably more than 80% pure, more preferably
more than 90% pure, more preferably more than 95% pure, where purity for this purpose refers to the ratio of the number of ligand molecules formed in a predefined region having a desired sequence to the total number of molecules formed in the predefined
region.
12. Activator: A activator is a material or energy source adapted to render a group active and which is directed from a source to at least a predefined location on a substrate, such as radiation. A primary illustration of an activator is light,
such as visible, ultraviolet, or infrared light. Other examples of activators include ion beams, electric fields, magnetic fields, electron beams, x-ray, and the like.
13. Combinatorial Synthesis Strategy: A combinatorial synthesis strategy is an ordered strategy for parallel synthesis of diverse polymer sequences by sequential addition of reagents which may be represented by a reactant matrix and a switch
matrix, the product of which is a product matrix. A reactant matrix is a l column by m row matrix of the building blocks to be added. The switch matrix is all or a subset of the binary numbers, preferably ordered, between l and m arranged in columns.
A "binary strategy" is one in which at least two successive steps illuminate a portion, often half, of a region of interest on the substrate. In a binary synthesis strategy, all possible compounds which can be formed from an ordered set of reactants are
formed. In most preferred embodiments, binary synthesis refers to a synthesis strategy which also factors a previous addition step. For example, a strategy in which a switch matrix for a masking strategy halves regions that were previously illuminated,
illuminating about half of the previously illuminated region and protecting the remaining half (while also protecting about half of previously protected regions and illuminating about half of previously protected regions). It will be recognized that
binary rounds may be interspersed with non-binary rounds and that only a portion of a substrate may be subjected to a binary scheme. A combinatorial "masking" strategy is a synthesis which uses light or other spatially selective deprotecting or
activating agents to remove protecting groups from materials for addition of other materials such as amino acids.
14. Linker: A linker is a molecule or group of molecules attached to a substrate, and spacing a synthesized polymer from the substrate for exposure/binding to a receptor.
15. Systematically Substituted: A position in a target molecule has been systematically substituted when the molecule is formed at a plurality of synthesis sites, with the molecule having a different member of a basis set of monomers at the
selected position of the molecule within each of the synthesis sites on the substrate.
16. Abbreviations: The following frequently used abbreviations are intended to have the following meanings:
BOC: t-butyloxycarbonyl.
BOP: benzotriazol-1-yloxytris-(dimethylamino) phosphonium hexafluorophosphate.
DCC: dicyclohexylcarbodiimide.
DCM: dichloromethane; methylene chloride.
DDZ: dimethoxydimethylbenzyloxy.
DIEA: N,N-diisopropylethylamine.
DMAP: 4-dimethylaminopyridine.
DMF: dimethyl formamide.
DMT: dimethoxytrityl.
FMOC: fluorenylmethyloxycarbonyl.
HBTU: 2-(1H-benzotriazol-1-yl) -1,1,3,3-tetramethyluronium hexafluorophosphate.
HOBT: 1-hydroxybenzotriazole.
NBOC: 2-nitrobenzyloxycarbonyl.
NMP: N-methylpyrrolidone.
NPOC: 6-nitropiperonyloxycarbonyl.
NV: 6-nitroveratryl.
NVOC: 6-nitroveratryloxycarbonyl.
PG: protecting group.
TFA: trifluoracetic acid.
THF: tetrahydrofuran.
II. Synthesis
The present invention provides synthetic strategies and devices for the creation of large scale chemical diversity. Solid-phase chemistry, photolabile protecting groups, and photolithography are brought together to achieve light-directed
spatially-addressable parallel chemical synthesis in preferred embodiments.
The invention is described herein for purposes of illustration primarily with regard to the preparation of peptides and nucleotides but could readily be applied in the preparation of other molecules. Such molecules include, for example, both
linear and cyclic polymers of nucleic acids, polysaccharides, phospholipids, and peptides having either .alpha.-, .beta.-, or .omega.-amino acids, heteropolymers in which a known drug is covalently bound to any of the above, polyurethanes, polyesters,
polycarbonates, polyureas, n-alkylureas, polyamides, polyethyleneimines, polyarylene sulfides, polysiloxanes, polyimides, carbamates, sulfones, sulfoxides, polyacetates, or other polymers which will be apparent upon review of this disclosure. It will be
recognized further that peptide illustrations herein are primarily with reference to C- to N-terminal synthesis, but the invention could readily be applied to N- to C-terminal synthesis without departing from the scope of the invention. Methods for
forming cyclic and reversed polarity peptides and other polymers are described in copending application Ser. No. 796,727, filed Nov. 22, 1991, and previously incorporated herein by reference. Other molecules that are not conventionally viewed as
polymers but which are formed from a basis set of monomers or building blocks may also be formed according to the invention herein.
The prepared substrate may, for example, be used in screening a variety of polymers as ligands for binding with a receptor, although it will be apparent that the invention could be used for the synthesis of a receptor for binding with a ligand.
The substrate disclosed herein will have a wide variety of uses. Merely by way of example, the invention herein can be used in determining peptide and nucleic acid sequences that bind to proteins, finding sequence-specific binding drugs, identifying
epitopes recognized by antibodies, and evaluating a variety of drugs for clinical and diagnostic applications, as well as combinations of the above.
The invention preferably provides for the use of a substrate "S" with a surface. Linker molecules "L" are optionally provided on a surface of the substrate. The purpose of the linker molecules, in some embodiments, is to facilitate receptor
recognition of the synthesized polymers.
Optionally, the linker molecules are chemically protected for storage or synthesis purposes. A chemical protecting group such as t-BOC (t-butyloxycarbonyl) is used in some embodiments. Such chemical protecting groups would be chemically removed
upon exposure to, for example, acidic solution and could serve, inter alia to protect the surface during storage and be removed prior to polymer preparation.
When a polymer sequence to be synthesized is, for example, a polypeptide, amino groups at the ends of linkers attached to a glass substrate are derivatized with, for example, nitroveratryloxycarbonyl (NVOC), a photoremovable protecting group.
The linker molecules may be, for example, aryl acetylene, ethylene glycol oligomers containing from 2-10 monomers, diamines, diacids, amino acids, or combinations thereof.
According to one aspect of the invention, on the substrate or a distal end of the linker molecules, a functional group with a protecting group P.sub.0 is provided. The protecting group P.sub.0 may be removed upon exposure to an activator such as
a chemical reagent, radiation, electric fields, electric currents, or other activators to expose the functional group. In a preferred embodiment, the radiation is ultraviolet (UV), infrared (IR), or visible light, or a basic or acidic reagent. In still
further alternative embodiments, ion beams, electron beams, or the like may be used for deprotection.
Photodeprotection is effected by illumination of the substrate through, for example, a mask wherein the pattern produces illuminated regions with dimensions of, for example, less than 1 cm.sup.2, 10.sup.-1 cm.sup.2, 10.sup.-2 cm.sup.2, 10.sup.-3
cm.sup.2, 10.sup.-4 cm.sup.2, 10.sup.-5 cm.sup.2, 10.sup.-6 cm.sup.2, 10.sup.-7 cm.sup.2, 10.sup.-8 cm.sup.2, or 10.sup.-10 cm.sup.2. In a preferred embodiment, the regions are between about 10.times.10 .mu.m and 500.times.500 .mu.m. According to some
embodiments, the masks are arranged to produce a checkerboard array of polymers, although any one of a variety of geometric configurations may be utilized.
Concurrently with or after exposure of a known region of the substrate to light or another activator, the surface is contacted with a first monomer unit M.sub.1 which reacts with the functional group that has been exposed by the deprotection
step. The first monomer includes a protecting group P.sub.1. P.sub.1 may or may not be the same as P.sub.0.
Accordingly, after a first cycle, first regions of the surface comprise the sequence:
while remaining regions of the surface comprise the sequence:
Thereafter, one or more second regions of the surface (which may include all or part of the first region, as well as other regions) are exposed to light and contacted with a second monomer M.sub.2 (which may or may not be the same as M.sub.1)
having a protecting group P.sub.2. P.sub.2 may or max not be the same as P.sub.0 and P.sub.1. After this second cycle, different regions of the substrate may comprise one or more of the following sequences:
The above process is repeated until the substrate includes desired polymers of desired lengths. By controlling the locations of the substrate exposed to light and the reagents exposed to the substrate following exposure, one knows the location
of each sequence.
According to some embodiments of the invention, multiple protecting groups are utilized. For example, when light-labile protecting groups are utilized to protect the growing polymer chain, it will be desirable in some embodiments to provide
different protecting groups on at least selected side groups of the various monomers. For example, acid or base labile protecting groups may be particularly desirable when light labile protecting groups are used on the growing polymer chain. As a
specific example, in the case of amino acids, the sulfhydryl groups of cysteine side chains can form disulfide bonds with one another. Accordingly, it will sometimes be desirable to protect such side groups with an acid or base labile protecting group,
or a protecting group that is removed with a wavelength of light different from that which removes the protecting group on the growing polymer chain. Then, one can selectively couple these side chains by removing the appropriate protecting groups.
Thereafter, the protecting groups are removed from some or all of the substrate and the sequences are, optionally, capped with a capping unit C. The process results in a substrate having a surface with a plurality of polymers of the following
general formula:
where square brackets indicate optional groups, and M.sub.i . . . M.sub.x indicates any sequence of monomers. The number of monomers could cover a wide variety of values, but in a preferred embodiment they will range from 2 to 100.
In some embodiments, a plurality of locations on the substrate polymers contain a common monomer subsequence. For example, it may be desired to synthesize a sequence S-M.sub.1 -M.sub.2 -M.sub.3 at first locations and a sequence S-M.sub.4
-M.sub.2 -M.sub.3 at second locations. The process would commence with irradiation of the first locations followed by contacting with M.sub.1 -P, resulting in the sequence S-M.sub.1 -P at the first location. The second locations would then be
irradiated and contacted with M.sub.4 -P, resulting in the sequence S-M.sub.4 -P at the second locations. Thereafter both the first and second locations would be irradiated and contacted with monomers M.sub.2 and M.sub.3 (or with the dimer M.sub.2
-M.sub.3, resulting in the sequence S-M.sub.1 -M.sub.2 -M.sub.3 at the first locations and S-M.sub.4 -M.sub.2 -M.sub.3 at the second locations. Of course, common subsequences of any length could be utilized including those in a range of 2 or more
monomers, such as 2 to 10 monomers, 2 to 20 monomers, or 2 to 100 monomers.
The polymers prepared on a substrate according to the above methods will have a variety of uses including, for example, screening for biological activity, i.e., such as ability to bind to a receptor. In such screening activities, the substrate
containing the sequences is exposed to an unlabeled or labeled receptor such as an antibody, a receptor on a cell, a phospholipid vesicle, or any one of a variety of other receptors. In one preferred embodiment, the polymers are exposed to a first,
unlabeled or labeled receptor of interest and thereafter exposed to a labeled receptor-specific recognition element, which is, for example, an antibody. This process can provide signal amplification in the detection stage.
The receptor molecules may or may not bind with one or more polymers on the substrate. The presence (or lack thereof) of the labeled receptor and, therefore, the presence of a sequence which binds with the receptor is detected in a preferred
embodiment through the use of autoradiography, detection of fluorescence with a charge-coupled device, fluorescence microscopy, or the like. The sequence of the polymer at the locations where the receptor binding is detected may be used to determine all
or part of a sequence which is complementary to the receptor.
Use of the invention herein is illustrated primarily with reference to screening for binding to a complementary receptor. The invention will, however, find many other uses. For example, the invention may be used in information storage (e.g., on
optical disks), production of molecular electronic devices, production of stationary phases in separation sciences, production of dyes and brightening agents, photography, and in immobilization of cells, proteins, lectins, nucleic acids, polysaccharides,
and the like in patterns on a surface via molecular recognition of specific polymer sequences. By synthesizing the same compound in adjacent, progressively differing concentrations, one can establish a gradient to control chemotaxis or to develop
diagnostic "dipsticks," which, for example, titrate an antibody against an increasing amount of antigen. By synthesizing several catalyst molecules in close proximity, one can achieve more efficient multistep conversions by "coordinate immobilization."
Coordinate immobilization also may be used for electron transfer systems, as well as to provide both structural integrity and other desirable properties to materials, such as lubrication, wetting, etc.
According to alternative embodiments, molecular biodistribution or pharmacokinetic properties may be examined. For example, to assess resistance to intestinal or serum proteases, polymers may be capped with a fluorescent tag and exposed to
biological fluids of interest.
A high degree of miniaturization is possible, because the density of compounds on the surface is determined largely with regard to spatial addressability of the activator, in one case the diffraction of light. Each compound is physically
accessible and its position is precisely known. Hence, the array is spatially-addressable, and its interactions with other molecules can be assessed.
According to one aspect of the invention, reactions take place in an appropriate reaction chamber that includes isolated fluid flow paths for heating or cooling liquids that are used to maintain the reaction chamber temperature at a desired
level. In still further embodiments the reaction chamber is placed on a rotating "centrifuge" to reduce the volume of reactants needed for the various coupling/deprotection steps disclosed herein. In a centrifuge flow cell, the substrate is placed in
the centrifuge such that, for example, when a monomer solution passes over the surface of the substrate a relatively thin film of the material is formed on the substrate due to the higher gravitational forces acting on the substrate. Accordingly, the
volume of various reagents needed in the synthesis will be substantially reduced.
A. Systematic Substitution
According to one preferred embodiment of the invention, a "lead" sequence is identified using either the light directed techniques described herein, or more conventional methods such as those described in Geysen, J. Imm. Methods (1987)
102:259-274, incorporated herein by reference for all purposes, or through other knowledge of the structure of the receptor in question, such as through computer modeling information. As used herein, a "lead" or "kernel" sequence is a molecule having a
monomer sequence which has been shown to exhibit at least limited binding affinity with a receptor or class of receptors.
Thereafter, a series of molecules related to the lead sequence are generated by systematic substitution, deletion, addition, or a combination of these processes at one or more positions of the molecule. A sequence with a binding affinity higher
than the lead sequence can be (or may be) identified through evaluation of the molecules produced by these processes.
One aspect of the invention herein provides for improved methods for forming molecules with systematically substituted monomers or groups of monomers using a limited number of synthesis steps. Like the other embodiments of the invention
described herein, this aspect of the invention has applicability not only to the evaluation of peptides, but also other molecules, such as oligonucleotides and polysaccharides. Light-directed techniques are utilized in preferred embodiments because of
the significant savings in time, labor, and the like.
According to one aspect of the invention, a lead polymer sequence is identified using conventional techniques or the more sophisticated light directed techniques described herein. The lead sequence is generally represented herein by:
where the various letters refer to amino acids or other monomers in their respective positions in the lead sequence. Although a polymer with seven monomers is used herein for the purpose of illustration, a larger or smaller number of monomers
will typically be found in the lead polymer in most embodiments of the invention.
Using a selected basis set of monomers, such as twenty amino acids or four nucleotides, one generates the following series of systematically substituted polymers. The sequence of the molecules generated is determined with reference to the
columns of the map. In other words, the "map" below can be viewed as a cross-section of the substrate:
______________________________________ G G G G G G X F F F F F X F E E E E X E E D D D X D D D C C X C C C C B X B B B B B X A A A A A A ______________________________________
where X represents the monomers in a basis set of monomers such as twenty amino acids. For example, the twenty polymers XBCDEFG are generated within 20 individual synthesis sites on the substrate. In the case of 7-monomer lead peptides, the
total number of peptides generated with all twenty monomers in the basis set is 140, i.e., 7*20, with 134 unique sequences being made and 7 occurences of the lead sequence.
One of the least efficient ways to form this array of polymers would be via conventional synthesis techniques, which would require about 938 coupling steps (134 peptides * 7 residues each). At the other extreme, each of these sequences could be
made in 7 steps, but the sequences would be physically mixed, requiring separation after screening.
By contrast, this aspect of the invention provides for efficient synthesis of substituted polymers. FIG. 1A-1C illustrates the masking strategy for the 7-monomer lead polymer. The particular masking strategy illustrated in FIG. 1A-1C utilizes
rectangular masks, but it will be apparent that other shapes of masks may also be used without departing from the scope of the invention herein. Masking techniques wherein regions of a substrate are selectively activated by light are described herein by
way of a preferred embodiment. The inventions herein are not so limited, however, and other activation techniques may be utilized. For example, mechanical techniques of activation/coupling such as described in copending application Ser. No. 07/796,243
are used in some embodiments.
As shown in FIG. 1A, the process begins by exposing substantially all of a predefined region of the substrate to light with a mask 291, exposing approximately 6/7 of the region of interest 289. This step is followed by exposure of the substrate
to monomer A. Thereafter, a mask 292 is used to expose approximately 5/7 of the region of interest, followed by coupling of B. It will be recognized that mask 292 may in fact be the same mask as 291 but translated across the substrate. Accordingly,
regions indicated by dashed line 290, may also be exposed to light in this step, as well as in later steps. Thereafter, subsequent masking steps expose 4/7, 3/7, 2/7, and 1/7 of the area of interest on the substrate, each mask being used to couple a
different monomer (C, D, E, F) to the substrate. The resulting substrate is schematically illustrated in the bottom portion of FIG. 1A along with the resulting polymer sequences thereon. Again, the composition of the sequences on the substrate is given
by the vertical column such as "ABCDEF." As seen, 1- to 6 -membered truncated portions of the target ABCDEFG are formed.
FIG. 1B illustrates the next series of masking steps. As shown in FIG. 1B, the same mask 293 is used for each masking step, but the mask is translated with respect to the substrate in each step. In each step, the mask illuminates a portion of
each of the "stripes" of polymers formed in FIG. 1A, and in each step a different one of the monomers in a basis set is coupled to the substrate. The mask is then translated downwards for irradiation of the substrate and coupling of the next monomer.
In the first step, the mask exposes the top 1/20 of each "stripe" of polymers shown in FIG. 1A, and monomer X.sub.1 is coupled to this region. In the second step, the mask is translated downwards and X.sub.2 is coupled, etc. The resulting substrate is
shown in the bottom portion of FIG. 1B. An additional stripe 295 is formed adjacent the region addressed in FIG. 1A, this region containing a series of subregions, each containing one of the 20 monomers in this particular basis set.
Accordingly, after the steps shown in FIG. 1B, the substrate contains the following polymer sequences on the surface thereof (columns again indicating the sequences formed on the substrate):
______________________________________ X X F X E E X D D D X C C C C X B B B B B X A A A A A A ______________________________________
where X indicates that an individual region contains one of each of the monomers in the basis set. Accordingly, for example, each of the 20 dimers AX are generated when the basis set is the 20 natural L-amino acids typically found in proteins.
The 20 trimers of ABX, 4-mers of ABCX, etc., are also formed at predefined regions on the substrate.
Thereafter, as shown in FIG. 1C, the process continues, optionally using the same mask(s) used in FIG. 1A. The masks differ only in that they have been translated with respect to the substrate. In step 27, monomer B is added to the substrate
using the mask that illuminates only the right 1/7 of the region of the substrate of interest. In step 28, the right 2/7 of the substrate is exposed and monomer C is coupled, etc. As shown in the bottom portion of FIG. 1C, the process results in the
generation of all possible polymers based on the polymer ABCDEFG, wherein each monomer position is systematically substituted with all possible monomers from a basis set.
A number of variations of the above technique will be useful in some applications. For example, in some embodiments the process is varied slightly to form disubstitutions of a lead polymer in which the substitutions are in adjacent locations in
the polymer. Such arrays are formed by one of a variety of techniques, but one simple technique provides for each of the masks illustrated in steps 7-26 of FIG. 1 to overlap the previous mask by some fraction, e.g., 1/3, of its height. According to
such embodiments, the following array of polymers would be generated, in addition to the previous array of 134:
______________________________________ G G G G G G X F F F F F X X E E E E X X F D D D X X E E B X X C C C C X X B B B B B X A A A A A A ______________________________________
The scheme may be expanded to produce tri-substituted, tetra-substituted, etc. molecules. Accordingly, the present invention provides a method of forming all molecules in which at least one location in the polymer is systematically substituted
with all possible monomers from a basis set.
According to a preferred aspect of the invention, masks are formed and reused to minimize the number of masks used in the process. FIG. 2 illustrates how masks may be designed in this manner. For simplicity, a 5-monomer synthesis is
illustrated. The masks are illustrated from "above" in FIG. 2, with the cross hatching indicating light-transmissive regions. The resulting substrate is shown in the bottom portion of FIG. 2, with the region of primary interest for mono-substitutions
indicated by the arrows.
FIG. 2 illustrates how to generate and systematically substitute all of the 5-mers contained in a 6-mer lead. A limited set of the possible 2-, 3-, and 4-mers is also synthesized. A 4-pattern mask was used. To make the 6-mers in a 7-mer
kernel, a 5-pattern mask is used. To make the 7-mers in an 8-mer kernel, a 6-pattern mask is used, etc. To make all the 7-mers in a 12-mer kernel, a 6-pattern mask is still all that is needed. As a "bonus," all of the truncation sequences are generated
with this strategy of letting the masks extend beyond the "desired" regions.
For example, FIG. 2 illustrates that in a 6-monomer sequence the 1- to 5-position substituted polymers are formed in the primary regions of interest (indicated by arrows), the 2- to 6-position substituted polymers are formed in region 403, the 3-
to 6-position substituted polymers are formed in region 404, the 4- to 6-position substituted polymers in region 405, etc. The 1- to 4-position substituted polymers are formed in regions 406, the 1- to 3-position substituted polymers in region 407, etc.
In some embodiments, the substrate is only as large as the region indicated by arrows. It will often be desirable, however, to synthesize all of the molecules illustrated in FIG. 2 since the deletion sequences and others found outside of the
region delineated by arrows will often provide additional valuable binding information.
As shown, a single mask 401 is used for all of steps 1-5, while another mask 402 is used for steps 6-11. The same mask 401 is used for steps 12-16, but the mask is rotated preferably 180 degrees with respect to the substrate. The light
tranmissive regions of the mask 401 extend the full length ("Y") of the area of interest in the y-direction. As shown, in step 1, the mask is placed above the substrate, the substrate is exposed, and the A monomer is coupled to the substrate in selected
portions of the substrate corresponding at least to the region 401a. Monomer A may also be placed at other locations on the substrate at positions corresponding to mask regions 401b, 401c, and 401d. Alternatively, these regions of the mask may simply
illuminate regions that are off of the substrate or otherwise not of interest. If these regions correspond to regions of the substrate, various truncated analogs of the sequences will be formed.
Thereafter, as shown in step 2, the same mask is utilized, but it is translated to the right. The substrate is exposed by the mask, and the monomer B is then coupled to the substrate. Thereafter, coupling steps 3, 4, and 5 are conducted to
couple monomers C, D and E, respectively. These steps also use the same mask translated to the right in the manner shown.
Thereafter, mask 402 is used for the "X" coupling steps. The mask 402 contains a single stripe that extends the full length ("X") of the area of interest in the x-direction. Mask 402 is preferrably a single, linear stripe that will normally be
of width Y divided by the number of monomers in the substitution basis set. For example, in the case of 6 amino acids as a basis set for the monosubstitution of peptides, the stripe will have a width of Y/6. The mask is repeatedly used to couple each
of the monomers in the basis set of monomers and is translated downwards (or upwards) after each coupling step. For example, the mask may be placed at the top of the region of interest for the first coupling step, followed by translation downwards by
1/6 of the Y dimension for each successive coupling step when 6 monomers are to be substituted into the target. FIG. 2 shows only 2 mask steps for simplicity, but a greater number will normally be used.
Thereafter the mask 401 is again utilized for the remaining coupling steps. As shown, the mask 401 is rotated, preferably 180 degrees, for the following coupling steps. The succeeding coupling steps 12-16 are used to couple monomers B-F,
respectively. The resulting substrate is shown in "cross section" in the bottom portion of FIG. 2. Again, the primary area of interest is designated by arrows and may be the only region used for synthesis on the substrate. The truncated substitutions
outside of this region will also provide valuable information, however.
One extension of this method provides for the synthesis of all the possible double substitutions of a kernel sequence. For a kernel sequence 7 residues long, there are 8400 peptides that make up all possible disubstitutions of 20 amino acids,
not considering replication (8380 unique). These peptides can be synthesized in 55 steps with 17 masks. The synthesized sequences are shown below:
__________________________________________________________________________ G G G G G X G G G G X G G G X G G X G X X F F F F X F F F F X F F F X F F X F X F X E E E X E E E E X E E E X E E X E E X X E D D X D D D D X D D D X D D D X X X D D
D C X C C C C X C C C C X X X X C C C C C C X B B B B B X X X X X B B B B B B B B B B X X X X X X A A A A A A A A A A A A A A A __________________________________________________________________________
The systematic substitution of three or more positions of the kernel sequence is also easily derived. The optimum polymer identified from the above strategy can then serve as the new kernel sequence in further iterations of this process. The
present method may be used for any desired systematic substitution set, such as all 8-mers in a 12-mer kernel, substitution in cyclic polymers, and the like. This method provides a powerful technique for the optimization of ligands that bind to a
molecular recognition element.
B. Cyclic Polymer Mapping
Copending application Ser. No. 07/796,727 (Entitled "Polymer Reversal on Solid Surfaces"), incorporated herein by reference for all purposes, discloses a method for forming cyclic polymers on a solid surface. According to one aspect of the
present invention, improved strategies for forming systematically varied cyclic polymers are provided.
In the discussion below, "P" refers to a protecting group, and X, Y, and Z refer to the various reactive sites on a tether molecule T. A, B, C, D, E, and F refer to various monomers or groups of monomers. To synthesize a cyclic polymer according
to one aspect of the invention herein, the process is conducted on a substrate. A tether molecule T is coupled to a surface of the substrate. T may be one of the monomers in the polymer, such as glutamic acid in the case of amino acids. Other examples
of amino acid tether molecules include, but are not limited to, serine, threonine, cysteine, aspartic acid, glutamic acid, tyrosine, 4-hydroxyproline, homocysteine, cysteinesulfinic acid, homoserine, ornithine, and the like. The tether molecule includes
one or more reactive sites such as a reactive site Z which is used to couple the tether to the substrate. The tether also includes a reactive site X having a protecting group P.sub.2 thereon. The tether molecule further includes a reactive site Y with
a protecting group P.sub.1 thereon.
In a first step, a polymer synthesis is carried out on the reactive site Y. According to some embodiments, conventional polymer synthesis techniques are utilized such as those described in Atherton et al., previously incorporated herein by
reference for all purposes. A wide variety of techniques may be used in alternative embodiments. For example, according to one embodiment, a variety of polymers with different monomer sequences are synthesized on the substrate. Such techniques may
involve the sequential addition of monomers or groups of monomers on the growing polymer chain, each monomer of which may also have a reactive site protected by a protecting group.
A variety of such methods are available for synthesizing different polymers on a surface. For example, Geysen et al., "Strategies for Epitope Analysis Using Peptide Synthesis," J. Imm. Meth., (1987) 102:259-274, incorporated herein by reference
for all purposes, describes one commonly used technique for synthesizing different peptides using a "pin" technique. Other techniques include those of Houghten et al., Nature (1991) 354:84-86, incorporated herein by reference. In some embodiments,
advanced techniques for synthesizing polymer arrays are utilized, such as those described in copending application Ser. No. 07/796,243, or light-directed, spatially-addressable techniques disclosed in Pirrung et al., U.S. Pat. No. 5,143,854; U.S.
application Ser. No. 07/624,120; and Fodor et al., "Light-Directed Spatially-Addressable Parallel Chemical Synthesis," Science (1991) 251:767-773, all incorporated herein by reference for all purposes, such techniques being referred to herein for
purposes of brevity as VLSIPS.TM. (Very Large Scale Immobilized Polymer Synthesis) techniques.
During polymer synthesis, the activator used to remove P.sub.1 (if any) on the Y reactive site, and on reactive sites of the growing polymer chain, should be different than the activator used to remove the X protecting group P.sub.2. Merely by
way of example, the activator used to remove P.sub.2 may be a first chemical reagent, while the activator used to remove the protecting group P.sub.1, may be a second, different chemical reagent such as acid or base. By way of further example, the
activator used to remove one of the protecting groups may be light, while the activator used to remove the other protecting group may be a chemical reagent, or both activators may be light, but of different wavelengths. Of course, other combinations
will be readily apparent to those of skill in the art on review of this disclosure.
By virtue of proper protecting group selection and exposure to only the P.sub.1 activato | | |