|
Claims  |
|
|
I claim:
1. A method for determining the sequence of a nucleic acid, comprising the
steps of:
a) generating at least two conditioned, base-specifically terminated
nucleic acid fragments from a nucleic acid to be sequenced;
b) determining the molecular weight value of each base-specifically
terminated fragment by mass spectrometry, wherein the molecular weight
values of at least two base-specifically terminated fragments are
determined concurrently; and
c) determining the sequence of the nucleic acid by aligning the
base-specifically terminated nucleic acid fragments according to molecular
weight.
2. The method according to claim 1 wherein the nucleic acid fragments are
purified before the step of determining the molecular weight values by
mass spectrometry.
3. The method according to claim 2 wherein the nucleic acid fragments are
purified, comprising the steps of:
i) reversibly immobilizing the nucleic acid fragments on a solid support;
and
ii) washing out all remaining reactants and by-products.
4. The method according to claim 3, further comprising the step of removing
the nucleic acid fragments from the solid support.
5. The method according to claim 3, wherein each fragment is coupled by the
linking group (L) to a functionality (L') on the support creating a
temporary and cleavable attachment of the nucleic acid fragments to the
support.
6. The method according to claim 5, wherein the base-specifically
terminated nucleic acid fragments are cleaved from the solid support prior
to mass spectrometry.
7. The method according to claim 5, wherein the base-specifically
terminated nucleic acid fragments are cleaved from the solid support
during mass spectrometry.
8. The method according to claim 5, wherein the temporary and cleavable
attachment can be cleaved enzymatically, chemically or physically.
9. The method according to claim 8, wherein the temporary and cleavable
attachment is selected from the group consisting of a photocleavable bond,
a bond based on strong electrostatic interaction, a tritylether bond, a
.beta.-benzoylpropionyl group, a levulinyl group, a disulfide bond, an
arginine/arginine bond, a lysine/lysine bond, a pyrophosphate bond, and a
bond created by Watson-Crick base pairing.
10. The method according to claim 1, wherein step a), the nucleic acid
fragments are conditioned by cation exchange.
11. The method according to claim 1, wherein step a), the nucleic acid
fragments are conditioned by mass modification.
12. The method according to claim 11, wherein each nucleic acid fragment is
synthesized using a nucleic acid primer; in the presence of
chain-terminating and chain-elongating deoxynucleotides; and wherein at
least one chain-elongating deoxynucleotide is selected from the group
consisting of deoxyadenosine triphosphate dATP, deoxythymidine
triphosphate dTTP, deoxyguanosine triphosphate dGTP, deoxycytidine
triphosphate dCTP, deoxyinosine triphosphate dITP, a 7-deazadeoxyguanosine
triphosphate c.sup.7 dGTP, a 7-deazadeoxyadenosine triphosphate c.sup.7
dATP, and a 7-deazadeoxyinosine triphosphate c.sup.7 dITP; at least one
chain-terminating dideoxynucleotide selected from the group consisting of
dideoxyadenosine triphosphate ddATP, dideoxythymidine triphosphate ddTTP,
dideoxyguanosine triphosphate ddGTP, and dideoxycytidine triphosphate
ddCTP; and a DNA polymerase.
13. The method according to claim 12, wherein the nucleic acid primer
further includes a linking group (L) for reversibly immobilizing the
primer on a solid support.
14. The method according to claim 11, wherein each nucleic acid fragment is
synthesized using chain terminating and chain elongating nucleotides and
wherein at least one chain elongating nucleotide is selected from the
group consisting of adenosine triphosphate (ATP), uridine triphosphate
(UTP), guanosine triphosphate (GTP), cytidine triphosphate (CTP), inosine
triphosphate (ITP), a 7-deazaadenosine triphosphate (c.sup.7 ATP), a
7-deazaguanosine triphosphate (c.sup.7 GTP), and a 7-deazainosine
triphosphate (c.sup.7 ITP); and at least one chain-terminating
3'-deoxynucleotide selected from the group consisting of deoxyadenosine
triphosphate 3'-dATP, deoxyuridine triphosphate 3'-dUTP, deoxyguanosine
triphosphate 3'-dGTP, and deoxycytidine triphosphate 3'-dCTP); and an RNA
polymerase.
15. The method according to claim 1, wherein the molecular weight value of
each nucleic acid fragment is determined by matrix-assisted laser
desorption/ionization mass spectrometry (MALDI-MS).
16. The method according to claim 1 in which the molecular weight value of
each nucleic acid fragment is determined by electrospray mass spectrometry
(ES-MS).
17. The method of claim 1, wherein step b) is performed without first
performing an electrophoretic separation.
18. A method of claim 1, wherein the base-specifically terminated nucleic
acid fragments are conditioned by removing the negative charge from the
phosphodiester backbone.
19. A method of claim 1, wherein the base-specifically terminated nucleic
acid fragments are conditioned by purification.
20. The method according to claim 1, wherein more than one species of
nucleic acid are concurrently sequenced by multiplex mass spectrometric
nucleic acid sequencing employing tag probes, nucleic acid primers,
chain-elongating nucleotides, and chain-terminating nucleotides, wherein
one of the sets of base-specifically terminated fragments is unmodified
and the other sets of base-specifically terminated fragments are mass
modified, and each of the sets of base-specifically terminated nucleic
acid fragments has a sufficient mass difference to be distinguished from
the others by mass spectrometry.
21. The method according to claim 20, wherein at least one of the sets of
mass-modified base-specifically terminated fragments is modified with a
mass-modifying functionality (M) at a heterocyclic base of at least one
nucleotide.
22. The method according to claim 21, wherein the heterocyclic
base-modified nucleotide is selected from the group consisting of a
cytosine nucleotide modified at C-5, a thymine nucleotide modified at C-5,
a thymine nucleotide modified at the C-5 methyl group, a uracil nucleotide
modified at C-5, an adenine nucleotide modified at C-8, a c.sup.7
-deazaadenine modified at C-8, a c.sup.7 -deazaadenine modified at C-7, a
guanine nucleotide modified at C-8, a c.sup.7 -deazaguanine modified at
C-8, a c.sup.7 -deazaguanine modified at C-7, a hypoxanthine modified at
C-8, a c.sup.7 -deazahypoxanthine modified at C-7, and a c.sup.7
-deazahypoxanthine modified at C-8.
23. The method according to claim 20, wherein at least one of the sets of
mass-modified base-specifically terminated nucleic acid fragments is
modified with a mass-modifying functionality (M) attached to one or more
phosphate moieties of the internucleotidic linkages of the fragments.
24. The method according to claim 20, wherein at least one of the sets of
mass-modified base-specifically terminated nucleic acid fragments is
modified with a mass-modifying functionality (M) attached to one or more
sugar moieties of nucleotides within the set of mass modified
base-specifically terminated fragments at at least one sugar position
selected from the group consisting of a C-2' position, an external C-3'
position, and an external C-5' position.
25. The method according to claim 20, wherein at least one of the sets of
mass-modified base-specifically terminated nucleic acid fragments is
modified with a mass-modifying functionality (M) attached to the sugar
moiety of a 5'-terminal nucleotide and wherein the mass-modifying function
(M) is the linking functionality (L).
26. The method according to claim 20, wherein a mass-modifying
functionality (M) is attached to a set of base-specifically terminated
nucleic acid fragments subsequent to generating the base-specifically
terminated nucleic acid fragments and prior to determining the molecular
weight values for the nested fragments by mass spectrometry.
27. The method according to claim 26, wherein the base-specifically
terminated nucleic acid fragments are generated using at least one reagent
selected from the group consisting of a nucleic acid primer, a
chain-elongating nucleotide, a chain-terminating nucleotide, a tag probe
which has been modified with a precursor of the mass-modifying
functionality, M; and a subsequent step comprises modifying the precursor
of the mass-modifying functionality, M, to generate the mass-modifying
functionality, M, prior to mass spectrometric analysis.
28. The method according to claim 20, wherein mass differentiation of the
tag probes is achieved by changing the nucleotide composition of at least
one of the tag probes and complementary tag sequence in the species of
nucleic acid.
29. The method according to claim 20, wherein the tag probes are covalently
bound to the corresponding complementary tag sequence prior to mass
spectrometric analysis.
30. The method according to claim 29, wherein binding between the tag
probes and the corresponding complementary tag sequences is achieved
photochemically via photoactivatable groups.
31. A method of sequencing a nucleic acid, comprising the steps of:
a) reversibly linking an oligonucleotide primer to a solid support;
b) generating at least two conditioned, base-specifically terminated
nucleic acid fragments;
c) determining the molecular weight value of each nested fragment in each
of the four sets of base-specifically terminated fragments by matrix
assisted laser desorption/ionization mass spectrometry wherein the
molecular weight values of at least two base-specifically terminated
fragments are determined concurrently and wherein the nested fragments are
cleaved from the solid support by a laser during mass spectrometry; and
d) determining the nucleotide sequence by aligning the base specifically
terminated fragments according to molecular weight.
32. The method according to claim 31, wherein the base-specifically
terminated fragments are cleaved from the solid support prior to mass
spectrometry.
33. The method according to claim 31, wherein the base-specifically
terminated fragments are cleaved from the solid support during mass
spectrometry.
34. The method according to claim 31, wherein the reversible linkage is a
photocleavable bond.
35. The method according to claim 31, wherein step b), the nucleic acid
fragments are conditioned by cation exchange.
36. The method according to claim 31, wherein step b), the nucleic acid
fragments are conditioned by mass modification.
37. The method according to claim 31, wherein the base-specifically
terminated fragments are conditioned by purification.
38. The method according to claim 31, wherein the base-specifically
terminated fragments are conditioned by removal of the negative charge of
the phosphodiester backbone.
39. A method of multiplex analysis of nucleic acid sequences, comprising
the steps of:
a) reversibly linking a nucleic acid primer to a solid support;
b) generating at least two conditioned, base-specifically terminated
nucleic acid fragments;
c) determining the molecular weight value of each fragment by matrix
assisted laser desorption/ionization mass spectrometry wherein the
molecular weight values of at least two base-specifically terminated
fragments are determined concurrently and wherein the fragments are
cleaved from the solid support by a laser during mass spectrometry; and
d) determining the nucleotide sequence by aligning the fragments according
to molecular weight; wherein at least one reagent selected from a group
consisting of, a nucleic acid primer, a chain-elongating nucleotide, and a
chain-terminating nucleotide which has been mass-modified; wherein each
set of base-specifically terminated fragments has a sufficient mass
difference from the other sets of base-specifically terminated fragments
so as to be unique; and wherein the molecular weight values of the nested
fragments of two or more sets of unseparated base-specifically terminated
fragments are determined concurrently.
40. The method according to claim 39, wherein the reversible linkage is a
photocleavable bond.
41. The method according to claim 39, wherein the base-specifically
terminated fragments are cleaved from the solid support prior to mass
spectrometry.
42. The method according to claim 39, wherein the base-specifically
terminated fragments are cleaved from the solid support during mass
spectrometry. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
Since the genetic information is represented by the sequence of the four
DNA building blocks deoxyadenosine-(dpA), deoxyguanosine-(dpG),
deoxycytidine-(dpC) and deoxythymidine-5'-phosphate (dpT), DNA sequencing
is one of the most fundamental technologies in molecular biology and the
life sciences in general. The ease and the rate by which DNA sequences can
be obtained greatly affects related technologies such as development and
production of new therapeutic agents and new and useful varieties of
plants and microorganisms via recombinant DNA technology. In particular,
unraveling the DNA sequence helps in understanding human pathological
conditions including genetic disorders, cancer and AIDS. In some cases,
very subtle differences such as a one nucleotide deletion, addition or
substitution can create serious, in some cases even fatal, consequences.
Recently, DNA sequencing has become the core technology of the Human
Genome Sequencing Project (e.g., J. E. Bishop and M. Waldholz, 1991,
Genome; The Story of the Most Astonishing Scientific Adventure of Our
Time--The Attempt to Map All the Genes in the Human Body, Simon &
Schuster, New York). Knowledge of the complete human genome DNA sequence
will certainly help to understand, to diagnose, to prevent and to treat
human diseases. To be able to tackle successfully the determination of the
approximately 3 billion base pairs of the human genome in a reasonable
time frame and in an economical way, rapid, reliable, sensitive and
inexpensive methods need to be developed, which also offer the possibility
of automation. The present invention provides such a technology.
Recent reviews of today's methods together with future directions and
trends are given by Barrell (The FASEB Journal 5, 40-45 (1991)), and
Trainor (Anal. Chem. 62, 418-26 (1990)).
Currently, DNA sequencing is performed by either the chemical degradation
method of Maxam and Gilbert (Methods in Enzymology 65, 499-560 (1980)) or
the enzymatic dideoxynucleotide termination method of Sanger et al. (Proc.
Natl. Acad. Sci. U.S.A. 74, 5463-67 (1977)). In the chemical method, base
specific modifications result in a base specific cleavage of the
radioactive or fluorescently labeled DNA fragment. With the four separate
base specific cleavage reactions, four sets of nested fragments are
produced which are separated according to length by polyacrylamide gel
electrophoresis (PAGE). After autoradiography, the sequence can be read
directly since each band (fragment) in the gel originates from a base
specific cleavage event. Thus, the fragment lengths in the four "ladders"
directly translate into a specific position in the DNA sequence.
In the enzymatic chain termination method, the four base specific sets of
DNA fragments are formed by starting with a primer/template system
elongating the primer into the unknown DNA sequence area and thereby
copying the template and synthesizing a complementary strand by DNA
polymerases, such as Klenow fragment of E. coli DNA polymerase I, a DNA
polymerase from Thermus aquaticus, Taq DNA polymerase, or a modified T7
DNA polymerase, Sequenase (Tabor et al., Proc. Natl. Acad. Sci. U.S.A. 84,
4767-4771 (1987)), in the presence of chain-terminating reagents. Here,
the chain-terminating event is achieved by incorporating into the four
separate reaction mixtures in addition to the four normal deoxynucleoside
triphosphates, dATP, dGTP, dTTP and dCTP, only one of the
chain-terminating dideoxynucleoside triphosphates, ddATP, ddGTP, ddTTP or
ddCTP, respectively, in a limiting small concentration. The four sets of
resulting fragments produce, after electrophoresis, four base specific
ladders from which the DNA sequence can be determined.
A recent modification of the Sanger sequencing strategy involves the
degradation of phosphorothioate-containing DNA fragments obtained by using
alpha-thio dNTP instead of the normally used ddNTPs during the primer
extension reaction mediated by DNA polymerase (Labeit et al., DNA 5,
173-177 (1986); Amersham, PCT-Application GB86/00349; Eckstein et al.,
Nucleic Acids Res. 16, 9947 (1988)). Here, the four sets of base-specific
sequencing ladders are obtained by limited digestion with exonuclease III
or snake venom phosphodiesterase, subsequent separation on PAGE and
visualization by radioisotopic labeling of either the primer or one of the
dNTPs. In a further modification, the base-specific cleavage is achieved
by alkylating the sulphur atom in the modified phosphodiester bond
followed by a heat treatment (Max-Planck-Gesellschaft, DE 3930312 A1).
Both methods can be combined with the amplification of the DNA via the
Polymerase Chain Reaction (PCR).
On the upfront end, the DNA to be sequenced has to be fragmented into
sequencable pieces of currently not more than 500 to 1000 nucleotides.
Starting from a genome, this is a multi-step process involving cloning and
subcloning steps using different and appropriate cloning vectors such as
YAC, cosmids, plasmids and M13 vectors (Sambrook et al., Molecular
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, 1989).
Finally, for Sanger sequencing, the fragments of about 500 to 1000 base
pairs are integrated into a specific restriction site of the replicative
form I (RF I) of a derivative of the M13 bacteriophage (Vieria and
Messing, Gene 19, 259 (1982)) and then the double-stranded form is
transformed to the single-stranded circular form to serve as a template
for the Sanger sequencing process having a binding site for a universal
primer obtained by chemical DNA synthesis (Sinha, Biernat, McManus and
Koster, Nucleic Acids Res. 12, 4539-57 (1984); U.S. Pat. No. 4,725,677
upstream of the restriction site into which the unknown DNA fragment has
been inserted. Under specific conditions, unknown DNA sequences integrated
into supercoiled double-stranded plasmid DNA can be sequenced directly by
the Sanger method (Chen and Seeburg, DNA 4, 165-170 (1985)) and Lim et
al., Gene Anal. Techn. 5, 32-39 (1988), and, with the Polymerase Chain
Reaction (PCR) (PCR Protocols: A Guide to Methods and Applications, Innis
et al., editors, Academic Press, San Diego (1990)) cloning or subcloning
steps could be omitted by directly sequencing off chromosomal DNA by first
amplifying the DNA segment by PCR and then applying the Sanger sequencing
method (Innis et al., Proc. Natl. Acad. Sci. U.S.A. 85, 9436-9440 (1988)).
In this case, however, the DNA sequence in the interested region most be
known at least to the extent to bind a sequencing primer.
In order to be able to read the sequence from PAGE, detectable labels have
to be used in either the primer (very often at the 5'-end) or in one of
the deoxynucleoside triphosphates, dNTP. Using radioisotopes such as
.sup.32 P, .sup.33 P, or .sup.35 S is still the most frequently used
technique. After PAGE, the gels are exposed to X-ray films and silver
grain exposure is analyzed. The use of radioisotopic labeling creates
several problems. Most labels useful for autoradiographic detection of
sequencing fragments have relatively short half-lives which can limit the
useful time of the labels. The emission high energy beta radiation,
particularly from .sup.32 P, can lead to breakdown of the products via
radiolysis so that the sample should be used very quickly after labeling.
In addition, high energy radiation can also cause a deterioration of band
sharpness by scattering. Some of these problems can be reduced by using
the less energetic isotopes such as .sup.33 P or .sup.35 S (see, e.g.,
Ornstein et al., Biotechniques 3, 476 (1985)). Here, however, longer
exposure times have to be tolerated. Above all, the use of radioisotopes
poses significant health risks to the experimentalist and, in heavy
sequencing projects, decontamination and handling the radioactive waste
are other severe problems and burdens.
In response to the above mentioned problems related to the use of
radioactive labels, non-radioactive labeling techniques have been explored
and, in recent years, integrated into partly automated DNA sequencing
procedures. All these improvements utilize the Sanger sequencing strategy.
The fluorescent label can be tagged to the primer (Smith et al., Nature
321, 674-679 (1986) and EPO Patent No. 87300998.9; Du Pont De Nemours EPO
Application No. 0359225; Ansorge et al. J. Biochem. Biophys. Methods 13,
325-32 (1986)) or to the chain-terminating dideoxynucloside triphosphates
(Prober et al. Science 238, 336-41 (1987); Applied Biosystems, PCT
Application WO 91/05060). Based on either labeling the primer or the
ddNTP, systems have been developed by Applied Biosystems (Smith et al,
Science 235, G89 (1987); U.S. Pat. Nos. 570,973 and 689,013), Du Pont De
Nemours (Prober et al., Science 238, 336-341 (1987); U.S. Pat. Nos.
881,372 and 57,566), Pharmacia-LKB (Ansorge et al. Nucleic Acids Res. 15,
4593-4602 (1987) and EMBL Patent Application DE P3724442 and P3805808.1)
and Hitachi (JP 1-90844 and DE 4011991 A1). A somewhat similar approach
was developed by Brumbaugh et al. (Proc. Natl. Sci. U.S.A. 85, 5610-14
(1988) and U.S. Pat. No. 4,729,947). An improved method for the Du Pont
system using two electrophoretic lanes with two different specific labels
per lane is described (PCT Application WO92/02635). A different approach
uses fluorescently labeled avidin and biotin labeled primers. Here, the
sequencing ladders ending with biotin are reacted during electrophoresis
with the labeled avidin which results in the detection of the individual
sequencing bands (Brumbaugh et al, U.S. Pat. No. 594,676).
More recently even more sensitive non-radioactive labeling techniques for
DNA using chemiluminescence triggerable and amplifyable by enzymes have
been developed (Beck, O'Keefe, Coull and Koster, Nucleic Acids Res. 17,
5115-5123 (1989) and Beck and Kster, Anal. Chem. 62, 2258-2270 (1990)).
These labeling methods were combined with multiplex DNA sequencing (Church
et al. Science 240, 185-188 (1988) to provide for a strategy aimed at high
throughput DNA sequencing (Koster et al., Nucleic Acids Res. Symposium
Ser. No. 24, 318-321 (1991), University of Utah, PCT Application No. WO
90/15883); this strategy still suffers from the disadvantage of being very
laborious and difficult to automate.
In an attempt to simplify DNA sequencing, solid supports have been
introduced. In most cases published so far, the template strand for
sequencing (with or without PCR amplification) is immobilized on a solid
support most frequently utilizing the strong biotin-avidin/streptavidin
interaction (Orion-Yhtyma Oy, U.S. Pat. No. 277,643; M. Uhlen et al.
Nucleic Acids Res. 16, 3025-38 (1988); Cemu Bioteknik, PCT Application No.
WO 89/09282 and Medical Research Council, GB, PCT Application No. WO
92/03575). The primer extension products synthesized on the immobilized
template strand are purified of enzymes, other sequencing reagents and
by-products by a washing step and then released under denaturing
conditions by loosing the hydrogen bonds between the Watson-Crick base
pairs and subjected to PAGE separation. In a different approach, the
primer extension products (not the template) from a DNA sequencing
reaction are bound to a solid support via biotin/avidin (Du Pont De
Nemours, PCT Application WO 91/11533). In contrast to the above mentioned
methods, here, the interaction between biotin and avidin is overcome by
employing denaturing conditions (formamide/EDTA) to release the primer
extension products of the sequencing reaction from the solid support for
PAGE separation. As solid supports, beads, (e.g., magnetic beads
(Dynabeads) and Sepharose beads), filters, capillaries, plastic dipsticks
(e.g., polystyrene strips) and microtiter wells are being proposed.
All methods discussed so far have one central step in common:
polyacrylamide gel electrophoresis (PAGE). In many instances, this
represents a major drawback and limitation for each of these methods.
Preparing a homogeneous gel by polymerization, loading of the samples, the
electrophoresis itself, detection of the sequence pattern (e.g., by
autoradiography), removing the gel and cleaning the glass plates to
prepare another gel are very laborious and time-consuming procedures.
Moreover, the whole process is error-prone, difficult to automate, and, in
order to improve reproducibility and reliability, highly trained and
skilled personnel are required. In the case of radioactive labeling,
autoradiography itself can consume from hours to days. In the case of
fluorescent labeling, at least the detection of the sequencing bands is
being performed automatically when using the laser-scanning devices
integrated into commercial available DNA sequencers. One problem related
to the fluorescent labeling is the influence of the four different
base-specific fluorescent tags on the mobility of the fragments during
electrophoresis and a possible overlap in the spectral bandwidth of the
four specific dyes reducing the discriminating power between neighboring
bands, hence, increasing the probability of sequence ambiguities.
Artifacts are also produced by base-specific interactions with the
polyacrylamide gel matrix (Frank and Koster, Nucleic Acids Res. 6, 2069
(1979)) and by the formation of secondary structures which result in "band
compressions" and hence do not allow one to read the sequence. This
problem has, in part, been overcome by using 7-deazadeoxyguanosine
triphosphates (Barr et al., Biotechniques 4, 428 (1986)). However, the
reasons for some artifacts and conspicuous bands are still under
investigation and need further improvement of the gel electrophoretic
procedure.
A recent innovation in electrophoresis is capillary zone electrophoresis
(CZE) (Jorgenson et al., J. Chromatography 352, 337 (1986); Gesteland et
al., Nucleic Acids Res. 18, 1415-1419 (1990)) which, compared to slab gel
electrophoresis (PAGE), significantly increases the resolution of the
separation, reduces the time for an electrophoretic run and allows the
analysis of very small samples. Here, however, other problems arise due to
the miniaturization of the whole system such as wall effects and the
necessity of highly sensitive on-line detection methods. Compared to PAGE,
another drawback is created by the fact that CZE is only a "one-lane"
process, whereas in PAGE samples in multiple lanes can be electrophoresed
simultaneously.
Due to the severe limitations and problems related to having PAGE as an
integral and central part in the standard DNA sequencing protocol, several
methods have been proposed to do DNA sequencing without an electrophoretic
step. One approach calls for hybridization or fragmentation sequencing
(Bains, Biotechnology 10, 757-58 (1992) and Mirzabekov et al., FEBS
Letters 256, 118-122 (1989)) utilizing the specific hybridization of known
short oligonucleotides (e.g., octadeoxynucleotides which gives 65,536
different sequences) to a complementary DNA sequence. Positive
hybridization reveals a short stretch of the unknown sequence. Repeating
this process by performing hybridizations with all possible
octadeoxynucleotides should theoretically determine the sequence. In a
completely different approach, rapid sequencing of DNA is done by
unilaterally degrading one single, immobilized DNA fragment by an
exonuclease in a moving flow stream and detecting the cleaved nucleotides
by their specific fluorescent tag via laser excitation (Jett et al., J.
Biomolecular Structure & Dynamics 7, 301-309, (1989); United States
Department of Energy, PCT Application No. WO 89/03432). In another system
proposed by Hyman (Anal. Biochem. 174, 423-436 (1988)), the pyrophosphate
generated when the correct nucleotide is attached to the growing chain on
a primer-template system is used to determine the DNA sequence. The
enzymes used and the DNA are held in place by solid phases (DEAE-Sepharose
and Sepharose) either by ionic interactions or by covalent attachment. In
a continuous flow-through system, the amount of pyrophosphate is
determined via bioluminescence (luciferase). A synthesis approach to DNA
sequencing is also used by Tsien et al. (PCT Application No. WO 91/06678).
Here, the incoming dNTP's are protected at the 3'-end by various blocking
groups such as acetyl or phosphate groups and are removed before the next
elongation step, which makes this process very slow compared to standard
sequencing methods. The template DNA is immobilized on a polymer support.
To detect incorporation, a fluorescent or radioactive label is
additionally incorporated into the modified dNTP's. The same patent
application also describes an apparatus designed to automate the process.
Mass spectrometry, in general, provides a means of "weighing" individual
molecules by ionizing the molecules in vacuo and making them "fly" by
volatilization. Under the influence of combinations of electric and
magnetic fields, the ions follow trajectories depending on their
individual mass (m) and charge (z). In the range of molecules with low
molecular weight, mass spectrometry has long been part of the routine
physical-organic repertoire for analysis and characterization of organic
molecules by the determination of the mass of the parent molecular ion. In
addition, by arranging collisions of this parent molecular ion with other
particles (e.g., argon atoms), the molecular ion is fragmented forming
secondary ions by the so-called collision induced dissociation (CID). The
fragmentation pattern/pathway very often allows the derivation of detailed
structural information. Many applications of mass spectrometric methods in
the known in the art, particularly in biosciences, and can be found
summarized in Methods in Enzymology, Vol. 193: "Mass Spectrometry" (J. A.
McCloskey, editor), 1990, Academic Press, New York.
Due to the apparent analytical advantages of mass spectrometry in providing
high detection sensitivity, accuracy of mass measurements, detailed
structural information by CID in conjunction with an MS/MS configuration
and speed, as well as on-line data transfer to a computer, there has been
considerable interest in the use of mass spectrometry tier the structural
analysis of nucleic acids. Recent reviews summarizing this field include
K. H. Schram, "Mass Spectrometry of Nucleic Acid Components, Biomedical
Applications of Mass Spectrometry" 34, 203-287 (1990); and P. F. Crain,
"Mass Spectrometric Techniques in Nucleic Acid Research," Mass
Spectrometry Reviews 9, 505-554 (1990). The biggest hurdle to applying
mass spectrometry to nucleic acids is the difficulty of volatilizing these
very polar biopolymers. Therefore, "sequencing" has been limited to low
molecular weight synthetic oligonucleotides by determining the mass of the
parent molecular ion and through this, confirming the already known
sequence, or alternatively, confirming the known sequence through the
generation of secondary ions (fragment ions) via CID in an MS/MS
configuration utilizing, in particular, for the ionization and
volatilization, the method of fast atomic bombardment (FAB mass
spectrometry) or plasma desorption (PD mass spectrometry). As an example,
the application of FAB to the analysis of protected dimeric blocks for
chemical synthesis of oligodeoxynucleotides has been described (Koster et
al. Biomedical Environmental Mass Spectrometry 14, 111-116 (1987)).
Two more recent ionization/desorption techniques are electrospray/ionspray
(ES) and matrix-assisted laser desorption/ionization (MALDI). ES mass
spectrometry has been introduced by Fenn et al. (J. Phys. Chem. 88,
4451-59 (1984); PCT Application No. WO 90/14148) and current applications
are summarized in recent review articles (R. D. Smith et al., Anal. Chem.
62, 882-89 (1990) and B. Ardrey, Electrospray Mass Spectrometry,
Spectroscopy Europe, 4, 10-18 (1992)). The molecular weights of the
tetradecanucleotide d(CATGCCATGGCATG) (SEQ ID NO:1) (Covey et al. "The
Determination of Protein, Oligonucleotide and Peptide Molecular Weights by
Ionspray Mass Spectrometry," Rapid Communications in Mass Spectrometry, 2,
249-256 (1988)), of the 21-mer d(AAATTGTGCACATCCTGCAGC) (SEQ ID NO:2) and
without giving details of that of a tRNA with 76 nucleotides (Methods in
Enzymology, 193, "Mass Spectrometry" (McCloskey, editor), p. 425, 1990,
Academic Press, New York) have been published. As a mass analyzer, a
quadrupole is most frequently used. The determination of molecular weights
in femtomole amounts of sample is very accurate due to the presence of
multiple ion peaks which all could be used for the mass calculation.
MALDI mass spectrometry, in contrast, can be particularly attractive when a
time-of-flight (TOF) configuration is used as a mass analyzer. The
MALDI-TOF mass spectrometry has been introduced by Hillenkamp et al.
("Matrix Assisted UV-Laser Desorption/Ionization: A New Approach to Mass
Spectrometry of Large Biomolecules," Biological Mass Spectrometry
(Burlingame and McCloskey, editors), Elsevier Science Publishers,
Amsterdam, pp. 49-60, 1990.) Since, in most cases, no multiple molecular
ion peaks are produced with this technique, the mass spectra, in
principle, look simpler compared to ES mass spectrometry. Although DNA
molecules up to a molecular weight of 410,000 daltons could be desorbed
and volatilized (Williams et al., "Volatilization of High Molecular Weight
DNA by Pulsed Laser Ablation of Frozen Aqueous Solutions," Science, 246,
1585-87 (1989)), this technique has so far only been used to determine the
molecular weights of relatively small oligonucleotides of known sequence,
e.g., oligothymidylic acids up to 18 nucleotides (Huth-Fehre et al.,
"Matrix-Assisted Laser Desorption Mass Spectrometry of
Oligodeoxythymidylic Acids," Rapid Communications in Mass Spectrometry, 6,
209-13 (1992)) and a double-stranded DNA of 28 base pairs (Williams et
al., "Time-of-Flight Mass Spectrometry of Nucleic Acids by Laser Ablation
and Ionization from a Frozen Aqueous Matrix," Rapid Communications in Mass
Spectrometry, 4, 348-351 (1990)). In one publication (Huth-Fehre et al.,
1992, supra), it was shown that a mixture of all the oligothymidylic acids
from n=12 to n=18 nucleotides could be resolved.
In U.S. Pat. No. 5,064,754, RNA transcripts extended by DNA both of which
are complementary to the DNA to be sequenced are prepared by incorporating
NTP's, dNTP's and, as terminating nucleotides, ddNTP's which are
substituted at the 5'-position of the sugar moiety with one or a
combination of the isotopes .sup.12 C, .sup.13 C, .sup.14 C, .sup.1 H,
.sup.2 H, .sup.3 H, .sup.16 O, .sup.17.sub.O and .sup.18 O. The
polynucleotides obtained are degraded to 3'-nucleotides, cleaved at the
N-glycosidic linkage and the isotopically labeled 5'-functionality removed
by periodate oxidation and the resulting formaldehyde species determined
by mass spectrometry. A specific combination of isotopes serves to
discriminate base-specifically between internal nucleotides originating
from the incorporation of NTP's and dNTP's and terminal nucleotides caused
by linking ddNTP's to the end of the polynucleotide chain. A series of
RNA/DNA fragments is produced, and in one embodiment, separated by
electrophoresis, and, with the aid of the so-called matrix method of
analysis, the sequence is deduced.
In Japanese Patent No. 59-131909, an instrument is described which detects
nucleic acid fragments separated either by electrophoresis, liquid
chromatography or high speed gel filtration. Mass spectrometric detection
is achieved by incorporating into the nucleic acids atoms which normally
do not occur in DNA such as S, Br, I or Ag, Au, Pt, Os, Hg. The method,
however, is not applied to sequencing of DNA using the Sanger method. In
particular, it does not propose a base-specific correlation of such
elements to an individual ddNTP.
PCT Application No. WO 89/12694 (Brennan et al., Proc. SPIE-Int. Soc. Opt.
Eng. 1206, (New Technol. Cytom. Mol. Biol.), pp. 60-77 (1990); and
Brennan, U.S. Pat. No. 5,003,059) employs the Sanger methodology for DNA
sequencing by using a combination of either the four stable isotopes
.sup.32 S, .sup.33 S, .sup.34 S, .sup.36 S or .sup.35 Cl, .sup.37 Cl,
.sup.79 Br, .sup.81 Br to specifically label the chain-terminating
ddNTP's. The sulfur isotopes can be located either in the base or at the
alpha-position of the triphosphate moiety whereas the halogen isotopes are
located either at the base or at the 3'-position of the sugar ring. The
sequencing reaction mixtures are separated by an electrophoretic technique
such as CZE, transferred to a combustion unit in which the sulfur isotopes
of the incorporated ddNTP's are transformed at about 900.degree. C. in an
oxygen atmosphere. The SO.sub.2 generated with masses of 64, 65, 66 or 68
is determined on-line by mass spectrometry using, e.g., as mass analyzer,
a quadrupole with a single ion-multiplier to detect the ion current.
A similar approach is proposed in U.S. Patent No. 5,002,868 (Jacobson et
al., Proc. SPIE-Int. Soc. Opt. Eng. 1435, (Opt. Methods Ultrasensitive
Detect. Anal. Tech. Appl.), 26-35 (1991)) using Sanger sequencing with
four ddNTP's specifically substituted at the alpha-position of the
triphosphate moiety with one of the four stable sulfur isotopes as
described above and subsequent separation of the four sets of nested
sequences by tube gel electrophoresis. The only difference is the use of
resonance ionization spectroscopy (RIS) in conjunction with a magnetic
sector mass analyzer as disclosed in U.S. Pat. No. 4,442,354 to detect the
sulfur isotopes corresponding to the specific nucleotide terminators, and
by this, allowing the assignment of the DNA sequence.
EPO Patent Applications No. 0360676 A1 and 0360677 A1 also describe Sanger
sequencing using stable isotope substitutions in the ddNTP's such as D,
.sup.13 C, .sup.15 N, .sup.17 O, .sup.18 O, .sup.32 S, .sup.33 S, .sup.34
S, .sup.36 S, .sup.19 F, .sup.35 Cl, .sup.37 Cl, .sup.79 Br, .sup.81 Br
and .sup.127 I or functional groups such as CF.sub.3 or Si(CH.sub.3).sub.3
at the base, the sugar or the alpha position of the triphosphate moiety
according to chemical functionality. The Sanger sequencing reaction
mixtures are separated by tube gel electrophoresis. The effluent is
converted into an aerosol by the electrospray/thermospray nebulizer method
and then atomized and ionized by a hot plasma (7000.degree. to
8000.degree. K.) and analyzed by a simple mass analyzer. An instrument is
proposed which enables one to automate the analysis of the Sanger
sequencing reaction mixture consisting of tube ele | | |