|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates to the sequencing of DNA strands.
In one class of techniques for sequencing DNA, identical cloned strands of
DNA are marked. The strands are separated into four batches and either
individually cleaved at or synthesized to one of the four base types,
which are adenine, guanine, cytosine and thymine (hereinafter A, G, C and
T). The adenine-, guanine-, cytosine- and thymine- cleaved batches are
then electrophoresed for separation. The rate of electrophoresis indicates
the DNA sequence.
In a prior art sequencing technique of this class, the DNA strands are
marked with a radioactive marker, cleaved at a different base type in each
aliquot, and after being separated by electrophoresis, film is exposed to
the gel and developed to indicate the sequence of the bands. The range of
lengths and resolution of this type of static detection is limited by the
size of the apparatus.
In another prior art sequencing technique of this class, single strands are
synthesized to a different base type in each aliquot, and the strands are
marked radioactively for later detection.
It is also known in the pirior art to use fluorescent markers for marking
proteins and to pulse the fluorescent markers with light to receive an
indication of the presence of a particular protein from the fluorescence.
The prior art techniques for DNA sequencing have several disadvantages such
as: (1) they are relatively slow; (2) they are at least partly manual; and
(3) they are limited to relatively short strands of DNA.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the invention to provide a novel technique
for DNA sequencing.
It is a still further object of the invention to provide novel apparatus
and methods for sequencing relatively large chains of DNA.
It is a still further object of the invention to provide apparatus and
methods for sequencing cloned DNA fragments of 100 bases or more.
It is a still further object of the invention to provide a technique for
continuous sequencing of DNA.
It is a still further object of the invention to continuously sequence DNA
without the spatial limitations of range of lengths and resolution.
It is a still further object of the invention to provide a technique for
sequencing of DNA.
It is a still further object of the invention to provide a novel technique
for continuously sequencing DNA using fluorescent detection.
It is a still further object of the invention to provide a novel technique
for DNA sequencing using a fluorescent marker attached to the DNA, or the
inherent fluorescence of the DNA itself.
It is a still further object of the invention to provide a novel technique
for continuously sequencing DNA marked with fluorescence which more
clearly distinguishes marked DNA fragments from background fluorescent
noise.
It is a still further object of the invention to provide a novel technique
for continuously sequencing DNA using radioactive detection.
In accordance with the above and further objects of the invention, one
embodiment of apparatus for sequencing DNA includes at least four
electrophoresis channels each adapted to receive cloned DNA strands
labeled at one end with biotin and cleaved at the other end at a given
type of base. Each of the channels has a gel path and electrical field
across it identical in its characteristics to the gel path of the other
channels and electrical fields across the other channels.
To provide marking, means are provided for introducing biotin into the DNA
fragments prior to their being electrophoresed into the gel with the gel
and field being selected so that strands being electophoresed towards the
terminal end of the gel channel are fully resolved prior to the resolution
of longer strands towards the beginning of the channel, and so on, in a
continuous process over a period of time.
At the terminal end of this separating gel, there is provided means for
applying avidin to the strands to further mark the strands individually
while maintaining the strands in each channel separate from the strands in
other channels. The avidin is pre-marked with multiple fluorescent
molecules and therefore provides multiple fluorescent markers for each
separated strand. The application of avidin to the strands may be during
further electrophoresing in a second section of the gel, in which
unattached avidin is stationary, but the fluorescein- avidin- biotin- DNA
complex continues to move.
In another embodiment, strands are synthesized with primers which primers
are an inverted complementary sequence. These primers are synthesized by a
DNA synthesizer available commercially such as that manufactured by
Applied Biosystems. After separation, the inverted complementary sequencne
forms a hairpin in which ethidium bromide intercalates.
In another embodiment, after separation, ethidium bromide intercalates in
duplex DNA formed by alidromes of unprimed DNA or in the single stranded
DNA.
In another embodiment, the inherent fluorescence of DNA may be used as a
suitable detection mechanism. Thus, it is not necessary to mark one end of
the strands with biotin nor mark them with fluorescein nor attach primers
with inverted complementary sequences.
In another embodiment, radioactive markers attached directly to DNA may be
used as a suitable detection mechanism.
The gel electrophoresis may be provided in conventional gel slabs with
input sections for each of the four channels for A, G, T and C, in
addition to any timing that may be needed. As alternatives (1) four
chromatography tubes may be used with gel in them so as to provide more
uniform temperature control and eliminate the need for timing channels;
(2), open capillary tubes may be used and thus avoid the need for gel and
make the cleaning more convenient; or (3), high performance liquid
chromatography (HPLC) columns such as ion-exchange columns or reverse
phase columns may be used in conjunction with high pressure instead of
high voltage for separating the strands within each channel or batch. In
using HPLC, sequencing would be performed on smaller strands of DNA called
oligonucleotides with typical lengths of 10-50 bases, using one column for
each aliquot or at least four columns.
The detection of the strands is accomplished by moving the strands by bulk
flow after electrophoresis or HPLC separation while scanning them with a
source of light. Means are provided for detecting the bands individually
from each channel in accordance with their time of exit from the gel to
indicate the sequence of the A, G, C and T strands of different lengths.
Advantageously, an additional channel may be utilized as a calibration
channel through the electrophoresis of DNA strands of known, but different
lengths. These DNA strands are also marked and thereby indicate a time
base.
The scanning apparatus includes a light source, such as a laser or
mercury-arc lamp or other suitable source, which emits light in the
optimum absorption spectrum of the marker. The light may be split by the
use of fiber optics or other conventional optical components, so that
there is a source for each of the 4 sample channels as well as any
calibration channels.
The detector includes a filtering system for passing selectively the
optimum emission band of the fluorescent marker to a light sensor which is
preferably a photomultiplier. The photomultiplier or other light
controlled mechanism selectively detects the fluorescence using techniques
which enhance the signal/noise ratio. One technique is the use of laser
pulses which are less than five nanoseconds time duration, with detection
in a time window. The length of such window and its delay from the pulse
are optimized to discriminate against background fluorescence. Another
technique is to modulate the laser source with an electro-optic modulator,
with detection by a lock-in amplifier. There is a detector for each
channel, and the combination thereof, indicates: (1) if the type of base
termination or nucleotide cleavage is A, G, C or T; and (2) the time of
emergence of each strand from each channel of the electrophoresis gel or
HPLC column to indicate the overall sequence of strands.
To use the apparatus to sequence DNA strands, cloned strands are normally
formed of a length greater than 100 bases. In one embodiment, the strands
are marked by biotin at one end. The strands are divided into four
aliquots and the strands within each aliquot are cleaved at a different
base type. In another embodiment, strands are synthesized to selected base
types. These four batches are then electrophoresed through identical
channels to separate strands such that the shorter strands are resolved
towards the end of the gel prior to resolution of the longer strands,
which still are near the beginning of the gel. This occurs in a continuous
process so a substantial number of different length strands may be
resolved in a relatively short gel. This methodology takes advantage of
time-resolved bands, as opposed to the limitations of spatial-resolved
bands.
The gel size, electric field and DNA mobilities are such that the first
bands to be moved completely through the gel are fully resolved while the
last bands are yet unresolved in a continuous process such that at least
ten percent of the bands are resolved and electrophoresed through the gel
while the lesser mobile bands are yet unresolved near the entrance end of
the gel. These lesser mobile bands become resolved little by little over
time in a continuous fashion without interruption of the movement of these
bands through the gel.
In one embhodiment, near the end of the gel, the biotin terminated
fragments are further combined with avidin. The avidin, being a relatively
large molecule, may have a plurality of fluorescent markers for each
avidin molecule to provide signal amplification. The combination of biotin
and avidin may take place either within a second section gel or in liquid
after the bands leave the gel.
To attach the avidin within the second section of the gel, the pH of this
section may be different from that of the first section. In such a
gradient gel the biotin-marked strands contact the avidin during
electrophoresis. Marked avidin is stationary at a gel pH that is dependent
on the number of fluorescein molecules attached to it, whereas DNA is
mobile at a gel pH above 4. The electrophoresis of the DNA is done in a
first section of the gel having a pH of approximately 7-8, while a band of
avidin is located in a second section having its pH in which the
fluorescein marked avidin is stationary. In the preferred embodiment,
three fluorescein markers are used for each molecule of avidin and the
fluoresceinated-avidin has a pI of approximately 8. The avidin should be
pure and not contain any DNA or else non-specific staining may occur. The
distance to the second section is sufficiently long enough so that the DNA
strands are resolved into bands before reaching the avidin.
The markers are detected by transmitting light in the one embodiment to the
fluorescent-avidin-biotin-DNA complexes, in another embodiment to the
ethidium-bromide-DNA hairpin complex, and in another embodiment, to an
ethidium bromide unmarked DNA complex and in yet another embodiment, to
plain DNA, using wavelengths in a narrow wavelength bandwidth in the
optimum absorption spectrum of the markers on DNA and detecting emitted
fluorescent light either during a time period in which the markers'
fluorescence has not yet decayed to an insignificant amount but the
background fluorescence has or by modulating the light source and
detecting using lock-in techniques. The detection is made in a wavelength
band including at least as a principal portion of its energy, the high
emission spectrum of the fluorescent marker. For the gated window
technique, the light is transmitted from pulsed lasers in approximately
three nanosecond pulses. Readings are taken within a window period, after
an initial delay, both period and delay are optimized for best results.
In another embodiment, radioactive marked strands, after being separated,
are combined with scintillation liquid whereby detection of the presence
of the strands is accomplished by an appropriate photodetector.
From the above summary, it can be understood that the sequencing technique
of this invention has several advantages, such as: (1) it takes advantage
of resolution over time, as opposed to space, (2) it is continuous; (3) it
is automatic; (4) it is capable of sequencing relatively long strands
including strands of more than 100 bases; and (5) it is relatively
economical and easy to use.
SUMMARY OF THE DRAWINGS
The above noted and other features of the invention will be better
understood from the following detailed description when considered with
reference to the accompaning drawings in which:
FIG. 1 is a block diagram of an embodiment of the invention;
FIG. 2 is a block diagram of another embodiment of the invention;
FIG. 3 is a simplified schematic of a portion of the embodiment of FIGS. 1
and 2;
FIG. 4 is an alternative embodiment of the portion of FIG. 3;
FIG. 5 is another alternative embodiment of the portion of FIG. 3;
FIG. 6 is a block diagram of a portion of the embodiments of FIGS. 1 and 2;
FIG. 7 is a logical circuit diagram of a portion of the block diagram of
FIG. 3; and
FIG. 8 is a schematic circuit diagram of a portion of the embodiments of
FIGS. 1 and 2.
DETAILED DESCRIPTION
In FIG. 1, there is shown a block diagram of a DNA sequencing system 10
having a biotin labeling system 11, a DNA cleavage system 12, a separating
system 14, a detection and processing system 16 and a source of standard
length DNA 18. Biotin labeling takes place before dividing the DNA cloned
strands into 4 aliquots.
The biotin from any suitable commercial source is added to the cloned
strands of more than 100 bases in a container as indicated at 11. The
biotin preparation must be sufficient to mark at least one end of a
substantial proportion of the DNA fragments with the biotin in a manner
known in the art.
Biotin is selected because of its affinity to avidin and becuase it is not
a large molecule, which in the latter case when added to the DNA fragments
might substantially dominate the mobility of the DNA fragments during
electrophoresis. Being a small molecule, it does not prevent the
discrimination between different DNA fragments within the separating
system 14.
Although biotin has been selected as a marker which may be combined later
with a larger molecule such as avidin, other markers may be used. They
must have characteristics which enable them to be attached to a DNA
fragment and to have a strong affinity to a larger molecule which may be
marked with a fluorescein or other suitably detectable material. They must
also be of such a size and have such chemical characteristics to not
obscure the normal differences in the mobilities between the different
fragments due to cleavages at different ones of the adenine, guanine,
cytosine and thymine bases.
In addition, a radioactive marker such as radioactive phosphorus or
radioactive sulfur, radioactive carbon or tritium may be incorporated into
the DNA molecules such that after separation, strands are combined with
scintillation liquid.
The DNA cleavage system 12 communicates in four paths and the source of
standard length DNA 18 communicates in one path within the separating
system 14 to permit passage of DNA fragments and standard fragments
thereto in separate paths. The separating system 14, which sequences
strands by separation, communicates with the detection and processing
system 16 which analyzes the fragments by comparison with each other and
the standard from the source of standard length DNA 18 to derive
information about the DNA sequence of the original fragments.
The DNA cleavage system 12 includes four sources 20A, 20G, 20C, 20T of
fragments of the same cloned DNA strand. This DNA strand is normally
greater than 100 bases in length and is then further cleaved by chemical
treatment to provide different lengths of fragments in each of four
containers 20A, 20G, 20C and 20T.
In one embodiment, the container 20A contains fragments of DNA strands
randomly cleaved by a chemical treatment for A; the container 20G contains
fragments of DNA strands randomly cleaved by a chemical treatment for G;
container 20C contains fragments of DNA strands randomly cleaved by a
chemical treatment for C; and container 20T contains fragments of DNA
strands randomly cleaved by a chemical treatment for T. Thus, identical
fragments in each container have been cleaved at different bases of a
given base type by the appropriate chemical treatment.
The fragments in the containers are respectively referred to as A-DNA
fragments, G-DNA fragments, C-DNA fragments and T-DNA fragments from the
containers 20A, 20G, 20C and 20T respectively. These fragments are flowed
from the containers 20A, 20G, 20C and 20T through corresponding ones of
the conduits 22A, 22G, 22C and 22T into contact with the separating system
14.
The source of standard length DNA 18 includes a source of reference DNA
fragments of known but different lengths which are flowed through a
conduit 22S to the separating system 14. These reference fragments have
known lengths and therefore their time of movement through the separating
system 14 forms a clock source or timing source as explained hereinafter.
While in the preferred embodiment the cloned strands of 100 bases are
marked with biotin before being divided into four batches, they may be
marked instead after dividing into four batches but before the selected
chemical treatment.
The separating system 14 includes five electrophoresis channels 26S, 26A,
26G, 26C and 26T. The electrophoresis channels 26S, 26A, 26G, 26C and 26T
include in the preferred embodiment, gel electrophoresis apparatus with
each path length of gel being identical and having the same field applied
across it to move samples continuously through five channels. The gels and
fields are selected to provide a mobility to DNA strands that does not
differ from channel to channel by more than 5% in velocity. In addition,
the field may be varied over time to enhance the speed of larger molecules
after smaller molecules have been detected, as well as to adjust the
velocities in each channel based on feedback from the clock channel to
compensate for differences in each channel such that the mobilities in
each channel are within the accuracy required to maintain synchronism
among the channels.
Preferably the gels are of the same materials, chemical derivatives and
lengths and the electric fields are within 5% of the intermediates of each
other in each channel. However, more than one reference channel can be
used such that a reference channel is adjacent to a sample channel in
order to minimize the requirements for uniformity of DNA movement in the
gel for all channels.
The electrophoresis channel 26S receives fragments of known length DNA
marked with biotin and moves them through the gel. Similarly, each of the
electrophoresis channels 26A, 26G, 26C and 26T receives biotin-labeled
fragments from the cleavage system 20A, 20G, 20C and 20T and moves them in
sequence through the sample electrophoresis channels, with each being
moved in accordance with its mobility under a field identical to that of
the reference electrophoresis channel 26S.
To provide information concerning the DNA sequence, the detection and
processing system 16 includes five avidin sources 30S, 30A, 30G, 30C and
30T; five detection systems 32S, 32A, 32G, 32C and 32T and a correlation
system 34. Each of the avidin sources 30S, 30A, 30G, 30C and 30T is
connected to the detecting systems 32S, 32A, 32G, 32C and 32T. Each of the
outputs from corresponding ones of the electrophoresis channels 26S, 26A,
26G, 26C and 26T within the separating system 14 is connected to a
corresponding one of the detection systems 32S, 32A, 32G, 32C and 32T. In
the detection system, avidin with fluorescent markers attached and DNA
fragments are combined to provide avidin marked DNA fragments with
fluorescent markers attached to the avidin to a sample volume within the
detection system for the detection of bands indicating the presence or
absence of the fragments, which over time relates to their length.
The output from each of the detection systems 32S, 32A, 32G, 32C and 32T
are electrically connected through conductors to the correlation system 34
which may be a microprocessor system for correlating the information from
each of the detection systems to provide information concerning the DNA
sequence.
The avidin sources 30S, 30A, 30G, 30C and 30T each contain avidin puchased
from known suppliers, with each avidin molecule in the preferred
embodiment combined with three fluorescein molecules. The avidin sources
are arranged to contact the DNA fragments and may be in a section of gel
placed adjacent to the electrophoresis channel. In this case, this section
of the gel should have a pH of approximately 8 to avoid movement of the
three fluorescein-marked avidin by electrophoresis. When the biotinylated
DNA strands reach the section of gel that has a pH of 8, they will pick up
the fluoresceinated avidin which moves very slowly or is stationary in
this section of the gel.
To prepare the second section of gel with fluoresceinated avidin, the
fluoresceinated avidin may be electrophoresed from the exit end of the
channel inwardly. In this embodiment, it moves in this direction slowly
because its pI is slightly higher than the pH of the second section of
gel. Alternatively, it may be mixed in gel.
Because the fluorescein-avidin-biotin-DNA complex molecule is acidic in the
pH 8 gel, it will continue to move out of this section of the gel where it
is then passed to a sample volume within the detection system by an
eluant. The sequences of separation determined before the attachment of
avidin are maintained and not substantially altered. In the alternative,
the bands of DNA fragments may be electrophoresed into a more mobile
liquid containing fluorescein marked avidin for combination with the
avidin. The avidin binds selectively to the biotin attached to the ends of
the DNA fragments and unreacted fluoresceinated avidin is separated from
the fluorescein-avidin-biotin-DNA complex by standard techniques such as
chromatography.
The detection systems each include an optical system for detecting the
presence or absence of bands and converting the detection of them to
electrical signals which are applied electrically to the correlation
system 34 indicating the sequence of the fragments with respect to both
the standard fragments from the source of standard length DNA 18 and the
A, G, C and T fragments from the containers 20A, 20G, 20C and 20T
respectively.
In FIG. 2, there is shown a simplified block diagram of another embodiment
of DNA sequencing apparatus A10. This apparatus is similar to the DNA
sequencing apparatus 10 of FIG. 1 and the components are identified in a
similar manner with the reference numbers being prefixed by the letter A.
In this embodiment, instead of the containers for DNA and chemical
treatment for A, G, C and T of the embodiment of DNA sequencing system 10
shown at 20A, 20G, 20C and 20T in FIG. 1, the DNA sequencing apparatus A10
includes containers for treatment of the DNA in accordance with the method
of Sanger described by F. Sanger, S. Nicklen and A. R. Coulson, "DNA
Sequencing with Chain-Terminating Inhibiters," Proceedings of the National
Academy of Science, USA, Vol. 74, No. 12, 5463-5467, 1977, indicated in
the embodiment A10 of FIG. 2 at A20A, A20G, A20C and A20T shown as a group
generally at A12.
In this method, the strands are separated and used as templates to
synthesize DNA with synthesis terminating at given base types A, G, C or a
T in a random manner so as to obtain a plurality of different molecular
weight strands. The limited synthesis is obtained by using nucleotides
which will terminate synthesis and is performed in separate containers,
one of which has the special A nucleotide, another the special G
nucleotide, another the special C nucleotide and another the special T
nucleotide. These special nucleotides may be dideoxy nucleotides or marked
nucleotides, both of which would terminate synthesis. So, each of the four
batches will be terminated at a different one of the types of bases A, G,
C and T randomly.
In this embodiment, the fragments may be marked by biotin at one end in the
manner shown in FIG. 1. However, in the preferred embodiment of FIG. 2,
instead of labeling with biotin, the fragments are labeled by an inverted
complementary repeat of DNA as shown at A11 before being applied to the
channels indicated at A12 in FIG. 2. The design of inverted complementary
repeat takes advantage of the process of designing small DNA fragments
known as oligonucleotides. This process is widely described in the
literature as well as such patents as Phosphoramidite Components and
Processes (U.S. Pat. No. 4,415,732), the disclosure of which is
incorporated herein.
After the electrophoresis, the inverted complementary repeat forms a
hairpin from a palidrone of duplex DNA, which is then combined with
ethidium bromide and detected by the detection system using a wavelength
of light appropriate to the intercalated ethidium bromide rather than
wavelengths of light appropriate to the fluorescein marking. If one uses
highly sensitive detection techniques, the inverted repeat would not be
used and detection would be accomplished either by sensing ethidium
bromide that intercalated between portions of the unknown DNA that
happened to form duplex DNA, or by ethidium bromide that attached to
single stranded DNA, or by the inherent fluorescence of the DNA itself. If
one used radioactive markers, detection would be accomplished by sensing
light given off by the combination of the radioactive marker and
scintillation fluid.
In FIG. 3, there is shown a separating system which includes a slab of gel
27 as known in the art with five sample dispensing tubes indicated
generally at 29A terminating in aligned slots 51 in the gel 27 on one end,
with such slots in contact with a negative potential buffer well 29 having
a negative electrode 47A, and five exit tubes at the other end located at
31A terminating in apertures in the gel 27, as well as a positive
potential buffer well 31 having a positive electrode 53A. The material to
be electrophoresed is inserted into slots 51 through tubes 29A and due to
the field across the gel 27 moves from top to bottom in the gel and into
the appropriate corresponding exit tubes of the group 31A. The gel slab 27
has glass plates 27A and 27B on either side to confine the sample and gel.
Buffer fluid from the buffer well 31 is pumped at right angles to the gel
27 from a source at 57 by pumps connected to tubes 31A to pull fluid
therethrough. The buffer fluid picks up any DNA that is electrophoresed
into the exit holes 31A and makes its way to sensing equipment to be
described hereinafter or to provide communication with other gel slabs for
further electrophoresis of the DNA strands being electrophoresed from the
slab 27.
In FIG. 4, there is shown another embodiment B26A of gel electrophoresis
having a negative-potential buffer for the A channel indicated generally
at B29A, a gel electrophoresis channel for A terminated DNA indicated at
B27A and a positive potential buffer for the channel indicated at B31A.
This embodim | | |