|
Description  |
|
|
This invention relates to DNA polymerases suitable for DNA sequencing.
DNA sequencing involves the generation of four populations of single
stranded DNA fragments having one defined terminus and one variable
terminus. The variable terminus always terminates at a specific given
nucleotide base (either guanine (G), adenine (A), thymine (T), or cytosine
(C)). The four different sets of fragments are each separated on the basis
of their length, on a high resolution polyacrylamide gel; each band on the
gel corresponds colinearly to a specific nucleotide in the DNA sequence,
thus identifying the positions in the sequence of the given nucleotide
base.
Generally there are two methods of DNA sequencing. One method (Maxam and
Gilbert sequencing) involves the chemical degradation of isolated DNA
fragments, each labeled with a single radiolabel at its defined terminus,
each reaction yielding a limited cleavage specifically at one or more of
the four bases (G, A, T or C). The other method (dideoxy sequencing)
involves the enzymatic synthesis of a DNA strand. Four separate syntheses
are run, each reaction being caused to terminate at a specific base (G, A,
T or C) via incorporation of the appropriate chain terminating
dideoxynucleotide. The latter method is preferred since the DNA fragments
are uniformly labelled (instead of end labelled) and thus the larger DNA
fragments contain increasingly more radioactivity. Further, .sup.35
S-labelled nucleotides can be used in place of .sup.32 P-labelled
nucleotides, resulting in sharper definition; and the reaction products
are simple to interpret since each lane corresponds only to either G, A, T
or C. The enzyme used for most dideoxy sequencing is the Escherichia coli
DNA-polymerase I large fragment ("Klenow"). Another polymerase used is AMV
reverse transcriptase.
SUMMARY OF THE INVENTION
In one aspect the invention features a method for determining the
nucleotide base sequence of a DNA molecule, comprising annealing the DNA
molecule with a primer molecule able to hybridize to the DNA molecule;
incubating separate portions of the annealed mixture in at least four
vessels with four different deoxynucleoside triphosphates, a processive
DNA polymerase having less than 500 units of exonuclease activity per mg
of polymerase, and a DNA synthesis terminating agent which terminates DNA
synthesis at a specific nucleotide base. The agent terminates at a
different specific nucleotide base in each of the four vessels. The DNA
products of the incubating reaction are separated according to their size
so that at least a part of the nucleotide base sequence of the DNA
molecule can be determined.
In preferred embodiments the polymerase remains bound to the DNA molecule
for at least 500 bases before dissociating, most preferably for at least
1,000 bases; the polymerase is substantially the same as one in cells
infected with a T7-type phage (i.e., phage in which the DNA polymerase
requires host thioredoxin as a subunit) for example, the T7-type phage is
T7, T3, .PHI.I, .PHI.II, H, W31, gh-1, Y, A1122, or Sp6; the polymerase is
non-discriminating for dideoxy nucleotide analogs; the polymerase is
modified to have less than 50 units of exonuclease activity per mg of
polymerase, more preferably less than 1 unit, even more preferably less
than 0.1 unit, and most preferably has no detectable exonuclease activity;
the polymerase is able to utilize primers of as short as 10 bases or
preferably as short as 4 bases; the primer comprises four to forty
nucleotide bases, and is single stranded DNA or RNA; the annealing step
comprises heating the DNA molecule and the primer to above 65 .degree. C.,
preferably from 65.degree. C. to 100.degree. C., and allowing the heated
mixture to cool to below 65.degree. C., preferably to 10.degree. C. to
30.degree. C.; the incubating step comprises a pulse and a chase step,
wherein the pulse step comprises mixing the annealed mixture with all four
different deoxynucleoside triphosphates and a processive DNA polymerase,
wherein at least one of the deoxynucleoside triphosphates is labelled;
most preferably the pulse step performed under conditions in which the
polymerase does not exhibit its processivity and is for 30 seconds to 20
minutes at 0.degree. C. to 20.degree. C. or where at least one of the
nucleotide triphosphates is limiting; and the chase step comprises adding
one of the chain terminating agents to four separate aliquots of the
mixture after the pulse step; preferably the chase step is for 1 to 60
minutes at 30.degree. C. to 50.degree. C.; the terminating agent is a
dideoxynucleotide, or a limiting level of one deoxynucleoside
triphosphate; one of the four deoxynucleotides is chosen from dITP or
deazaguanosine; and labelled primers are used so that no pulse step is
required, preferably the label is radioactive fluorescent.
In other aspects the invention features (a) a method for producing blunt
ended double-stranded DNA molecules from a linear DNA molecule having no
3' protruding termini, using a processive DNA polymerase free from
exonuclease activity; (b) a method of amplification of a DNA sequence
comprising annealing a first and second primer to opposite strands of a
double stranded DNA sequence and incubating the annealed mixture with a
processive DNA polymerase having less than 500 units of exonucease
activity per mg of polymerase, preferably less than 1 unit, wherein the
first and second primers anneal to opposite strands of the DNA sequence;
in preferred embodiments the primers have their 3' ends directed toward
each other; and the method further comprises, after the incubation step,
denaturing the resulting DNA, annealing the first and second primers to
the resulting DNA and incubating the annealed mixture with the polymerase;
preferably the cycle of denaturing, annealing and incubating is repeated
from 10 to 40 times; (c) a method for in vitro mutagenesis of cloned DNA
fragments, comprising providing a cloned fragment and synthesizing a DNA
strand using a processive DNA polymerase having less than 1 unit of
exonuclease activity per mg of polymerase; (d) a method of producing
active T7-type DNA polymerase from cloned DNA fragments under the control
of non-leaky promoters (see below) in the same cell comprising inducing
expression of the genes only when the cells are in logarithmic growth
phase, or stationary phase, and isolating the polymerase from the cell;
preferably the cloned fragments are under the control of a promoter
requiring T7 RNA polymerase for expression; (e) a gene encoding a T7-type
DNA polymerase, the gene being genetically modified to reduce the activity
of naturally occurring exonuclease activity; (f) the product of the gene
encoding genetically modified polymerase; (g) a method of purifying T7 DNA
polymerase from cells comprising a vector from which the polymerase is
expressed, comprising the steps of lysing the cells, and passing the
polymerase over a sizing column over a DE52 DEAE column, a
phosphocellulose column, and a hydroxyapatite column; preferably prior to
the passing step the method comprises precipitating the polymerase with
ammonium sulfate; the method further comprises the step of passing the
polymerase over a sephadex DEAE50 column; and the sizing column is a DE52
DEAE column; (h) a method of inactivating exonuclease activity in a DNA
polymerase solution comprising incubating the solution in a vessel
containing oxygen, a reducing agent and a transition metal; (i) a kit for
DNA sequencing, comprising a processive DNA polymerase having less than
500 units of exonuclease activity per mg of polymerase, wherein the
polymerase is able to exhibit it processivity in a first environmental
condition, and unable to exhibit its processivity in a second
environmental condition, and a reagent necessary for the sequencing,
selected from a deoxynucleotide, a chain terminating agent, or an
oligonucleotide primer; preferably the deoxynucleotide is dITP; (j) a
method for labelling the 3' end of a DNA fragment comprising incubating
the DNA fragment with a processive DNA polymerase having less than 500
units of exonuclease activity per mg of polymerase, and a labelled
deoxynucleotide; (k) a method for in vitro mutagenesis of a cloned DNA
fragment comprising providing a primer and a template, the primer and the
template having a specific mismatched base, and extending the primer with
a processive DNA polymerase; and (l) a method for in vitro mutagenesis of
a cloned DNA fragment comprising providing the cloned fragment and
synthesizing a DNA strand using a processive DNA polymerase, having less
than 50 units of exonuclease activity, under conditions which cause
misincorporation of a nucleotide base.
This invention provides a DNA polymerase which is processive,
non-discriminating, and can utilize short primers. Further, the polymerase
has no associated exonuclease activity. These are ideal properties for the
above described methods, and in particular for DNA sequencing reactions,
since the background level of radioactivity in the polyacylamide gels is
negligible, there are few or no artifactual bands, and the bands are
sharp--making the DNA sequence easy to read. Further, such a polymerase
allows novel methods of sequencing long DNA fragments, as is described in
detail below.
Other features and advantages of the invention will be apparent from the
following description of the preferred embodiments thereof and from the
claims.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The drawings will first briefly be described.
DRAWINGS
FIGS. 1-3 are diagrammatic representations of the vectors pTrx-2, mGP1-1,
and pGP5-5 respectively;
FIG. 4 is a graphical representation of the selective oxidation of T7 DNA
polymerase;
FIG. 5 is a graphical representation of the ability of modified T7
polymerase to synthesize DNA in the presence of etheno-dATP; and
FIG. 6 is a diagrammatic representation of the enzymatic amplification of
genomic DNA using modified T7 DNA polymerase.
FIGS. 7, 8 and 9 are the nucleotide sequences of pTrx-2, a part of pGP5-5
and mGP1-2 respectively.
DNA Polymerase
In general the DNA polymerase of this invention is processive, has no
associated exonuclease activity, does not discriminate against nucleotide
analog incorporation, and can utilize small oligonucleotides (such as
tetramers, hexamers and octamers) as specific primers. These properties
will now be discussed in detail.
Processivity
By processivity is meant that the DNA polymerase is able to continuously
incorporate many nucleotides using the same primer-template without
dissociating from the template. The degree of processivity varies with
different polymerases: some incorporate only a few bases before
dissociating (e.g. Klenow, T4 DNA polymerase, and reverse transcriptase)
while others, such as those of the present invention, will remain bound
for at least 500 bases and preferably at least 1,000 bases under suitable
environmental conditions. Such environmental conditions include having
adequate supplies of all four deoxynucleoside triphosphates and an
incubation temperature from 10.degree. C.-50.degree. C. Processivity is
greatly enhanced in the presence of E. coli single stranded binding (ssb),
protein.
With processive enzymes termination of a sequencing reaction will occur
only at those bases which have incorporated a chain terminating agent,
such as a dideoxynucleotide. If the DNA polymerase is non-processive, then
artifactual bands will arise during sequencing reactions, at positions
corresponding to the nucleotide where the polymerase dissociated. Frequent
dissociation creates a background of bands at incorrect positions and
obscures the true DNA sequence. This problem is partially corrected by
incubating the reaction mixture for a long time (30-60 min) with a high
concentration of substrates, which "chase" the artifactual bands up to a
high molecular weight at the top of the gel, away from the region where
the DNA sequence is read. This is not an ideal solution since a
non-processive DNA polymerase has a high probability of dissociating from
the template at regions of compact secondary structure, or hairpins.
Reinitiation of primer elongation at these sites is inefficient and the
usual result is the formation of bands at the same position for all four
nucleotides, thus obscuring the DNA sequence.
Analog discrimation
The DNA polymerases of this invention do not discriminate significantly
between dideoxy-nucleotide analogs and normal nucleotides. That is, the
chance of incorporation of an analog is approximately the same as that of
a normal nucleotide. The polymerases of this invention also do not
discriminate significantly against some other analogs. This is important
since, in addition to the four normal deoxynucleoside triphosphates (dGTP,
dATP, dTTP and dCTP), sequencing reactions require the incorporation of
other types of nucleotide derivatives such as: radioactively- or
fluorescently-labelled nucleoside triphosphates, usually for labeling the
synthesized strands with .sup.35 S, .sup.32 P, or other chemical agents.
When a DNA polymerase does not discriminate against analogs the same
probability will exist for the incorporation of an analog as for a normal
nucleotide. For labelled nucleoside triphosphates this is important in
order to efficiently label the synthesized DNA strands using a minimum of
radioactivity. Further, lower levels of analogs are required with such
enzymes, making the sequencing reaction cheaper than with a discriminating
enzyme.
Discriminating polymerases show a different extent of discrimination when
they are polymerizing in a processive mode versus when stalled, struggling
to synthesize through a secondary structure impediment. At such
impediments there will be a variability in the intensity of different
radioactive bands on the gel, which may obscure the sequence.
Exonuclease Activity
The DNA polymerase of the invention has less than 50%, preferably less than
1%, and most preferably less than 0.1%, of the normal or naturally
associated level of exonuclease activity (amount of activity per
polymerase molecule). By normal or naturally associated level is meant the
exonuclease activity of unmodified T7-type polymerase. Normally the
associated activity is about 5,000 units of exonuclease activity per mg of
polymerase, measured as described below by a modification of the procedure
of Chase et al. (249 J. Biol. Chem. 4545, 1974). Exonucleases increase the
fidelity of DNA synthesis by excising any newly synthesized bases which
are incorrectly basepaired to the template. Such associated exonuclease
activities are detrimental to the quality of DNA sequencing reactions.
They raise the minimal required concentration of nucleotide precursors
which must be added to the reaction since, when the nucleotide
concentration falls, the polymerase activity slows to a rate comparable
with the exonuclease activity, resulting in no net DNA synthesis, or even
degradation of the synthesized DNA.
More importantly, associated exonuclease activity will cause a DNA
polymerase to idle at regions in the template with secondary structure
impediments. When a polymerase approaches such a structure its rate of
synthesis decreases as it struggles to pass. An associated exonuclease
will excise the newly synthesized DNA when the polymerase stalls. As a
consequence numerous cycles of synthesis and excision will occur. This may
result in the polymerase eventually synthesizing past the hairpin (with no
detriment to the quality of the sequencing reaction); or the polymerase
may dissociate from the synthesized strand (resulting in an artifactual
band at the same position in all four sequencing reactions); or, a chain
terminating agent may be incorporated at a high frequency and produce a
wide variability in the intensity of different fragments in a sequencing
gel. This happens because the frequency of incorporation of a chain
terminating agent at any given site increases with the number of
opportunities the polymerase has to incorporate the chain terminating
nucleotide, and so the DNA polymerase will incorporate a chain-terminating
agent at a much higher frequency at sites of idling than at other sites.
An ideal sequencing reaction will produce bands of uniform intensity
throughout the gel. This is essential for obtaining the optimal exposure
of the X-ray film for every radioactive fragment. If there is variable
intensity of radioactive bands, then fainter bands have a chance of going
undetected. To obtain uniform radioactive intensity of all fragments, the
DNA polymerase should spend the same interval of time at each position on
the DNA, showing no preference for either the addition or removal of
nucleotides at any given site. This occurs if the DNA polymerase lacks any
associated exonuclease, so that it will have only one opportunity to
incorporate a chain terminating nucleotide at each position along the
template.
Short primers
The DNA polymerase of the invention is able to utilize primers of 10 bases
or less, as well as longer ones, most preferably of 4-20 bases. The
ability to utilize short primers offers a number of important advantages
to DNA sequencing. The shorter primers are cheaper to buy and easier to
synthesize than the usual 15-20-mer primers. They also anneal faster to
complementary sites on a DNA template, thus making the sequencing reaction
faster. Further, the ability to utilize small (e.g., six or seven base)
oligonucleotide primers for DNA sequencing permits strategies not
otherwise possible for sequencing long DNA fragments. For example, a kit
containing 80 random hexamers could be generated, none of which are
complementary to any sites in the cloning vector. Statistically, one of
the 80 hexamer sequences will occur an average of every 50 bases along the
DNA fragment to be sequenced. The determination of a sequence of 3000
bases would require only five sequencing cycles. First, a "universal"
primer (e.g., Biolabs #1211, sequence 5' GTAAAACGACGGCCAGT 3') would be
used to sequence about 600 bases at one end of the insert. Using the
results from this sequencing reaction, a new primer would be picked from
the kit homologous to a region near the end of the determined sequence. In
the second cycle, the sequence of the next 600 bases would be determined
using this primer. Repetition of this process five times would determine
the complete sequence of the 3000 bases, without necessitating any
subcloning, and without the chemical synthesis of any new oligonucleotide
primers. The use of such short primers is enhanced by including gene 2.5
and 4 protein of T7 in the sequencing reaction.
DNA polymerases of this invention, (i.e., having the above properties)
include modified T7-type polymerases. That is the DNA polymerase requires
host thioredoxin as a sub-unit, and they are substantially identical to a
modified T7 DNA polymerase or to equivalent enzymes isolated from related
phage, such as T3, .PHI.I, .PHI.II, H, W31, gh-1, Y, Al122 and Sp6. Each
of these enzymes can be modified to have properties similar to those of
the modified T7 enzyme. It is possible to isolate the enzyme from phage
infected cells directly, but preferably the enzyme is isolated from cells
which overproduce it. By substantially identical is meant that the enzyme
may have amino acid substitutions which do not affect the overall
properties of the enzyme. One example of a particularly desirable amino
acid substitution is one in which the natural enzyme is modified to remove
any exonuclease activity. This modification may be performed at the
genetic or chemical level (see below).
Cloning T7 polymerase
As an example of the invention we shall describe the cloning,
overproduction, purification, modification and use of T7 DNA polymerase.
This enzyme consists of two polypeptides tightly complexed in a one to one
stoichiometry. One is the phage T7-encoded gene 5 protein of 84,000
daltons (Modrich et al. 150 J. Biol. Chem. 5515, 1975), the other is the
E. coli encoded thioredoxin, of 12,000 daltons (Tabor et al., 82 Proc.
Natl. Acad. Sci. 1074, 1985). The thioredoxin is an accessory protein and
attaches the gene 5 protein (the actual DNA polymerase) to the primer
template. The natural DNA polymerase has a very active 3' to 5 exonuclease
associated with it. This activity makes the polymerase useless for DNA
sequencing and must be inactivated or modified before the polymerase can
be used. This is readily performed, as described below, either chemically,
by local oxidation of the exonuclease domain, or genetically, by modifying
the coding region of the polymerase gene encoding this activity.
pTrx-2
In order to clone the trxA (thioredoxin) gene of E. coli wild type E. coli
DNA was partially cleaved with Sau3A and the fragments ligated to
BamHI-cleaved T7 DNA isolated from strain T7 ST9 (Tabor et al., in
Thioredoxin and Glutaredoxin Systems: Sturcture and Function (Holmgren et
al., eds) pp. 285-300, Raven Press, NY; and Tabor et al., supra). The
ligated DNA was transfected into E. coli trxA.sup.- cells, the mixture
plated onto trxA.sup.- cells, and the resulting T7 plaques picked. Since
T7 cannot grow without an active E. coli trxA gene only those phages
containing the trxA gene could form plaques. The cloned trxA genes were
located on a 470 base pair HincII fragment.
In order to overproduce thioredoxin a plasmid, pTrx-2, was as constructed.
Briefly, the 470 base pair HincII fragment containing the trxA gene was
isolated by standard procedure (Maniatis et al., Cloning: A Laboratory
Manual, Cold Spring Harbor Labs., Cold Spring Harbor, N.Y.), and ligated
to a derivative of pBR322 containing a Ptac promoter (ptac-12, Amann et
al., 25 Gene 167, 1983). Referring to FIG. 2, ptac-12, containing
.beta.-lactamase and Col El origin, was cut with PvuII, to yield a
fragment of 2290 bp, which was then ligated to two tandem copies of trxA
(HincII fragment) using commercially available linkers (SmaI-BamHI
Polylinker), to form pTrx-2. The complete nucleotide sequence of pTrx-2 is
shown in FIG. 7. Thioredoxin production is now under the control of the
tac promoter, and thus can be specifically induced, e.g. by IPTG
(isopropyl .beta.-D-thiogalactoside).
pGP5-5 and mGP1-2
Some gene products of T7 are lethal when expressed in E. coli. An
expression system was developed to facilitate cloning and expression of,
lethal genes, based on the inducible expression of T7 RNA polymerase. Gene
5 protein is lethal in some E. coli strains and an example of such a
system is described by Tabor et al. 82 Proc. Nat. Acad. Sci. 1074 (1985)
where T7 gene 5 was placed under the control of the .PHI.10 promoter, and
is only expressed when T7 RNA polymerase is present in the cell.
Briefly, pGP5-5 (FIG. 3) was constructed by standard procedures using
synthetic BamHI linkers to join T7 fragment from 14306 (NdeI) to 16869
(AhaIII), containing gene 5, to the 560 bp fragment of T7 from 5667
(HincII) to 6166 (Fnu4Hl) containing both the .PHI.1.1A and .PHI.1.1B
promoters, which are recognized by T7 RNA polymerase, and the 3kb
BamHI-HincII fragment of pACYC177 (Chang et al., 134 J. Bacteriol. 1141,
1978). The nucleotide sequence of the T7 inserts and linkers in shown in
FIG. 8. In this plasmid gene 5 is only expressed when T7 RNA polymerase is
provided in the cell.
Referring to FIG. 3, T7 RNA polymerase is provided on phage vector mGP1-2.
This is similar to pGP1-2 (Tabor et al., id.) except that the fragment of
T7 from 3133 (HaeIII) to 5840 (HinfI), containing T7 RNA polymerase was
ligated, using linkers (BglII and SalI respectively), to BamHI-SalI cut
M13 mp8, placing the polymerase gene under control of the lac promoter.
The complete nucleotide sequence of mGP1-2 is shown in FIG. 9.
Since pGP5-5 and pTrx-2 have different origins of replication (respectively
a P15A and a ColEl origin) they can be tranformed into one cell
simultaneously. pTrx-2 expresses large quantities of thioredoxin in the
presence of IPTG. mGP1-2 can coexist in the same cell as these two
plasmids and be used to regulate expression of T7-DNA polymerase from
pGP5-5, simply by causing production of T7-RNA polymerase by inducing the
lac promoter with, e.g., IPTG.
Overproduction of T7 DNA polymerase
There are several potential strategies for overproducing and reconstituting
the two gene products of trxA and gene 5. The same cell strains and
plasmids can be utilized for all the strategies. In the preferred strategy
the two genes are co-overexpressed in the same cell. (This is because gene
5 is susceptible to proteases until thioredoxin is bound to it.) As
described in detail below, one procedure is to place the two genes
separately on each of two compatible plasmids in the same cell.
Alternatively, the two genes could be placed in tandem on the same
plasmid. It is important that the T7-gene 5 is placed under the control of
a non-leaky inducible promoter, such as .PHI.1.1A, .PHI.1.1B and .PHI.10
of T7, as the synthesis of even small quantities of the two polypeptides
together is toxic in most E. coli cells. By non-leaky is meant that less
than 500 molecules of the gene product are produced, per cell generation
time, from the gene when the promoter, controlling the gene's expression,
is not activated. Preferably the T7 RNA polymerase expression system is
used although other expression systems which utilize inducible promoters
could also be used. A leaky promoter, e.g., plac, allows more than 500
molecules of protein to be synthesized, even when not induced, thus cells
containing lethal genes under the control of such a promoter grow poorly
and are not suitable in this invention. It is of course possible to
produce these products in cells where they are not lethal, for example,
the plac promoter is suitable in such cells.
In a second strategy each gene can be cloned and overexpressed separately.
Using this strategy, the cells containing the individually overproduced
polypeptides are combined prior to preparing the extracts, at which point
the two polypeptides form an active T7 DNA polymerase.
EXAMPLE 1
Production of T7 DNA polymerase
E. coli strain JM103 (Messing et al., 9 Nuc. Acid Res. 309, 1981) is used
for preparing stocks of mGP1-2. JM103 is stored in 50% glycerol at
-80.degree. C. and is streaked on a standard minimal media agar plate. A
single colony is grown overnight in 25 ml standard M9 media at 37.degree.
C., and a single plaque of mGP1-2 is obtained by titering the stock using
freshly prepared JM103 cells. The plaque is used to inoculate 10 ml
2.times. LB (2% Bacto-Tryptone, 1% yeast extract, 0.5% NaCl, 8 mM NaOH)
containing JM103 grown to an A.sub.590 =0.5. This culture will provide the
phage stock for preparing a large culture of mGP1-2. After 3-12 hours, the
10 ml culture is centrifuged, and the supernatant used to infect the large
(2 L) culture. For the large culture, 4.times.500 ml 2.times. LB is
inoculated with 4.times.5 ml 71.18 cells grown in M9, and is shaken at
37.degree. C. When the large culture of cells has grown to an A.sub.590
=1.0 (approximately three hours), they ar | | |