WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
DNA sequencing by stepwise ligation and cleavage    
United States Patent5552278   
Link to this pagehttp://www.wikipatents.com/5552278.html
Inventor(s)Brenner; Sydney (Cambridge, GB2)
AbstractThe invention provides a method of nucleic acid sequence analysis based on repeated cycles of ligation to and cleavage of probes at the terminus of a target polynucleotide. At each such cycle one or more terminal nucleotides are identified and one or more nucleotides are removed from the end of the target polynucleotide, such that further cycles of ligation and cleavage can take place. At each cycle the target sequence is shortened by one or more nucleotides until the nucleotide sequence of the target polynucleotide is determined. The method obviates electrophoretic separation of similarly sized DNA fragments and eliminates the difficulties associated with the detection and analysis of spacially overlapping bands of DNA fragments in a gel, or like medium. The invention further obviates the need to generate DNA fragments from long single stranded templates with a DNA polymerase.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5552278
DNA sequencing by stepwise ligation and cleavage - US Patent 5552278 Drawing
DNA sequencing by stepwise ligation and cleavage
Inventor     Brenner; Sydney (Cambridge, GB2)
Owner/Assignee     Spectragen, Inc. (Hayward, CA)
Patent assignment
All assignments
Publication Date     September 3, 1996
Application Number     08/280,441
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     July 25, 1994
US Classification     435/6 435/18 435/91.52 435/91.53 536/24.3
Int'l Classification     C07H 021/04 C12P 019/34 C12Q 001/68
Examiner     Jones; W. Gary
Assistant Examiner     Tran; Paul B.
Attorney/Law Firm     Macevicz; Stephen C.
Address
Parent Case     This is a continuation-in-part of U.S. patent application Ser. No. 08/222,300 filed 4 Apr. 1994, now abandoned, which application is incorporated by reference.
Priority Data    
USPTO Field of Search     435/6 435/18 435/91.52 435/91.53 536/24.3 935/77 935/78
Patent Tags     dna sequencing stepwise ligation cleavage
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5202231
Drmanac
435/6
Apr,1993

[0 after 0 votes]
5118605
Urdea
435/6
Jun,1992

[0 after 0 votes]
5114839
Blocker
435/6
May,1992

[0 after 0 votes]
5102785
Livak
435/6
Apr,1992

[0 after 0 votes]
5093245
Keith
435/91.2
Mar,1992

[0 after 0 votes]
5002867
Macevicz
435/6
Mar,1991

[0 after 0 votes]
4775619
Urdea
435/6
Oct,1988

[0 after 0 votes]
4321365
Wu
536/24.2
Mar,1982

[0 after 0 votes]
4293652
Cohen
435/91.1
Oct,1981

[0 after 0 votes]
4237224
Cohen
435/69.1
Dec,1980

[0 after 0 votes]
4683202
Mullis
435/91.2
Dec,1969

[0 after 0 votes]
5242794
Whiteley
435/6
Dec,1969

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) providing the polynucleotide in double stranded form such that the polynucleotide has a protruding strand at at least one end;

(b) ligating a probe from a mixture of probes to an end of the polynucleotide having a protruding strand to form a ligated complex, each probe having an end with a complementary protruding strand to that of the polynucleotide and each probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(c) identifying one or more nucleotides in the protruding strand of the polynucleotide;

(d) cleaving the ligated complex with said nuclease that recognizes said nuclease recognition site and cuts the ligated complex to give an augmented probe and a new protruding strand on the polynucleotide; and

(e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined.

2. The method of claim 1 wherein the step of providing further includes blocking the recognition sites of said nuclease on said polynucleotide.

3. The method of claim 2 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

4. The method of claim 3 wherein said step of ligating includes treating said polynucleotide with a ligase.

5. The method of claim 4 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

6. The method of claim 5 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

7. The method of claim 1 further including the step of removing unligated probe from said ligated complex after said step of ligating.

8. The method of claim 1 wherein said step of identifying includes identifying said one or more nucleotides in said protruding strand of said polynucleotide by the identity of said ligated probe.

9. The method of claim 1 wherein said protruding strand of said polynucleotide includes a 3' recessed strand and wherein step of identifying includes extending the 3' recessed strand by a nucleic acid polymerase in the presence of at least one chain-terminating nucleoside triphosphate so that at least one chain-terminating nucleoside triphosphate is incorporated into the 3' recessed strand.

10. The method of claim 9 wherein said chain-terminating nucleoside triphosphate is a labeled dideoxynucleoside triphosphate and wherein said step of identifying includes identifying said one or more nucleotides by the label on the labeled dideoxynucleoside triphosphate incorporated into said 3' recessed strand of said polynucleotide as a labeled dideoxynucleotide.

11. The method of claim 10 further including the steps of excising said labeled dideoxynucleotide and extending said recessed strand with a nucleic acid polymerase.

12. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) ligating a probe from a mixture of probes to an end of a polynucleotide having a protruding strand to forth a ligated complex, each probe having an end with a complementary protruding strand to that of the polynucleotide and each probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(b) removing unligated probe from the ligated complex;

(c) cleaving the ligated complex with said nuclease, said nuclease recognizing the recognition site and cleaving the ligated complex to give an augmented probe and a new protruding strand on the polynucleotide;

(d) identifying one or more nucleotides in the protruding strand of the polynucleotide; and

(e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined.

13. The method of claim 12 further including the step of providing said polynucleotide with said recognition sites of said nuclease blocked.

14. The method of claim 13 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

15. The method of claim 14 wherein said step of ligating includes treating said polynucleotide with a ligase.

16. The method of claim 15 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

17. The method of claim 16 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

18. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) ligating a probe from a mixture of probes to an end of a polynucleotide having a protruding strand to form a ligated complex, the probe having an end with a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(b) identifying one or more nucleotides in the protruding strand of the polynucleotide;

(c) cleaving the ligated complex with said nuclease so that the polynucleotide is shortened by one or more nucleotides; and

(d) repeating steps (a) through (c) until the nucleotide sequence of said nuclease polynucleotide is determined.

19. The method of claim 18 further including the step of providing said polynucleotide with said recognition sites of said nuclease blocked.

20. The method of claim 19 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

21. The method of claim 20 wherein said step of ligating includes treating said polynucleotide with a ligase.

22. The method of claim 21 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

23. The method of claim 22 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

24. The method of claim 23 wherein the identity of a single nucleotide is determined in said step of identifying and said polynucleotide is shortened by a single nucleotide in said step of cleaving.

25. The method of claim 18 further including the step of removing unligated probe from said ligated complex after said step of ligating.

26. The method of claim 18 wherein said step of identifying includes identifying said one or more nucleotides in said protruding strand of said polynucleotide by the identity of said ligated probe.

27. The method of claim 18 wherein said protruding strand of said polynucleotide includes a 3' recessed strand and wherein step of identifying includes extending the 3' recessed strand by a nucleic acid polymerase in the presence of at least one chain-terminating nucleoside triphosphate so that at least one chain-terminating nucleoside triphosphate is incorporated into the 3' recessed strand.

28. The method of claim 27 wherein said chain-terminating nucleoside triphosphate is a labeled dideoxynucleoside triphosphate and wherein said step of identifying includes identifying said one or more nucleotides by the label on the labeled dideoxynucleoside triphosphate incorporated into said 3' recessed strand of said polynucleotide.

29. A method for determining the identity of a terminal nucleotide of a polynucleotide, the method comprising the steps of:

(a) ligating a probe from a mixture to an end of a polynucleotide having a protruding strand to form a ligated complex, the probe having an end with a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(b) removing unligated probe from the ligated complex; and

(c) identifying the terminal nucleotide of the polynucleotide by the identity of the ligated probe.

30. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) providing the polynucleotide in double stranded form such that the polynucleotide has a protruding strand at at least one end;

(b) ligating a probe to an end of the polynucleotide having a protruding strand to form a ligated complex, the probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site and an end with a complementary protruding strand to that of the polynucleotide;

(c) identifying one or more nucleotides in the protruding strand of the polynucleotide;

(d) cleaving the ligated complex with a nuclease that recognizes the recognition site and cuts the ligated complex to give an augmented probe and a new protruding strand on the polynucleotide; and

(e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined.

31. The method of claim 30 wherein the step of providing further includes blocking the recognition sites of said nuclease on said polynucleotide.

32. The method of claim 31 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

33. The method of claim 32 wherein said step of ligating includes treating said polynucleotide with a ligase.

34. The method of claim 33 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

35. The method of claim 34 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

36. The method of claim 30 further including the step of removing unligated probe from said ligated complex after said step of ligating.

37. The method of claim 30 wherein said step of identifying includes identifying said one or more nucleotides in said protruding strand of said polynucleotide by the identity of said ligated probe.

38. The method of claim 30 wherein said protruding strand of said polynucleotide includes a 3' recessed strand and wherein step of identifying includes extending the 3' recessed strand by a nucleic acid polymerase in the presence of at least one chain-terminating nucleoside triphosphate so that at least one chain-terminating nucleoside triphosphate is incorporated into the 3' recessed strand.

39. The method of claim 38 wherein said chain-terminating nucleoside triphosphate is a labeled dideoxynucleoside triphosphate and wherein said step of identifying includes identifying said one or more nucleotides by the label on the labeled dideoxynucleoside triphosphate incorporated into said 3' recessed strand of said polynucleotide as a labeled dideoxynucleotide.

40. The method of claim 39 further including the steps of excising said labeled dideoxynucleotide and extending said recessed strand with a nucleic acid polymerase.

41. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) ligating a probe to an end of a polynucleotide having a protruding strand to form a ligated complex, the probe having an end with a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(b) removing unligated probe from the ligated complex;

(c) cleaving the ligated complex with a nuclease, the nuclease recognizing the recognition site and cleaving the ligated complex to give an augmented probe and a new protruding strand on the polynucleotide;

(d) identifying one or more nucleotides in the protruding strand of the polynucleotide; and

(e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined.

42. The method of claim 41 further including the step of providing said polynucleotide with said recognition sites of said nuclease blocked.

43. The method of claim 42 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

44. The method of claim 43 wherein said step of ligating includes treating said polynucleotide with a ligase.

45. The method of claim 44 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

46. The method of claim 45 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

47. A method for determining the nucleotide sequence of a polynucleotide, the method comprising the steps of:

(a) ligating a probe to an end of a polynucleotide having a protruding strand to form a ligated complex, the probe having an end with a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site of a nuclease whose cleavage site is separate from its recognition site;

(b) identifying one or more nucleotides in the protruding strand of the polynucleotide;

(c) cleaving the ligated complex with said nuclease so that the polynucleotide is shortened by one or more nucleotides; and

(d) repeating steps (a) through (c) until the nucleotide sequence of the polynucleotide is determined.

48. The method of claim 47 further including the step of providing said polynucleotide with said recognition sites of said nuclease blocked.

49. The method of claim 48 wherein said nuclease is a type IIs restriction endonuclease and wherein said blocking said recognition sites includes treating said polynucleotide with a methylase.

50. The method of claim 49 wherein said step of ligating includes treating said polynucleotide with a ligase.

51. The method of claim 50 wherein said polynucleotide contains a 5'-phosphoryl group on said end having said protruding strand and said probe lacks a 5'-phosphoryl group on said end having said complementary protruding strand.

52. The method of claim 51 wherein said step of ligation includes treating said polynucleotide in succession with a ligase, a kinase, and a ligase.

53. The method of claim 52 wherein the identity of a single nucleotide is determined in said step of identifying and said polynucleotide is shortened by a single nucleotide in said step of cleaving.

54. The method of claim 53 further including the step of removing unligated probe from said ligated complex after said step of ligating.

55. The method of claim 47 wherein said step of indentifying includes identifying said one or more nucleotides in said protruding strand of said polynucleotide by the identity of said ligated probe.

56. The method of claim 47 wherein said protruding strand of said polynucleotide includes a 3' recessed strand and wherein step of identifying includes extending the 3' recessed strand by a nucleic acid polymerase in the presence of at least one chain-terminating nucleoside triphosphate so that at least one chain-terminating nucleoside triphosphate is incorporated into the 3' recessed strand.

57. The method of claim 56 wherein said chain-terminating nucleoside triphosphate is a labeled dideoxynucleoside triphosphate and wherein said step of identifying includes identifying said one or more nucleotides by the label on the labeled dideoxynucleoside triphosphate incorporated into said 3' recessed strand of said polynucleotide.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The invention relates generally to methods for determining the nucleotide sequence of a polynucleotide, and more particularly, to a method of step-wise removal and identification of terminal nucleotides of a polynucleotide.

BACKGROUND

Analysis of polynucleotides with currently available techniques provides a spectrum of information ranging from the confirmation that a test polynucleotide is the same or different than a standard sequence or an isolated fragment to the express identification and ordering of each nucleoside of the test polynucleotide. Not only are such techniques crucial for understanding the function and control of genes and for applying many of the basic techniques of molecular biology, but they have also become increasingly important as tools in genomic analysis and a great many non-research applications, such as genetic identification, forensic analysis, genetic counselling, medical diagnostics, and the like. In these latter applications both techniques providing partial sequence information, such as fingerprinting and sequence comparisons, and techniques providing full sequence determination have been employed, e.g. Gibbs et al, Proc. Natl. Acad. Sci., 86:1919-1923 (1989); Gyllensten et al, Proc. Natl. Acad. Sci, 85:7652-7656 (1988); Carrano et al, Genomics, 4:129-136 (1989); Caetano-Anolles et al, Mol. Gen. Genet., 235:157-165 (1992); Brenner and Livak, Proc. Natl. Acad. Sci., 86:8902-8906 (1989); Green et al, PCR Methods and Applications, 1:77-90 (1991); and Versalovic et al, Nucleic Acids Research, 19:6823-6831 (1991).

Native DNA consists of two linear polymers, or strands of nucleotides. Each strand is a chain of nucleosides linked by phosphodiester bonds. The two strands are held together in an antiparallel orientation by hydrogen bonds between complementary bases of the nucleotides of the two strands: deoxyadenosine (A) pairs with thymidine (T) and deoxyguanosine (G) pairs with deoxycytidine (C).

Presently there are two basic approaches to DNA sequence determination: the dideoxy chain termination method, e.g. Sanger et al, Proc. Natl. Acad. Sci., 74:5463-5467 (1977); and the chemical degradation method, e.g. Maxam et al, Proc. Natl. Acad. Sci., 74:560-564 (1977). The chain termination method has been improved in several ways, and serves as the basis for all currently available automated DNA sequencing machines, e.g. Sanger et al, J. Mol. Biol., 143:161-178 (1980); Schreier et al, J. Mol. Biol., 129:169-172 (1979); Smith et al, Nucleic Acids Research, 13: 2399-2412 (1985); Smith et al, Nature, 321:674-679 (1987); Prober et al, Science, 238:336-341 (1987); Section II, Meth. Enzymol., 155:51-334 (1987); Church et al, Science, 240:185-188 (1988); Hunkapiller et al, Science, 254:59-67 (1991); Bevan et al, PCR Methods and Applications, 1: 222-228 (1992).

Both the chain termination and chemical degradation methods require the generation of one or more sets of labeled DNA fragments, each having a common origin and each terminating with a known base. The set or sets of fragments must then be separated by size to obtain sequence information. In both methods, the DNA fragments are separated by high resolution gel electrophoresis, which must have the capacity of distinguishing very large fragments differing in size by no more than a single nucleotide. Unfortunately, this step severely limits the size of the DNA chain that can be sequenced at one time. Sequencing using these techniques can reliably accommodate a DNA chain of up to about 400-450 nucleotides, Bankier et al, Meth. Enzymol., 155:51-93 (1987); and Hawkins et al, Electrophoresis, 13:552-559 (1992).

Several significant technical problems have seriously impeded the application of such techniques to the sequencing of long target polynucleotides, e.g. in excess of 500-600 nucleotides, or to the sequencing of high volumes of many target polynucleotides. Such problems include i) the gel electrophoretic separation step which is labor intensive, is difficult to automate, and introduces an extra degree of variability in the analysis of data, e.g. band broadening due to temperature effects, compressions due to secondary structure in the DNA sequencing fragments, inhomogeneities in the separation gel, and the like; ii) nucleic acid polymerases whose properties, such as processivity, fidelity, rate of polymerization, rate of incorporation of chain terminators, and the like, are often sequence dependent; iii) detection and analysis of DNA sequencing fragments which are typically present in fmol quantities in spacially overlapping bands in a gel; iv) lower signals because the labelling moiety is distributed over the many hundred spacially separated bands rather than being concentrated in a single homogeneous phase, and v) in the case of single-lane fluorescence detection, the availability of dyes with suitable emission and absorption properties, quantum yield, and spectral resolvability, e.g. Trainor, Anal. Biochem., 62:418-426 (1990); Connell et al, Biotechniques, 5:342-348 (1987); Karger et al, Nucleic Acids Research, 19:4955-4962 (1991); Fung et al, U.S. Pat. No. 4,855,225; and Nishikawa et al, Electrophoresis, 12: 623-631 (1991).

Another problem exists with current technology in the area of diagnostic sequencing. An ever widening array of disorders, susceptibilities to disorders, prognoses of disease conditions, and the like, have been correlated with the presence of particular DNA sequences, or the degree of variation (or mutation) in DNA sequences, at one or more genetic loci. Examples of such phenomena include human leukocyte antigen (HLA) typing, cystic fibrosis, rumor progression and heterogeneity, p53 proto-oncogene mutations, ras proto-oncogene mutations, and the like, e.g. Gyllensten et al, PCR Methods and Applications, 1:91-98 (1991); Santamaria et al, International application PCT/US92/01675; Tsui et al, International application PCT/CA90/00267; and the like. A difficulty in determining DNA sequences associated with such conditions to obtain diagnostic or prognostic information is the frequent presence of multiple subpopulations of DNA, e.g. allelic variants, multiple mutant forms, and the like. Distinguishing the presence and identity of multiple sequences with current sequencing technology is virtually impossible, without additional work to isolate and perhaps clone the separate species of DNA.

A major advance in sequencing technology could be made if an alternative approach was available for sequencing DNA that did not required high resolution separations, generated signals more amenable to analysis, and provided a means for readily analyzing DNA from heterozygous genetic loci.

SUMMARY OF THE INVENTION

The invention provides a method of nucleic acid sequence analysis based on repeated cycles of ligation and cleavage of probes at the terminus of a target polynucleotide. Preferably, at each such cycle a terminal nucleotide is identified and removed from the end of the target polynucleotide, such that further cycles of ligation, cleavage, and identification can take place. That is, in each cycle the target sequence is shortened by a single nucleotide and the cycles are repeated until the nucleotide sequence of the target polynucleotide is determined. An important feature of the invention is providing a target polynucleotide, that is, the nucleic acid whose sequence is to be determined, with a protruding strand.

Another important feature of the invention is the probe employed in the ligation and cleavage events. A probe of the invention is a double stranded polynucleotide (i) containing a recognition site for a nuclease and (ii) having a protruding strand capable of forming a duplex with the protruding strand of the target polynucleotide. Preferably, at each cycle, only those probes whose protruding strands form perfectly matched duplexes with the protruding strand of the target polynucleotide are ligated to the end of the target polynucleotide to form a ligated complex. After removal of the unligated probe, a nuclease recognizing the probe cuts the ligated complex at a site one or more nucleotides from the ligation site along the target polynucleotide leaving a protruding strand capable of participating in the next cycle of ligation and cleavage. An important feature of the nuclease is that its recognition site be separate from its cleavage site. As is described more fully below, in the course of such cycles of ligation and cleavage, the terminal nucleotides of the target polynucleotide are identified.

In one aspect of the invention, more than one nucleotide at the terminus of a target polynucleotide can be identified and/or cleaved during each cycle of the method.

Generally, the method of the invention comprises the following steps: (a) ligating a probe to an end of the polynucleotide having a protruding strand to form a ligated complex, the probe having a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site; (b) identifying one or more nucleotides in the protruding strand of the polynucleotide; (c) cleaving the ligated complex with a nuclease; and (d) repeating steps (a) through (c) until the nucleotide sequence of the polynucleotide is determined. As is described more fully below, the order of steps (a) through (c) may vary with different embodiments of the invention. For example, identifying the one or more nucleotides can be carried out either before or after cleavage of the ligated complex from the target polynucleotide. Likewise, ligating a probe to the end of the polynucleotide may follow the step of identifying in some preferred embodiments of the invention. Preferably, the method further includes a step of removing the unligated probe after the step of ligating.

Preferably, whenever natural protein endonucleases are employed as the nuclease, the method further includes a step of methylating the target polynucleotide at the start of a sequencing operation.

The present invention overcomes many of the deficiencies inherent to current methods of DNA sequencing: there is no requirement for the electrophoretic separation of closely-sized DNA fragments; no difficult-to-automate gel-based separations are required; no polymerases are required; detection and analysis are greatly simplified because signal-to-noise ratios are much more favorable on a nucleotide-by-nucleotide basis, permitting smaller sample sizes to be employed; and for fluorescent-based detection schemes, analysis is further simplified because fluorophores labeling different nucleotides may be separately detected in homogeneous solutions rather than in spacially overlapping bands.

The present invention is readily automated, both for small-scale serial operation and for large-scale parallel operation, wherein many target polynucleotides or many segments of a single target polynucleotide are sequenced simultaneously. Unlike present sequencing approaches, the progressive nature of the method--that is, determination of a sequence nucleotide-by-nucleotide--permits one to monitor the progress of the sequencing operation in real time which, in ram, permits the operation to be curtailed, or re-started, if difficulties arise, thereby leading to significant savings in time and reagent usage. Also unlike current approaches, the method permits the simultaneous determination of allelic forms of a target polynucleotide: As described more fully below, if a population of target polynucleotides consists of several subpopulations of distinct sequences, e.g. polynucleotides from a heterozygous genetic locus, then the method can identify the proportion of each nucleotide at each position in the sequence.

Generally, the method of the invention is applicable to all tasks where DNA sequencing is employed, including medical diagnostics, genetic mapping, genetic identification, forensic analysis, molecular biology research, and the like.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1a illustrates a preferred structure of a labeled probe of the invention.

FIG. 1b illustrates a probe and terminus of a target polynucleotide wherein a separate labeling step is employed to identify one or more nucleotides in the protruding strand of a target polynucleotide.

FIG. 1c illustrates steps of an embodiment wherein a nucleotide of the target polynucleotide is identified by extension with a polymerase in the presence of labeled dideoxynucleoside triphosphates followed by their excision, strand extension, and strand displacement.

FIG. 2 illustrates the relative positions of the nuclease recognition site, ligation site, and cleavage site in a ligated complex.

FIGS. 3a through 3h diagrammatically illustrate the embodiment referred to herein as "double stepping," or the simultaneous use of two different nucleases in accordance with the invention.

DEFINITIONS

As used herein "sequence determination" or "determining a nucleotide sequence" in reference to polynucleotides includes determination of partial as well as full sequence information of the polynucleotide. That is, the term includes sequence comparisons, fingerprinting, and like levels of information about a target polynucleotide, as well as the express identification and ordering of each nucleoside of the test polynucleotide.

"Perfectly matched duplex" in reference to the protruding strands of probes and target polynucleotides means that the protruding strand from one forms a double stranded structure with the other such that each nucleotide in the double stranded structure undergoes Watson-Crick basepairing with a nucleotide on the opposite strand. The term also comprehends the pairing of nucleoside analogs, such as deoxyinosine, nucleosides with 2-aminopurine bases, and the like, that may be employed to reduce the degeneracy of the the probes.

The term "oligonucleotide" as used herein includes linear oligomers of nucleosides or analogs thereof, including deoxyribonucleosides, ribonucleosides, and the like. Usually oligonucleotides range in size from a few monomeric units, e.g. 3-4, to several hundreds of monomeric units. Whenever an oligonucleotide is represented by a sequence of letters, such as "ATGCCTG," it will be understood that the nucleotides are in 5'.fwdarw.3' order from left to right and that "A" denotes deoxyadenosine, "C" denotes deoxycytidine, "G" denotes deoxyguanosine, and "T" denotes thymidine, unless otherwise noted.

As used herein, "nucleoside" includes the natural nucleosides, including 2'-deoxy and 2'-hydroxyl forms, e.g. as described in Komberg and Baker, DNA Replication, 2nd Ed. (Freeman, San Francisco, 1992). "Analogs" in reference to nucleosides includes synthetic nucleosides having modified base moieties and/or modified sugar moieties, e.g. described generally by Scheit, Nucleotide Analogs (John Wiley, New York, 1980). Such analogs include synthetic nucleosides designed to enhance binding properties, reduce degeneracy, increase specificity, and the like.

DETAILED DESCRIPTION OF THE INVENTION

The invention provides a method of sequencing nucleic acids which obviates electrophoretic separation of similarly sized DNA fragments and which eliminates the difficulties associated with the detection and analysis of spacially overlapping bands of DNA fragments in a gel or like medium. Moreover, the invention obviates the need to generate DNA fragments from long single stranded templates with a DNA polymerase.

As mentioned above an important feature of the invention are the probes ligated to the target polynucleotide. In one aspect of the invention, probes have the form illustrated in FIG. 1a. Probes are double stranded segments of DNA having a protruding strand at one end 10, at least one nuclease recognition site 12, and a spacer region 14 between the recognition site and the protruding end 10. Preferably, probes also include a label 16, which in this particular embodiment is illustrated at the end opposite of the protruding strand. The probes may be labeled by a variety of means and at a variety of locations, the only restriction being that the labeling means selected does not interfer with the ligation step or with the recognition of the probe by the nuclease.

Preferably, this embodiment of the invention comprises the following steps: (a) ligating a probe to an end of the polynucleotide having a protruding strand to form a ligated complex, the probe having a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site; (b) removing unligated probe from the ligated complex; (c) identifying one or more nucleotides in the protruding strand of the polynucleotide by the identity of the ligated probe; (d) cleaving the ligated complex with a nuclease; and (e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined. The step of identifying can take place either before or after the step of cleaving. Preferably, the one or more nucleotides in the protruding strand of the polynucleotide are identified prior to cleavage.

It is not critical whether protruding strand 10 of the probe is a 5' or 3' end. However, in this embodiment, it is important that the protruding strands of the target polynucleotide and probes be capable of forming perfectly matched duplexes to allow for specific ligation. If the protruding strands of the target polynucleotide and probe are different lengths the resulting gap can be filled in by a polymerase prior to ligation, e.g. as in "gap LCR" disclosed in Backman et al, European patent application 91100959.5. Preferably, the number of nucleotides in the respective protruding strands are the same so that both strands of the probe and target polynucleotide are capable of being ligated without a filling step. Preferably, the protruding strand of the probe is from 2 to 6 nucleotides long. As indicated below, the greater the length of the protruding strand, the greater the complexity of the probe mixture that is applied to the target polynucleotide during each ligation and cleavage cycle.

In another aspect of the invention, the primary function of the probe is to provide a site for a nuclease to bind to the ligated complex so that the complex can be cleaved and the target polynucleotide shortened. In this aspect of the invention, identification of the nucleotides can take place separately from probe ligation and cleavage. This embodiment provides several advantages: First, correct sequence determination does not require that the protruding strand of the ligated probe be perfectly complementary to the protruding strand of the target polynucleotide, thereby permitting greater flexibility in the control of hybridization stringency. Second, one need not provide a fully degenerate set of probes based on the four natural nucleotides. So-called "wild card" nucleotides, or "degeneracy reducing analogs" can be provided to significantly reduce, or even eliminate, the complexity of the probe mixture employed in the ligation step, since specific binding is not critical to nucleotide identification in this embodiment.

Preferably; this embodiment of the invention comprises the following steps: (a) providing a polynucleotide having a 3' recessed strand and a 5' protruding strand; (b) identifying one or more nucleotides in the protruding strand, (c) ligating a probe having a 5' protruding strand to an end of the polynucleotide to form a ligated complex, the probe having a complementary protruding strand to that of the polynucleotide and the probe having a nuclease recognition site; (d) cleaving the ligated complex with a nuclease; and (e) repeating steps (a) through (d) until the nucleotide sequence of the polynucleotide is determined. A nuclease is employed that produces a 3'-recessed strand and 5' protruding strand at the terminus of the target polynucleotide.

An example of this embodiment is illustrated in FIG. 1b: The 3' recessed strand of polynucleotide (15) is extended with a nucleic acid polymerase in the presence of the four dideoxynucleoside triphosphates, each carrying a distinguishable fluorescent label, so that the 3' recessed strand is extended by one nucleotide (11), which permits its complementary nucleotide in the 5' protruding strand of polynucleotide (15) to be identified. Probe (9) having recognition site (12), spacer region (14), and complementary protruding strand (10), is then ligated to polynucleotide (15) to form ligated complex (17). Ligated complex (17) is then cleaved at cleavage site (19) to release a labeled fragment (21) and augmented probe (23). A shortened polynucleotide (15) with a regenerated 3' recessed strand is then ready for the next cycle of identification, ligation, and cleavage.

In such embodiments, the first nucleotide of the 5' protruding strand adjacent to the double stranded portion of the target polynucleoti