WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Characterization of individual polymer molecules based on monomer-interface interactions    
United States Patent6015714   
Link to this pagehttp://www.wikipatents.com/6015714.html
Inventor(s)Baldarelli; Richard (Bar Harbor, ME); Branton; Daniel (Lexington, MA); Church; George (Brookline, MA); Deamer; David W. (Santa Cruz, CA); Akeson; Mark (Santa Cruz, CA); Kasianowicz; John (Darnestown, MD)
AbstractA method for sequencing a nucleic acid polymer by (1) providing two separate, adjacent pools of a medium and an interface between the two pools, the interface having a channel so dimensioned as to allow sequential monomer-by-monomer passage from one pool to the other pool of only one nucleic acid polymer at a time; (2) placing the nucleic acid polymer to be sequenced in one of the two pools; and (3) taking measurements as each of the nucleotide monomers of the nucleic acid polymer passes through the channel so as to sequence the nucleic acid polymer.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6015714
Characterization of individual polymer molecules based on

     monomer-interface interactions - US Patent 6015714 Drawing
Characterization of individual polymer molecules based on monomer-interface interactions
Inventor     Baldarelli; Richard (Bar Harbor, ME); Branton; Daniel (Lexington, MA); Church; George (Brookline, MA); Deamer; David W. (Santa Cruz, CA); Akeson; Mark (Santa Cruz, CA); Kasianowicz; John (Darnestown, MD)
Owner/Assignee     The United States of America as represented by the Secretary of Commerce (Washington, DC) The Regents of the University of California (Oakland, CA)
Patent assignment
All assignments
Publication Date     January 18, 2000
Application Number     09/098,142
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 16, 1998
US Classification    
Int'l Classification    
Examiner     Patterson Jr.; Charles L.
Assistant Examiner    
Attorney/Law Firm     Fish & Richardson P. C.
Address
Parent Case     CROSS REFERENCE TO RELATED APPLICATIONS This application is a continuation-in-part of U.S. Ser. No. 08/405,735 filed on Mar. 17, 1995 now U.S. Pat. No. 5,795,782.
Priority Data    
USPTO Field of Search    
Patent Tags     characterization individual polymer molecules based on monomer-interface interactions
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for sequencing a nucleic acid polymer, the method comprising:

providing two separate, adjacent pools of a medium and an interface between the two pools, the interface having a channel so dimensioned as to allow sequential monomer-by-monomer passage from one pool to the other pool of only one nucleic acid polymer at a time;

placing the nucleic acid polymer to be sequenced in one of the two pools; and

taking measurements as each of the nucleotide monomers of the nucleic acid polymer passes through the channel so as to sequence the nucleic acid polymer.

2. The method of claim 1, wherein the medium is electrically conductive.

3. The method of claim 2, wherein the medium is an aqueous solution.

4. The method of claim 3, further comprising applying a voltage across the interface.

5. The method of claim 4, wherein ionic flow between the two pools is measured.

6. The method of claim 5, wherein the duration of ionic flow blockage is measured.

7. The method of claim 5, wherein the amplitude of ionic flow blockage is measured.

8. The method of claim 2, further comprising applying a voltage across the interface.

9. The method of claim 8, wherein ionic flow between the two pools is measured.

10. The method of claim 9, wherein the duration of ionic flow blockage is measured.

11. The method of claim 9, wherein the amplitude of ionic flow blockage is measured.

12. The method of claim 1, wherein the nucleic acid polymer interacts with an inner surface of the channel.

13. The method of claim 12, wherein the medium is electrically conductive.

14. The method of claim 13, wherein the medium is an aqueous solution.

15. The method of claim 14, further comprising applying a voltage across the interface.

16. The method of claim 15, wherein ionic flow between the two pools is measured.

17. The method of claim 16, further comprising applying a voltage across the interface.

18. The method of claim 17, wherein ionic flow between the two pools is measured.

19. The method of claim 18, wherein the duration of ionic flow blockage is measured.

20. The method of claim 18, wherein the amplitude of ionic flow blockage is measured.

21. The method of claim 1, further comprising providing a polymerase or exonuclease in one of the two pools, wherein the polymerase or exonuclease draws the nucleic acid polymer through the channel.

22. The method of claim 21, wherein the medium is an aqueous solution.

23. The method of claim 22, wherein ionic flow between the two pools is measured.

24. The method of claim 21, wherein ionic flow between the two pools is measured.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

Rapid, reliable, and inexpensive characterization of polymers, particularly nucleic acids, has become increasingly important. One notable project, known as the Human Genome Project, has as its goal sequencing the entire human genome, which is over three billion nucleotides.

Typical current nucleic acid sequencing methods depend either on chemical reactions that yield multiple length DNA strands cleaved at specific bases, or on enzymatic reactions that yield multiple length DNA strands terminated at specific bases. In each of these methods, the resulting DNA strands of differing length are then separated from each other and identified in strand length order. The chemical or enzymatic reactions, as well as the technology for separating and identifying the different length strands, usually involve tedious, repetitive work. A method that reduces the time and effort required would represent a highly significant advance in biotechnology.

SUMMARY OF THE INVENTION

The invention relates to a method for rapid, easy characterization of individual polymer molecules, for example polymer size or sequence determination. Individual molecules in a population may be characterized in rapid succession.

Stated generally, the invention features a method for evaluating a polymer molecule which includes linearly connected (sequential) monomer residues. Two separate pools of a medium and an interface between the pools are provided. The interface between the pools is capable of interacting sequentially with the individual monomer residues of a single polymer present in one of the pools. Interface dependent measurements are continued over time, as individual monomer residues of a single polymer interact sequentially with the interface, yielding data suitable to infer a monomer-dependent characteristic of the polymer. Several individual polymers, e.g., in a heterogenous mixture, can be characterized or evaluated in rapid succession, one polymer at a time, leading to characterization of the polymers in the mixture.

The method is broadly useful for characterizing polymers that are strands of monomers which, in general (if not entirely), are arranged in linear strands. The method is particularly useful for characterizing biological polymers such as deoxyribonucleic acids, ribonucleic acids, polypeptides, and oligosaccharides, although other polymers may be evaluated. In some embodiments, a polymer which carries one or more charges (e.g., nucleic acids, polypeptides) will facilitate implementation of the invention.

The monomer-dependent characterization achieved by the invention may include identifying physical characteristics such as the number and composition of monomers that make up each individual molecule, preferably in sequential order from any starting point within the polymer or its beginning or end. A heterogenous population of polymers may be characterized, providing a distribution of characteristics (such as size) within the population. Where the monomers within a given polymer molecule are heterogenous, the method can be used to determine their sequence.

The interface between the pools is designed to allow passage of the monomers of one polymer molecule at a time. As described in greater detail below, the useful portion of the interface may be a passage in or through an otherwise impermeable barrier, or it may be an interface between immiscible liquids.

The medium used in the invention may be any fluid that permits adequate polymer mobility for interface interaction. Typically, the medium will be liquids, usually aqueous solutions or other liquids or solutions in which the polymers can be distributed. When an electrically conductive medium is used, it can be any medium which is able to carry electrical current. Such solutions generally contain ions as the current conducting agents, e.g., sodium, potassium, chloride, calcium, cesium, barium, sulfate, or phosphate. Conductance across the pore or channel is determined by measuring the flow of current across the pore or channel via the conducting medium. A voltage difference can be imposed across the barrier between the pools by conventional means. Alternatively, an electrochemical gradient may be established by a difference in the ionic composition of the two pools of medium, either with different ions in each pool, or different concentrations of at least one of the ions in the solutions or media of the pools. In this embodiment of the invention, conductance changes are measured and are indicative of monomer-dependent characteristics.

The term "ion permeable passages" used in this embodiment of the invention includes ion channels, ion-permeable pores, and other ion-permeable passages, and all are used herein to include any local site of transport through an otherwise impermeable barrier. For example, the term includes naturally occurring, recombinant, or mutant proteins which permit the passage of ions under conditions where ions are present in the medium contacting the channel or pore. Synthetic pores are also included in the definition. Examples of such pores can include, but are not limited to, chemical pores formed, e.g., by nystatin, ionophores, or mechanical perforations of a membranous material. Proteinaceous ion channels can be voltage-gated or voltage independent, including mechanically gated channels (e.g., stretch-activated K.sup.+ channels), or recombinantly engineered or mutated voltage dependent channels (e.g., Na.sup.+ or K.sup.+ channels constructed as is known in the art).

Another type of channel is a protein which includes a portion of a bacteriophage receptor which is capable of binding all or part of a bacteriophage ligand (either a natural or functional ligand) and transporting bacteriophage DNA from one side of the interface to the other. The polymer to be characterized includes a portion which acts as a specific ligand for the bacteriophage receptor, so that it may be injected across the barrier/interface from one pool to the other.

The protein channels or pores of the invention can include those translated from one or more natural and/or recombinant DNA molecule(s) which includes a first DNA which encodes a channel or pore forming protein and a second DNA which encodes a monomer-interacting portion of a monomer polymerizing agent (e.g., a nucleic acid polymerase or exonuclease). The expressed protein or proteins are capable of non-covalent association or covalent linkage (any linkage herein referred to as forming an "assemblage" of "heterologous units"), and when so associated or linked, the polymerizing portion of the protein structure is able to polymerize monomers from a template polymer, close enough to the channel forming portion of the protein structure to measurably affect ion conductance across the channel. Alternatively, assemblages can be formed from unlike molecules, e.g., a chemical pore linked to a protein polymerase; these assemblages fall under the definition of a "heterologous" assemblage.

The invention also includes the recombinant fusion protein(s) translated from the recombinant DNA molecule(s) described above, so that a fusion protein is formed which includes a channel forming protein linked as described above to a monomer-interacting portion of a nucleic acid polymerase. Preferably, the nucleic acid polymerase portion of the recombinant fusion protein is capable of catalyzing polymerization of nucleotides. Preferably, the nucleic acid polymerase is a DNA or RNA polymerase, more preferably T7 RNA polymerase.

The polymer being characterized may remain in its original pool, or it may cross the passage. Either way, as a given polymer molecule moves in relation to the passage, individual monomers interact sequentially with the elements of the interface to induce a change in the conductance of the passage. The passages can be traversed either by polymer transport through the central opening of the passage so that the polymer passes from one of the pools into the other, or by the polymer traversing across the opening of the passage without crossing into the other pool. In the latter situation, the polymer is close enough to the channel for its monomers to interact with the passage and bring about the conductance changes which are indicative of polymer characteristics. The polymer can be induced to interact with or traverse the pore, e.g., as described below, by a polymerase or other template-dependent polymer replicating catalyst linked to the pore which draws the polymer across the surface of the pore as it synthesizes a new polymer from the template polymer, or by a polymerase in the opposite pool which pulls the polymer through the passage as it synthesizes a new polymer from the template polymer. In such an embodiment, the polymer replicating catalyst is physically linked to the ion-permeable passage, and at least one of the conducting pools contains monomers suitable to be catalytically linked in the presence of the catalyst. A "polymer replicating catalyst," "polymerizing agent" or "polymerizing catalyst" is an agent that can catalytically assemble monomers into a polymer in a template dependent fashion--i.e., in a manner that uses the polymer molecule originally provided as a template for reproducing that molecule from a pool of suitable monomers. Such agents include, but are not limited to, nucleotide polymerases of any type, e.g., DNA polymerases, RNA polymerases, tRNA and ribosomes.

The characteristics of the polymer can be identified by the amplitude or duration of individual conductance changes across the passage. Such changes can identify the monomers in sequence, as each monomer will have a characteristic conductance change signature. For instance, the volume, shape, or charges on each monomer will affect conductance in a characteristic way. Likewise, the size of the entire polymer can be determined by observing the length of time (duration) that monomer-dependent conductance changes occur. Alternatively, the number of monomers in a polymer (also a measure of size) can be determined as a function of the number of monomer-dependent conductance changes for a given polymer traversing a passage. The number of monomers may not correspond exactly to the number of conductance changes, because there may be more than one conductance level change as each monomer of the polymer passes sequentially through the channel. However, there will be a proportional relationship between the two values which can be determined by preparing a standard with a polymer of known sequence.

The mixture of polymers used in the invention does not need to be homogenous. Even when the mixture is heterogenous, only one molecule interacts with a passage at a time, yielding a size distribution of molecules in the mixture, and/or sequence data for multiple polymer molecules in the mixture.

In other embodiments, the channel is a natural or recombinant bacterial porin molecule that is relatively insensitive to an applied voltage and does not gate. Preferred channels for use in the invention include the .alpha.-hemolysin toxin from S. aureus and maltoporin channels.

In other preferred embodiments, the channel is a natural or recombinant voltage-sensitive or voltage gated ion channel, preferably one which does not inactivate (whether naturally or through recombinant engineering as is known in the art). "Voltage sensitive" or "gated" indicates that the channel displays activation and/or inactivation properties when exposed to a particular range of voltages.

In an alternative embodiment of the invention, the pools of medium are not necessarily conductive, but are of different compositions so that the liquid of one pool is not miscible in the liquid of the other pool, and the interface is the immiscible surface between the pools. In order to measure the characteristics of the polymer, a polymer molecule is drawn through the interface of the liquids, resulting in an interaction between each sequential monomer of the polymer and the interface. The sequence of interactions as the monomers of the polymer are drawn through the interface is measured, yielding information about the sequence of monomers that characterize the polymer. The measurement of the interactions can be by a detector that measures the deflection of the interface (caused by each monomer passing through the interface) using reflected or refracted light, or a sensitive gauge capable of measuring intermolecular forces. Several methods are available for measurement of forces between macromolecules and interfacial assemblies, including the surface forces apparatus (Israelachvili, Intermolecular and Surface Forces, Academic Press, New York, 1992), optical tweezers (Ashkin et al., Oppt. Lett., 11: 288, 1986; Kuo and Sheetz, Science, 260: 232, 1993; Svoboda et al., Nature 365: 721, 1993), and atomic force microscopy (Quate, F. Surf. Sci. 299: 980, 1994; Mate et al., Phys. Rev. Lett. 59: 1942, 1987; Frisbie et al., Science 265: 71, 1994; all hereby incorporated by reference)

The interactions between the interface and the monomers in the polymer are suitable to identify the size of the polymer, e.g., by measuring the length of time during which the polymer interacts with the interface as it is drawn across the interface at a known rate, or by measuring some feature of the interaction (such as deflection of the interface, as described above) as each monomer of the polymer is sequentially drawn across the interface. The interactions can also be sufficient to ascertain the identity of individual monomers in the polymer.

The invention further features a method for sequencing a nucleic acid polymer, which can be double stranded or single stranded, by (1) providing two separate, adjacent pools of a medium and an interface (e.g., a lipid bilayer) between the two pools, the interface having a channel (e.g., bacterial porin molecules) so dimensioned as to allow sequential monomer-by-monomer passage from one pool to another of only one nucleic acid polymer at a time; (2) placing the nucleic acid polymer to be sequenced in one of the two pools; and (3) taking measurements (e.g., ionic flow measurements, including measuring duration or amplitude of ionic flow blockage) as each of the nucleotide monomers of the nucleic acid polymer passes through the channel, so as to sequence the nucleic acid polymer. The interface can include more than one channel in this method. In some cases, the nucleic acid polymer can interact with an inner surface of the channel. The sequencing of a nucleic acid, as used herein, is not limited to identifying specific nucleotide monomers, but can include distinguishing one type of monomer from another type of monomer (e.g., purines from pyrimidines).

The two pools can contain an electrically conductive medium (e.g., an aqueous solution), in which case a voltage can be optionally applied across the interface to facilitate movement of the nucleic acid polymer through the channel and the taking of measurements. Such measurements are interface-dependent, i.e., the measurements are spatially or temporally related to the interface. For example, ionic measurements can be taken when the polymer traverses an internal limiting (in size or conductance) aperture of the channel. In this case, the flow of ions through the channel, and especially through the limiting aperture of the channel, is affected by the size or charge of the polymer and the inside surface of the channel. These measurements are spatially related to the interface because one measures the ionic flow through the interface as specific monomers pass a specific portion (the limiting aperture) of the interface channel.

To maximize the signal to noise ratio when ionic flow measurements are taken, the interface surface area facing a chamber is preferably less than 0.02 mm.sup.2. In general, the interface containing the channels should have a design which minimizes the total access resistance to less than 20% of the theoretical (calculated) minimal convergence resistance. The total access resistance is the sum of the resistance contributed by the electrode/electrolyte interface, salt bridges, and the medium in the channel. The resistance of the medium in the channel includes the bulk resistance, the convergence resistance at each end of the channel, and the intra-channel resistance.

In addition, measurements can be temporally related to the interface, such as when a measurement is taken at a pre-determined time or range of times before or after each monomer passes into or out of the channel.

As an alternative to voltage, a nucleic acid polymerase or exonuclease can be provided in one of the chambers to draw the nucleic acid polymer through the channel as discussed below.

This invention offers advantages in nucleotide sequencing, e.g., reduced number of sequencing steps, higher speed of sequencing, and increased length of the polymer to be sequenced. The speed of the method and the size of the polymers it can sequence are particular advantages of the invention. The linear polymer may be very large, and this advantage will be especially useful in reducing template preparation time, sequencing errors and analysis time currently needed to piece together small overlapping fragments of a large gene or stretch of polymer.

Other features and advantages of the invention will be apparent from the following description of the preferred embodiments thereof, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic representation of an embodiment of DNA characterization by the method of the invention. The unobstructed ionic current (illustrated for the channel at the top of the diagram), is reduced as a polymeric molecule begins its traversal through the pore (illustrated for the channel at the bottom of the diagram).

FIG. 2 is a schematic representation of an implementation of DNA sequencing by the method of the invention. In this embodiment, the polymer is drawn across the opening of the channel, but is not drawn through the channel. The channel, e.g., a porin, is inserted in the phospholipid bilayer. A polymerase domain is fused by its N-terminus to the C-terminus of one of the porin monomers (the porin C-termini are on the periplasmic side of the membrane in both Rhodobacter capsulatus and LamB porins). Fusions on the other side of the membrane can also be made. Malto-oligosaccharides can bind and block current from either side. The polymerase is shown just prior to binding to the promoter. A non-glycosylated base is shown near a pore opening, while a penta-glycosylated cytosine is shown 10 bp away. The polymerase structure represented is that of DNA polymerase I (taken from Ollis et al., 1985, Nature, 313: 762-66), and the general porin model is from Jap (1989, J. Mol. Biol., 205: 407-19).

FIG. 3 is a schematic representation of DNA sequencing results by the method of the invention. The schematic depicts, at very high resolution, one of the longer transient blockages such as can be seen in FIG. 4. The monomeric units of DNA (bases G, A, T, and C) interfere differentially with the flow of ions through the pore, resulting in discrete conductance levels that are characteristic of each base. The order of appearance of the conductance levels sequentially identifies the monomers of the DNA.

FIG. 4 is a recording of the effect of polyadenylic acid (poly A) on the conductance of a single .alpha.-hemolysin channel in a lipid bilayer between two aqueous compartments containing 1 M NaCl, 10 Mm Tris, Ph 7.4. Before addition of RNA, the conductance of the channel was around 850 Ps. The cis compartment, to which poly A is added, is -120 mV with respect to the trans compartment. After adding poly A to the cis compartment, the conductance of the .alpha.-hemolysin channel begins to exhibit transient blockages (conductance decreases to about 100 Ps) as individual poly A molecules are drawn across the channel from the cis to the trans compartment. When viewed at higher resolution (expanded time scale, at top), the duration of each transient blockage is seen to vary between less than 1 msec up to 10 msec. Arrows point to two of the longer duration blockages. See FIGS. 5A and 5B for histograms of blockage duration.

FIGS. 5A and 5B are comparisons of blockage duration with purified RNA fragments of 320 nt (FIG. 5A) and 1100 nt (FIG. 5B) lengths. The absolute number of blockades plotted in the two histograms are not comparable because they have not been normalized to take into account the different lengths of time over which the data in the two graphs were collected.

FIGS. 6A, 6B, and 6C are plots of current measurements versus time according to a method of the invention. FIG. 6A illustrates the current blockages when polycytidylic oligonucleotides traverse a channel. FIG. 6B illustrates the current blockages when polyadenylic oligonucleotides traverse the channel. FIG. 6C illustrates the current blockages when polycytidylic and polyadenylic oligonucleotides traverse a channel.

FIG. 7 is a plot of current measurements versus time according to a method of the invention, illustrating the current blockages when polyA.sub.30 C.sub.70 oligonucleotides traverse a channel.

DETAILED DESCRIPTION

As summarized above, we have determined a new method for rapidly analyzing polymers such as DNA and RNA. We illustrate the invention with two primary embodiments. In one embodiment, the method involves measurements of ionic current modulation as the monomers (e.g., nucleotides) of a linear polymer (e.g., nucleic acid molecule) pass through or across a channel in an artificial membrane. During polymer passage through or across the channel, ionic currents are reduced in a manner that reflects the properties of the polymer (length, concentration of polymers in solution, etc.) and the identities of the monomers. In the second embodiment, an immiscible interface is created between two immiscible liquids, and, as above, polymer passage through the interface results in monomer interactions with the interface which are sufficient to identify characteristics of the polymer and/or the identity of the monomers.

The description of the invention will be primarily concerned with sequencing nucleic acids, but this is not intended to be limiting. It is feasible to size and sequence polymers other than nucleic acids by the method of the invention, including linear protein molecules which include monomers of amino amonomers, linear arrays of monomers, including chemicals (e.g., biochemicals such as polysaccharides), may also be sequenced and characterized by size.

I. Polymar Analysis Uing Conductance Changes Across An Interface

Sensitive single channel recording techniques (i.e., the patch clamp technique) can be used in the invention, as a rapid, high-resolution approach allowing differentiation of nucleotide bases of single DNA molecules, and thus a fast and efficient DNA sequencing technique or a method to determine polymer size or concentration (FIGS. 1 and 2). We will describe methods to orient DNA to a pore molecule in two general configurations (see FIGS. 1 and 2) and record conductance changes across the pore (FIG. 3). One method is to use a pore molecule such as the receptor for bacteriophage lambda (LamB) or .alpha.-hemolysin, and to record the process of DNA injection or traversal through the channel pore when that channel has been isolated on a membrane patch or inserted into a synthetic lipid bilayer (FIG. 1). Another method is to fuse a DNA polymerase molecule to a pore molecule and allow the polymerase to move DNA over the pore's opening while recording the conductance across the pore (FIG. 2). A third method is to use a polymerase on the trans side of the membrane/pore divider to pull a single stranded nucleic acid through the pore from the cis side (making it double stranded) while recording conductance changes. A fourth method is to establish a voltage gradient across a membrane containing a channel (e.g., .alpha.-hemolysin) through which a single stranded or double stranded DNA is electrophoresed.

The apparatus used for this embodiment includes 1) an ion-conducting pore or channel, perhaps modified to include a linked or fused polymerizing agent, 2) the reagents necessary to construct and produce a linear polymer to be characterized, or the polymerized molecule itself, and 3) an amplifier and recording mechanism to detect changes in conductance of ions across the pore as the polymer traverses its opening.

A variety of electronic devices are available which are sensitive enough to perform the measurements used in the invention, and computer acquisition rates and storage capabilities are adequate for the rapid pace of sequence data accumulation.

A. Characteristics Identified by the Methods

1) Size/Length of Molecules

The size or length of a polymer can be determined by measuring its residence time in the pore or channel, e.g., by measuring duration of transient blockade of current. The relationship between this time period and the length of the polymer can be described by a reproducible mathematical function which depends on the experimental condition used. The function is likely a linear function for a given type of polymer (e.g., DNA, RNA, polypeptide), but if it is described by another function (e.g., sigmoidal or exponential), accurate size estimates may be made by first preparing a standard curve using known sizes of like linear molecules.

2) Identity of Residues/Monomers

The chemical composition of individual monomers is sufficiently variant to cause characteristic changes in channel conductance as each monomer traverses the pore due to physical configuration, size/volume, charge, interactions with the medium, etc. For example, our experimental data suggest that polyc RNA reduces conductance more than does polyA RNA, indicating a measurable physical difference between pyrimidines and purines that is one basis of nucleotide identification in this invention.

The nucleotide bases of DNA will influence pore conductance during traversal, but if the single channel recording techniques are not sensitive enough to detect differences between normal bases in DNA, it is practical to supplement the system's specificity by using modified bases. The modifications should be asymmetrical (on only one strand of double stranded template), to distinguish otherwise symmetrical base pairs.

Modified bases are readily available. These include: 1) methylated bases (lambda can package and inject DNA with or without methylated A's and C's), 2) highly modified bases found in the DNA of several bacteriophage (e.g. T4, SP15), many of which involve glycosylations coupled with other changes (Warren, 1980, Ann. Rev. Microbiol., 34: 137-58), and 3) the modified nucleotide triphosphates that can be incorporated by DNA polymerase (e.g. biotinylated, digoxigenated, and fluorescently tagged triphosphates).

In order to identify the monomers, conditions should be appropriate to avoid secondary structure in the polymer to be sequenced (e.g., nucleic acids); if necessary, this can be achieved by using a recording solution which is denaturing. Using single stranded DNA, single channel recordings can be made in up to 40% formamide and at temperatures as high as 45.degree. C. using e.g., the .alpha.-hemolysin toxin protein in a lipid bilayer. These conditions are not intended to exclude use of any other denaturing conditions. One skilled in the art of electrophysiology will readily be able to determine suitable conditions by 1) observing incorporation into the bilayer of functional channels or pores, and 2) observing transient blockades of conductance uninterrupted by long-lived blockades caused by polymers becoming stuck in the channel because of secondary structure. Denaturing conditions are not always necessary for the polymerase-based methods or for double stranded DNA methods of the invention. They may not be necessary for single stranded methods either, if the pore itself is able to cause denaturation, or if the secondary structure does not interfere.

3) Concentration of Polymers in Solutions

Concentration of polymers can be rapidly and accurately assessed by using relatively low resolution recording conditions and analyzing the number of conductance blockade events in a given unit of time. This relationship should be linear and proportional (the greater the concentration of polymers, the more frequent the current blockage events), and a standardized curve can be prepared using known concentrations of polymer.

B. Principles and Techniques

1) Recording Techniques

The conductance monitoring methods of the invention rely on an established technique, single-channel recording, which detects the activity of molecules that form channels in biological membranes. When a voltage potential difference is established across a bilayer containing an open pore molecule, a steady current of ions flows through the pore from one side of the bilayer to the other. The nucleotide bases of a DNA molecule, for example, passing through or over the opening of a channel protein, disrupt the flow of ions through the pore in a predictable way. Fluctuations in the pore's conductance caused by this interference can be detected and recorded by conventional single-channel recording techniques. Under appropriate conditions, with modified nucleotides if necessary, the conductance of a pore can change to unique states in response to the specific bases in DNA.

This flux of ions can be detected, and the magnitude of the current describes the conductance state of the pore. Multiple conductance states of a channel can be measured in a single recording as is well known in the art. By recording the fluctuations in conductance of the maltoporin (LamB) pore, for example, when DNA is passed through it by phage lambda injection or over its opening by the action of a polymerase fused to the surface of the LamB protein, we estimate that a sequencing rate of 100-1000 bases/sec/pore can be achieved.

The monitoring of single ion channel conductance is an inexpensive, viable method that has been successful for the last two decades and is in very wide spread current use. It directly connects movements of single ions or channel proteins to digital computers via amplifiers and analog to digital (A to D, A/D) converters. Single channel events taking place in the range of a few microseconds can be detected and recorded (Hamill et al., 1981, Pfluegers Arch. Eur. J. Physiol., 391: 85-100). This level of time resolution ranges from just sufficient to orders of magnitude greater than the level we need, since the time frame for movement of nucleotide bases relative to the pore for the sequencing method is in the range of microseconds to milliseconds. The level of time resolution required depends on the voltage gradient or the enzyme turnover number if the polymer is moved by an enzyme. Other factors controlling the level of time resolution include medium viscosity, temperature, etc.

The characteristics and conductance properties of any pore molecule that can be purified can be studied in detail using art-known methods (Sigworth et al., supra; Heinemann et al., 1988, Biophys. J., 54: 757-64; Wonderlin et al., 1990, Biophys. J., 58: 289-97). These optimized methods are ideal for our polymer sequencing application. For example, in the pipette bilayer technique, an artificial bilayer containing at least one pore protein is attached to the tip of a patch-clamp pipette by applying the pipette to a preformed bilayer reconstituted with the purified pore protein in advance. Due to the very narrow aperture diameter of the patch pipette tip (2 microns), the background noise for this technique is significantly reduced, and the limit for detectable current interruptions is about 10 microseconds (Sigworth et al., supra; Heinemann et al., 1990, Biophys. J., 57: 499-514). Purified channel protein can be inserted in a known orientation into preformed lipid bilayers by standard vesicle fusion techniques (Schindler, 1980, FEBS Letters, 122: 77-79), or any other means known in the art, and high resolution recordings are made. The membrane surface away from the pipette is easily accessible while recording. This is important for the subsequent recordings that involve added DNA. The pore can be introduced into the solution within the patch pipette rather than into the bath solution.

An optimized planar lipid bilayer method has recently been introduced for high resolution recordings in purified systems (Wonderlin et al., supra). In this method, bilayers are formed over very small diameter apertures (10-50 microns) in plastic. This technique has the advantage of allowing access to both sides of the bilayer, and involves a slightly larger bilayer target for reconstitution with the pore protein. This optimized bilayer technique is an alternative to the pipette bilayer technique.

Instrumentation is needed which can apply a variable range of voltages from about +400 Mv to -400 mV across the channel/membrane, assuming that the trans compartment is established to be 0 mV; a very low-noise amplifier and current injector, analog to digital (A/D) converter, data acquisition software, and electronic storage medium (e.g., computer disk, magnetic tape). Equipment meeting these criteria is readily available, such as from Axon Instruments, Foster City, Calif. (e.g., Axopatch 200 A system; pClamp 6.0.2 software).

Preferred methods of large scale DNA sequencing involve translating from base pairs to electronic signals as directly and as quickly as possible in a way that is compatible with high levels of parallelism, miniaturization and manufacture. The method should allow long stretches (even stretches over 40 kbp) to be read so that errors associated with assembly and repetitive sequence can be minimized. The method should also allow automatic loading of (possibly non-redundant) fresh sequences.

2) Channels and Pores Useful in the invention

Any channel protein which has the characteristics useful in the invention (e.g., pore sized up to about 9 nm) may be employed. Pore sizes across which polymers can be drawn may be quite small and do not necessarily differ for different polymers. Pore sizes through which a polymer is drawn will be e.g., approximately 0.5-2.0 nm for single stranded DNA; 1.0-3.0 nm for double stranded DNA; and 1.0-4.0 nm for polypeptides. These values are not absolute, however, and other pore sizes might be equally functional for the polymer types mentioned above.

Examples of bacterial pore-forming proteins which can be used in the invention include Gramicidin (e.g., Gramicidin A from Bacillus brevis; available from Fluka, Ronkonkoma, N.Y.); LamB (maltoporin), OmpF, OmpC, or PhoE from Escherichia coli, Shigella, and other Enterobacteriaceae, alpha-hemolysin (from S. aureus), Tsx, the F-pilus, lambda exonuclease, and mitochondrial porin (VDAC). This list is not intended to be limiting.

A modified voltage-gated channel can also be used in the invention, as long as it does not inactivate quickly, e.g., in less than about 500 msec (whether naturally or following modification to remove inactivation) and has physical parameters suitable for e.g., polymerase attachment (recombinant fusion proteins) or has a pore diameter suitable for polymer passage. Methods to alter inactivation characteristics of voltage gated channels are well known in the art (see e.g., Patton, et al., Proc. Natl. Acad. Sci. USA, 89: 10905-09 (1992); West, et al., Proc. Natl. Acad. Sci. USA, 89: 10910-14 (1992); Auld, et al., Proc. Natl. Acad. Sci. USA, 87: 323-27 (1990); Lopez, et al., Neuron, 7: 327-36 (1991); Hoshi, et al., Neuron, 7: 547-56 (1991); Hoshi, et al., Science, 250: 533-38 (1990), all hereby incorporated by reference).

Appropriately sized physical or chemical pores may be induced in a water-impermeable barrier (solid or membranous) up to a diameter of about 9 nm, which should be large enough to accommodate most polymers (either through the pore or across its opening). Any methods and materials known in the art may be used to form pores, including track etching and the use of porous membrane templates which can be used to produce pores of the desired material (e.g., scanning-tunneling microscope or atomic force microscope related methods).

Chemical channels or pores can be formed in a lipid bilayer using chemicals (or peptides) such as Nystatin, as is well known in the art of whole-cell patch clamping ("perforated patch" technique); and peptide channels such as Alamethicin.

Template-dependent nucleic acid polymerases and free nucleotides can be used as a motor to draw the nucleic acids through the channel. For example, the DNA to be sequenced is placed in one chamber; RNA polymerases, nucleotides, and optionally primers are placed in the other chamber. As the 3' end of the DNA passes through the channel (via a voltage pulse or diffusion, for example), the RNA polymerase captures and begins polymerization. If the polymerase is affixed to the chamber or is physically blocked from completely passing through the channel, the polymerase can act as a rachet to draw the DNA through the channel.

Similarly, lambda exonuclease, which is itself shaped as a pore with a dimension similar to .alpha.-hemolysin, can operate as a motor, controlling the movement of the nucleic acid polymer through the channel. The exonuclease has the added benefit of allowing access to one strand of a double stranded polymer. As the double stranded polymer passes through the pore, the exonuclease grabs onto the 5' single-stranded overhang of a first strand (via endonuclease digestion or breathing of the double stranded DNA ends) and sequentially cleaves the complementary second strand at its 3' end. During the sequential cleavage, the exonuclease progresses 5' to 3' down the first strand, pulling the double stranded DNA through the channel at a controlled rate. Thus, the exonuclease can operate as a pore as well as a motor for drawing the nucleic acid polymer through the channel.

To produce pores linked with polymerase or exonuclease, synthetic/recombinant DNA coding for a fusion protein can be transcribed and translated, then inserted into an artificial membrane in vitro. For example, the C-terminus of E. coli DNA polymerase I (and by homology, T7 DNA polymerase) is very close to the surface of the major groove of groove of the newly synthesized DNA. If the C-terminus of a polymerase is fused to the N-terminus of a pore forming protein such as colicin E1 and the colicin is inserted into an artificial membrane, one opening of the colicin pore should face the DNA's major groove and one should face the opposite side of the lipid bilayer. For example, the colicin molecule can be modified to achieve a pH optimum compatible with the polymerase as in Shiver et al. (J. Biol. Chem., 262: 14273-14281 1987, hereby incorporated by reference). Both pore and polymerase domains can be modified to contain cysteine replacements at points such that disulfide bridges form to stabilize a geometry that forces the pore opening closer to the major groove surface and steadies the polymer as it passes the pore opening. The loops of the pore domain at this surface can be systematically modified to maximize sensitivity to changes in the DNA sequence.

C. General Considerations for Conductance Based Measurements

1) Electrical/Channel Optimization

The conductance of a pore at any given time is determined by its resistance to ions passing through the pore (pore resistance) and by the resistance to ions entering or leaving the pore (access resistance). For a pore's conductance to be altered in discrete steps, changes in one or both of these resistance factors will occur by unit values. The base pairs of a DNA molecule represent discrete units that are distinct from each other along the phosphate backbone. As long as the orientation of DNA to the pore remains relatively constant, and the membrane potential does not change, as each base pair passes by (or through) the pore, it is likely to interfere with a reproducible number of ions. Modifications made to the individual bases would influence the magnitude of this effect.

To resolve stretches of repeating identical bases accurately, and to minimize reading errors in general, it may be useful for the pore to register a distinct (probably higher) level of conductance in between the bases. This can take place naturally in the pore-polymerase system with helix rotation during polymerization, or in the phage system between entry of base pairs into the pore, or when the regions in between base pairs pass by a rate limiting site for ion flux inside the pore. Modified bases used to distinguish nucleotide identities may also contribute significantly to this issue, because they should magnify the conductance effect of the bases relative to the effect of regions in between the bases. With single strand passage through a pore, charged phosphates may punctuate the passage of each base by brief, higher conductance states. Also, if the rate of movement is constant, then punctuation between bases may not be required to resolve stretches of repeating identical bases.

Altered conductance states have been described for many channels, including some LamB mutants (Dargent et al., 1988, supra). A mutant may be a valuable alternative to a wild type channel protein if its fluctuation to a given state is sensitive to nucleotide bases in DNA. Alternative systems can also be developed from other channel proteins that are known to have multiple single channel conductance states. Examples of these are the alamethicin channel, which under certain conditions fluctuates through at least 20 discrete states (Taylor et al., 1991, Biophys. J., 59: 873-79), and the OmpF porin, which shows gating of its individual monomers giving rise to four discrete states (Lakey et al., 1989, Eur. J. Biochem., 186: 303-308).

Since channel events can be resolved in the microsecond range with the high resolution recording techniques available, the limiting issue for sensitivity with the techniques of our invention is the amplitude of the current change between bases. Resolution limits for detectable current are in the 0.2 pA range (1 pA=6.24.times.10.sup.6 ions/sec). Each base affecting pore current by at least this magnitude is detected as a separate base. It is the function of modified bases to affect current amplitude for specific bases if the bases by themselves are poorly distinguishable.

One skilled in the art will recognize that there are many possible configurations of the sequencing method described herein. For instance, lipid composition of the bilayer may include any combination of non-polar (and polar) components which is compatible with pore or channel protein incorporation. Any configuration of recording apparatus may be used (e.g., bilayer across aperture, micropipette patches, intra-vesicular recording) so long as its limit of signal detection is below about 0.5 pA, or in a range appropriate to detect monomeric signals of the polymer being evaluated. If polymeric size determination is all that is desired, the resolution of the recording apparatus may be much lower.

A Nernst potential difference, following the equation

where E.sub.ion is the solvent ion (e.g., potassium ion) equilibrium potential across the membrane, R is the gas constant, T is the absolute temperature, z is the valency of the ion, F is Faraday's constant, [ion].sub.o is the outside and [ion].sub.i is the inside ionic concentration (or trans and cis sides of the bilayer, respectively),

can be established across the bilayer to force polymers across the pore without supplying an external potential difference across the membrane. The membrane potential can be varied ionically to produce more or less of a differential or "push." The recording and amplifying apparatus is capable of reversing the gradient electrically to clear blockages of pores caused by secondary structure or cross-alignment of charged polymers.

2) Optimization of Methods

In an operating system of the invention, one can demonstrate that the number of transient blockades observed is quantitatively related to the number of polymer molecules that move through the channel from the cis to the trans compartment. By sampling the trans compartment solution after observing one to several hundred transient blockades and using quantitative, competitive PCR assays (e.g., as in Piatak et al., 1993, BioTechniques, 14: 70-79) it is possible to measure the number of molecules that have traversed the channel. Procedures similar to those used in competitive PCR can be used to include an internal control that will distinguish between DNA that has moved through the channel and contaminating or aerosol DNA.

Further steps to optimize the method may include:

1. Slowing the passage of polynucleotides so that individual nucleotides can be sensed. Since the blockade durations we observed are in the millisecond range, each nucleotide in a one or two thousand monomer-long polynucleotide occupies the channel for just a few microseconds. To measure effects of individual nucleotides on the conductance, substantially reducing the velocity may offer substantial improvement. Approaches to accomplish this include: (a) increasing the viscosity of the medium, (b) establishing the lower limit of applied potential that will move polynucleotides into the channel (c) use of high processivity polymerase in the trans compartment to "pull" DNA through the pore in place of voltage gradients. Using enzymes to pull the DNA through the pore may also solve another potential problem (see 3, below).

2. Making a channel in which an individual nucleotide modulates current amplitude. While .alpha.-toxin may give rise to distinguishable current amplitudes when different mono-polynucleotides pass through the channel, 4-5 nucleotides in the strand necessarily occupy the length of its approximately 50 .ANG. long channel at any given time. Ionic current flow may therefore reflect the sum of the nucleotide effects, making it difficult to distinguish monomers. To determine current modulation attributable to individual monomers, one may use channels containing a limiting aperture that is much shorter than the full length of the overall channel (Weiss et aone can mod. For example, one can modify .alpha.-hemolysin by standard molecular biological techniques such that portions of the pore leading to and away from the constriction are widened.

3. Enhancing movement of DNA in one direction. If a DNA molecule is being pulled through a channel by a voltage gradient, the probability of its moving backward against the gradient will be given by

where kT is energy associated with thermal fluctuations. For example, using reasonable assumptions for the effective charge density of the DNA polyelectrolyte in buffer (Manning, 1969, J. Chem. Phys., 51: 924-33), at room temperature the probability of thermal energy moving the DNA molecule backward 10 .ANG. against a 100 mV voltage gradient .apprxeq.e.sup.-4, or about one in fifty. Should this problem exist, some kind of ratchet mechanism, possibly a polymerase or other DNA binding protein, may be useful in the trans chamber to prevent backward movements of the DNA.

3) Advantages of Single Channel Sequencing

The length of continuous DNA sequence obtainable from the methods described herein will only be limited in certain embodiments (e.g., by the packaging limit of phage lambda heads (.about.50 kb) or by the size of the template containing polymerase promoter sequences). Other embodiments (e.g., voltage gradients) have no such limitation and should even make it possible to sequence DNA