|
Description  |
|
|
FIELD OF THE INVENTION
This invention relates to cloned DNA sequences that are specifically
complementary to prechosen target regions within individual chromosomes of
a genome which is typically multi-chromosomal, to processes for making the
same, to processes for producing probes therefrom and to usage of such
probes.
BACKGROUND OF THE INVENTION
Probes containing DNA sequences which are complementary to specific
chromosomal alphoid DNA are known to be useful as enumerators in in situ
hybridization assays. The best known members of the prior art alphoid DNA
sequences preparation methods use a common approach for isolating the
alphoid DNA sequences. An enrichment based on a physical characteristic of
repeated DNA is applied, DNA from the enriched pool is cloned and
individual clones from this enriched pool are individually analyzed for
utility in in situ hybridization assays. Searching such a pool has proven
to be an inefficient and unreliable method for obtaining a sequence with
high chromosomal specificity for a predetermined chromosome. The first
such scheme to obtain cloned alphoid DNA used the buoyant density
characteristics of alphoid DNA to produce an enriched pool of DNA
sequences. (See Manuelidis, L., et al. in Chromosoma 66:23-32 (1978)).
Other schemes for obtaining alphoid DNA clones have used the distribution
of DNA restriction sites or the rapid renaturation of alphoid DNA relative
to non-repeated species in the genome as the basis for producing enriched
pools of alphoid DNA. (See Yang, T. P., et al., Proc. Natl. Acad. Sci.
USA. 79:6593-6597 (1982), and Moyzis, R. K., et al., Chromosoma 95:375-386
(1987)). However, these methods are inherently relative inefficient and
are not well suited for rapid commercial development of enumerator probes.
Also, prior art probes prepared from such sequences were indirect label
probes and so required post-hybridization processing in order to achieve
hybrid detection in contrast to direct label probes which require, for
example, only one probe penetration step of a slide mounted specimen
during in situ hybridization. Indirect label probes require the successful
diffusion into the slide mounted specimen of the various protein reagents
(antibodies, avidins, enzymes and the like) during an in situ
hybridization multi-step procedure.
Prior art methods for labeling such prior art chromosome regionally
specific complementary DNA sequences present difficulties in controlling
the number of label moieties attached to individual sequences.
Improved DNA segments which are complementary to specific chromosomal DNA
repeated segments existing in a particular chromosomal region, such as,
for example, alphoid DNA in a specific chromosome, and improved
methodology for making direct labeled probes therefrom, would be very
useful. The present invention provides both such segments and such
methodology.
SUMMARY OF THE INVENTION
This invention provides (a) a new and very useful class of cloned DNA
sequences which incorporate DNA repeated segments and which are
specifically complementary to prechosen regions of individual chromosomes
of a genome which is typically multi-chromosomal, and (b) processes for
making and for converting same to probe compositions, especially
chaotropic transamination.
The invention avoids the problem of individually testing large numbers of
clones derived from a large pool and enhances the capacity to produce
specific sequences which are complementary to a desired prechosen
chromosome.
More particularly, in one aspect, the present invention provides (a)
methodology for making individual cloned DNA sequences which incorporate
DNA repeated segments and which are complementary to sequential DNA
sequences that occurs uniquely in only one selected region of one selected
chromosome of a multi-chromosomal genome, and (b) the cloned DNA sequences
so made. This methodology utilizes specific combinations of:
(a) enzymatic amplification of template DNA that is comprised of DNA
sequences which together comprise a selected starting single whole
chromosome of a multi-chromosomal genome using as primers synthesized
oligonucleotides that are known to exist commonly and repetitively within
or between adjacent DNA repeated segments which are present in one
selected region in such template chromosome;
(b) clone colony production and sampling using either the so enzymatically
amplified DNA repeated segments or DNA repeated segments separated from
genomic DNA after identification thereof by hybridization using probes
formed with the so enzymatically amplified DNA repeated segments; and
(c) hybridization of probes formed by labeling sampled, cultured and
extracted colony-derived vector DNA sequences with selected samples of
genomic DNA target sequences.
From the resulting hybrids, at least one individual cloned DNA sequence is
selected that contains a plurality of copies of at least one DNA repeated
segment that occurs in, and that is complementary to, a DNA sequence or
sequences which occur(s) in the one selected region of the selected
starting single chromosome. Each selected cloned DNA sequence is then
cultured to produce a plurality of replicates thereof.
Thus, the present invention provides a new class of cloned DNA sequences
wherein each sequence produced as indicated above is complementary to a
preselected one region of a preselected chromosome. Also, each such
sequence contains at least one DNA repeated segment which occurs in such
one region.
These novel cloned and replicated DNA complementary sequences can be
labeled to produce new and useful probes for hybridization assays of
specimens for which karyotypic information is desired. In general, probes
produced from these cloned DNA sequences can be classified as repeat
sequence based probes.
In another aspect, the present invention provides methodology for making
intermediates useful in the production of direct label probe compositions.
The methodology uses as starting materials (1) at least one starting DNA
sequence such as taught herein, (2) linking group compounds and (3)
fluorophore group containing compounds. This probe composition-making
methodology preferably utilizes a combination of:
(a) fragmenting of the starting DNA sequence(s) into DNA segments;
(b) transaminating the DNA segments to introduce linking groups thereinto;
and
(c) covalently bonding fluorophore groups to the so introduced linking
groups.
In a present transamination procedure, the linking compound is
difunctional. One functional moiety thereof is an amino group, the other a
group that is reactive with another reactive group that is present in the
starting fluorescent compound. This transamination procedure is conducted
under aqueous liquid phase, ambient temperature conditions in the presence
of a bisulfite catalyst. Controlled transamination of the deoxycytidine
nucleotides present in the selected regional DNA sequences and/or
fragments thereof is accomplished without otherwise substantially altering
sequence structure or complementary character so that the resulting
transaminated polynucleotides retain their capacity to hybridize to
complementary target DNA sequences that incorporate the segments in the
selected chromosomal region.
In another aspect of the present invention, a novel transamination
technique is provided by which polynucleotides are maintained in a single
stranded condition during such a transamination procedure. This technique
utilizes the presence of trihaloacetate chaotrope anions in the bisulfite
catalyzed aqueous reaction medium together with the reactants. Such
chaotrope anions induce and, particularly, maintain, nucleotide sequence
denaturation as desired during the transamination without inducing
crystallization of reactants and without reacting with reactants. This
technique is also advantageous because it permits synthesis of relatively
large batches of transaminated DNA sequences and/or segments, if such are
desired, without the high cost and low reliability of prior art enzymatic
labeling methods.
A class of new and very useful transaminated DNA segments is produced by
the indicated chaotropic bisulfite catalyzed transamination procedure.
Direct label probe compositions particularly those prepared from such new
class of chaotropically transaminated segments display excellent
hybridization capacity and the hybrids produced thereby have excellent
signal strength production capability.
Other and further features, objects, aims, purposes, advantages,
applications, embodiments and the like will be apparent to those skilled
in the art from the teachings of the present specification taken with the
accompanying drawings.
DETAILED DESCRIPTION
(A) Definitions
The term "sequence" refers to a chain or interconnected series of DNA
nucleotides.
The term "fragment," "segment" or "DNA segment" indicates generally only a
portion of a larger DNA polynucleotide or DNA sequence such as occurs in
one chromosome or one region thereof. A polynucleotide, for example, can
be broken up, or fragmented into, a plurality of segments.
The term "DNA repeated segment" refers to the fact that a particular DNA
segment, or almost the same segment, occurs a plurality (i.e., at least
two) of times in a particular DNA sequence or in a particular plurality of
DNA sequences. Individual DNA segment size: and/or DNA repeated segment
size can vary greatly. For example, in the case of the human genome, each
DNA repeated segment is now believed to be typically in the approximate
size range of about 5 to about 3,000 bp. Illustratively, a single alphoid
DNA sequence may incorporate at least about five different DNA repeated
segments. As is known, a chromosome characteristically contains regions
which have DNA sequences that contain DNA repeated segments. Small
sequential variations in individual segment repeats may possibly occur;
see, for example, Waye, J. S. et al., Molecular and Cellular Biology
6:3156-3165 (1986).
The term "genome" designates or denotes the complete, single-copy set of
genetic instructions for an organism as coded into DNA of the organism. In
the practice of the present invention, the particular genome under
consideration is typically multi-chromosomal so that such DNA is
cellularly distributed among a plurality of individual chromosomes (which
number, for example, in man 22 pairs plus a gender associated XX pair or
an XY pair).
In the practice of this invention, the genome involved in any given
instance is preferably from a primate, and the DNA sequences containing
the DNA repeated segments are preferably alphoid or are associated with
the centromere of a chromosome type. As used herein, the term "alphoid" or
"alpha satellite" in reference to DNA has reference to the complex family
of tandemly repeated DNA segments found in primate genomes. Long tandem
arrays of alpha satellite DNA based on a monomer repeat length of about
171 base pairs are located principally at the centromeres of primate
chromosomes.
The term "chromosome" refers to the heredity-bearing gene carrier of a
living cell which is derived from chromatin and which comprises DNA and
protein components (especially histones). The conventional internationally
recognized individual human genome chromosome numbering identification
system is employed herein. The size of an individual chromosome can vary
from one type to another with a given multi-chromosomal genome and from
one genome to another. In the case of the (preferred) human genome, the
entire DNA mass of a given chromosome is usually greater than about
100,000,000 bp. For example, the size of the entire human genome is about
3.times.10.sup.9 bp. The largest chromosome, chromosome no. 1, contains
about 2.4.times.10.sup.8 bp while the smallest chromosome, chromosome no.
22, contains about 5.3.times.10.sup.7 bp (Yunis, J. J. in Science
191:1268-1270 (1976), and Kavenoff, et al. in Cold Spring Harbor Symposia
on Qualitative Biology 38:1-8 (1973)).
The term "region" indicates a portion thereof which contains DNA repeated
segments that are preferably alphoid or associated with the centromere.
The actual physical size or extent of such an individual region can vary
greatly. An exact quantification of such a region cannot now be made for
all possible regions. Usually, a region is at least large enough to
include at least one DNA sequence that (a) incorporates a plurality of
copies of at least one DNA repeated segment and that (b) is identifiable
and preferably enumeratable optically by fluoroscopic microscopic
examination after formation of fluorophore labeled hybrids in such region
following an in situ hybridization procedure with a direct label probe or
probe composition. Presently available information suggests that a region
may contain more than a single such DNA sequence with each such DNA
sequence containing one or more DNA repeated segments. Each DNA sequence
that occurs in a region may typically contain perhaps from about 70,000 to
about 20,000,000 bp, with a present preferred regional DNA sequence size
estimate being in the range of about 80,000 to about 225,000 bp, and with
a presently most preferred such regional DNA sequence size estimate being
in the range of about 100,000 to about 200,000 bp. However, larger and
smaller DNA sequences can occur in a single region of a chromosome.
The term "region" is typically and characteristically a chromosome fragment
which comprises less DNA mass or size than the entire DNA mass or size of
a given chromosome. As is know, not all the DNA of a given chromosome of
chromosome region is arranged as DNA sequences containing or comprised of
DNA repeated segments. A region, for example, can have a size which
encompasses about 2.times.10.sup.6 to about 40.times.10.sup.6 bp. which
size region encompasses, for example, centromeres of the human
chromosomes. Such a size is thus a substantial fraction of the size of a
single human chromosome. Such a region size is presently preferred as a
region size in the practice of this invention although larger and smaller
region sizes can be used. A centromeric region of even a small human
chromosome is a microscopically visible large portion of the chromosome,
and a region comprising DNA repeated segments (not alphoid or centromeric)
on the Y chromosome occupies the bulk of the chromosome and is
microscopically visible.
In general, the term "region" is not definitive of a particular one (or
more) genes because a "region" does not take into specific account the
particular coding segments (exons) of an individual gene. Rather, a
"region" as used herein in reference to a chromosome is unique to a given
chromosome by reason of the particular confirmation of DNA segments
therein for present probe composition formation and use purposes.
The term "centromere" refers to a heterochromatic region of the eucaryotic
chromosome which is the chromosomal site of attachment of the kinetochore.
The centromere divides just before replicated chromosomes separate, and so
such holds together the paired chromatids.
The term "gene" designates or denotes to a DNA sequence along a chromosome
that codes for a functional product (either RNA or its translation
product, a polypeptide). A gene contains a coding region and includes
regions preceding and following the coding region (termed respectively
"leader" and "trailer"). The coding region is comprised of a plurality of
coding segments ("exons") and intervening sequences ("introns") between
individual coding segments.
The term "probe" or "probe composition" refers to a polynucleotide or a
mixture of polynucleotides, such as DNA sequence(s), or DNA segment(s),
which has (or have), been chemically combined (i.e., associated) with
individual label containing moieties. Each such polynucleotide of a probe
is typically single stranded at the time of hybridization to a target.
The term "label" or "label containing moiety" refers in a general sense to
a moiety, such as a radioactive isotope or group containing same, and
nonisotopic labels, such as enzymes, biotin, avidin, streptavidin,
digoxygenin, luminescent agents, dyes, haptens, and the like. Luminescent
agents, depending upon the source of exciting energy, can be classified as
radioluminescent, chemiluminescent, bioluminescent, and photoluminescent
(or fluorescent).
Preferably probe compositions made from the chromosomal regional sequences
provided herewith contain DNA segments that are chemically bound to
label-containing moieties. Each label-containing moiety contains at least
one fluorophore (fluorescent) group, and each label-containing moiety is
derived from a monofunctional radical-containing, and also fluorophore
group-containing, fluorescent starting compound. Such a fluorophore group
is covalently bound to a linking group that is itself transaminated as
taught herein to DNA segment.
The term "direct label probe" (or "direct label probe composition")
designates or denotes a nucleic acid probe whose label after hybrid
formation with a target is detectable without further reactive processing
of hybrid. Conventionally, a direct label probe incorporates either a
fluorophore group or a radioisotope as an individual label moiety.
The term "indirect label probe" (or "indirect label probe composition")
designates or denotes a nucleic acid probe whose label after hybrid
formation with a target must be further reacted in subsequent processing
with one or more reagents to associate therewith one or more moieties that
finally result in a detectable entity.
The term "target", "DNA target" or "DNA target region" refers to at least
one nucleotide sequence, such as a DNA sequence or a DNA segment, all or a
portion of which is complementary to and hybridizable with the nucleotide
sequence(s) of a given probe. Each Such sequence or portion is typically
being single stranded at the time of hybridization. When the target
nucleotide sequences are located only in a single region or fraction of a
given chromosome, the term "target region" is sometimes applied. When a
given specimen or sample is merely suspected of containing one or more
target complementary nucleotide sequences relative to a probe composition,
a general term such as "target" or "target composition" is sometimes used
herein.
The term "hybrid" refers to the product of a hybridization procedure
between a probe and a target. Typically, a hybrid is a molecule that
includes a double stranded, helically configured portion comprised of
complementarily paired single stranded molecules, such as two DNA
molecules, one of which is a target DNA nucleotide sequence, and the other
of which is the labeled DNA nucleotide sequence of a probe.
The term "fluorescent" (and equivalent terms) has general reference to the
property of a substance (such as a fluorophore) to produce light while it
is being acted upon by radiant energy, such as ultraviolet light or
x-rays.
The term "fluorescent compound" or "fluorophore group" as used herein
generally refers to an organic moiety. A fluorescent compound is capable
of reacting, and a fluorophore group may have already reacted, with a
linking group.
The term "linking compound" or "linking group" refers to a
hydrocarbonaceous moiety. A linking compound is capable of reacting, and a
linking group may have already reacted, with a nucleotide (or nucleotide
sequence). A linking compound is also capable of reacting, and a linking
group may have already reacted with a fluorescent compound.
The term "in situ hybridization" has reference hybridization and preferably
detection of a probe to a target that exists within a cytological or
histological specimen. As a result of an in situ hybridization procedure,
hybrids are produced between a probe (or probe composition) and a target
or targets. This term "in situ hybridization" may also be inclusive herein
of a hybrid or probe detection procedure which is practiced after
hybridization of a probe to a target. A specimen can be adhered as a layer
upon a slide surface, and a specimen can, for example, comprise or contain
individual chromosomes or chromosomal regions which have been treated to
maintain their morphology under, for example, denaturing conditions and
conditions such as typically exist during flow cytomeric analyses
subsequent to hybridization of a probe to a target. The term "in situ
hybridization" may include use of a counterstain. In the case of the
inventive fluorophore labeled probes or probe compositions, the detection
method can involve fluorescence microscopy, flow cytometry, and the like.
The term "hybridizing conditions" as has general reference to the
combinations of conditions that are employable in a given hybridization
procedure to produce hybrids, such conditions typically involving
controlled temperature, liquid phase, and contacting between a probe (or
probe composition) and a target composition. Conveniently and preferably,
at least one denaturation step precedes a step wherein a probe or probe
composition is contacted to a target. Alternatively, a probe can be
contacted with a specimen comprising a DNA target region and both
subjected to denaturing conditions together as described by Bhatt, et al
in Nucleic Acids Research 16:3951-3961. The presence of an agent or agents
which in effect lower the temperature required for denaturation and
subsequent hybridization between probe (or probe composition) and target
is generally desirable, and a presently most preferred such agent is
formamide. Using, for example, about a 50:50 weight ratio mixture of water
and formamide, an illustrative temperature for thermal denaturation is in
the range of about 35.degree. to about 70.degree. C. applied for times
that are illustratively in the range of about 1 to about 10 minutes, and
an illustrative temperature for contacting and hybridization between probe
(or probe composition) and target is in the range of about 35.degree. to
about 55.degree. C. applied for times that are illustratively in the range
of about 1 to about 16 hours. Other hybridizing conditions can be
employed. The ratio of numbers of probes to number to target sequences or
segments can vary widely, but generally the higher this ratio, the higher
the probability of hybrid formation under hybridizing conditions within
limits.
The term "lower" as used herein in reference to an individual compound,
group or radical means that such compound, group or radical contains less
than 6 carbon atoms.
The term "clone", "cloning" or equivalent refers to the process wherein a
particular nucleotide segment or sequence is inserted into an appropriate
vector, the vector is then transported into a host cell, and the vector
within the host cell is then caused to reproduce itself in a culturing
process, thereby producing numerous copies of each vector and the
respective nucleotide sequence that it carries. Cloning results in the
formation of a colony or clone (i.e., group) of identical host cells
wherein each contains one or more copies of a vector incorporating a
particular nucleotide segment or sequence. The nucleotide segment or
sequence is now said to be "cloned", and the product nucleotide segments
or sequences can be called "clones."
The term "library" is used herein in its conventional sense to refer to a
set of cloned DNA fragments which together represent an entire genome or a
specified fragment thereof, such as a single chromosome. Various libraries
are known to the prior art and are available from various repositories,
and techniques for genome and genome fragment preparation, and for cloning
libraries therefrom, are well known. A present procedural preference is to
fragment a selected one chromosome that was separated by flow sorting or
the like. Fragmentation prior to cloning is preferably achieved by
digestion with restriction endonucleases or the like. This procedure
produces fragment ends which are particularly amenable to insertion into
vectors. However, those skilled in the art will appreciate that any
conventional or convenient technique for fragmentation can be used. The
fragments are then conventionally cloned to produce a chromosome library.
(B) Starting Materials
(1) The Starting Oligonucleotides
Conveniently and preferably, at least one oligonucleotide is used in the
practice of making a regionally specific cloned DNA sequence of this
invention. Each such oligonucleotide is complementary to a location in a
DNA sequence which occurs in a preselected region of a chromosome and
which is located approximately between adjacent DNA repeated segments that
occur in such preselected region. While only a single oligonucleotide is
sufficient, an oligonucleotide mixture of at least two structurally
differing short (i.e., oligomeric) common DNA repeated segments which
bound (i.e. terminate) DNA repeated segments specific to a preselected
region of a given chromosome is presently preferred.
For individual human chromosomes, the structures of such degenerate (i.e.
synthesizable) commonly occurring oligonucleotide segments which occur in
such a DNA sequence are generally known, as are methods for their
identification. See, for example, Koch J. E., et al., Chromosoma
98:259-265 (1989). Typically, suitable synthesized oligonucleotides
complementary to such DNA repeated segments can contain about 17 to about
50 bp, preferably about 15 to about 30 bp, but larger and smaller
oligonucleotides can be prepared and used, if desired.
The known identification methods can be readily used for identifying the
DNA repeated segments that are present in a given region of a chromosome,
such as alphoid DNA in the centromere region, in any multi-chromosomal
genome, as those skilled in the art will readily appreciate. From such an
identification, desired complementary oligonucleotide segments can be
derived and synthesized for a given chromosome. The complete nucleotide
structure of the DNA sequence wherein such DNA repeated segments naturally
occur need not be known and, indeed, usually is not known, as those
skilled in the art will appreciate.
Once derived (i.e., identified), the oligonucleotides are readily
synthesized using conventional, commercially available nucleotide sequence
generating apparatus and methods. See, for example, M. H. Caruthers, in
Science, 281-285 (1985). One presently preferred DNA synthesizing machine
is the Applied Biosystems Model 38D B DNA Synthesizer available
commercially from Applied Biosystems (Foster City, Calif.). Such a machine
was employed for synthesis of starting oligonucleotides employed in the
examples described herein.
(2) The Starting Chromosomal Template DNA
The starting chromosomal DNA template sequences used in the practice of
this invention comprise DNA from a preselected whole chromosome (of a
multi-chromosomal genome) wherein a preselected region occurs. This
template DNA is typically in the form of a plurality of DNA sequences
which taken together contain a multiplicity of DNA segments that
individually occur at various locations in and throughout such chromosome
and that are reasonably representative of DNA occurring in the preselected
chromosome. Although in its naturally occurring state, such a starting DNA
sequences may typically have a size much greater than about one million
base pairs, at the time of availability for use as a starting material in
the practice of this invention, such sequence may already be somewhat
fragmented, depending upon such factors as the methods used in separation,
isolation and the like. Preferably, such chromosome is from the human
genome.
For purposes of preparing a cloned DNA sequence of this invention, the
starting chromosomal DNA sequence(s) can be obtained by various
techniques. Thus, such can be derived or obtained from (a) DNA of a
preselected chromosome that is separated by flow sorting or the like and
purified from component intracellular material of an organism; (b) a
library of a preselected chromosome; and (c) an interspecies hybrid which
incorporates DNA from a preselected chromosome. A presently preferred
stating chromosomal DNA is a chromosome library that has been prepared by
standard methods and is available from traditional sources known to those
in the art, such as the American Type Culture Collection (ATCC) or other
repositories of human or other cloned genetic material. While a large
number of specific chromosome libraries are available from the ATCC,
representative libraries are shown in Table I below:
TABLE I
______________________________________
HUMAN CHROMOSOME LIBRARIES
Human Human
Chromosome Chromosome
Library ATCC No. Library ATCC No.
______________________________________
1 57738 13 57757
1 57753 14 57739
1 57754 14 57706
2 57716 14/15 57707
2 57744 15 57729
3 57717 15 57740
3 57748 15 57737
3 57751 16 57765
4 57719 16 57730
4 57718 16 57749
4 57700 16 57758
4 57745 17 57741
5 57720 17 57759
5 57746 18 57742
6 57721 18 57710
6 57701 19 57731
7 57722 19 57766
7 57755 19 57711
8 57723 20 57732
8 57707 20 57712
9 57724 21 57743
9 57705 21 57713
10 57725 22 57733
10 57736 22 57714
11 57726 X 57750
11 57704 X 57734
12 57727 X 57752
12 57736 X 57747
13 57728 Y 57735
13 57705 Y 57715
______________________________________
The ATCC deposits of Table I are available from the American Type Culture
Collection, 12301 Parklawn Drive, Rockville, Md.
Examples of prior art teachings illustrating the preparation of suitable
starting preselected chromosomal template sequences for making region
specific DNA sequences of this invention include (but are not limited to):
1. Physically separated chromosomes or libraries derived from same, as in
M. A. Van Dilla, et al. in Bio/Technology 4:537-552 (1986).
2. A microdissected chromosome, fragment of a chromosome or cloned library
derived from same, as in Ludecke, H. J., et al. in Nature 338:348-350
(1989).
3. Single human chromosomes or fragments thereof, which are propagated in
rodent cell lines. A method for the generation of human, mono-chromosomal
hybrid lines is described in: Carlock. L. R., et al. in Somatic Cell Mol.
Gent. 12:163-174 (1986).
The starting chromosomal DNA typically contains about 18 to about 25 mole
percent deoxycytidine nucleotides based on the total number of
deoxynucleotides present therein. Typically, the starting template
chromosomal DNA of the preselected single chromosome wherein the
preselected region exists displays a wide variation in molecular size, for
example, the sizes can be in the range of about 150 to about 20,000,000
bp.
(3) The Starting Linking Compound
A starting linking compound employed in the practice of this invention is a
difunctional organic compound, that is, such contains two substituent
functional (i.e., reactive) substituents per starting linking compound
molecule.
At least one of such functional substituents per linking compound molecule
is reactive with deoxycytidine nucleotides in a polynucleotide under
bisulfite catalyzed aqueous transamination conditions (such as provided
herein, for example). Examples of such substituents include alkyl amino
(primary and secondary) hydrazide, semicarbazido, thiosemicarbazido, and
the like. Amino groups are presently most preferred.
When the amino group is secondary, the secondary substituent is preferably
a lower alkyl group, but other non-blocking such secondary substituents
can be used, if desired.
The second of other of such two functional substituents per linking
compound molecule is reactive with a third functional substituent which is
itself incorporated into a starting fluorescent compound (as herein
described). Such second functional substituent can itself be either
blocked or unblocked. When the second substituent is unblocked, then it is
substantially non-reactive with other substances that are present in the
transamination medium (especially polynucleotides) during transamination.
When the second substituent is blocked then it is substantially
non-reactive with the other substances that are present in the
transamination medium (especially polynucleotides) during transamination.
Examples of suitable unblocked second functional substituent group include
amino, carboxyl, phosphate, sulfonate, hydroxyl, hydrazido, semicarbazido,
thiosemicarbazido and the like. Presently, most preferred unblocked second
functional substituent include amino (primary or secondary) and carboxyl
groups.
The carboxyl group preferably is either in the salt form or in the acid
form, but can sometimes be in the ester form. When in the salt form,
presently preferred cations are alkali metals, such as sodium and
potassium.
Examples of suitable blocked second functional substitutent group include
blocked sulfonate, blocked phosphate, blocked sulfhydryl, and the like.
Examples of suitable blocking substituents include lower alkyl groups such
as methyl, ethyl, propyl, etc.
The first and the second functional substituents are interconnected
together through a linker (or linking) moiety. This linking moiety can
have any convenient structure but such is non-reactive with other
substances that are present in the transamination medium during
transamination. A present preference is that the linking moiety be a
hydrocarbonaceous divalent group which is acyclic or cyclical and which
can optionally incorporate other atoms.
The two functional substituents present in such a difunctional linking
compound can be respective substituents of the linking moiety. Such
substituents can be on adjacent carbon atoms relative to each other, or
they can be spaced from one another in a linking compound molecule by a
plurality of intervening interconnected atoms (preferably carbon atoms).
Preferably these functional groups are in an alpha, omega relationship to
one another (that is, each is at a different opposite end region) in a
given linking compound molecule.
Thus, the two functional radicals in a linking compound are each bonded to
an organic linking group moiety which is either entirely hydrocarbonaceous
(that is, composed only of carbon and hydrogen atoms), or is comprised of
carbon and hydrogen atoms plus at least one additional atom or group which
contains at least one atom selected from the group consisting of oxygen,
sulfur, nitrogen, phosphorous, or the like. Preferably such additional
atom(s) are so associated with such organic moiety as to be substantially
less reactive than either one of such above indicated two functional
radicals that are present in a given starting linking compound.
Hydrocarbonaceous organic moieties that are saturated aliphatic are
presently preferred, and more preferably such moiety is a divalent
alkylene radical containing from 2 through 12 carbon atoms, inclusive.
However, if desired, such a saturated aliphatic radical can incorporate
either at least one ether group (--O--) or at least one thio-ether group
(--S--), but it is presently more preferred that only one of such ether or
thio ether groups be present. It is presently preferred that a linking
compound incorporates an organic radical that contains at least two and
not more than about a total of about 20 carbon atoms, although more carbon
atoms per molecule can be present, if desired.
Presently preferred are linking compounds in which each of such functional
radicals is an amino radical. Both acyclic and cyclic diamino compounds
can be used.
Examples of suitable aliphatic primary diamines include alkylene primary
amines wherein the alkylene group is propylene, butylene, pentylene,
hexylene, nonylene, and the like.
Examples of suitable aliphatic secondary diamines include CH.sub.3
NH(CH.sub.2).sub.2 NH.sub.2, CH.sub.3 NH(CH.sub.2).sub.2 NHCH.sub.3, and
the like.
Diamino compounds incorporating hydroxylated hydrocarbons can be used.
Examples of acyclic such compounds include 1,3-diamino-2-hydroxypropane;
1,4-diamino-2,3 dihydroxybutane; 1,5-diamino-2,3,4-trihydroxypentane;
1,6-diamino-1,6-dideoxy-D-mannitol (or D-glucitol or D-galactitol),
1,6-diamino-2,3,4,5-tetrahydroxy hexane, and the like.
Examples of suitable polyhydroxylated cyclic dimensions include cis or
trans cyclic diamino compounds where the diamines are constrained in a
ring, such as 1,4-diamino-2,3,5,6-tetrahydroxy cyclohexane, cis and trans
1,2-diaminocyclohexane, cis and trans 1,2-diaminocyclopentane, and
hydroxylated derivatives thereof, such as
1,2-diamino-3,4,5,6-tetrahydroxycyclohexane, 1,2-diamino-3,4,5-trihydroxy
cyclopentane, 3,6-diamino-3,6-dideoxy-derivatives of myo-inositol, such as
##STR1##
and the like.
Examples of suitable heterocyclic diamines include piperazine, N,N'-bis
(3-aminopropyl) piperazine, derivatives thereof, and the like.
Examples of suitable ether-group containing diamines include
3-oxo-1,5-pentanediamine, 3,6-dioxo-1,8-diaminooctane, and the like.
Examples of suitable linking compounds containing both an amino radical and
a carboxyl radical include amino acids, such as sarcosine
(N-methylglycine), and alpha amino acids, such as glycine, alanine,
glutaric acid, aspartic acid, proline, pipecolinic acid
(piperidine-2-carboxylic acid), isopipecolinic acid
(piperidine-4-carboxylic acid), glucosaminic acid and derivatives thereof,
and the like.
Examples of alpha, omega aminocarboxylic acids (in addition to the above
identified amino acids) include 4-aminobutyric acid, 6-aminohexanoic acid,
8-aminooctanoic acid, and the like.
Examples of phosphorous containing difunctional linking compounds include
alpha, omega aminoalkyl phosphoric acid, monoesters, such as
O-(2-aminoethyl) phosphate disodium salt and the like.
Examples of suitable sulfur containing difunctional linking compounds
include alpha, omega aminoalkyl sulfonic acids, such as taurine
(2-aminoethyl sulfonic acid) and the like.
One presently more preferred class of difunctional linking compounds is
represented by the following generic formula:
##STR2##
wherein:
X is a divalent radical selected from the class consisting of:
##STR3##
wherein:
R is an alkylene radical containing from 2 through 12 carbon atoms
inclusive or carbocyclic ring hydroxylated car carboci, and
R.sub.1 and R.sub.2 are each independently selected from the class
consisting of hydrogen and lower alkyl.
Preferably, in Formula (1), R contains not more than 7 carbon atoms, X is
R.sub.1 and R.sub.2 are each hydrogen, and X is
##STR4##
and R.sub.1 and R.sub.2 are each hydrogen, and R contains less than 7
carbon atoms.
Mixtures of different linking compounds can be used, such as linking
compounds containing a mixture of mono and/or diamines, but such mixtures
are not preferred because associated problems in transamination control
and usage.
Diamines which are characterized by having a large proportion thereof that
exists as a free unprotonated species at pH values of about 7 appear to
enhance the present transamination reaction. Ethylene diamine (pK of about
7.6) is presently most preferred for use as the reactive difunctional
amine because of this property.
When, for example, such a linking compound is bonded to a DNA sequence
using a transamination reaction, as hereinbelow described, the
transamination reaction is carried out so that an amino radical in the
linking compound bonds to the sequence or segment. Then, in the resulting
linking group, one functional group remains free to undergo further
reaction. Thus, when the second functional radical is an amino radical,
such radical remains free thereafter to undergo further reaction with the
fluorescent compound, as hereinbelow described. When the second functional
radical is a carboxyl radical, such radical remains free thereafter to
undergo such a further reaction with the fluorescent compound, as
hereinbelow described.
(C) Production of Cloned Regional Chromosomal Sequence
The present invention provides a process for producing a cloned DNA
sequence that (a) is complementary to a DNA sequence which occurs in one
selected region of one selected chromosome that is preferably of a
multi-chromosomal genome, and also that (b) incorporates a plurality of
copies of at least one DNA repeated segment which occurs in such one
selected region.
Briefly, this process involves, as a first step, synthesizing at least one
starting oligonucleotide (as above described). Each such oligonucleotide
contains a nucleotide sequence that is complementary to at least one DNA
| | |