|
Description  |
|
|
FIELD OF THE INVENTION
This invention relates to nucleic acid and amino acid sequences of a
ATP-dependent RNA helicase and to the use of these sequences in the
diagnosis, prevention, and treatment of cancer, neurological disorders,
and immune disorders.
BACKGROUND OF THE INVENTION
Nucleic acid helicases are a large family of enzymes that unwind
double-stranded DNA and RNA an use the energy derived from the hydrolysis
of a nucleoside 5'-triphosphate (usually ATP) to drive the unwinding
process. ATP-dependent DNA helicases are needed to provide single-stranded
DNA for DNA replication, repair, recombination, and transcription.
ATP-dependent RNA helicases are needed to provide single-stranded RNA for
mRNA splicing, translation, and ribosomal assembly.
RNA helicases from a wide variety of sources including bacteria, yeast, and
mammals share a number of highly conserved sequences and structural
features suggesting that RNA helicase activity is of fundamental
importance to cells of all types. For example, yeast Drs1 protein is
involved in ribosomal RNA processing; yeast TIF1 and TIF2 and mammalian
eIF-4A are essential to the initiation of RNA translation; and human p68
antigen regulates cell growth and division (Ripmaster, T. L. et al. (1992)
Proc. Natl. Acad. Sci. 89:11131-35; Chang, T-H et al. (1990) Proc. Natl.
Acad. Sci. 87:1571-75). These RNA helicases demonstrate strong sequence
homology over a stretch of some 420 amino acids. Included among these
conserved sequences are the sequence DX.sub.4 A.sub.4 GKT typical for the
A motif of an ATP binding protein, the "DEAD box" sequence
(aspartate-glutamate-alanine-aspartate) associated with ATPase activity,
the sequence SAT associated with the actual helicase unwinding region, and
the sequence H/QRXGRXXR required for RNA binding and ATP hydrolysis
(Pause, A. et al. (1993) Mol. Cell Biol. 13:6789-98). Moreover, these
sequences are similarly spaced apart in members of the helicase family and
most often found in the middle region of the protein. Differences outside
of the conserved regions are believed to reflect differences in the
functional roles of individual proteins (Chang et al., supra)
The discovery of a new ATP-dependent RNA helicase and the polynucleotides
encoding it satisfies a need in the art by providing new compositions
which are useful in the diagnosis, prevention, and treatment of cancer,
neurological disorders, and immune disorders.
SUMMARY OF THE INVENTION
The invention features a substantially purified polypeptide, ATP-dependent
RNA helicase ADRH-1), having the amino acid sequence shown in SEQ ID NO:1,
or fragments thereof.
The invention further provides an isolated and substantially purified
polynuclcotide sequence encoding the polypeptide comprising the amino acid
sequence of SEQ ID NO:1 or fragments thereof and a composition comprising
said polynucleotide sequence. The invention also provides a polynucleotide
sequence which hybridizes under stringent conditions to the polynucleotide
sequence encoding the amino acid sequence SEQ ID NO:1, or fragments of
said polynucleotide sequence. The invention further provides a
polynucleotide sequence comprising the complement of the polynucleotide
sequence encoding the amino acid sequence of SEQ ID NO:1, or fragments or
variants of said polynucleotide sequence.
The invention also provides an isolated and purified sequence comprising
SEQ ID NO:2 or variants thereof. In addition, the invention provides a
polynucleotide sequence which hybridizes under stringent conditions to the
polynucleotide sequence of SEQ ID NO:2.
The invention also provides a polynucleotide sequence comprising the
complement of SEQ ID NO:2 or fragments or variants thereof.
The present invention further provides an expression vector containing at
least a fragment of any of the claimed polynucleotide sequences. In yet
another aspect, the expression vector containing the polynucleotide
sequence is contained within a host cell.
The invention also provides a method for producing a polypeptide comprising
the amino acid sequence of SEQ ID NO:1 or a fragment thereof, the method
comprising the steps of: a) culturing the host cell containing an
expression vector containing at least a fragment of the polynucleotide
sequence encoding ADRH-1 under conditions suitable for the expression of
the polypeptide; and b) recovering the polypeptide from the host cell
culture.
The invention also provides a pharmaceutical composition comprising a
substantially purified ADRH-1 having the amino acid sequence of SEQ ID
NO:1 in conjunction with a suitable pharmaceutical carrier.
The invention also provides a purified antagonist of the polypeptide of SEQ
ID NO:1. In one aspect the invention provides a purified antibody which
binds to a polypeptide comprising the amino acid sequence of SEQ ID NO:1.
Still further, the invention provides a purified agonist of the polypeptide
of SEQ ID NO:1.
The invention also provides a method for treating or preventing a
neurological disorder comprising administering to a subject in need of
such treatment an effective amount of a pharmaceutical composition
comprising purified ADRH-1.
The invention also provides a method for treating or preventing cancer
comprising administering to a subject in need of such treatment an
effective amount of a purified antagonist of ADRH-1.
The invention also provides a method for treating or preventing an immune
disorder comprising administering to a subject in need of such treatment
an effective amount of a purified antagonist of ADRH-1.
The invention also provides a method for detecting a polynucleotide which
encodes ADRH-1 in a biological sample comprising the steps of: a)
hybridizing the complement of the polynucleotide sequence which encodes
SEQ ID NO:1 to nucleic acid material of a biological sample, thereby
forming a hybridization complex; and b) detecting the hybridization
complex, wherein the presence of the complex correlates with the presence
of a polynucleotide encoding ADRH-1 in the biological sample. In one
aspect the nucleic acid material of the biological sample is amplified by
the polymerase chain reaction prior to hybridization.
BRIEF DESCRIPTION OF THE FIGURES
FIGS. 1A, 1B, 1C, 1D, 1E, 1F, 1G, and 1H show the amino acid sequence (SEQ
ID NO:1) and nucleic acid sequence (SEQ ID NO:2) of ADRH-1. The alignment
was produced using MacDNASIS PRO.TM. software (Hitachi Software
Engineering Co. Ltd. San Bruno, Calif.).
FIGS. 2A, 2B, 2C, and 2D show the amino acid sequence alignments among
ADRH-1 (SEQ ID NO:1), ATP-dependent RNA helicase from mouse (GI 1335873;
SEQ ID NO:3) and an ATP-dependent RNA helicase-like protein from
Caenorhabditis elegans (GI 1707046; SEQ ID NO:4), produced using the
multisequence alignment program of DNASTAR.TM. software (DNASTAR Inc,
Madison Wis.).
FIGS. 3A and 3B show the hydrophobicity plots for ADRH-1 (SEQ ID NO:1) and
ATP-dependent RNA helicase from mouse (SEQ ID NO:3), respectively; the
positive X axis reflects amino acid position, and the negative Y axis,
hydrophobicity (MACDNASIS PRO software).
DESCRIPTION OF THE INVENTION
Before the present proteins, nucleotide sequences, and methods are
described, it is understood that this invention is not limited to the
particular methodology, protocols, cell lines, vectors, and reagents
described, as these may vary. It is also to be understood that the
terminology used herein is for the purpose of describing particular
embodiments only, and is not intended to limit the scope of the present
invention which will be limited only by the appended claims.
It must be noted that as used herein and in the appended claims, the
singular forms "a", "an", and "the" include plural reference unless the
context clearly dictates otherwise. Thus, for example, reference to "a
host cell" includes a plurality of such host cells, reference to the
"antibody" is a reference to one or more antibodies and equivalents
thereof known to those skilled in the art, and so forth.
Unless defined otherwise, all technical and scientific terms used herein
have the same meanings as commonly understood by one of ordinary skill in
the art to which this invention belongs. Although any methods and
materials similar or equivalent to those described herein can be used in
the practice or testing of the present invention, the preferred methods,
devices, and materials are now described. All publications mentioned
herein are incorporated herein by reference for the purpose of describing
and disclosing the cell lines, vectors, and methodologies which are
reported in the publications which might be used in connection with the
invention. Nothing herein is to be construed as an admission that the
invention is not entitled to antedate such disclosure by virtue of prior
invention.
Definitions
ADRH-1, as used herein, refers to the amino acid sequences of substantially
purified ADRH-1 obtained from any species, particularly mammalian,
including bovine, ovine, porcine, murine, equine, and preferably human,
from any source whether natural, synthetic, semi-synthetic, or
recombinant.
The term "agonist", as used herein, refers to a molecule which, when bound
to ADRH-1, increases or prolongs the duration of the effect of ADRH-1.
Agonists may include proteins, nucleic acids, carbohydrates, or any other
molecules which bind to and modulate the effect of ADRH-1.
An "allele" or "allelic sequence", as used herein, is an alternative form
of the gene encoding ADRH-1. Alleles may result from at least one mutation
in the nucleic acid sequence and may result in altered mRNAs or
polypeptides whose structure or function may or may not be altered. Any
given natural or recombinant gene may have none, one, or many allelic
forms. Common mutational changes which give rise to alleles are generally
ascribed to natural deletions, additions, or substitutions of nucleotides.
Each of these types of changes may occur alone, or in combination with the
others, one or more times in a given sequence.
"Altered" nucleic acid sequences encoding ADRH-1 as used herein, include
those with deletions, insertions, or substitutions of different
nucleotides resulting in a polynucleotide that encodes the same or a
functionally equivalent ADRH-1. Included within this definition are
polymorphisms which may or may not be readily detectable using a
particular oligonucleotide probe of the polynucleotide encoding ADRH-1,
and improper or unexpected hybridization to alleles, with a locus other
than the normal chromosomal locus for the polynucleotide sequence encoding
ADRH-1. The encoded protein may also be "altered" and contain deletions,
insertions, or substitutions of amino acid residues which produce a silent
change and result in a functionally equivalent ADRH-1. Deliberate amino
acid substitutions may be made on the basis of similarity in polarity,
charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic
nature of the residues as long as the biological or immunological activity
of ADRH-1 is retained. For example, negatively charged amino acids may
include aspartic acid and glutamic acid; positively charged amino acids
may include lysine and arginine; and amino acids with uncharged polar head
groups having similar hydrophilicity values may include leucine,
isoleucine, and valine, glycine and alanine, asparagine and glutamine,
serine and threonine, and phenylalanine and tyrosine.
"Amino acid sequence", as used herein, refers to an oligopeptide, peptide,
polypeptide, or protein sequence, and fragment thereof, and to naturally
occurring or synthetic molecules. Fragments of ADRH-1 are preferably about
5 to about 15 amino acids in length and retain the biological activity or
the immunological activity of ADRH-1. Where "amino acid sequence" is
recited herein to refer to an amino acid sequence of a naturally occurring
protein molecule, amino acid sequence, and like terms, are not meant to
limit the amino acid sequence to the complete, native amino acid sequence
associated with the recited protein molecule.
"Amplification", as used herein refers to the production of additional
copies of a nucleic acid sequence and is generally carried out using
polymerase chain reaction (PCR) technologies well known in the art
(Dieffenbach, C. W. and G. S. Dveksler (1995) PCR Primer, a Laboratory
Manual, Cold Spring Harbor Press, Plainview, N.Y.).
The term "antagonist", as used herein, refers to a molecule which, when
bound to ADRH-1, decreases the amount or the duration of the effect of the
biological or immunological activity of ADRH-1. Antagonists may include
proteins, nucleic acids, carbohydrates, antibodies or any other molecules
which decrease the effect of ADRH-1.
As used herein, the term "antibody" refers to intact molecules as well as
fragments thereof, such as Fa, F(ab').sub.2, and Fv, which are capable of
binding the epitopic determinant. Antibodies that bind ADRH-1 polypeptides
can be prepared using intact polypeptides or fragments containing small
peptides of interest as the immunizing antigen. The polypeptide or
oligopeptide used to immunize an animal can be derived from the
translation of RNA or synthesized chemically and can be conjugated to a
carrier protein, if desired. Commonly used carriers that are chemically
coupled to peptides include bovine serum albumin and thyroglobulin,
keyhole limpet hemocyanin. The coupled peptide is then used to immunize
the animal (e.g., a mouse, a rat, or a rabbit).
The term "antigenic determinant", as used herein, refers to that fragment
of a molecule (i.e., an epitope) that makes contact with a particular
antibody. When a protein or fragment of a protein is used to immunize a
host animal, numerous regions of the protein may induce the production of
antibodies which bind specifically to a given region or three-dimensional
structure on the protein; these regions or structures are referred to as
antigenic determinants. An antigenic determinant may compete with the
intact antigen (i.e., the immunogen used to elicit the immune response)
for binding to an antibody.
The term "antisense", as used herein, refers to any composition containing
nucleotide sequences which are complementary to a specific DNA or RNA
sequence. The term "antisense strand" is used in reference to a nucleic
acid strand that is complementary to the "sense" strand. Antisense
molecules include peptide nucleic acids and may be produced by any method
including synthesis or transcription. Once introduced into a cell, the
complementary nucleotides combine with natural sequences produced by the
cell to form duplexes and block either transcription or translation. The
designation "negative" is sometimes used in reference to the antisense
strand, and "positive" is sometimes used in reference to the sense strand.
The term "biologically active", as used herein, refers to a protein having
structural, regulatory, or biochemical functions of a naturally occurring
molecule. Likewise, "immunologically active" refers to the capability of
the natural, recombinant, or synthetic ADRH-1, or any oligopeptide
thereof, to induce a specific immune response in appropriate animals or
cells and to bind with specific antibodies.
The terms "complementary" or "complementarity", as used herein, refer to
the natural binding of polynucleotides under permissive salt and
temperature conditions by base-pairing. For example, the sequence "A-G-T"
binds to the complementary sequence "T-C-A". Complementarity between two
single-stranded molecules may be "partial", in which only some of the
nucleic acids bind, or it may be complete when total complementarity
exists between the single stranded molecules. The degree of
complementarity between nucleic acid strands has significant effects on
the efficiency and strength of hybridization between nucleic acid strands.
This is of particular importance in amplification reactions, which depend
upon binding between nucleic acids strands and in the design and use of
PNA molecules.
A "composition comprising a given polynucleotide sequence", as used herein
refers broadly to any composition containing the given polynucleotide
sequence. The composition may comprise a dry formulation or an aqueous
solution. Compositions comprising polynucleotide sequences encoding ADRH-1
(SEQ ID NO:1) or fragments thereof (e.g., SEQ ID NO:2 and fragments
thereof) may be employed as hybridization probes. The probes may be stored
in freeze-dried form and may be associated with a stabilizing agent such
as a carbohydrate. In hybridizations, the probe may be deployed in an
aqueous solution containing salts (e.g., NaCl), detergents (e.g., SDS) and
other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA,
etc.).
"Consensus", as used herein, refers to a nucleic acid sequence which has
been resequenced to resolve uncalled bases, has been extended using
XL-PCR.TM. (Perkin Elmer, Norwalk, Conn.) in the 5' and/or the 3'
direction and resequenced, or has been assembled from the overlapping
sequences of more than one Incyte Clone using a computer program for
fragment assembly (e.g., GELVIEW.TM. Fragment Assembly system, GCG,
Madison, Wis.). Some sequences have been both extended and assembled to
produce the consensus sequence.
The term "correlates with expression of a polynucleotide", as used herein,
indicates that the detection of the presence of ribonucleic acid that is
similar to SEQ ID NO:2 by northern analysis is indicative of the presence
of mRNA encoding ADRH-1 in a sample and thereby correlates with expression
of the transcript from the polynucleotide encoding the protein.
A "deletion", as used herein, refers to a change in the amino acid or
nucleotide sequence and results in the absence of one or more amino acid
residues or nucleotides.
The term "derivative", as used herein, refers to the chemical modification
of a nucleic acid encoding or complementary to ADRH-1 or the encoded
ADRH-1. Such modifications include, for example, replacement of hydrogen
by an alkyl, acyl, or amino group. A nucleic acid derivative encodes a
polypeptide which retains the biological or immunological function of the
natural molecule. A derivative polypeptide is one which is modified by
glycosylation, pegylation, or any similar process which retains the
biological or immunological function of the polypeptide from which it was
derived.
The term "homology", as used herein, refers to a degree of complementarity.
There may be partial homology or complete homology (i.e., identity). A
partially complementary sequence that at least partially inhibits an
identical sequence from hybridizing to a target nucleic acid is referred
to using the functional term "substantially homologous." The inhibition of
hybridization of the completely complementary sequence to the target
sequence may be examined using a hybridization assay (Southern or northern
blot, solution hybridization and the like) under conditions of low
stringency. A substantially homologous sequence or hybridization probe
will compete for and inhibit the binding of a completely homologous
sequence to the target sequence under conditions of low stringency. This
is not to say that conditions of low stringency are such that non-specific
binding is permitted; low stringency conditions require that the binding
of two sequences to one another be a specific (i.e., selective)
interaction. The absence of non-specific binding may be tested by the use
of a second target sequence which lacks even a partial degree of
complementarity (e.g., less than about 30% identity). In the absence of
non-specific binding, the probe will not hybridize to the second
non-complementary target sequence.
Human artificial chromosomes (HACs) are linear microchromosomes which may
contain DNA sequences of 10K to 10M in size and contain all of the
elements required for stable mitotic chromosome segregation and
maintenance (Harrington, J. J. et al. (1997) Nat Genet. 15:345-355).
The term "humanized antibody", as used herein, refers to antibody molecules
in which amino acids have been replaced in the non-antigen binding regions
in order to more closely resemble a human antibody, while still retaining
the original binding ability.
The term "hybridization", as used herein, refers to any process by which a
strand of nucleic acid binds with a complementary strand through base
pairing.
The term "hybridization complex", as used herein, refers to a complex
formed between two nucleic acid sequences by virtue of the formation of
hydrogen bonds between complementary G and C bases and between
complementary A and T bases; these hydrogen bonds may be further
stabilized by base stacking interactions. The two complementary nucleic
acid sequences hydrogen bond in an antiparallel configuration. A
hybridization complex may be formed in solution (e.g., C.sub.0 t or
R.sub.0 t analysis) or between one nucleic acid sequence present in
solution and another nucleic acid sequence immobilized on a solid support
(e.g., paper, membranes, filters, chips, pins or glass slides, or any
other appropriate substrate to which cells or their nucleic acids have
been fixed).
An "insertion" or "addition", as used herein, refers to a change in an
amino acid or nucleotide sequence resulting in the addition of one or more
amino acid residues or nucleotides, respectively, as compared to the
naturally occurring molecule.
"Microarray" refers to an array of distinct polynucleotides or
oligonucleotides synthesized on a substrate, such as paper, nylon or other
type of membrane, filter, chip, glass slide, or any other suitable solid
support.
The term "modulate", as used herein, refers to a change in the activity of
ADRH-1. For example, modulation may cause an increase or a decrease in
protein activity, binding characteristics, or any other biological,
functional or immunological properties of ADRH-1.
"Nucleic acid sequence", as used herein, refers to an oligonucleotide,
nucleotide, or polynucleotide, and fragments thereof, and to DNA or RNA of
genomic or synthetic origin which may be single- or double-stranded, and
represent the sense or antisense strand.
"Fragments" are those nucleic acid sequences which are greater than 60
nucleotides than in length, and most preferably includes fragments that
are at least 100 nucleotides or at least 1000 nucleotides, and at least
10,000 nucleotides in length.
The term "oligonucleotide" refers to a nucleic acid sequence of at least
about 6 nucleotides to about 60 nucleotides, preferably about 15 to 30
nucleotides, and more preferably about 20 to 25 nucleotides, which can be
used in PCR amplification or a hybridization assay, or a microarray. As
used herein, oligonucleotide is substantially equivalent to the terms
"amplimers","primers", "oligomers", and "probes", as commonly defined in
the art.
"Peptide nucleic acid", PNA, as used herein, refers to an antisense
molecule or anti-gene agent which comprises an oligonucleotide of at least
five nucleotides in length linked to a peptide backbone of amino acid
residues which ends in lysine. The terminal lysine confers solubility to
the composition. PNAs may be pegylated to extend their lifespan in the
cell where they preferentially bind complementary single stranded DNA and
RNA and stop transcript elongation (Nielsen, P. E. et al. (1993)
Anticancer Drug Des. 8:53-63).
The term "portion", as used herein, with regard to a protein (as in "a
portion of a given protein") refers to fragments of that protein. The
fragments may range in size from five amino acid residues to the entire
amino acid sequence minus one amino acid. Thus, a protein "comprising at
least a portion of the amino acid sequence of SEQ ID NO:1" encompasses the
full-length ADRH-1 and fragments thereof.
The term "sample", as used herein, is used in its broadest sense. A
biological sample suspected of containing nucleic acid encoding ADRH-1, or
fragments thereof, or ADRH-1 itself may comprise a bodily fluid, extract
from a cell, chromosome, organelle, or membrane isolated from a cell, a
cell, genomic DNA, RNA, or cDNA(in solution or bound to a solid support, a
tissue, a tissue print, and the like).
The terms "specific binding" or "specifically binding", as used herein,
refers to that interaction between a protein or peptide and an agonist, an
antibody and an antagonist. The interaction is dependent upon the presence
of a particular structure (i.e., the antigenic determinant or epitope) of
the protein recognized by the binding molecule. For example, if an
antibody is specific for epitope "A", the presence of a protein containing
epitope A (or free, unlabeled A) in a reaction containing labeled "A" and
the antibody will reduce the amount of labeled A bound to the antibody.
The terms "stringent conditions" or "stringency", as used herein, refer to
the conditions for hybridization as defined by the nucleic acid, salt, and
temperature. These conditions are well known in the art and may be altered
in order to identify or detect identical or related polynucleotide
sequences. Numerous equivalent conditions comprising either low or high
stringency depend on factors such as the length and nature of the sequence
(DNA, RNA, base composition), nature of the target (DNA, RNA, base
composition), milieu (in solution or immobilized on a solid substrate),
concentration of salts and other components (e.g., formamide, dextran
sulfate and/or polyethylene glycol), and temperature of the reactions
(within a range from about 5.degree. C. below the melting temperature of
the probe to about 20.degree. C. to 25.degree. C. below the melting
temperature). One or more factors be may be varied to generate conditions
of either low or high stringency different from, but equivalent to, the
above listed conditions.
The term "substantially purified", as used herein, refers to nucleic or
amino acid sequences that are removed from their natural environment,
isolated or separated, and are at least 60% free, preferably 75% free, and
most preferably 90% free from other components with which they are
naturally associated.
A "substitution", as used herein, refers to the replacement of one or more
amino acids or nucleotides by different amino acids or nucleotides,
respectively.
"Transformation", as defined herein, describes a process by which exogenous
DNA enters and changes a recipient cell. It may occur under natural or
artificial conditions using various methods well known in the art.
Transformation may rely on any known method for the insertion of foreign
nucleic acid sequences into a prokaryotic or eukaryotic host cell. The
method is selected based on the type of host cell being transformed and
may include, but is not limited to, viral infection, electroporation, heat
shock, lipofection, and particle bombardment. Such "transformed" cells
include stably transformed cells in which the inserted DNA is capable of
replication either as an autonomously replicating plasmid or as part of
the host chromosome. They also include cells which transiently express the
inserted DNA or RNA for limited periods of time.
A "variant" of ADRH-1, as used herein, refers to an amino acid sequence
that is altered by one or more amino acids. The variant may have
"conservative" changes, wherein a substituted amino acid has similar
structural or chemical properties, e.g., replacement of leucine with
isoleucine. More rarely, a variant may have "nonconservative" changes,
e.g., replacement of a glycine with a tryptophan. Analogous minor
variations may also include amino acid deletions or insertions, or both.
Guidance in determining which amino acid residues may be substituted,
inserted, or deleted without abolishing biological or immunological
activity may be found using computer programs well known in the art, for
example, DNASTAR software.
The Invention
The invention is based on the discovery of a new human ATP-dependent RNA
helicase (hereinafter referred to as "ADRH-1"), the polynucleotides
encoding ADRH-1, and the use of these compositions for the diagnosis,
prevention, or treatment of cancer, neurological disorders, and immune
disorders.
Nucleic acids encoding the ADRH-1 of the present invention were first
identified in Incyte Clone 1321876 from the normal bladder cDNA library
(BLADNOT04) using a computer search for amino acid sequence alignments. A
consensus sequence, SEQ ID NO:2, was derived from the following
overlapping and/or extended nucleic acid sequences: Incyte Clones
368897/SYNORAT01, 1321876/BLADNOT04, 1382885/BRAITUT08, and
1393885/THYRNOT03.
In one embodiment, the invention encompasses a polypeptide comprising the
amino acid sequence of SEQ ID NO:1, as shown in FIGS. 1A, 1B, 1C, 1D, 1E,
1F, 1G and 1H. ADRH-1 is 859 amino acids in length and has a potential
ATP/GTP binding site (motif A) at D.sub.32 ILGAAETGSGKT. The "DEAD box"
sequence is found at D.sub.471 EAD, the SAT (helicase) region is found at
S.sub.507 AT, and the RNA binding-ATP hydrolysis sequence, H/QRXGRXXR, is
found at H.sub.673 RSGRTAR. A potential N-linked glycosylation site is
found at N.sub.493, and various potential protein phosphorylation sites
are found for protein kinase A at T.sub.503, T.sub.526, S.sub.604, and
T.sub.841, and for protein tyrosine kinase at Y.sub.740. As shown in FIGS.
2A,2B, 2C, and 2D, ADRH-1 has chemical and structural homology with RNA
helicase from mouse (GI 1335873; SEQ ID NO:3) and C. elegans (GI 1707046;
SEQ ID NO:4). In particular, ADRH-1 shares 85% and 36% identity with the
mouse and C. elegans RNA helicases, respectively. In particular, ADRH-1
and the C. elegans RNA helicase share the ATP/GTP binding site, and both
the mouse and C. elegans RNA helicase share the "DEAD box", the SAT
(helicase), and the RNA binding-ATP hydrolysis sequences found in ADRH-1.
The N-linked glycosylation site and the protein kinase A phosphorylation
sites at T.sub.503 and S.sub.604 in ADRH-1 are shared by the mouse
helicase, as is the tyrosine kinase phosphorylation site at Y.sub.740.
ADRH-1 differs from the mouse and C. elegans helicases primarily in the
N-terminal, approximately 400 amino acids which may confer different
functional characteristics to the molecule. As illustrated by FIGS. 3A and
3B, ADRH-1 and the mouse RNA helicase have rather similar hydrophobicity
plots when the latter protein is compared with the C-terminal half of
ADRH-1. Northern analysis shows the expression of this sequence in various
libraries, at least 29% of which are immortalized or cancerous, at least
20% are associated with the brain and neural tissues, and at least 19% of
which involve inflammation and the immune response.
The invention also encompasses ADRH-1 variants. A preferred ADRH-1 variant
is one having at least 80%, and more preferably at least 90%, amino acid
sequence identity to the ADRH-1 amino acid sequence (SEQ ID NO:1) and
which retains at least one biological, immunological or other functional
characteristic or activity of ADRH-1. A most preferred ADRH-1 variant is
one having at least 95% amino acid sequence identity to SEQ ID NO:1.
The invention also encompasses polynucleotides which encode ADRH-1.
Accordingly, any nucleic acid sequence which encodes the amino acid
sequence of ADRH-1 can be used to produce recombinant molecules which
express ADRH-1. In a particular embodiment, the invention encompasses the
polynucleotide comprising the nucleic acid sequence of SEQ ID NO:2 as
shown in FIGS. 1A, 1B, 1C, 1D, 1E, 1F, 1G and 1H.
It will be appreciated by those skilled in the art that as a result of the
degeneracy of the genetic code, a multitude of nucleotide sequences
encoding ADRH-1, some bearing minimal homology to the nucleotide sequences
of any known and naturally occurring gene, may be produced. Thus, the
invention contemplates each and every possible variation of nucleotide
sequence that could be made by selecting combinations based on possible
codon choices. These combinations are made in accordance with the standard
triplet genetic code as applied to the nucleotide sequence of naturally
occurring ADRH-1, and all such variations are to be considered as being
specifically disclosed.
Although nucleotide sequences which encode ADRH-1 and its variants are
preferably capable of hybridizing to the nucleotide sequence of the
naturally occurring ADRH-1 under appropriately selected conditions of
stringency, it may be advantageous to produce nucleotide sequences
encoding ADRH-1 or its derivatives possessing a substantially different
codon usage. Codons may be selected to increase the rate at which
expression of the peptide occurs in a particular prokaryotic or eukaryotic
host in accordance with the frequency with which particular codons are
utilized by the host. Other reasons for substantially altering the
nucleotide sequence encoding ADRH-1 and its derivatives without altering
the encoded amino acid sequences include the production of RNA transcripts
having more desirable properties, such as a greater half-life, than
transcripts produced from the naturally occurring sequence.
The invention also encompasses production of DNA sequences, or fragments
thereof, which encode ADRH-1 and its derivatives, entirely by synthetic
chemistry. After production, the synthetic sequence may be inserted into
any of the many available expression vectors and cell systems using
reagents that are well known in the art. Moreover, synthetic chemistry may
be used to introduce mutations into a sequence encoding ADRH-1 or any
fragment thereof.
Also encompassed by the invention are polynucleotide sequences that are
capable of hybridizing to the claimed nucleotide sequences, and in
particular, those shown in SEQ ID NO:2, under various conditions of
stringency as taught in Wahl, G. M. and S. L. Berger (1987; Methods
Enzymol. 152:399-407) and Kimmel, A. R. (1987; Methods Enzymol.
152:507-511).
Methods for DNA sequencing which are well known and generally available in
the art and may be used to practice any of the embodiments of the
invention. The methods may employ such enzymes as the Klenow fragment of
DNA polymerase I, SEQUENASE (U.S. Biochemical Corp, Cleveland, Ohio), Taq
polymerase (Perkin Elmer), thermostable T7 polymerase (Amersham, Chicago,
Ill.), or combinations of polymerases and proofreading exonucleases such
as those found in the ELONGASE Amplification System marketed by Gibco/BRL
(Gaithersburg, Md.). Preferably, the process is automated with machines
such as the Hamilton Micro Lab 2200 (Hamilton, Reno, Nev.), Peltier
Thermal Cycler (PTC200; MJ Research, Watertown, Mass.) and the ABI
Catalyst and 373 and 377 DNA Sequencers (Perkin Elmer).
The nucleic acid sequences encoding ADRH-1 may be extended utilizing a
partial nucleotide sequence and employing various methods known in the art
to detect upstream sequences such as promoters and regulatory elements.
For example, one method which may be employed, "restriction-site" PCR,
uses universal primers to retrieve unknown sequence adjacent to a known
locus (Sarkar, G. (1993) PCR Methods Applic. 2:318-322). In particular,
genomic DNA is first amplified in the presence of primer to a linker
sequence and a primer specific to the known region. The amplified
sequences are then subjected to a second round of PCR with the same linker
primer and another specific primer internal to the first one. Products of
each round of PCR are transcribed with an appropriate RNA polymerase and
sequenced using reverse transcriptase.
Inverse PCR may also be used to amplify or extend sequences using divergent
primers based on a known region (Triglia, T. et al. (1988) Nucleic Acids
Res. 16:8186). The primers may be designed using commercially available
software such as OLIGO 4.06 Primer Analysis software (National Biosciences
Inc., Plymouth, Minn.), or another appropriate program, to be 22-30
nucleotides in length, to have a GC content of 50% or more, and to anneal
to the target sequence at temperatures about 68.degree.-72.degree. C. The
method uses several restriction enzymes to generate a suitable fragment in
the known region of a gene. The fragment is then circularized by
intramolecular ligation and used as a PCR template.
Another method which may be used is capture PCR which involves PCR
amplification of DNA fragments adjacent to a known sequence in human and
yeast artificial chromosome DNA (Lagerstrom, M. et al. (1991) PCR Methods
Applic. 1:111-119). In this method, multiple restriction enzyme digestions
and ligations may also be used to place an engineered double-stranded
sequence into an unknown fragment of the DNA molecule before performing
PCR.
Another method which may be used to retrieve unknown sequences is that of
Parker, J. D. et al. (1991; Nucleic Acids Res. 19:3055-3060).
Additionally, one may use PCR, nested primers, and PROMOTERFINDER
libraries to walk genomic DNA (Clontech, Palo Alto, Calif.). This process
avoids the need to screen libraries and is useful in finding intron/exon
junctions.
When screening for full-length cDNAs, it is preferable to use libraries
that have been size-selected to include larger cDNAs. Also, random-primed
libraries are preferable, in that they will contain more sequences which
contain the 5' regions of genes. Use of a randomly primed library may be
especially preferable for situations in which an oligo d(T) library does
not yield a full-length cDNA. Genomic libraries may be useful for
extension of sequence into 5' non-transcribed regulatory regions.
Capillary electrophoresis systems which are commercially available may be
used to analyze the size or confirm the nucleotide sequence of sequencing
or PCR products. In particular, capillary sequencing may employ flowable
polymers for electrophoretic separation, four different fluorescent dyes
(one for each nucleotide) which are laser activated, and detection of the
emitted wavelengths by a charge coupled device camera. Output/light
intensity may be converted to electrical signal using appropriate software
(e.g. GENOTYPER and SEQUENCE NAVIGATOR, Perkin Elmer) and the entire
process from loading of samples to computer analysis and electronic data
display may be computer controlled. Capillary electrophoresis is
especially preferable for the sequencing of small pieces of DNA which
might be present in limited amounts in a particular sample.
In another embodiment of the invention, polynucleotide sequences or
fragments thereof which encode ADRH-1 may be used in recombinant DNA
molecules to direct expression of ADRH-1, fragments or functional
equivalents thereof, in appropriate host cells. Due to the inherent
degeneracy of the genetic code, other DNA sequences which encode
substantially the same or a functionally equivalent amino acid sequence
may be produced, and these sequences may be used to clone and express
ADRH-1.
As will be understood by those of skill in the art, it may be advantageous
to produce ADRH-1-encoding nucleotide sequences possessing non-naturally
occurring codons. For example, codons preferred by a particular
prokaryotic or eukaryotic host can be selected to increase the rate of
protein expression or to produce an RNA transcript having desirable
properties, such as a half-life which is longer than that of a transcript
generated from the naturally occurring sequence.
The nucleotide sequences of the present invention can be engineered using
methods generally known in the art in order to alter ADRH-1 encoding
sequences for a variety of reasons, including but not limited to,
alterations which modify the cloning, processing, and/or expression of the
gene product. DNA shuffling by random fragmentation and PCR reassembly of
gene fragments and synthetic oligonucleotides may be used to engineer the
nucleotide sequences. For example, site-directed mutagenesis may be used
to insert new restriction sites, alter glycosylation patterns, change
codon preference, produce splice variants, introduce mutations, and so
forth.
In another embodiment of the invention, natural, modified, or recombinant
nucleic acid sequences encoding ADRH-1 may be ligated to a heterologous
sequence to encode a fusion protein. For example, to screen peptide
libraries for inhibitors of ADRH-1 activity, it may be useful to encode a
chimeric ADRH-1 protein that can be recognized by a commercially available
antibody. A fusion protein may also be engineered to contain a cleavage
site located between the ADRH-1 encoding sequence and the heterologous
protein sequence, so that ADRH-1 may be cleaved and purified away from the
heterologous moiety.
In another embodiment, sequences encoding ADRH-1 may be synthesized, in
whole or in part, using chemical methods well known in the art (see
Caruthers, M. H. et al. (1980) Nucl. Acids Res. Symp. Ser. 215-223, Horn,
T. et al. (1980) Nucl. Acids Res. Symp. Ser. 225-232). Alternatively, the
protein itself may be produced using chemical methods to synthesize the
amino acid sequence of ADRH-1, or a fragment thereof. For example, peptide
synthesis can be performed using various solid-phase techniques (Roberge,
J. Y. et al. (1995) Science 269:202-204) and automated synthesis may be
achieved, for example, using the ABI 431A Peptide Synthesizer (Perkin
Elmer).
The newly synthesized peptide may be substantially purified by preparative
high performance liquid chromatography (e.g., Creighton, T. (1983)
Proteins, Structures and Molecular Principles, WH Freeman and Co., New
York, N.Y.). The composition of the synthetic peptides may be confirmed by
amino acid analysis or sequencing (e.g., the Edman degradation procedure;
Creighton, supra). Additionally, the amino acid sequence of ADRH-1, or any
part thereof, may be altered during direct synthesis and/or combined using
chemical methods with sequences from other proteins, or any part thereof,
to produce a variant polypeptide.
In order to express a biologically active ADRH-1, the nucleotide sequences
encoding ADRH-1 or functional equivalents, may be inserted into
appropriate expression vector, i.e., a vector which contains the necessary
elements for the transcription and translation of the inserted coding
sequence.
Methods which are well known to those skilled in the art may be used to
construct expression vectors containing sequences encoding ADRH-1 and
appropriate transcriptional and translational control elements. These
methods include in vitro recombinant DNA techniques, synthetic techniques,
and in vivo genetic recombination. Such techniques are described in
Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold
Spring Harbor Press, Plainview, N.Y., and Ausubel, F. M. et al. (1989)
Current Protocols in Molecular Biology, John Wiley & Sons, New York, N.Y.
A variety of expression vector/host systems may be utilized to contain and
express sequences encoding ADRH-1. These include, but are not limited to,
microorganisms such as bacteria transformed with recombinant
bacteriophage, plasmid, or cosmid DNA expression vectors; yeast
transformed with yeast expression vectors; insect cell systems infected
with virus expression vectors (e.g., baculovirus); plant cell systems
transformed with virus expression vectors (e.g., cauliflower mosaic virus,
CaMV; tobacco mosaic virus, TMV) or with bacterial expression vectors
(e.g., Ti or pBR322 plasmids); or animal cell systems. The invention is
not limited by the host cell employed.
The "control elements" or "regulatory sequences" are those non-translated
regions of the vector--enhancers, promoters, 5' and 3' untranslated
regions--which interact with host cellular proteins to carry out
transcription and translation. Such elements may vary in their strength
and specificity. Depending on the vector system and host utilized, any
number of suitable transcription and translation elements, including
constitutive and inducible promoters, may be used. For example, when
cloning in bacterial systems, inducible promoters such as the hybrid lacZ
promoter of the BLUESCRIPT phagemid (Stratagene, LaJolla, Calif.) or
PSPORT1 plasmid (Gibco BRL) and the like may be used. The baculovirus
polyhedrin promoter may be used in insect cells. Promoters or enhancers
derived from the genomes of plant cells (e.g., heat shock, RUBISCO; and
storage protein genes) or from plant viruses (e.g., viral promoters or
leader sequences) may be cloned into the vector. In mammalian cell
systems, promoters from mammalian genes or from mammalian v | | |