|
|
|
| United States Patent | 5708806 |
| Link to this page | http://www.wikipatents.com/5708806.html |
| Inventor(s) | DeRose; Steven (East Providence, RI);
Vogel; Jeffrey (Providence, RI) |
| Abstract | A data processing system and method for generating a representation of an
electronic document, for indexing the electronic document, for navigating
the electronic document using its representation and for displaying the
electronic document on an output device. The system and method are used
with electronic documents having descriptive markup which describes the
content or meaning of the document rather than its appearance. Such
documents may be represented by a tree. Each markup element defines a node
or element in a tree. The tree is represented by providing a unique
identifier for each element and for accessing a descriptor of the element.
An element descriptor preferably includes indications of the parent, first
child, last child, left sibling, right sibling, type name and text
location for the element. The document representation is used to
facilitate navigation of the text for constructing navigational aids such
as table of contents and full text indexing. A document is also provided
with a style sheet for specifying desired formatting characteristics for
each type of element in the document. To display the document, a suitable
starting point is found on the basis of a selected starting point. The
document is displayed beginning with the suitable starting point and the
format characteristics for each element displayed are retrieved from the
style sheet and applied to the text of the displayed element. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5708806 |
|
|
Data processing system and method for generating a representation for
and for representing electronically published structured documents |
|
|
|
|
|
| Publication Date |
January 13, 1998 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This application is a division of application Ser. No. 08/419,051 filed
Apr. 7, 1995, pending, which is a file-wrapper continuation of application
Ser. No. 07/733,204, filed Jul. 19, 1991, entitled DATA PROCESSING SYSTEM
AND METHOD FOR REPRESENTING, GENERATING A REPRESENTATION OF AND RANDOM
ACCESS RENDERING OF ELECTRONIC DOCUMENTS, abandoned. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5644776 DeRose
Jul,1997 |      Your vote accepted [0 after 0 votes] | | 5557720 Brown, Jr.
Sep,1996 |      Your vote accepted [0 after 0 votes] | | 5553284 Barbara 707/4 Sep,1996 |      Your vote accepted [0 after 0 votes] | | 5428529 Hartrick 715/513 Jun,1995 |      Your vote accepted [0 after 0 votes] | | 5367621 Cohen 715/501.1 Nov,1994 |      Your vote accepted [0 after 0 votes] | | 5293473 Hesse 715/529 Mar,1994 |      Your vote accepted [0 after 0 votes] | | 5285526 Bennett, III 715/516 Feb,1994 |      Your vote accepted [0 after 0 votes] | | 5276793 Borgendale 715/513 Jan,1994 |      Your vote accepted [0 after 0 votes] | | 5241671 Reed 707/104.1 Aug,1993 |      Your vote accepted [0 after 0 votes] | | 5202977 Pasetes, Jr. 703/27 Apr,1993 |      Your vote accepted [0 after 0 votes] | | 5185698 Hesse 715/531 Feb,1993 |      Your vote accepted [0 after 0 votes] | | 5146552 Cassorla 715/512 Sep,1992 |      Your vote accepted [0 after 0 votes] | | 5142615 Levesque 345/595 Aug,1992 |      Your vote accepted [0 after 0 votes] | | 5140676 Langelaan 715/515 Aug,1992 |      Your vote accepted [0 after 0 votes] | | 5113341 Kozol 715/531 May,1992 |      Your vote accepted [0 after 0 votes] | | 5108206 Yoshida 400/61 Apr,1992 |      Your vote accepted [0 after 0 votes] | | 5089956 MacPhail
Feb,1992 |      Your vote accepted [0 after 0 votes] | | 5079700 Kozoll
Jan,1992 |      Your vote accepted [0 after 0 votes] | | 5068809 Verhelst
Nov,1991 |      Your vote accepted [0 after 0 votes] | | 5008853 Bly
Apr,1991 |      Your vote accepted [0 after 0 votes] | | 5001654 Winiger 715/529 Mar,1991 |      Your vote accepted [0 after 0 votes] | | 4992972 Brooks
Feb,1991 |      Your vote accepted [0 after 0 votes] | | 4876665 Iwai 707/200 Oct,1989 |      Your vote accepted [0 after 0 votes] | | 4823303 Terasawa 715/515 Apr,1989 |      Your vote accepted [0 after 0 votes] | | 4803643 Hickey 715/513 Feb,1989 |      Your vote accepted [0 after 0 votes] | | 4716404 Tabata 345/625 Dec,1987 |      Your vote accepted [0 after 0 votes] | | 4710885 Litteken 715/513 Dec,1987 |      Your vote accepted [0 after 0 votes] | | 4608664 Bartlett 358/1.2 Aug,1986 |      Your vote accepted [0 after 0 votes] | | 4594674 Boulia 345/471 Jun,1986 |      Your vote accepted [0 after 0 votes] | | 4587633 Wang 709/234 May,1986 |      Your vote accepted [0 after 0 votes] | | 4539653 Bartlett 715/520 Sep,1985 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
Claims  |
|
|
What is claimed is:
1. A data processing system for storing a representation of an electronic
document, including first means for storing the electronic document, the
electronic document having descriptive markup defining a plurality of
hierarchical elements, wherein each element except a root element has an
ancestor element and wherein each element has a type name and wherein an
element may have a child element, a left sibling element, a right sibling
element, and wherein at least one element contains text content, the data
processing system comprising:
second means for storing a value indicative of a parent element for each
element having a parent element;
third means for storing a value indicative of a first child element for
each element having a child element; and
fourth means for storing a value indicative of a left sibling element, for
each element having a left sibling element.
2. The data processing system as set forth in claim 1, further comprising:
fifth means for storing the text content of the electronic document;
for each element containing text content, sixth means for storing a value
indicative of the location of the text content in the fifth means for
storing the text content; and
seventh means for storing, for each element, the type name of the markup
element.
3. The data processing system as set forth in claim 2, further comprising:
means for parsing the electronic document to provide a sequence of element
events in response to detection of elements and text events in response to
detection of text content;
means for assigning a unique element identifier to each element event and
to each text event;
means for constructing the type name of each element from the descriptive
markup defining the element;
means for generating an element descriptor for each element event and for
each text event, addressable by the element identifier assigned to the
event, wherein the element descriptor for an element event includes a
combination of the second through seventh means for storing for the
element.
4. The data processing system as set forth in claim 3, wherein the
indication of the parent, child and sibling elements in each of the means
for storing is the element identifier of the element event provided in
response to detection of the parent, child and sibling elements.
5. The data processing system as set forth in claim 4, wherein the means
for assigning element identifiers assigns numbers in a sequence for the
sequence of events provided by the means for parsing.
6. The data processing system of claim 1, further comprising:
means for parsing the electronic document to provide a sequence of element
events in response to detection of elements and text events in response to
detection of text content;
means for assigning a unique element identifier to each element event and
to each text event; means for constructing the type name of each element
from the descriptive markup defining the element;
means for generating an element descriptor for each element event and for
each text event, addressable by the element identifier assigned to the
event, wherein the element descriptor for an element event includes a
combination of the second through fourth means for storing for the
element.
7. The data processing system of claim 1, further comprising:
means for accessing the second through fourth means for storing for an
element using a unique identifier for the element.
8. The data processing system of claim 1, further comprising:
means for storing an index of at least the text content of the electronic
document including a record for each word in the text content, addressable
by an indication of the word, and for storing an indication of the element
containing the text content containing the word, and for storing an
indication of the number of occurrences in the element of the word.
9. The data processing system of claim 8, wherein the descriptive markup
for an element may include an attribute, and wherein the index further
includes a record for each word in the attribute.
10. The data processing system of claim 1, further comprising:
for selected elements, a record for storing the element identifier of the
element and the element identifier of a next element in the electronic
document which is not contained within the element.
11. The data processing system of claim 10, wherein the record for an
element includes an indication of the type name of the element.
12. The data processing system of claim 1,
wherein the descriptive markup for an element may include an attribute; and
wherein the element descriptor of the element stores an indication of the
attribute as the indication of any text content contained within the
element including the attribute.
13. A method for generating a representation of an electronic document, the
electronic document having descriptive markup defining a plurality of
hierarchical elements, wherein each element except a root element has an
ancestor element and wherein each element has a type name and may have a
child element, a left sibling element, a right sibling element, and
wherein at least one element contains text content, the method comprising
the steps of:
parsing the document to provide a sequence of element events in response to
detection of elements and text events in response to detection of text
content;
assigning, in response to an element event, a unique element identifier to
each element event;
constructing the type name for each element from the descriptive markup
defining the element; and
generating an element descriptor for each element, addressable by the
unique element identifier assigned to the element, wherein the element
descriptor stores an indication of any parent element, any first child
element, and any left sibling element for the element.
14. The method as set forth in claim 13, wherein the element descriptor for
an element includes an indication of the location of the text content, for
each element containing text content, and, for each element, the type name
of the element.
15. The method as set forth in claim 14, wherein the indication of the
parent, child and left sibling elements in each element descriptor is the
element identifier of the parent, child and left sibling elements.
16. The method as set forth in claim 15, wherein the assigned unique
element identifiers are sequential numbers assigned according to the
sequence of events provided by the step of parsing.
17. The method of claim 13, further comprising the step of:
accessing the element descriptor of an element using the element identifier
assigned to the element.
18. The method of claim 13, further comprising the steps of:
indexing at least the text content of the document to provide a sequence of
word events in response to detection of a word in the text content;
constructing, in response to a word event, a record, addressable by an
indication of the detected word, for storing an indication of the element
containing the text content containing the detected word, and for storing
an indication of the number of occurrences of the detected word in the
element.
19. The method of claim 18, wherein the descriptive markup for an element
may include an attribute, and wherein the step of for indexing at least
the text content further includes indexing the attribute.
20. The method of claim 13, further comprising the step of,
for each element, storing in a record the element identifier of the element
and the element identifier of any next element in the electronic document
which is not contained within the element.
21. The method of claim 20, wherein the record for an element includes an
indication of the type name of the element.
22. The method of claim 13,
wherein the descriptive markup for an element may include an attribute; and
wherein the step of constructing an element descriptor includes storing an
indication of the attribute as the indication of any text content of the
element containing the attribute.
23. A data processing system for constructing a representation of an
electronic document, the electronic document having descriptive markup
defining a plurality of hierarchical elements, wherein each element except
a root element has an ancestor element and wherein each element has a type
name and may have a child element, a left sibling element and a right
sibling element and wherein at least one element contains text content,
the data processing system comprising:
a parser for providing a sequence of element events in response to
detection of elements and text events in response to text content;
means, responsive to an element event, for assigning a unique element
identifier to the element event;
means, responsive to an element event, for constructing the type name for
the element event from the descriptive markup defining the element event;
and
means, responsive to an element event, for constructing an element
descriptor, addressable by the element identifier assigned to the element
event, wherein the element descriptor stores an indication of the type
name and any parent element, first child element, last child element, left
sibling element, right sibling element, and text content contained within
the element.
24. The data processing system of claim 23, further comprising:
means for storing the text content of the electronic document separate from
the electronic document and without descriptive markup, and
wherein the indication of the text content contained within an element
stored in the element descriptor of the element is indicative of the
location of the text content in the separate means for storing.
25. The data processing system of claim 23, wherein the indications of any
parent, child or sibling elements in the element descriptors are
indicative of the element identifiers of the parent, child or sibling
elements.
26. The data processing system of claim 23, wherein the means for assigning
element identifiers assigns numbers according to a sequence matching the
sequence of events provided by the parser.
27. The data processing system of claim 23, further comprising:
means for indexing at least the text content of the electronic document
including means for providing a sequence of word events in response to
detection of each word in the text content;
means, responsive to a word event, for constructing a record, addressable
by an indication of the detected word, and for storing an indication of
the element containing the text content containing the detected word, and
for storing an indication of a number of occurrences in the element of the
detected word.
28. The data processing system of claim 27, wherein the descriptive markup
for an element may include an attribute, and wherein the means for
indexing at least the text content further indexes the attribute.
29. The data processing system of claim 23, further comprising:
for selected elements, a record for storing the element identifier of the
element and the element identifier of a next element in the electronic
document which is not contained within the element.
30. The data processing system of claim 29 wherein the record for an
element includes an indication of the type name of the element.
31. The data processing system of claim 23,
wherein the descriptive markup for an element may include an attribute; and
wherein the means for constructing an element descriptor stores an
indication of the attribute as the indication of any text content
contained within the element including the attribute.
32. The data processing system of claim 23, further comprising:
means for accessing the element descriptor of an element using the element
identifier assigned to the element.
33. A method for constructing a representation of an electronic document,
the electronic document having descriptive markup defining a plurality of
hierarchical elements, wherein each element except a root element has an
ancestor element and wherein each element has a type name and may have a
child element, a left sibling element and a right sibling element, and
wherein at least one element contains text content, the method comprising
the steps of:
parsing the electronic document to provide a sequence of element events in
response to detection of elements and text events in response to detection
of text content;
assigning a unique element identifier to each element;
constructing the type name for each element from the descriptive markup
defining the element; and
constructing, for each element, an element descriptor addressable by the
element identifier assigned to the element, wherein the element descriptor
stores an indication of the type name and any parent element, first child
element, last child element, left sibling element, right sibling element,
and text content contained within the element.
34. The method of claim 33 further comprising the step of storing the text
content of the electronic document separate from the electronic document
and without descriptive markup, and wherein the indication of the text
content contained within an element stored in the element descriptor of
the element is indicative of the location of the text content contained
within the element in the separately stored text content.
35. The method of claim 33 wherein the indications of any parent, child or
sibling elements in the element descriptors are indicative of the element
identifiers of the parent, child or sibling elements.
36. The method of claim 33 wherein the step of assigning element
identifiers includes assigning sequential numbers according to the
sequence of events provided by the step of parsing.
37. The method of claim 33 further comprising the steps of:
indexing at least the text content of the document to provide a sequence of
word events in response to detection of a word in the text content;
constructing, in response to a word event a record, addressable by an
indication of the detected word, for storing an indication of the element
containing the text content containing the detected word, and for storing
an indication of the number of occurrences of the detected word in the
element.
38. The method of claim 37 wherein the descriptive markup for an element
may include an attribute, and wherein the step of for indexing at least
the text content further includes indexing the attribute.
39. The method of claim 33 further comprising the step of,
for each element, storing in a record the element identifier of the element
and the element identifier of any next element in the electronic document
which is not contained within the element.
40. The method of claim 39 wherein the record for an element includes an
indication of the type name of the element.
41. The method of claim 33,
wherein the descriptive markup for an element may include an attribute; and
wherein the step of constructing an element descriptor includes storing an
indication of the attribute as the indication of any text content of the
element containing the attribute.
42. The method of claim 33, further comprising the step of:
accessing the element descriptor of an element using the element identifier
assigned to the element.
43. A digital information product containing an electronic document,
wherein the electronic document has descriptive markup defining a
plurality of hierarchical elements, wherein each element except a root
element has an ancestor element and wherein each element has a type name
and wherein an element may have a child element, a left sibling element, a
right sibling element, and wherein at least one element contains text
content, digital information product comprising a computer-readable medium
on which computer-readable signals are stored, wherein the
computer-readable signals define an element directory containing an
element descriptor for each element wherein each element descriptor is
accessible using an identifier of the element descriptor for an element
and includes a value indicative of a parent element for each element
having a parent element, a value indicative of a first child element for
each element having a child element, and a value indicative of a left
sibling element for each element having a left sibling element.
44. The digital information product of claim 43,
wherein the element descriptor of each element further includes a value
indicative of any last child element for the element.
45. The digital information product of claim 44,
wherein the element descriptor of each element further includes a value
indicative of any right sibling element for the element.
46. The digital information product of claim 43,
wherein the element descriptor of each element further includes a value
indicative of any right sibling element for the element.
47. The digital information product of claim 43,
wherein the element descriptor of each element further includes a value
indicative of any text contained within the element.
48. The digital information product of claim 47, wherein the text content
of the electronic document is defined by the computer-readable signals
separately form the element directory and wherein for each element the
value indicative of the text content contained within the element is a
value indicative of the location of the text content contained within the
element in the text content stored in the computer-readable medium.
49. The digital information product of claim 43,
wherein the element descriptor of each element further includes a value
indicative of the type name for the element.
50. The digital information product of claim 43, wherein the
computer-readable signals further define an index of the text content of
the electronic document including a record for each word in the text
content, addressable by an indication of the word, wherein the record
includes an indication of each eleme | | |