|
|  Get related patents on CD |
| United States Patent | 5737599 |
| Link to this page | http://www.wikipatents.com/5737599.html |
| Inventor(s) | Rowe; Edward R. (701 W. 32nd St. #14, Los Angeles, CA 90007);
Priyadarshan; Eswar (1054 Heatherston Ave., Sunnyvale, CA 94087);
Anderson; Kenneth S. (4133 Amaranta, Palo Alto, CA 94306);
Al-Shamma; Nabeel A. (1525 Gretel La., Mountain View, CA 94040);
Taft; Edward A. (1147 Sladky Ave., Mountain View, CA 94040);
McQuarrie; Elizabeth M. (15882 Ravine Rd., Los Gatos, CA 95030);
Cohn; Richard J. (575 N. California Ave., Palto Alto, CA 94301) |
| Abstract | A method and apparatus for providing an optimized page-based electronic
document file and downloading the optimized file. An optimized document
file is created from a non-optimized electronic document. Page contents
are contiguously written in the optimized file and a page offset table is
provided in the optimized file that includes page offset information used
to locate individual pages and objects of the document. Shared objects,
such as fonts, are included in the file after the page contents. When
downloading the optimized file from a host, the page offset information is
read early and is used to download a specific page requested by the user
without downloading other pages in the document. In one embodiment, a
viewer downloads a first portion of the requested page, while all
remaining portions of the requested page are located and requested by a
finder process using the page offset table. In alternative embodiments,
all objects for a full page may be requested at once. The requested page
can thus be downloaded with only one connection to the host. Shared
objects can optionally be downloaded interleaved between portions of the
page contents that reference the shared objects. Alternatively, with the
use of hint tables, shared and other objects can be read in one
transaction identifying byte ranges in the document. The requested page is
displayed to the user on an output display device. The order elements are
displayed provides quick access to useful information and to active
elements. |
| |
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5737599 |
|
|
Method and apparatus for downloading multi-page electronic documents
with hint information |
|
| Inventor |
Rowe; Edward R. (701 W. 32nd St. #14, Los Angeles, CA 90007);
Priyadarshan; Eswar (1054 Heatherston Ave., Sunnyvale, CA 94087);
Anderson; Kenneth S. (4133 Amaranta, Palo Alto, CA 94306);
Al-Shamma; Nabeel A. (1525 Gretel La., Mountain View, CA 94040);
Taft; Edward A. (1147 Sladky Ave., Mountain View, CA 94040);
McQuarrie; Elizabeth M. (15882 Ravine Rd., Los Gatos, CA 95030);
Cohn; Richard J. (575 N. California Ave., Palto Alto, CA 94301) |
|
|
|
| Publication Date |
April 7, 1998 |
|
|
|
|
|
| Filing Date |
December 7, 1995 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part of earlier filed U.S.
application Ser. Nos. 08/533,875, filed Sep. 26, 1995 now abandoned, and
08/533,177 filed Sept. 25, 1995, now pending, each of which is
incorporated herein by reference in its entirety and each of which is the
basis of a claim for priority under 35 U.S.C. .sctn.120. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
Claims  |
|
|
What is claimed is:
1. A method for downloading a multi-page electronic document which contains
page offset information hints, hints being optional information added to
the document to optimize operations, the method comprising:
reading the page offset information hints early during the downloading
process;
receiving a request to download a specific page of the document, the
document having elements defining the appearance of the specific page, the
elements being stored in the document in a non-contiguous manner;
finding the non-contiguous elements of the specific page in the document
using the page offset information hints which were downloaded early during
the downloading process; and
downloading the elements defining the appearance of the specific page.
2. The method of claim 1 wherein the page offset information hints are
located at a predetermined location in the document.
3. The method of claim 1 wherein the page offset information hints are read
before the downloading of more than one page of the document has been
completed.
4. The method of claim 1 wherein the beginning and ending offsets of each
element needed to display a page of the document can be derived from the
page offset information hints.
5. The method of claim 1 wherein, prior to reading the page offset
information hints, an offset value is read from the document and used as
an index into the document to locate the page offset information hints in
the document.
6. The method of claim 5 wherein the offset value is read before or during
the reading of the first page of the document.
7. The method of claim 1 wherein the page offset information hints are
contained on or after the last page of the document.
8. The method of claim 1 wherein the page offset information hints are
contained before the second page of the document.
9. The method of claim 1 further comprising a step of displaying the
specific page requested by the user on an output display device.
10. The method of claim 9 wherein the specific page requested by the user
is downloaded in one request-response transaction with a host computer
which stores the multi-page document.
11. The method of claim 10 wherein the specific page includes page contents
and shared objects, where the shared objects are downloaded interleaved
between portions of the page contents.
12. The method of claim 9 wherein the specific page is requested by a user
and is downloaded during one connection to a host computer which stores
the multi-page document.
13. The method of claim 12 wherein said downloading is accomplished by a
viewer program implemented on a client computer system operated by the
user.
14. The method of claim 13 wherein a portion of the specific page is
downloaded by the viewer while all remaining portions of the specific page
are located by a finder process using the page offset information hints
and downloaded during the one connection with the host computer.
15. The method of claim 14 wherein the remaining portions of the specific
page include shared objects referenced by page contents of the specific
page.
16. The method of claim 15 wherein the shared objects are downloaded
interleaved between portions of the page contents that reference the
shared objects.
17. The method of claim 1 wherein the specific page includes page contents
and shared objects, where the shared objects are downloaded interleaved
between portions of the page contents.
18. The method of claim 1 wherein the page offset information is stored in
a page offset hint table in the document.
19. The method of claim 1 further comprising:
downloading shared object hints for the document, shared object hints
providing information enabling a downloading process to locate shared
objects in the document.
20. The method of claim 1 further comprising:
downloading bookmark hints for the document, bookmarks hints providing
information enabling a downloading process to locate bookmarks in the
document; and
downloading thumbnail hints for the document, thumbnail hints providing
information enabling a downloading process to locate thumbnail in the
document.
21. The method of claim 1 further comprising:
downloading article thread hints for the document, article thread hints
providing information enabling a downloading process to locate article
threads and beads in the document.
22. The method of claim 1 wherein:
the document includes at least one category of objects associated with
particular pages and at least one category of objects associated with the
document as a whole; and
each category of objects associated with the document as a whole has a
corresponding hint table;
the method further comprising downloading the corresponding hint table.
23. A method for downloading a document including a plurality of pages from
a remote host computer, comprising:
placing within the document page offset information hints specifying the
exact location of the elements defining the appearance of each page of the
document, hints being optional information not logically part of the
structure or content of the document;
reading the page offset information hints early during the downloading
process; and
using the page offset information hints to obtain and download a specific
page requested by a user, the elements defining, the appearance of the
specific page being stored in the document in a non-contiguous manner.
24. The method of claim 23 wherein the page offset information is read
before the downloading of more than one page of the document has been
completed.
25. The method of claim 23 further comprising displaying the specific page
requested by the user on an output display device.
26. The method of claim 23 further comprising downloading page content
information and shared objects for a first page of the document before
said step of reading the page offset information hints.
27. The method of claim 26 further comprising initially downloading a range
table to determine the location of the page content information and shared
objects for the first page in the document.
28. The method of claim 26 further comprising a step of displaying a first
page of the document as the page content information and shared objects
for the first page are downloaded.
29. The method of claim 26 wherein the specific page requested by the user
is downloaded in one request-response transaction with a host computer
which stores the document.
30. The method of claim 26 wherein the specific page requested by the user
is downloaded with one connection to a host computer which stores the
document.
31. The method of claim 30 wherein a portion of the specific page is
downloaded by a viewer process implemented by a client computer system
operated by the user, while all remaining portions of the specific page
are located in the document by a finder process using the page offset
information hints, the remaining portions being downloaded during the one
connection with the host computer.
32. The method of claim 31 wherein the remaining portions of the specific
page include shared objects referenced by page contents of the specific
page, wherein the shared objects are downloaded interleaved between
portions of the page contents that reference the shared objects.
33. A computer readable storage medium including program instructions for
performing steps during a downloading process for interleaving page
contents of a page of a document with a shared object referenced by the
page, the steps comprising:
downloading a first portion of page content of a page-based document stored
on a host computer, the portion of page content including a reference to a
shared object;
downloading the shared object referenced by the first portion of the page;
and
downloading a second portion of page content of the page of the page-based
document;
wherein the second portion of page content is stored contiguously with the
first portion of page content on the host computer and the shared object
is not stored contiguously with either the first portion of page content
or the second portion of page content on the host computer.
34. A computer readable storage medium as recited in claim 33 wherein the
shared object is stored on the host computer in the page-based document.
35. A computer readable storage medium as recited in claim 33 wherein the
shared object is referenced by a plurality of pages of the document.
36. A computer readable storage medium as recited in claim 33 wherein the
shared object is force to be a shared object regardless of whether the
object is referenced by a plurality of pages, and wherein only
predetermined types of object are forced to be shared objects.
37. A computer readable storage medium as recited in claim 33 wherein the
program instructions further perform a step of deriving the locations of
the first portion of page content, the second portion of page content, and
the shared object in the page-based document utilizing a page offset table
downloaded from the page-based document.
38. A computer readable storage medium as recited in claim 33 wherein the
program instructions further perform a step of displaying the first
portion and the second portion of page content on an output display
device, and wherein the downloading of the shared object is necessary to
display the first portion of page content.
39. A computer readable storage medium as recited in claim 38 wherein the
page content includes text to be displayed, and wherein the shared object
is a font object needed to display the text.
40. A computer readable storage medium as recited in claim 33 wherein the
first portion and the second portion of page contents are predetermined
fractions of the entire page contents for the page in the page-based
document.
41. A computer readable storage medium as recited in claim 40 wherein a
plurality of the fractions of page content are downloaded up to the
fraction that includes the reference to the shared object.
42. A computer readable storage medium as recited in claim 41 wherein the
shared object is stored in a shared object cache, wherein the shared
object cache is checked to determine if a shared object has been
previously downloaded before downloading the shared object.
43. An apparatus for downloading a page-based document stored on a host,
the document containing page offset information hints located at a
predetermined location, the apparatus comprising:
a digital processor;
a memory device coupled to the digital processor;
a display screen coupled to the digital processor;
means for displaying the page-based document on the display screen, the
means for displaying connecting with the host to download the page offset
information hints, to download a specific page of the page-based document
requested by the user without the necessity of downloading other pages in
the document, and to display the downloaded page on the display screen;
and
a finder for utilizing the page offset information hints to provide a
location of the specific page in the document to the means for displaying
so that the means for displaying can download the specific page; wherein
when the means for displaying is downloading a portion of a specific page
during a connection with the host, the finder additionally requests
additional portions of the specific page and the additional portions of
the page are downloaded to the means for displaying to be displayed, the
additional portions including page contents and shared objects, the page
contents including text and the shared objects including font objects, the
finder requesting a portion of the page contents to be downloaded that
includes a reference to a particular shared object and requesting the
particular shared object to be downloaded after the portion of the page
contents.
44. The apparatus of claim 43 wherein the page offset information hints are
read before the downloading of more than one page of the document has been
completed.
45. The apparatus of claim 43 wherein the finder can derive the beginning
offsets of each page of the document from the page offset information
hints.
46. The apparatus of claim 43 wherein the means for displaying includes a
viewer process for selecting and displaying pages of the page-based
document on the display screen.
47. Apparatus comprising a computer-readable storage medium tangibly
embodying computer program instructions comprising instructions to:
connect with a host computer to download page offset information hints
located at a predetermined location in a document;
download shared object hints from the host computer;
using the page offset information hints and the shared object hints, locate
a first portion of page content of a page of the document, a second
portion of page content of the page of the document, and a shared object
in the document, the first portion of page content including a reference
to the shared object;
download the located first portion of page content;
download the located shared object; and
download the located second, portion of content.
48. The apparatus of claim 47 wherein when a display process is downloading
a portion of a specific page during a connection with the host computer, a
finder process additionally requests additional portions of the specific
page, and the additional portions of the page are downloaded to the
display process to be displayed. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates generally to the storage and retrieval of
data for a computer system, and more particularly to a method and
apparatus for optimizing page-based data documents for fast retrieval over
networks, and to a method and apparatus for accessing such optimized
documents. The present invention also relates to methods and apparatus for
the processing and display of electronic documents, and more particularly
to the processing and display of such documents when retrieved over
networks.
It has become increasingly common to create, transmit, and display
documents in electronic form. Electronic documents have a number of
advantages over paper documents including their ease of transmission,
their compact storage, and their ability to be edited and/or
electronically manipulated. An electronic document typically has
information content (such as text, graphics, and pictures) and formatting
information that directs how the content is to be displayed. With recent
advances in multimedia technology, documents can now also include sound,
full motion video, and other multimedia content.
An electronic document is provided by an author, distributor or publisher
(referred to as "publisher" herein) who often desires that the document be
viewed with the appearance with which it was created. This, however,
creates a problem in that electronic documents are typically widely
distributed and, therefore, can be viewed on a great variety of hardware
and software platforms. For example, the video monitors being used to view
the document can vary in size, resolution, etc. Furthermore, the various
software platforms such as DOS, Microsoft Windows.TM., and Macintosh.TM.
all have their own display idiosyncrasies. Also, each user or "reader" of
the electronic document will have his or her own personal viewing
preferences, which should be accommodated, if possible.
A solution to this problem is to provide a "portable electronic document"
that can be viewed and manipulated on a variety of different platforms and
can be presented in a predetermined format where the appearance of the
document as viewed by a reader is as it was intended by the publisher. One
such predetermined format is the Portable Document Format.TM. (PDF.TM.)
developed by Adobe Systems, Inc. of Mountain View, Calif. An example of
page-based software for creating, reading, and displaying PDF documents is
the Acrobat.TM. software, also of Adobe Systems, Inc. The Adobe Acrobat
software is based on Adobe's PostScript.RTM. technology, which describes
formatted pages of a document in a device-independent fashion. An Acrobat
program on one platform can create, display, edit, print, annotate, etc. a
PDF document produced by another Acrobat program running on a different
platform, regardless of the type of computer platform used. A document in
a certain format or language can be translated into a PDF document using
Acrobat. A PDF document can be quickly displayed on any computer platform
having the appearance intended by the publisher, allowing the publisher to
control the final appearance of the document.
One relatively new application for portable electronic documents is the
retrieval of such documents from the "Internet", the globally-accessible
network of computers that collectively provides a large amount and variety
of information for users. From services of the Internet such as the World
Wide Web, users may retrieve or "download" data from Internet network
sites and display the data that includes information presented as text in
various fonts, graphics, images, and the like having an appearance
intended by the publisher. A file format such as PDF that allows any
platform to view a document having an appearance as intended by a
publisher is thus of great value when downloading files from such
widely-accessible and platform-independent network sources such as the
Internet.
One problem with previous page-based data downloading processes is that all
of the data of a document is typically downloaded before any portion of
the document is displayed to the user. Thus, the user must wait for an
entire document to download before seeing a page or other portion of the
document on the display screen. This can be inconvenient when the user
wishes to use only a portion of the document, i.e., view only specific
pages or a specific number of contiguous pages of a document. Some
searching processes allow a word to be searched in a document and will
download only the portion of the document that includes the searched word.
However, this portion of the document is an isolated, separate portion
that has no connection with the rest of the document. If the user wishes
to view the next page after the downloaded portion, he or she must
inconveniently either download the entire document or specify a search
term on the next page of the document.
Acrobat and similar programs for displaying portable electronic documents
such as PDF documents are often page-based, which means that the program
typically organizes and displays a desired page of the document at a time.
Typically, the entire document was downloaded at once, then desired pages
displayed. However, Acrobat is conducive to downloading a page of a
document at a time from a document file, while still allowing a user to
select other pages of the document conveniently. However, for such
page-based formats, the document data usually is not stored contiguously
in a page order within a file, data structure, or other collection of
document data ("document file" as referred to herein). For example, a
document file in the PDF format may store a page having objects such as a
page contents object (including text, graphics shapes, display
instructions, etc.) and image objects. However, the objects may be stored
in the document in a scattered or disjointed manner. For example, portions
of the page contents object can be scattered in different places in a
document file, and shared objects such as fonts can be stored anywhere in
the file. Shared objects such as fonts can also be stored in files
distinct from the document file, and even on a separate computer, or be
made available through a resource service such as a font server. Since the
output display device displays the page contents and shared objects based
upon pointers to related objects, objects do not have to be stored
sequentially or contiguously in the document file, and are typically
stored in a disjointed manner.
This disjointed data storage for pages can lead to problems when attempting
to download a specific page of a document desired by the user. One major
problem is time delays caused by making multiple connections (or multiple
request-response transactions) when downloading data. For example, a
viewing program for displaying page-based data at a client computer begins
downloading a PDF (or similar format) file from a remote host computer.
The viewing program makes one connection to (or initiates one transaction
with) the host and downloads data from the first portion of the page, then
must make another connection to (or transactions with) the host to
retrieve the next, disjointed portion of the page. This has the effect of
slowing down the downloading of the page, since each connection (and each
transaction) has a time delay and overhead associated with it. The user
requesting the page thus may have to wait several seconds before the
viewer receives all of the data for the page and displays the page. This
problem is compounded when fonts or other such referenced objects are
included on the page, since yet another connection must be made to (or
transaction made with) the host to retrieve these objects before the page
can be displayed.
The time delays for downloading a page can become even lengthier when a
randomly-accessed page is desired to be viewed by the user. In PDF files,
objects are provided in a "page tree" which the viewer consults to
determine where in the document file the root of a randomly-accessed page
is positioned. The page tree is a data structure in which every node must
be visited in order to determine all the children objects in the tree.
Thus, many page nodes may need to be visited to determine where a page
root object is located in the document file. The page tree can thus be
quite large, and downloading it from the document slows the downloading
process. In addition, the page tree is often so large or disjointed that
multiple connections to (or transactions with) the host are required to
download it.
Therefore, there is a need for a method and apparatus for providing
optimized page-based documents and downloading desired pages from such
documents without causing an excessive delay before displaying a page, or
portions of a page, to the user.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus for optimizing a
page-based electronic document and downloading and displaying desired
pages, or portions of a page, from the optimized document without
excessive time delays.
A method of the present invention provides an optimized document file from
a non-optimized electronic document having one or more pages. Page content
information that describes individual pages of the document is written in
the optimized document file. The page content information may be written
contiguously. Page offset information used to locate individual pages of
the document may also be provided in the optimized document file. Objects
shared by multiple pages are also provided for in the optimized document
file, contiguously located after all of the page content information, and
the page offset information includes offsets (locations) to these shared
objects. The page content information includes text and graphics, and the
shared objects can include font objects and image objects. To provide the
page contents and shared objects contiguously in the file, an internal
list of non-shared objects and shared objects in the document file is
created. A list of pages that share objects is also created that includes
the shared objects used by each sharing page and, for each such shared
object, a portion of the page contents in which the shared object is
referenced. In addition, in one aspect, first page offset information may
be provided in a range table for a first page of the optimized document
file. Such first page offset information describes the locations of all
portions of the first page in the document file. The offsets to page
content for this page may be interleaved in the range table with offsets
to shared objects referenced by the page content for the first page.
Another method of the present invention efficiently downloads a page-based
optimized document created as described above. The page offset information
is read early during the downloading process. Beginning and ending offsets
of each page of the document can be derived from the page offset
information. Using the page offset information, a specific page requested
by the user is downloaded, and any page desired by the user can readily be
downloaded without the necessity of downloading other pages in the
document. In one aspect of the method, the page offset information may be
read before the downloading of more than one page of the document has been
completed. In one aspect, the document file has a pointer that points to
the location of the page offset information, which pointer is read ahead
of, or during, the reading of the first page of the document.
In another aspect, when a user requests a specific page of an optimized
document, the specific page is downloaded to a client computer system in
only one connection with a host that stores the optimized document file.
In another aspect, the specific page is downloaded in only one transaction
with the host. The requested page, while being downloaded, may be
displayed to the user on an output display device, such as a display
screen, monitor, or printer. The downloading can be accomplished by a
viewer program on the client computer system. When connecting and
downloading, the viewer may download a first portion of the requested
page, while all remaining portions of the requested page are located and
requested by a finder process on the client computer using the page offset
table. These additional portions are downloaded during the client
computer's one connection with the host, thus saving time and overhead by
avoiding multiple transactions or connections. The additional portions of
the specific page may include shared objects referenced by page contents
of the specific page. Shared objects are downloaded in an interleaved
order between portions of the page contents that reference the shared
objects. In another aspect, the requested page is downloaded to a client
computer system in only one transaction with a host that stores the
optimized document file, the transaction being constructed by a process
using a page offset hint table and any other hint tables available in the
document.
If shared objects are downloaded in an interleaved order, the interleaving
process includes downloading a first portion of page content from the
requested page, where the first portion of page content includes a
reference to a shared object. The first portion may include all contiguous
page content of the document until the (approximate) point of reference to
the shared object. Then, the shared object referenced by the first portion
of the page is downloaded. The shared object is, for example, a font or
similar referenced object that is needed to display the first portion of
page content. A second portion of page content from the requested page is
then downloaded, where the second portion is contiguous with the first
portion of page content. The locations of the first and second portions of
page content and the shared objects in the page-based document are derived
using the page offset table. Alternatively, a surrogate, such as a
substitute font, is used to display the first portion of page content,
thereby allowing the process to defer the downloading of the referenced
object and thereby to download and to display more quickly the second
portion of page content.
Another method of the present invention provides for the displaying on a
display device of a computer an electronic document, such as a portable
electronic document, having text to be displayed on top of a large object,
such as a bitmap image. In general, in one aspect, the method includes
displaying the display of the large object in favor of displaying the
overlying text, displaying the overlying text on the display device, and
at least as to that portion of the large object that appears underneath
the overlying text, drawing the underneath portion into an off-screen
buffer, drawing the overlying text over the object in the off-screen
buffer and copying the off-screen buffer to be displayed on the display
device. In another aspect, the acts of displaying an object and of
displaying text include rendering a bitmap of at least one bit per pixel
into a display buffer of random access memory. In another aspect, the
display buffer and the off-screen buffer have the same pixel depths and
color definitions. In another aspect, the invention provides for creating
a blocking mask corresponding to the displayed appearance of the text and
then displaying the portion of the object that is specified to appear as
if drawn underneath the text under control of the blocking mask so that
displaying the object does not overwrite the displayed text.
Another method of the present invention is implemented in a viewing program
to display to a user an electronic document, such as a portable electronic
document, that contains an interactive element responsive to user input.
In one aspect, the method includes changing the appearance of the cursor
of the viewing program's graphical user interface to indicate when it is
located in a position where the interactive element will be displayed, and
making the interactive element responsive to input from the user without
waiting for the interactive element to be displayed. In another aspect,
the interactive element is a hypertext link. In another aspect, the
interactive element is an annotation in a PDF format electronic document.
Another method of the present invention provides for displaying on a
computer display device an electronic document, such as a portable
electronic document, that has text in a desired font, without waiting for
the desired font to be available. In one aspect, the method includes
initially drawing on the display device at least a part of the text in a
substitute font different from the desired font, obtaining the desired
font for use on the computer with the display device, and redrawing with
the desired font the area of display in which the substitute font had been
used initially. In another aspect, the method includes reading font
description metrics for the desired font and using them to create a
substitute font. In another aspect, the method also includes adopting a
font from available font resources as the substitute font. In another
aspect, the desired font is a font included as an embedded font in the
document. In another aspect, the desired font is obtained from a font
server.
An apparatus of the present invention provides for efficiently downloading
a page-based document stored on a host, as described above. The apparatus
includes a digital processor, a memory device, and a display screen.
Furthermore, a mechanism for displaying the page-based document on the
display screen is included which connects with the host to download the
page offset information and/or to download a specific page of the document
requested by the user without downloading other pages in the document. A
downloaded page can be displayed on the display screen. A finder uses the
page offset information to provide a location of the specific page in the
document to the displaying mechanism so that the specific page can be
downloade | | |