WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Retrieving documents transitively linked to an initial document    

Get related patents on CD
United States Patent6789080   
Link to this pagehttp://www.wikipatents.com/6789080.html
Inventor(s)Sweet; Richard Eric (Mountain View, CA), Rowe; Edward Royce Warren (Sunnyvale, CA)
AbstractA method for displaying hypertext data comprises displaying a first document represented in a first markup representation and containing at least one hypertext link, and in response to a user input selecting a first hypertext link in the first document, accessing an external document represented in a second markup representation, the first hypertext link having an original state pointing to the external document. The method further comprises converting the second markup representation of the external document into a first markup representation of the external document, and incorporating the first markup representation of the external document into the first document. The method can also include modifying the first hypertext link from the original state to a second state having an internal link pointed to the first markup representation of the external document in the first document.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Drawing from US Patent 6789080
Retrieving documents transitively linked to an initial document - US Patent 6789080 Drawing
Retrieving documents transitively linked to an initial document
Inventor     Sweet; Richard Eric (Mountain View, CA) , Rowe; Edward Royce Warren (Sunnyvale, CA)
Owner/Assignee     Adobe Systems Incorporated (San Jose, CA)
Patent assignment
All assignments
Company News
Publication Date     September 7, 2004
Application Number     10/388,093
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 13, 2003
US Classification     707/10 707/2 707/3
Int'l Classification    
Examiner     Kindred; Alford
Assistant Examiner    
Attorney/Law Firm     Fish & Richardson P.C.
Address
Parent Case     CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation application of and claims priority to U.S. patent application Ser. No. 10/071,762, filed on Feb. 6, 2002, which is a divisional of U.S. patent Ser. No. 08/970,743 filed Nov. 14, 1997. U.S. Pat. No. 6,415,278, issued on Jul. 2, 2002, both of which are hereby incorporated by reference in their entireties for all purposes.
Priority Data    
USPTO Field of Search     707/1 707/7 707/10 707/200 707/201 707/202 707/203 709/200 709/217 709/218 709/219 715/501.1 715/513 715/500
Patent Tags     retrieving documents transitively linked initial document
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6237011
Ferguson et al.

May,2001

[0 after 0 votes]
6167409
DeRose et al.

Dec,2000

[0 after 0 votes]
6115723
Fallside

Sep,2000

[0 after 0 votes]
5937406
Balabine et al.

Aug,1999

[0 after 0 votes]
5887171
Tada et al.

Mar,1999

[0 after 0 votes]
5873077
Kanoh et al.

Feb,1999

[0 after 0 votes]
5745908
Anderson et al.

Apr,1998

[0 after 0 votes]
5659729
Nielsen

Aug,1997

[0 after 0 votes]
5572643
Judson

Nov,1996

[0 after 0 votes]
5530852
Meske, Jr. et al.

Jun,1996

[0 after 0 votes]
July 1990


Jun,1996

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for displaying hypertext data, the method comprising: displaying a first document represented in a first markup representation and containing at least one hypertext link; in response to a user input selecting a first hypertext link in the first document, accessing an external document represented in a second markup representation, the first hypertext link having an original state pointing to the external document, the external document being external to the first document; converting the second markup representation of the external document into a first markup representation of the external document; and incorporating the first markup representation of the external document into the first document.

2. The method of claim 1, further comprising: modifying the first hypertext link from the original state to a second state having an internal link pointed to the first markup representation of the external document in the first document.

3. The method of claim 2, further comprising: saving information about the original state of the first hypertext link.

4. The method of claim 3, further comprising: in response to an action deleting a portion of the first document that included the first markup representation of the external document, using the information saved about the original state of the first hypertext link to reset the first hypertext link to the original state.

5. The method of claim 1, wherein the first markup representation comprises a physical markup language representation, the second markup representation comprises a semantic markup language representation, and wherein converting the second markup representation of the external document into a first markup representation includes: calculating a logical minimum width equal to the minimum width required to display all screen objects within the external document at their normal size; creating a physical markup representation of the external document, the physical markup representation having a width at least as wide as the logical minimum width; and conforming the physical markup representation to a target size, including a target width, wherein conforming the physical markup representation comprises scaling the width of the physical markup representation by a scaling factor derived from the ratio of an element of the target size to the logical minimum width.

6. A computer program product, tangibly stored on a machine-readable medium, comprising instructions operable to cause a programmable processor to: display a first document represented in a first markup representation and containing at least one hypertext link; in response. to a user input selecting a first hypertext link in the first document, access an external document represented in a second markup representation, the first hypertext link having an original state pointing to the external document, the external document being external to the first document; convert the second markup representation of the external document into a first markup representation of the external document; and incorporate the first markup representation of the external document into the first document.

7. The computer program product of claim 6, further comprising instructions to: modify the first hypertext link from the original state to a second state having an internal link pointed to the first markup representation of the external document in the first document.

8. The computer program product of claim 7, further comprising instructions to: save information about the original state of the first hypertext link.

9. The computer program product of claim 8, further comprising instructions to: in response to an action deleting a portion of the first document that included the first markup representation of the external document, use the information saved about the original state of the first hypertext link to reset the first hypertext link to the original state.

10. The computer program product of claim 6, wherein the first markup representation comprises a physical markup language representation, the second markup representation comprises a semantic markup language representation, and wherein converting the second markup representation of the external document into a first markup representation includes: calculating a logical minimum width equal to the minimum width required to display all screen objects within the external document at their normal size; creating a physical markup representation of the external document, the physical markup representation having a width at least as wide as the logical minimum width; and conforming the physical markup representation to a target size, including a target width, conforming the physical markup representation comprising: scaling the width of the physical markup representation by a scaling factor derived from the ratio of an element of the target size to the logical minimum width.
 Description Submit all comments and votes
 


BACKGROUND

The invention relates to capturing hypertext web pages for convenient viewing.

The World Wide Web ("the web") of the Internet has become in recent years a popular means of publishing documentary information. In particular, it is now common for users with access to the web to browse through collections of linked documents through the use of hypertext browsers, such as Netscape Navigator.TM. or Microsoft Internet Explorer.TM., whereby selection by the user of certain screen objects in a displayed document causes the contents of another document to be retrieved and displayed to the user.

Many of the documents on the web are encoded using a markup language known as the Hypertext Markup Language (HTML). HTML Version 3.2 with Frame Extensions is described in Graham, HTML Sourcebook, Third Edition, published by Wiley Computer Publishing, 1997. A markup language is a set of codes or tags that can be embedded within a document to describe how it should be displayed on a display device, such as a video screen or a printer. HTML is what is known as a "semantic" markup language. This means that, while it is possible to use HTML to dictate certain physical characteristics of a document (such as line spacing or font size), many HTML tags merely identify the logical features of the document, such as titles, paragraphs, lists, tables, and the like. The precise manner in which these logical features are displayed is then left to the browser software to determine at the time the document is displayed.

Because HTML tags often do not specify a fixed physical size of a document or its components, the precise appearance of a particular document displayed by a browser will often depend on the size of the browser window in which it is displayed. For example, FIGS. 1 and 2 show two views of the home web page of the US Patent and Trademark Office (specified by Uniform Resource Locator (URL) http://www.uspto.gov/ in September of 1997). In FIG. 2, the web browser window is significantly smaller than that in FIG. 1 and, as can be seen, the web page as seen through the two windows differs in its overall appearance, for example with respect to the width of the title 30 and list element 40.

One important feature of HTML is the ability, within an HTML document, to refer to external data resources. One way that such references are used within HTML is to identify auxiliary documents that are sources of content to be displayed as part of the display of the HTML document. For example, the HTML tag "IMG" specifies that the contents of a specified image document should be displayed within a portion of the display of the HTML document in which the IMG tag is found. Similarly, the tag "FRAME" within an HTML document specifies that the content of a specified document should be displayed within a particular frame of a frame set defined by the HTML document. The use of frames and frame sets within HTML is explained in more detail below.

HTML also features the ability to have a hypertext link within an HTML document. A hypertext link within an HTML document creates an association between a screen object (e.g., a word or an image) and an external resource. When the HTML document is displayed by a browser, a user may select the screen object, and the browser will respond by retrieving and displaying content from the external resource. A hypertext link may be specified within an HTML document with, for example, the HTML anchor tag with an HREF attribute.

The use of such external references within HTML facilitates distributed document storage on a wide area network (WAN). A large document may be broken up and stored as a set of smaller documents logically associated by external references. For example, it is common for the graphical images in an HTML document to be stored as separate documents (e.g., in the GIF or JPEG format). It is also common to store sections of a large text as separate documents, and to facilitate easy movement from one section to another through the use of hypertext links.

In addition, a set of pre-existing documents may be linked together with HTML tags to form a coherent whole. For example, an HTML document may be created containing hypertext links to a set of pre-existing documents relating to a common subject, thus facilitating the systematic review of such documents by a user.

A characteristic of HTML documents is that they are not paginated. That is, the displayed "height" of an HTML document is determined solely by the amount and arrangement of the screen objects defined within it, as displayed by the browser used to view it, and not by any fixed page size associated with the document. (Here "page size" does not necessarily refer to physical pages printed on paper, for example, but is simply a characteristic of an electronic document in which the content of the document is divided into a sequence of regions with fixed dimensions.) If the displayed document does not fit within the height of the browser window, the browser permits scrolling of the web page to permit additional content to be viewed. FIG. 3 shows the home web page of the US Patent and Trademark Office displayed within the same browser window as in FIG. 2, except that the page has been scrolled somewhat to reveal additional material.

A recent extension to HTML permits multiple scrollable and resizable "frames" to be displayed within a single browser window. A frame is defined by a special type of HTML document known as a "frame set". A frame set provides information giving the size and orientation of frames in a window, and specifies the contents of each frame. The contents of a frame may be either the contents of an HTML document, or a subsidiary frame set (i.e., a frame set, the entire contents of which appear within a single frame of the larger frame set). As with other HTML screen objects, the height or width of a frame may be specified in absolute or relative terms.

FIGS. 4, 5 and 6 illustrate the operation of frames in HTML. FIG. 4 shows a browser window displaying a frame set containing two frames. Frame 50 is a narrow vertical column on the left hand side of the screen. Frame 55 is a wider column to the right of frame 50. Frame 50 contains an HTML document that is as long as the browser window is high, while frame 55 contains a document that is longer than the browser window's height. As can be seen in FIG. 5, frame 55 can be scrolled independently of frame 50 to display the remainder of the HTML document contained within it.

In the above example, frame 50 is defined to have a fixed width of 115 pixels, whereas the width of frame 55 is defined relative to the width of frame 50--its width is set equal to the browser window's width, less the 115 pixels used by frame 50. As can be seen in FIG. 6, when the browser window is made smaller, frame 55 shrinks accordingly, while frame 50 remains at a fixed width.

As explained above, the ultimate appearance of an HTML document being displayed by a browser will usually depend on the size of the browser window (or frame) in which it is to be displayed. In general, a web browser will extract from an HTML document a series of screen objects (e.g., words, images, lists, frames or tables), and place them sequentially in rows on the screen. When a row has been filled, the next object is placed in a successive row. This process continues until all screen objects within the HTML document have been placed.

This general principle, however, is limited by the constraint that the width of the displayed HTML document cannot be narrower than the minimum width of the widest screen object contained within it. Under this constraint, if the minimum width of a screen object is wider than the width of the browser window, parts of the document will remain off screen (to the left or right) when viewed through the browser window, and a horizontal scroll bar will typically be displayed to permit the user to shift views of the document to the left or right.

HTML screen objects may have either a fixed or a variable width. For example, the width of a single word of text in an HTML document is fixed (given the font chosen by the browser in which to display it). Its width is determined by the characters in the word and the size font in which they will be displayed. Similarly, the width of a cell in an HTML table may be made fixed by explicitly specifying its width as a certain number of pixels.

By contrast, the width of a variable width screen object will vary, depending on the width of the browser window in which it appears. However, even a variable width screen object will have a minimum width. For example, the width of a paragraph of text will generally vary according to the size of the browser window; however, it can be no narrower than the widest word contained within the paragraph. Similarly, a table containing images may have cells whose widths are defined in relative terms, but the table nonetheless cannot be narrower than the sum of the widths of the images within its widest row.

This constraint is illustrated in FIGS. 7, 8, 9 and 10. In each of FIGS. 7, 8 and 9, an identical HTML document is displayed in a browser window 65. An excerpt of the underlying HTML code is shown in FIG. 10. Referring to FIGS. 7 and 10, the document being displayed includes a table 80 having two cells aligned to the top, one cell 85 containing a client-side image map and the other cell 90 containing the heading "US Patent and Trademark Office", a horizontal line, and an unordered list with the heading "New on the PTO site:". In FIG. 8, the window 65 is narrower than in FIG. 7, but wider than the minimum width of any object on the screen. Therefore, each line of the document is adjusted to be as wide as the window 65 and nothing is hidden from the user to the right of the browser window. By contrast, in FIG. 9, window 65 is narrower than the minimum width of table 80, since the fixed width of the image map in cell 85 plus the width of the widest word in cell 90 (the word "trademark") is greater than the width of the browser window 65. Therefore, the resulting display width of the document is wider than window 65, resulting in the rightmost part of the document being hidden from view.

While collections of visual display data on the web are typically stored as sets of linked HTML documents, it is also common and convenient for visual display data to be stored as a single document, having a fixed page size, using a physical markup language such as the portable document format (PDF). PDF is described in the publication Adobe Systems, Inc., Portable Document Format Reference Manual, Addison-Wesley Publishing Co., 1993.

SUMMARY

In general, in one aspect, the invention features a method for converting a semantic markup representation of a document into a physical markup representation of the document. The method includes calculating a logical minimum width equal to the minimum width required to display all screen objects within the document at their normal size, creating a physical markup representation of the document, the physical markup representation having a width at least as wide as the logical minimum width, and conforming the physical markup representation to a target size, including a target width, such that conforming the physical markup representation includes scaling the width of the physical markup representation by a scaling factor derived from the ratio of an element of the target size to the logical minimum width. Preferred embodiments of the invention include one or more of the following features. The physical markup representation is incorporated into a newly created document. The physical markup representation is incorporated into an existing document. The element of the target size is the target width. The physical markup representation is a paginated representation including pages each having a respective physical width and a respective physical height. The target size includes a target height. The target size is a standard paper size. The standard paper size is one of 8.5.times.11 inches, 8.5.times.14 inches, A4, A5, and 11.times.17 inches. The pages of the physical markup representation have the same aspect ratio as the target size. The height of the physical markup representation is scaled by the scaling factor. The page height of the physical markup representation is scaled by the scaling factor. The element of the target size is the target height. The pages of the physical markup representation are rotated by plus or minus 90 degrees. The ratio of the target width to the logical minimum width is tested whether it is less than a specified threshold. The document is a frame set specifying a plurality of frames. The document contains at least one hypertext link, the physical markup representation is displayed in a viewer, and an external document is accessed when a hypertext link is selected by a user from the displayed markup. The hypertext link is a server-side image map. The semantic markup representation is HTML.

The physical markup representation is PDF. After the physical markup representation is conformed to the target size, the physical markup representation is scaled by the inverse of scaling factor and the result is displayed in a viewer.

In general, in another aspect, the invention features a method for displaying hypertext data. The method includes displaying in a viewer a first document represented in a physical markup representation and containing at least one hypertext link, accessing an external document when a hypertext link is selected by a user from the displayed first document, converting the semantic markup representation of the external document into a physical markup representation, and incorporating the physical markup representation of the external document into the first document. Preferred embodiments of the invention include one or more of the following features. A hypertext link is modified to point to the physical markup representation of the external document. The original state of the hypertext link is saved. In response to an action deleting a portion of the first document, a hypertext link that pointed to the deleted portion is restored to its original state. The external document is digested to create a digest of the external document, and the digest of the external document is tested to determine whether the physical markup representation of the external document has already been incorporated into the first document. The external document comprises a primary document and one or more auxiliary documents. Each auxiliary document is digested to create a respective auxiliary document digest, and the digital digest of each auxiliary document is tested to determine whether the physical markup representation of the external document has already been incorporated into the first document. The digital digest is a compound digest.

In general, in another aspect, the invention features a method for creating a distinguishing identifier of a collection of data comprising a primary document and one or more auxiliary documents. The method includes digesting each auxiliary document to create a respective auxiliary document digest and creating a distinguishing identifier by digesting a concatenation of the primary document with all auxiliary document digests. Preferred embodiments of the invention include one or more of the following features. A digital digest algorithm is applied. The digital digest algorithm is the MD5 Message Digest Algorithm.

In general, in another aspect, the invention features a method for retrieving documents transitively linked to an initial document on a hierarchical file system. The method includes retrieving the initial document and retrieving only those other documents for which there is a transitive link from the initial document to the other document and for which the transitive link includes documents which are all within the same directory path as the initial document. Preferred embodiments of the invention include one or more of the following features. The hierarchical file system is distributed on a network. The hierarchical file system is distributed on an internet.

In general, in another aspect, the invention features a computer program, residing on a computer-readable medium, for converting a semantic markup representation of a document into a physical markup representation of the document, having instructions for causing a computer to calculate a logical minimum width equal to the minimum width required to display all screen objects within the document at their normal size, create a physical markup representation of the document, the physical markup representation having a width at least as wide as the logical minimum width, and conform the physical markup representation to a target size, including a target width, the instructions for causing a computer to conform the physical markup representation including instructions for causing a computer to scale the width of the physical markup representation by a scaling factor derived from the ratio of an element of the target size to the logical minimum width. Preferred embodiments of the invention include one or more of the following features. The program includes instructions for causing a computer to incorporate the physical markup representation into a newly created document. The program includes instructions for causing a computer to incorporate the physical markup representation into an existing document. The element of the target size is the target width. The physical markup representation is a paginated representation including pages each having a respective physical width and a respective physical height. The target size includes a target height. The target size is a standard paper size. The standard paper size is one of 8.5.times.11 inches, 8.5.times.14 inches, A4, A5, and 11.times.17 inches. The pages of the physical markup representation have the same aspect ratio as the target size. The program includes instructions for causing a computer to scale the height of the physical markup representation by the scaling factor. The program includes instructions for causing a computer to scale the page height of the physical markup representation by the scaling factor. The element of the target size is the target height. The program includes instructions for causing a computer to rotate the pages of the physical markup representation by plus or minus 90 degrees. The program includes instructions for causing a computer to test whether the ratio of the target width to the logical minimum width is less than a specified threshold. The document is a frame set specifying a plurality of frames. The document contains at least one hypertext link and the program includes instructions for causing a computer to display the physical markup representation in a viewer and access an external document when a hypertext link is selected by a user from the displayed markup. The hypertext link is a server-side image map. The semantic markup representation is HTML. The physical markup representation is PDF. The program includes instructions for causing a computer to, after conforming the physical markup representation to the target size, scale the physical markup representation by the inverse of scaling factor and display the result in a viewer. The program includes instructions for causing a computer to display in a viewer a first document represented in a physical markup representation and containing at least one hypertext link access an external document when a hypertext link is selected by a user from the displayed first document convert the semantic markup representation of the external document into a physical markup representation and incorporate the physical markup representation of the external document into the first document. The program includes instructions for causing a computer to modify a hypertext link to point to the physical markup representation of the external document. The program includes instructions for causing a computer to save the original state of the hypertext link. The program includes instructions for causing a computer to, in response to an action deleting a portion of the first document, restore a hypertext link that pointed to the deleted portion to its original state. The program includes instructions for causing a computer to digest the external document to create a digest of the external document, and test the digest of the external document to determine whether the physical markup representation of the external document has already been incorporated into the first document. The external document comprises a primary document and one or more auxiliary documents. The program includes instructions for causing a computer to digest each auxiliary document to create a respective auxiliary document digest and test the digital digest of each auxiliary document to determine whether the physical markup representation of the external document has already been incorporated into the first document. The digital digest is a compound digest.

In general, in another aspect, the invention features a computer program, residing on a computer readable medium, for creating a distinguishing identifier of a collection of data comprising a primary document and one or more auxiliary documents having instructions for causing a computer to digest each auxiliary document to create a respective auxiliary document digest and create a distinguishing identifier by digesting a concatenation of the primary document with all auxiliary document digests. Preferred embodiments of the invention include one or more of the following features. The program includes instructions for causing a computer to apply a digital digest algorithm. The digital digest algorithm is the MD5 Message Digest Algorithm.

In general, in another aspect, the invention features a computer program, residing on a computer readable medium, for retrieving documents transitively linked to an initial document on a hierarchical file system, having instructions for causing a computer to retrieve the initial document and retrieve only those other documents for which there is a transitive link from the initial document to the other document and for which the transitive link includes documents which are all within the same directory path as the initial document. Preferred embodiments of the invention include one or more of the following features. The hierarchical file system is distributed on a network. The hierarchical file system is distributed on an internet.

Among the advantages of the invention are one or more of the following. Web pages written in a semantic markup language, such as HTML, can be integrated into a single paginated document described in a physical markup language, such as PDF. Web pages can be converted to a format having fixed page dimensions, without losing information because of space constraints. A virtually unique single identifier can be created for a primary document and associated auxiliary documents. All of the documents that are linked to a document and also in the same directory path can be retrieved from a file system.

Other features and advantages of the invention will become apparent from the following description and from the claims.

DESCRIPTION OF DRAWINGS

FIG. 1 is a view of a web page displayed in a conventional web browser.

FIG. 2 is a view of a web page displayed in a conventional web browser.

FIG. 3 is a view of a web page displayed in a conventional web browser.

FIG. 4 is a view of a web page containing frames in a conventional web browser.

FIG. 5 is a view of a web page containing frames in a conventional web browser.

FIG. 6 is a view of a web page containing frames in a conventional web browser.

FIG. 7 is a view of a web page displayed in a conventional web browser.

FIG. 8 is a view of a web page displayed in a conventional web browser.

FIG. 9 is a view of a web page displayed in a conventional web browser.

FIG. 10 shows a portion of the underlying HTML code for the web page displayed in FIGS. 7-9.

FIG. 11 is a block diagram of a computer system programmed in accordance with the present invention.

FIGS. 12, 12a and 12b are a flowchart of a method of incorporating web pages into a single paginated document.

FIG. 13 is a flowchart showing steps of a routine FetchAndIncorporate.

FIG. 14 is a flowchart showing steps of a routine FetchDoc.

FIG. 15 is a flowchart showing steps of a routine ConvertToPDF.

FIG. 16 shows the logical relationship between a LayoutRegion and content of an associated PDF document.

FIGS. 17, 17a, and 17b are a flowchart showing steps taken by a routine LayoutElement.

FIG. 18 is a view of a web page displayed in a conventional web browser.

FIG. 19 is a view of a web page displayed in a conventional web browser.

FIG. 20 shows a PDF page produced by the present invention.

FIG. 21 shows PDF pages produced by the present invention.

Like reference symbols in the various drawings indicate like elements.

DETAILED DESCRIPTION

Referring to FIG. 11, a user computer 100 running client software is connected over a communications link 102 to web servers, such as web server 140. Web servers are linked (statically or dynamically) to data stores, such as data store 142, containing web pages, such as page 144. The client software (which may include one or more separate programs, as well as plug-in modules and operating system extensions) typically displays information on a display device such as a monitor 104 and receives user input from a keyboard (not shown) and a cursor positioning device such as a mouse 106. The computer 100 is generally programmed so that movement by a user of the mouse 106 results in corresponding movement of a displayed cursor graphic on the display 104.

The programming of computer 100 includes an interface 108 that receives position information from the mouse 106 and provides it to applications programs running on computer 100. Among such applications programs are a web browser 110, and a PDF viewer 120. Also running on computer 100 is a web page integrator 135, which may be part of the PDF viewer 120. In response to a request from the user, the PDF viewer may request the web page integrator 135 to retrieve, from one or more web servers (such as web server 140), an initial document specified by a URL supplied by the user, and other documents which are linked, directly or indirectly, to the initial document. When the requested documents are retrieved, the web page integrator integrates them into a single PDF document, which is then displayed by the PDF viewer 120.

The PDF document which is displayed by the PDF viewer may have hypertext links to web pages, as well as to internal pages within the PDF document. When the user selects a hypertext link in the PDF document, e.g., with the mouse, if the link is to a page within the PDF document, that page is displayed by the PDF viewer. However, if the hypertext link is to a web page, that page is either displayed by the browser, or integrated into the PDF document and displayed by the PDF viewer, depending on a mode set by the user.

FIGS. 12, 12a, and 12b are a flowchart of a method of incorporating web pages into a single paginated document, which will be described as implemented in a programmed computer system. First, the system queries the user to provide the name of an existing PDF document, or a URL along with web traversal criteria (step 200). If the user provides the name of a PDF document, the document becomes the "target document" (step 210). The target document is displayed in the PDF viewer and user input is awaited (step 220). If the user provides a URL with web traversal criteria, then a new, empty, PDF document is created. This document becomes the target document. Parameters of the target document are set which specify a target width and a target height of pages within the document (collectively the "target size" of the document), according to either a default value or input from the user. Then, the routine FetchAndIncorporate is called, which incorporates a starting document specified by the URL, as well as other documents which are linked to the starting document and which satisfy the web traversal criteria, into the target document (step 230). The target document is then displayed by the PDF viewer and the system waits for user input (step 220).

The pages of the target document are normally displayed in their target size, i.e., the size of the pages as specified in