|
|  Get related patents on CD |
| United States Patent | 6029182 |
| Link to this page | http://www.wikipatents.com/6029182.html |
| Inventor(s) | Nehab; Smadar (Palo Alto, CA), Wickramaratne; Manjula G. (Fremont, CA), Klark; Paul L. (Mountain View, CA) |
| Abstract | A World Wide Web site data retrieval system includes an input device for
inputting data and commands to access the World Wide Web, and a memory for
storing a Web site data retrieval driver which includes a Web reader,
stored Web site address information, stored Web site commands, and stored
format information. The memory also stores process steps to connect to a
Web site and to issue commands within the connected Web site, and a
connection to the World Wide Web. The system includes a processor for
launching the Web site data retrieval driver in response to a command to
access the World Wide Web. The Web site retrieval driver, upon being
launched, (1) launches the Web reader to connect to the World Wide Web via
the connection, (2) retrieves the Web site address information and Web
site commands, (3) instructs the Web reader to access the Web site based
on the Web site address information and Web site commands, (4) downloads
Web site data from the Web site based on the Web site commands, (5) stores
the Web site data in a linear document, (6) repeats steps 1 through 5
until all addresses in the stored Web site address information have been
accessed, and (7) formats the linear document into a personalized document
based on the format information. |
| |
|
Title Information  |
|
|
|
|
|
|
| Publication Date |
February 22, 2000 |
|
|
|
|
|
| Filing Date |
October 4, 1996 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5890152 Rapaport
Mar,1999 |      Your vote accepted [0 after 0 votes] | | 5886683 Tognazzini 715/700 Mar,1999 |      Your vote accepted [0 after 0 votes] | | 5877766 Bates 715/854 Mar,1999 |      Your vote accepted [0 after 0 votes] | | 5764906 Edelstein 709/219 Jun,1998 |      Your vote accepted [0 after 0 votes] | | 5761662 Dasan 707/10 Jun,1998 |      Your vote accepted [0 after 0 votes] | | 5758361 van Hoff
May,1998 |      Your vote accepted [0 after 0 votes] | | 5754939 Herz
May,1998 |      Your vote accepted [0 after 0 votes] | | 5737560 Yohanan 715/847 Apr,1998 |      Your vote accepted [0 after 0 votes] | | 5649186 Ferguson 707/10 Jul,1997 |      Your vote accepted [0 after 0 votes] | | 5530852 Meske, Jr. 709/206 Jun,1996 |      Your vote accepted [0 after 0 votes] | | 5423043 Fitzpatrick 719/317 Jun,1995 |      Your vote accepted [0 after 0 votes] | | 5408655 Oren 715/501.1 Apr,1995 |      Your vote accepted [0 after 0 votes] | | 5392428 Robins
Feb,1995 |      Your vote accepted [0 after 0 votes] | | 5347632 Filepp 709/202 Sep,1994 |      Your vote accepted [0 after 0 votes] | | 5327554 Palazzi, III 725/110 Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5267155 Buchanan 715/540 Nov,1993 |      Your vote accepted [0 after 0 votes] | | 5181162 Smith 715/530 Jan,1993 |      Your vote accepted [0 after 0 votes] | | 4965763 Zamora 704/1 Oct,1990 |      Your vote accepted [0 after 0 votes] | | 4959769 Cooper 707/200 Sep,1990 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
| Add a new Other reference: |
| Post related web sites and other references in this section |
| | Reference | Relevancy | Comments | At www.e.g.bucknell.edu/boulter/crayon "Crayon--Create Your Own Newspaper", Jun. 26, 1995.. Nov,2006 |      Your vote accepted [0 after 0 votes] | | Online User, "Personal Journal--Daily News on Your Virtual Doorstep", pp. 50-54, Oct./Nov. 1995.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | Beretta, Giordano, "W.sup.3 +Structure=Knowledge", Hewlett Packard Laboratories Technical Report HPL-96-99, Jun., 1996.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | Adverstisement by Dow Jones and Company, "News Retrieval for Windows", no date available.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | Advertisement by the San Jose Mercury, "Newshound User Guide", no date available.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | Collins, Regina S., ed., "Journalist(TM): Your Personal Newspaper for Compuserve(R)", Cupertino, Ca.: PED Software Corporation, pp. 1-143,
Jan. 1993.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
Claims  |
|
|
What is claimed is:
1. An automated method for formatting data into a personalized newspaper from at least one hypermedia document, comprising the steps of:
an accessing step to access the at least one hypermedia document;
a traversing step to traverse selectively links in the hypermedia document;
a retrieving step to retrieve data from the hypermedia document and/or traversed links into an extracted data tree, wherein the data is retrieved based on a structure of the hypermedia document and/or links in the hypermedia document;
a flattening step to flatten the extracted data tree into a linear document; and
a formatting step to format the linear document into a formatted personalized newspaper consisting of text and/or images, wherein a number of links traversed in the traversing step can be limited to a predefined number of links.
2. The method of claim 1, further comprising the step of printing the formatted document.
3. The method of claim 1, wherein said hypermedia document is located on the World Wide Web.
4. The method of claim 1, wherein said hypermedia document is located on the Internet.
5. The method of claim 1, wherein said hypermedia document is located on an intranet.
6. The method of claim 1, wherein said accessing step, said retrieving step, said flattening step, and said formatting step are performed in accordance with a personal-news-profile.
7. An automated method for retrieving articles from a hypermedia-linked computer network and for formatting the articles into a personalized newspaper, the method comprising the steps of:
retrieving a stored personal-news-profile which comprises address data for a site on the hypermedia-linked computer network, command data for accessing data from the site, and newspaper layout commands;
contacting the site based on address data stored in the personal-news-profile;
traversing selectively links in the site;
downloading articles from the site and/or links in the site based on command data stored in the personal-news-profile;
flattening the articles into a linear document; and
formatting the linear document into the personalized newspaper according to layout commands stored in the personal-news-profile, the personalized newspaper consisting of text and/or images,
wherein a number of links traversed in the traversing step can be limited to a predefined numbers of links based on command data in the personal-news-profile.
8. The method of claim 7, further comprising the step of printing the personalized newspaper.
9. The method of claim 7, wherein said hypermedia-linked computer network is the World Wide Web.
10. The method of claim 7, wherein said hypermedia-linked computer network is on the Internet.
11. The method of claim 7, wherein said hypermedia-linked computer network is on an intranet.
12. The method of claim 7, wherein the command data for accessing data includes data for selecting articles based on a structure of the site.
13. The method of claim 12, wherein the command data for accessing data also includes data for selecting articles based on a content of the articles.
14. Computer executable process steps stored on a computer-readable medium, said steps for accessing World Wide Web sites for retrieving data at the sites and for formatting the data into a personalized newspaper, said steps comprising:
a connecting step to connect to the World Wide Web;
a retrieving step to retrieve user-defined Web site address information, user-defined Web site commands, and user-defined formatting commands;
an activating step to activate a Web reader so as to access a Web site based on the user-defined Web site address information, a traversing step for traversing selectively links in the Web site, and retrieving data from within the Web site and/or
links based on the user-defined Web site commands;
a downloading step to download the retrieved Web site data and/or link data from the accessed Web site into an extracted data tree;
a flattening step to flatten the extracted data tree into a linear document;
a step to repeat the downloading step and the flattening step until all addresses/links in the user-defined Web site address information have been accessed; and
a formatting step to format the stored data into the personalized document based on the user-defined formatting commands, said personalized document consisting of text and/or images,
wherein a number of links traversed in the Web site can be limited to a predefined number of links based on the user-defined Web site commands.
15. The computer executable process steps of claim 14, further comprising a spooling step to spool the personalized document to an output device.
16. The computer executable process steps of claim 15, wherein the output device is a printer.
17. The computer executable process steps of claim 15, further comprising an output step to output the personalized document to a display.
18. The computer executable process steps of claim 14, wherein the user-defined Web site commands include commands for selecting data based on a structure of the Web site.
19. The computer executable process steps of claim 18, wherein the user-defined Web site commands also include commands for selecting data based on a content of the Web site.
20. An apparatus for automatically retrieving news articles from on-line news services on the World Wide Web and formatting the news articles into a personalized newspaper, the apparatus comprising:
first storage means for storing (1) a personal-news-profile which comprises address data and command data for accessing data from a Web site, and (2) newspaper format commands;
retrieval means for retrieving the stored personal-news-profile and accessing data stored therein;
activating means for activating a Web reader to contact a Web site based on address data stored in the personal-news-profile;
traversing means for traversing selectively links in the Web site;
downloading means for downloading news articles from the contacted Web site and/or links based on command data stored in the personal-news-profile;
second storage means for storing the downloaded news articles; and
formatting means for flattening the downloaded news articles into a linear document and for formatting the linear document into the personalized newspaper based on the newspaper format commands stored in the personal-news-profile, said personal
newspaper consisting of text and/or images,
wherein a number of links traversed by the traversing means can be limited to a predefined number of links based on command data stored in the personal-news-profile.
21. The apparatus of claim 20, further comprising spooling means for spooling the personalized newspaper to a printer. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to a data retrieval system which automatically traverses hypermedia documents on a computer network and automatically retrieves information from those documents based on a match between the structure of the documents and a
personalized data retrieval structure. More particularly, the invention can retrieve articles from a news service, from a magazine service, or from a combination of both services which are located on the World Wide Web, a private computer network that
supports hypermedia links, or any other hypermedia-linked computer system.
For example, there exists a Web site for retrieving news articles from the New York Times and a Web site for retrieving articles from People magazine. The retrieval system of the invention can traverse through such Web sites and select articles
based on a personalized data retrieval structure. The personalized data retrieval structure may include commands to retrieve a full text of the front page only, headlines of the business section, headlines of the stock section and sports section, etc.
In addition, the personalized data retrieval structure may include content-based rules to retrieve articles with certain keywords, to exclude articles with certain keywords, or to include articles based on a rule-based content analysis. The invention
also provides a method for synthesizing all retrieved news articles and printing the synthesized news articles into a newspaper-type format in which each of the articles is arranged based on a user's predefined layout.
While the above example is in the context of the Web, hypermedia documents can reside on other types of networks besides the Web, such as an intranet. An intranet is a private computer network that is not connected to outside computer networks.
For example, a company's own computer network could be an intranet with hypermedia documents on it. For brevity, the following discussion is made with respect to the World Wide Web. However, it should be understood that the invention applies equally
well to any type of computer network that contains hypermedia documents, such as an intranet, different hypermedia-linked computer networks that reside on the Internet other than the Web, etc.
A hypermedia document on the Web can span multiple Web sites. Such documents can be newspapers, news articles, magazines, catalogs, manuals, memoranda, and the like. For brevity, the following discussion is made with respect to sources of news
information. However, it should be understood that the invention applies equally well to any other type of hypermedia document.
2. Description of the Related Art
The World Wide Web is an on-line source of hypermedia documents containing hypermedia text and images that act as links to other documents, Web sites, etc. As a result, documents on the Web are not organized sequentially. Rather, a user is
automatically linked to other documents or Web sites to complete the viewing of a document by selecting a hypermedia link, such as a text link or an image link, within the document. Accordingly, an entire document cannot be viewed by scrolling through
text.
One popular use of the Web is on-line publication and distribution of magazines and newspapers. Currently, many Web news services, such as the New York Times, allow the user to define keywords of interest and to receive news information, daily
or hourly, that contains text matching the keywords. The news information can then be delivered to the user's computer via modem or E-mail. However, most Web news site newspapers, like the New York Times, include too much information, most of which has
no interest to the user since the information is retrieved based only on a keyword match.
Other sources of news information are provided through information suppliers like "Individual Inc." Individual Inc. supplies users with a brief summary of the top twenty most relevant articles based on a user's predefined keywords. This
subscription news service allows the user to specify five to ten areas of interest based on keywords, which are then prioritized by the user. The information service searches the Web for magazines and newspapers which contain any of the keywords. Based
on the keyword searches, twenty of the most relevant articles are selected, compiled into a brief one-page summary, and transmitted to the user via facsimile for the user's review. However, in order to review an entire document rather than the summary,
the user must log onto a specific Web site containing the document in order to retrieve and review the document.
There are yet other services which permit the user to personalize a newspaper to be displayed at the user's terminal by storing links to various news articles from various news sources on the Web. For example, CRAYON "Create Your Own Newspaper"
permits a user to select specific sections from among links to over twenty-five different on-line newspapers, and to compose the selections into a personalized newspaper. Using CRAYON, it is possible to compose a personalized newspaper containing, for
example, links to the international section of the New York Times, the business section of the Wall Street Journal, and the sports section of the Chicago Tribune. The HTML (hypertext markup language) source file for this newspaper is then stored to mass
media storage for later use.
While the forgoing news and information services provide convenient ways to keep updated on the news, they do not allow a user to access and view the news in the way that people naturally read a real-world newspaper. Namely, people naturally
read a newspaper by scanning the pages of sections that they find interesting and then reading those articles that grab their attention. In other words, people use a structural approach to decide what pages to look at initially (e.g., the first page of
the Business and World sections, and the comics page of the Arts section). They then scan the selected pages for articles.
In sum, conventional news and information services do not allow a user to access data from a hypermedia document on the basis of the structure of the document, and then to format that data in a manner that allows the user to scan and read the
data in a natural fashion.
SUMMARY OF THE INVENTION
The invention addresses the above deficiencies in the art by accessing at least one hypermedia document, retrieving data from the hypermedia document into an extracted data tree, with the data retrieved based on a structure of the hypermedia
document, flattening the extracted data tree into a linear document, and formatting the linear document into a formatted document.
In another aspect, the invention creates a personal-news-profile for retrieving data from a hypermedia-linked computer network. The hypermedia-linked computer network is accessed, a learning mode is started, the hypermedia-linked computer
network is traversed with commands, at least one rule is extracted from the commands, and the rule(s) is compiled into the personal-news-profile.
In yet another aspect, the invention creates a personalization profile for a Web site retrieval data retrieval system. Data and commands are input to access the World Wide Web and a connection is made to the World Wide Web. A Web reader is
launched, and the Web reader accesses the Web via the connection. In response to user commands, a learning mode is entered into. Commands are sent to traverse the World Wide Web, and at least one rule is extracted from the commands. The rule(s) is
compiled into a personalization profile, which is stored.
In yet another aspect, the invention retrieves articles from a hypermedia-linked computer network and formats the articles into a personalized newspaper. A stored personal-news-profile is retrieved. The personal-news-profile includes address
data for a site on the hypermedia-linked computer network, command data for accessing data from the site, and newspaper layout commands. The site is accessed based on address data stored in the personal-news-profile, and articles at the site are
downloaded based on command data stored in the personal-news-profile. The downloaded articles are flattened into a linear document, and the linear document is formatted into the personalized newspaper according to newspaper layout commands stored in the
personal-news-profile.
In yet another aspect, the invention retrieves data from a World Wide Web site and formats the data into a personalized document. A Web site data retrieval driver which includes a Web reader, stored Web site address information, stored Web site
commands, and stored format information is accessed. The invention (1) launches the Web reader to connect to the World Wide Web via a connection to the Web, (2) retrieves the Web site address information and Web site commands, (3) instructs the Web
reader to access the Web site based on the Web site address information and Web site commands, (4) downloads Web site data from the Web site based on the Web site commands, wherein the data is downloaded with reference to a linked list so as to avoid
hypermedia-links that form loops and so as to avoid repetitious downloading of data that has already been downloaded, (5) stores the Web site data in a linear document, (6) repeats steps 2 through 5 until all addresses in the stored Web site address
information have been accessed, and (7) formats the linear document into the personalized document based on the format information.
In yet another aspect, the invention accesses and retrieves data at World Wide Web sites and formats the data into a personalized document. The invention connects to the World Wide Web, retrieves user defined Web site address information, user
defined Web site commands, and user defined formatting commands, and activates a Web reader so as to access a Web site based on the user defined Web site address information. The Web reader is used to download data from the Web based on the user defined
Web site commands, and the data is downloaded into an extracted data tree. The downloading continues until all addresses in the user defined Web site address information have been accessed. The extracted data tree is flattened into a linear document,
and the flattened document is formatted into the personalized document based on the user defined formatting commands.
In yet another aspect, the invention retrieves news articles from on-line news services on the World Wide Web and formats the news articles into a personalized newspaper. The invention stores a personal-news-profile which comprises addresses
data and command data for accessing data from a Web site and newspaper format commands, retrieves the stored personal-news-profile and accesses the data stored therein, activates a Web reader to contact a Web site based on address data stored in the
personal-news-profile, downloads news articles at the contacted Web site based on command data stored in the personal-news-profile, stores the downloaded news articles, and formats the stored news articles into the personalized newspaper based on the
newspaper format commands stored in the personal-news-profile.
In yet another aspect, the invention formats a hypermedia document into a personalized document. A location of the hypermedia document is specified, a type of the hypermedia document is specified, a scope of data to be retrieved from the
hypermedia document is specified, wherein the scope is based on a structure of the hypermedia document, and a format is specified for formatting the data retrieved from the hypermedia document into the personalized document. The hypermedia document
found at the specified location is accessed, data is retrieved from the hypermedia document in accordance with the specified hypermedia document type and in accordance with the specified scope, and the data is formatted into the personalized document in
accordance with the specified format.
In yet another aspect, the invention is a system for processing a hypermedia document. The system accesses the hypermedia document, extracts addresses from the hypermedia document, and stores the addresses extracted from the hypermedia document
in a container. The system activates a processing function to process data stored at the addresses stored in the container, downloads the data stored at the addresses stored in the container into a memory, and extracts predetermined data from downloaded
data in accordance with predetermined configuration information. The predetermined data is then formatted in accordance with predefined formatting settings to generate a formatted document, and the formatted document is processed in accordance with the
processing function.
In preferred embodiments, the system inputs the formatting settings and configuration information via a graphical user interface. The graphical user interface comprises plural processing icons, one of which activates the processing function. By
virtue of the graphical user interface, a user can interactively set a document's format and change that format should a change be desired.
In particularly preferred embodiments, the graphical user interface is displayed in plural modes. The plural modes comprise (1) a fully-functional mode in which the graphical user interface displays formatting fields, processing options, menus
and the processing icons, and (2) a minimizing mode in which the graphical user interface displays only the processing icons. Typically, the graphical user interface displayed in the minimizing mode is displayed during browsing the hypermedia document.
By displaying the graphical user interface in plural modes, the present invention facilitates operation of the invention during browsing of the hypermedia document.
This summary has been provided so that the nature of the invention may be understood quickly. A more complete understanding of the invention can be obtained by reference to the following detailed description of the preferred embodiments thereof
in connection with the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a perspective view showing the outward appearance of the personal news retrieval system according to the invention.
FIG. 2 is a block diagram of the personal news retrieval system shown in FIG. 1.
FIG. 3, comprised of FIGS. 3A, 3B, 3C and 3D, shows representational diagrams illustrating an example of the transformation of information from the Web (FIG. 3A) to an extracted data tree (FIG. 3B), then to a flattened document (FIG. 3C), and
finally to a formatted document (FIG. 3D) according to the invention.
FIG. 4 is a representational block diagram of the manner by which a personal-news-profile for retrieving news articles via the Web is created or edited according to the invention.
FIG. 5, comprised of FIGS. 5A and 5B, shows flow diagrams describing how a personal-news-profile is created or edited.
FIG. 6 is a representational block diagram of the manner by which news articles are retrieved from the Web and formatted with reference to a personal-new-profile according to the invention.
FIG. 7 is a flow diagram describing how news articles are retrieved from the Web with reference to a personal-news-profile.
FIG. 8 is a flow diagram showing how retrieved news articles are formatted with reference to a personal news profile and sent to a print device interface.
FIGS. 9A to 9E depict a graphical user interface used with the second embodiment of the present invention.
FIG. 10 is a flow diagram describing the operation of the second embodiment of the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 is a view showing the outward appearance of a representative embodiment of the invention. Shown in FIG. 1 is computing equipment 1, such as a MacIntosh or an IBM PC or a PC-compatible computer, having a windowing environment, such as
Microsoft Windows. Provided with computing equipment 1 is display screen 2, such as a color monitor or a monochromatic monitor, keyboard 3 for entering text data and user commands, and a pointing device such as mouse 4 for pointing and for manipulating
objects displayed on display 2. Computing equipment 1 also includes a mass storage device such as disk drive 5. Image data can be input into computing equipment 1 from a variety of sources such as a network interface 11a or from external devices via
facsimile/modem interface 6. Network interface 11a is used to connect computing equipment 1 to a local area network (LAN) or to a wide area network (WAN) such as the World Wide Web.
FIG. 2 is a detailed block diagram showing the internal construction of computing equipment 1. As shown in FIG. 2, computing equipment 1 includes central processing unit (CPU) 8 interfaced with computer bus 9. Also interfaced with computer bus
9 is printer interface 10, fax/modem interface 6, display interface 11, network interfa | | |