|
Claims  |
|
|
What is claimed is:
1. In a computer network incorporating a plurality of servers used to store documents, each said document having a unique document identifier, and a client computer having a
browser configured to request and receive said documents over said network, a system for providing annotation overlays for a requested document, said system comprising:
at least one stored overlay group associated with one of said servers, said overlay group encapsulating annotation overlays regarding at least one of said documents, each said stored overlay group having a unique source identifier;
an annotation overlay proxy configured to form a merged document by merging said requested document from a first server with associated annotation overlays regarding said requested document from specified overlay groups and to relay said merged
document to a receiver selected from another proxy or said browser; and
an annotation directory associated with said annotation overlay proxy, wherein said annotation directory stores said annotation overlays from said specified overlay groups, and wherein said annotation overlay proxy is configured, in response to a
document request issued by said browser, to retrieve a set of associated annotation overlays related to said requested document from said annotation directory prior to merging said set of associated annotations and said requested document;
wherein said annotation directory is created by said annotation overlay proxy, which, upon receiving a list of said specified overlay groups identified by their unique source identifiers, is configured to retrieve all of said annotation overlays
from said specified overlay groups and store said retrieved annotation overlays in an electronic memory coupled to said annotation overlay proxy, said stored annotation overlays forming said annotation directory.
2. The system of claim 1, wherein at least one overlay group encapsulates a plurality of distinct annotation overlays regarding one of said documents.
3. The system of claim 1, wherein each overlay group encapsulates annotation overlays provided by a single commentator.
4. The system of claim 1, wherein at least a subset of said annotation overlays each have associated therewith data indicating with which document or documents said each annotation overlay is to be associated.
5. The system of claim 1, wherein each of said annotation overlays has an associated unique overlay identifier so that each of said annotation overlays can be annotated in same manner as said documents.
6. The system of claim 1, wherein said annotation overlay proxy is configured so as to be operable in series with other proxies, said other proxies including other said annotation overlay proxies and firewall proxies.
7. The system of claim 1, wherein said annotation overlays are defined using hypertext mark up language (HTML).
8. The system of claim 1, wherein said annotation overlay comprises:
a document id specifying a set of documents to which said annotation overlay is applicable;
a pattern in said set of documents;
an action code specifying an action to be taken by said annotation overlay proxy with regard to said pattern in said set of documents when creating said merged document; and
an argument supplying additional information to assist said annotation overlay proxy in executing said action.
9. The system of claim 8, wherein said action code further comprises:
a location field that indicates a part of said set of documents, said part being selected from a document, section, paragraph, sentence and word containing said pattern; and
a where field that indicates where, with respect to said part, said action is to be implemented.
10. The system of claim 8, wherein, when said action code is a replacement code and said requested document is within said set of documents, said annotation overlay proxy forms said merged document by substituting for each occurrence of said
pattern in said requested document said additional information in a corresponding part of said merged document.
11. The system of claim 8, wherein, when said action code is an insertion code and said requested document is within said set of documents, said annotation overlay proxy associates said additional information with each occurrence in said merged
document of said pattern.
12. The system of claim 8, wherein, when said action code is a deletion code and said requested document is within said set of documents, said annotation overlay proxy prevents from appearing in said merged document any occurrence of said
pattern.
13. The system of claim 8, wherein, when said action code is a program code and said requested document is within said set of documents, said annotation overlay proxy executes a filter program on each occurrence of said pattern in said requested
document so that a corresponding part of said merged document includes a transformation of said pattern operated on by said filter program.
14. In a computer network incorporating a plurality of servers used to store documents, each said document having a unique document identifier, and a client computer having a browser configured to request and receive said documents over said
network, a method for providing annotation overlays for a requested document, said method comprising the steps of:
associating at least one stored overlay group with one of said servers, said overlay group encapsulating annotation overlays regarding at least one of said documents, each said stored overlay group having a unique source identifier;
forming a merged document by merging said requested document from a first server with associated annotation overlays regarding said requested document from specified overlay groups; and
relaying said merged document to a receiver selected from a proxy or said browser,
wherein all said requested documents are routed through an annotation overlay proxy, said annotation overlay proxy performing said merging and said relaying steps, and, before said merging step, forming an annotation directory, wherein said
annotation directory stores said annotation overlays from said specified overlay groups;
said annotation overlay proxy forming said annotation directory by:
receiving a list of said specified overlay groups identified by their unique source identifiers;
retrieving all of said annotation overlays from said specified overlay groups; and
storing said retrieved annotation overlays in an electronic memory coupled to said annotation overlay proxy.
15. The method of claim 14, wherein said annotation overlay comprises:
a document id specifying a set of documents to which said annotation overlay is applicable;
a pattern in said set of documents;
an action code specifying an action to be taken with respect to said set of documents when forming said merged document; and
an argument supplying additional information for said action code.
16. The method of claim 15, wherein, when said action code is a replacement code and said requested document is within said set of documents, said action comprises substituting in said merged document said additional information for parts of
said requested document having a specified relationship to said pattern.
17. The method of claim 15, wherein, when said action code is an insertion code and said requested document is within said set of documents, said action comprises inserting in said merged document said additional information at a position set
wherein each position in said position set has a specified relationship to said pattern.
18. The method of claim 15, wherein, when said action code is a deletion code and said requested document is within said set of documents, said action comprises preventing from appearing in said merged document any occurrence of a part of said
requested document having a specified relationship to said pattern.
19. The method of claim 15, wherein, when said action code is said program code and said requested document is within said set of documents, said action comprises executing a specified filter program on parts of said requested document having a
specified relationship to said pattern so that corresponding parts of said merged document include a transformation of said parts operated on by said filter program.
20. A computer-readable memory that can be used to direct a computer to merge stored annotation overlays with documents stored on a computer network to which said computer is coupled, said computer-readable memory comprising:
(1) an annotation overlay including:
(a) a document id specifying a set of documents to which said annotation overlay is applicable;
(b) a pattern in said set of documents;
(c) an action code specifying an action to be taken by said computer with regard to said pattern in said set of documents when creating said merged document; and
(d) an argument supplying additional information to assist proxy procedures in executing said action with regard to said pattern;
(2) said proxy procedures for forming a merged document by merging said annotation overlay with a requested document within said set of documents, said merging including performing said action specified by said action code; and
(3) an annotation directory associated with said proxy procedures, wherein said annotation directory stores said annotation overlays from specified overlay groups, and wherein said proxy procedures are configured, in response to a document
request issued by said computer, to retrieve a set of associated annotation overlays related to said requested document from said annotation directory prior to merging said set of associated annotations and said requested document;
wherein said annotation directory is created by said proxy procedures, which, upon receiving a list of said specified overlay groups identified by their unique source identifiers, are configured to retrieve all of said annotation overlays from
said specified overlay groups and store said retrieved annotation overlays in an electronic memory coupled to said proxy procedures, said stored annotation overlays forming said annotation directory.
21. The computer readable memory of claim 20, wherein, when said action code is a replacement code and said requested document is within said set of documents, said proxy procedures substitute in said merged document said additional information
for parts of said requested document having a specified relationship to said pattern.
22. The computer readable memory of claim 20, wherein, when said action code is an insertion code and said requested document is within said set of documents, said proxy procedures insert in said merged document said additional information at a
set of insertion points wherein each of said insertion points has a specified relationship to said pattern.
23. The computer readable memory of claim 20, wherein, when said action code is a deletion code and said requested document is within said set of documents, said proxy procedures prevent from appearing in said merged document any occurrence of a
part of said requested document having a specified relationship to said pattern.
24. The computer readable memory of claim 20, wherein, when said action code is said program code and said requested document is within said set of documents, said proxy procedures execute a specified filter program on parts of said requested
document having a specified relationship to said pattern so that a corresponding part of said merged document includes a transformation of said part operated on by said filter program. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
The
present invention relates generally to computer networks, and particularly to proxy servers used to provide annotation overlays for documents stored on computer networks.
BACKGROUND OF THE INVENTION
The World-Wide Web ("WWW") links many of the servers making up the Internet, each storing documents identified by unique universal resource locators (URLs). Many of the documents stored on Web servers are written in a standard document
description language called HTML (hypertext markup language). Using HTML, a designer of Web documents can associate hypertext links or annotations with specific words or phrases in a document (these hypertext links identify the URLs of other Web
documents or other parts of the same document providing information related to the words or phrases ) and specify visual aspects and the content of a Web page.
A user accesses documents stored on the WWW using a Web browser (a computer program designed to display HTML documents and communicate with Web servers) running on a Web client connected to the Internet. Typically, this is done by the user
selecting a hypertext link (typically displayed by the Web browser as a highlighted word or phrase) within a document being viewed with the Web browser. The Web browser then issues a HTTP (hypertext transfer protocol) request for the requested document
to the Web server identified by the requested document's URL. In response, the designated Web server returns the requested document to the Web browser, also using the HTTP.
Many entities, especially corporations that allow access from corporate systems to the Web, modify this document access process by providing a firewall proxy running on a proxy server situated between the Web client running the browser and the
various Web servers hosting the requested documents. In this modified situation, all HTTP requests issued by the browser and all documents returned by the Web servers simply routed through the firewall proxy, which implements a proxy server
communications protocol that is a subset of the HTTP. Apart from providing a buffer between the Web client and servers, a pure firewall proxy performs no additional operations on the transferred information. Another common type of firewall proxy is a
caching firewall proxy, which caches requested documents to provide faster subsequent access to those documents.
The ease of access and page design provided by the Web has proved attractive to many types of uses; e.g., individuals and corporations, who have not traditionally used the Internet. Additionally, the WWW is increasingly being used for commercial
purposes, such as advertising and sales. Together, the new users and uses mean that an information explosion is occurring on the Web. With this information explosion it is becoming increasingly important that Web users be able to comment on the content
of Web document, view the commentary of others, or filter information in Web pages. For example, a competitor or industry critic might wish to comment on product announcements made by another competitor, buyers of a specific service might want to access
the commentary of certain critics (but not others) regarding that service and parents might wish to block their children's access to all documents classified as inappropriate by a review board with whose opinions they agree. Ideally, these features
would be implemented in a manner that is compatible with existing Web browsers and the HTTP.
One system that provides a subset of these features by taking advantage of the proxy server protocol is the Open Software Foundation's World Wide Web Agent Toolkit, or OreO. OreO allows users to build personal agents that can perform filtering
functions on requested documents before they are viewed using the Web browser. The agents created with OreO can be used in pipeline anywhere between a traditional Web client (i.e., Web browser) and a Web server to perform more complex and varied
filtering of Web transactions. For example, a user could connect an obscenity filter in series with a violence filter to ensure appropriate Web browsing for their children. OreO makes this pipelining possible by providing agent interfaces that make
each agent look like a traditional Web client on one side and a proxy server on the other.
However, because the OreO toolkit does not address the creation of source libraries of commentary associated with known commentators and critics, OreO agents are not well-suited to merge commentary by sources other than the creator of a requested
document with the requested document. Moreover, OreO agents can only perform filtering by parsing all requested documents looking for occurrences of certain key phrases or patterns then deleting or replacing those key phrases or patterns. Clearly then,
unless a commentator creates a new agent for each new document or class of documents and makes those agents available to all interested Web users, which would be extremely unwieldy, many alternate words and phrases equivalent to the key words and phrases
would be missed. Finally, because the agents are not Web documents, it would not be possible to provide overlays to comment on the changes made by a first agent without another commentator creating a second agent and making that second agent available
for users to insert between the Web browser and the first agent.
Therefore, there is a need for a system that introduces a proxy server between Web servers and clients that allows parts of requested documents to be annotated, filtered, transformed or deleted before the documents are viewed with a Web browser.
Unlike the OreO agent, this system should perform the aforementioned annotating, filtering, transforming and deleting based on sources of commentary associated with Web servers that might be completely unrelated to the author of the requested document.
Ideally, a user should be able to indicate to the proxy server specific overlay sources to merge. Then, when the user requests a document, that request should be relayed through the proxy, which merges the requested document with overlays from the
user-specified sources that reference the requested document. The resulting merged document should be viewable with any existing Web browser.
Alternatively, the system should allow a user of the proxy to direct the proxy to form a library of annotations from a specific set of sources. Then, when a user requests a document, the proxy should be able to merge comments in the library of
annotations with the requested document, eliminating the need to search the Web for the appropriate annotations. Ideally, each of the overlays should have their own URL's so they could be easily annotated by other commentators.
SUMMARY OF THE INVENTION
In summary, the present invention is a system and method for merging annotations from various sources with documents requested over the Web in such a way that the merged document is displayable by existing Web browsers.
Specifically, the present invention is a system for providing annotation overlays for documents requested over a computer network that incorporates a plurality of servers to store the documents. Each stored document has a unique document
identifier and is viewable from a client computer having a browser configured to request and receive documents over the network. Features of the present invention include at least one stored overlay group associated with one of the servers. Each such
overlay group encapsulates annotation overlays regarding at least one document and has a unique source identifier. Another feature of the present invention is an annotation overlay proxy (AOP), which is a software routine configured to merge a requested
document from a first server with associated annotation overlays regarding the requested document from specified overlay groups. The annotation overlay proxy then relays the merged document to a receiver unit that is selected from another proxy
(possibly a firewall proxy or another annotation overlay proxy) or the browser, which ultimately displays the merged document.
The present invention is also a method usable in the same type of computer network for providing annotation overlays for a requested document. As a first step, at least one stored overlay group is associated with a network document server. A
merged document is then formed by merging a requested document stored on a first server with annotation overlays regarding the requested document from specific overlay groups. This merged document is then relayed to a receiver selected from another
proxy or said browser.
BRIEF DESCRIPTION OF THE DRAWINGS
Additional objects and features of the invention will be more readily apparent from the following detailed description and appended claims when taken in conjunction with the drawings, in which:
FIG. 1 is a block diagram of a preferred embodiment of the present invention.
FIG. 2 is a block diagram of the preferred embodiment showing the situation where two annotation overlay proxies are connected in series.
FIG. 3 is a flow chart of a preferred method for merging annotation overlays and requested documents.
FIG. 4 is a diagram illustrating how the annotation overlay proxy merges annotations with a requested document.
FIG. 5 is an illustration of the appearance of a merged document displayed by a web browser.
FIG. 6 is a block diagram of an alternative embodiment of the present invention where the annotation overlay proxy does not build or use the annotation directory of FIG. 1.
FIG. 7 is a block diagram of an alternative embodiment of the present invention where each annotation overlay can refer to one or more documents .
DESCRIPTION OF THE PREFERRED EMBODIMENT
Referring to FIG. 1, there is shown a block diagram of a preferred embodiment 100, which includes a Web client 110, a proxy server 130 coupled to the Web client 110 and Web servers 140, 142, 144, each of which are coupled to the proxy server 130.
A Web browser 112 executes in the Web client 110, while an annotation overlay proxy (AOP) 114 executes in the proxy server 130. Information output by the Web browser 112 to the AOP 114 includes an overlay sources message 116 and a requested
document message 117, while the AOP 114 provides the browser 112 with a merged document 120. Data structures employed by the present invention, which can be stored in any form of electronic or magnetic memory, include an annotation directory 118
associated with the proxy server 130 and overlay groups 150, 152 associated with the Web servers 140 and 144, respectively. Each of the overlay groups 150, 152 can include multiple annotation overlays 151a-d, 153e-f that include cross-references to
documents such as the documents 146 (Doc1) and 148 (Doc2), which are stored on the Web servers 142 and 144, respectively. Where large numbers of annotation overlays are associated with a particular overlay group, in the interests of scalability, that
group can be split among several Web servers. The annotation directory 118 includes a plurality of annotation overlays 119a-f drawn from one or more of the overlay groups 150, 152.
The AOP 114 is coupled to the Web servers 140-44 and communicates with the Web servers 140-44 using standard Internet (TCP/IP) and WWW (HTTP) protocols. The AOP 1 14 relays all commands from the Web browser 112 to the Web servers 140-44 and
receives from the Web servers a copy of the requested document 115. Actions of the AOP 114 are directed by local AOP procedures 114a and programs 113.
Please note that the numbers of components shown in FIG. 1 are merely exemplary. Also, the preferred embodiment depicted in FIG. 1 is a generic configuration intended to illustrate the basic principles of the present invention. Consequently,
the following descriptions of the preferred embodiment are applicable to other configurations of the elements shown, including a configuration where the annotation directory 118 and the annotation overlay proxy 114 are resident in the Web client 110.
Having set out the elements of the present invention, these elements are now described in greater detail. The following descriptions, while general, will in places be directed to the situation where a user has issued a request to view the document 146,
which is associated with the Web server 142.
A. Web Browser:
The Web browser 112, which displays the merged document 120, is functionally identical to prior art Web browsers. Thus, as set out in the background section, a user of the Web browser 112 accesses a document 146 or 148 stored on the Web by first
selecting a hypertext link (i.e., a highlighted word or phrase) within a document currently being displayed by the Web browser 112. Alternatively, a user can issue a document request by entering the desired document's URL in the Web browser. Similarly
to existing browsers, the Web browser 112 acts on a user's document request by issuing a HTTP document request message 117 specifying the URL of the requested document. Unlike existing browsers, the browser 112 issues the document request message 117 to
the AOP 114, rather than to a firewall proxy or to a group of Web servers. However, in all respects, the HTTP document request message 117 is identical to one issued by existing browsers.
The Web browser 112 also issues an HTTP sources message 116 to the AOP 114. This is a new message (i.e., a message not currently used by existing Web browsers) that specifies the URLs of the overlay groups containing information to be merged by
the AOP 114 with the document identified in the document request message 117. For example, in FIG. 1, the sources message 117 indicates that the user wants to view annotation overlays from overlay groups 150 and 152, corresponding to Sources 1 and 2,
respectively. This message can be issued by the Web browser 112 at any time after the AOP 114 has been initialized.
The Web browser 112 can initialize the AOP 114 in two ways. First, the user can enter the various overlay groups they wish to view on a command line from the browser 112 or even when they start the browser 112. The browser then initializes the
AOP 114 and immediately issues a corresponding sources message 116, causing the AOP 114 to build the annotation directory 118. Alternatively, a list of sources can be submitted to the browser 112 using a common gateway interface (CGI), after which the
browser initializes the AOP 114 and issues the appropriate sources message 116 to the AOP 114.
B. Annotation Overlay Processor (AOP):
The AOP 114, like any other proxy (e.g, a firewall proxy), communicates with entities connected to the Web, such as the Web client 110 or the Web servers 140-42, using the standard HTTP proxy server communications protocol. All functions of the
AOP 114 are directed by a set of local AOP procedures 114a. Like a firewall proxy, much of the AOP's job involves merely passing along messages between the Web browser 112 and the Web servers 140-44. For example, upon receiving a HTTP document request
message 117 from the Web browser 112, the AOP 114 simply relays that message to the Web servers 140-44, the appropriate one of which returns the requested document.
However, in contrast to a firewall proxy, which acts only as a kind of glass wall between the Web client 112 and servers 140-144, the AOP 114 can also transform the data being returned by the Web servers 140-144 to the Web browser 112. In the
preferred embodiment illustrated in FIG. 1, this transformation involves the AOP 114 merging a requested document returned by one of the Web servers 140-44 with annotation overlays 119 from the annotation directory 118 that are associated with the
requested document.
The AOP 114 builds an annotation directory 118 by searching the Web servers 140-44 for all annotation overlays contributed by the sources, or overlay groups, designated by the Web browser 112 in an HTTP sources message 116, and then storing those
annotations in the annotation directory 118 as annotation overlays 119. Thus, in FIG. 1, the annotation directory 118 includes all annotations from Source 1 (overlay group 150) and Source 2 (overlay group 152), which were specified in the message 116.
Generally, the AOP 114 builds the annotation directory 118 only upon receiving the HTTP sources message 116.
Each annotation overlay 119 has five fields: (1) document URL, (2) source, (3) pattern, (4) action and (5) arg (short for argument), which respectively tell the AOP 114:
(1) the URL of the document to which the annotated pertains;
(2) which source contributed the annotation overlay;
(3) what specific part (or pattern) of the returned document the overlay pertains to;
(4) the action to take with respect to the pattern; and
(5) any additional information to associate in the merged document with the pattern. This additional information can include text or graphics to be inserted in the merged document or a designation of a "type" annotation, such as grammar error
("gr"), spelling error ("sp"), "agree", or "disagree".
To promote efficient retrieval of overlays, the AOP 114 orders the annotation overlays 119 on the URL of the annotated documents, although any other ordering of the overlays 119 is possible. These fields will be described in greater depth below.
Upon receiving the image of a requested document from a Web server responding to an HTTP document request message 117 issued by the browser 112, the AOP 114 first identifies the set of annotation overlays 119 in the annotation directory 118 that
are associated with the requested document's URL. For example, assuming that the document requested and returned was the document 146, this set would be the annotation overlays 119a,b,e,f (FIG. 1). This task is made particularly easy in the preferred
embodiment, where all annotation overlays 119 are organized by document URL. The AOP 114 then creates the merged document 120 by transforming the returned image 115 of the requested document 146 according to the information from the annotations
119a,b,e,f. This transformation is effected by the AOP 114 adding HTML-formatted content to the merged document 120 so that the annotations are seamlessly integrated with the requested document and viewable using existing Web browsers such as the
browser 112.
In the preferred embodiment, AOPs can be serially connected. Such an arrangement is shown in FIG. 2, where two AOPs 180, 182 are connected in series, with the AOP 180 being linked to the Web browser 112 and the AOP 182 to the Web servers 140-44. As in FIG. 1, each AOP has access to an annotation directory, 118a and 118b containing annotations derived from sources specified by a user using the HTTP sources message 116. However, unlike the arrangement shown in FIG. 1, the annotation directories
118a and 118b include annotation overlays only from Source 1 and 2, respectively. The need to serially connect AOPs can arise in several situations, two of which are now described.
In a first situation, the annotation overlays from Source 1 and 2 might be formatted differently; consequently, the AOPs 180, 182 are specialized to read the annotation overlays from Sources 1 and 2, respectively. When the user asks for source
information from diverse sources, such as Sources 1 and 2, the Web browser 112 opens as many different types of AOP as necessary to handle the different annotations. The merged document 184 returned to the Web browser 112 is then created in two steps.
First, the AOP 182 merges the annotations from Source 2 with the returned document 115, then the AOP 180 merges the annotations from Source 1 with the intermediate document 115'.
A second situation where two AOPs are serially connected is where a source of annotations, such as Source 1, largely provides commentary on annotations from another source, such as Source 2. This is possible in the preferred embodiment as each
annotation overlay has an associated URL that corresponds to the unique reference numbers 119a,b,e,f; 151a-d and 153e-f. When this situation arises, two AOPs are created so that the annotations from Source 1 are merged only after all of the annotations
from Source 2.
C. Overlay Groups
The same type of overlay information provided in the annotation directory 118 is also provided by the overlay groups 150, 152, which, as set out above, are the original sources for the annotation overlays copied by the AOP 114 to the annotation
directory 118. The only real difference between the overlay groups 150, 152 and the annotation directory 118 is that the overlay groups contain annotations for a multiplicity of documents contributed by a single source rather than a collection of
annotations from a set of designated sources (i.e., the sources designated to the AOP 114 using the sources message 116). Consequently, the overlay groups are organized differently from the annotation directory.
In the preferred embodiment, each annotation overlay group 150, 152 corresponds to a single source (i.e., Source 1 and Source 2, respectively) and is formatted as shown in Table 1.
TABLE 1 ______________________________________ document.sub.-- url 1 pattern a, action a arg a pattern b, action b arg b . . . document.sub.-- url 2 pattern c, action c arg a pattern d, action d arg b . . . document.sub.-- url 3
pattern e, action e arg e . . . document.sub.-- url 4 pattern e, action e arg e ______________________________________
In this representation, each document.sub.-- url corresponds to a document URL in the annotation directory 118 or overlay group 150, 152 (FIG. 1). The other information fields: pattern (3), action (4) and arg (5), are the same as for a record
119, 151, 153 in the annotation directory 118 or overlay groups 150, 152, respectively.
For example, the overlay group 150 (FIG. 1) includes four annotation overlays 151a-d provided by Source 1. Of these overlays, the first two, 151a-b, pertain to the document 146 (Doc 1), so they are included under a heading referencing the URL of
Doc1. Similarly, the second two overlays 151c-d, pertain to the document 148 (Doc2) so they reference the URL of Doc2. The overlay group 152 is formatted similarly to the overlay group 150 and includes annotation overlays 153e-f provided by Source 2
that are associated only with the document 146 (Doc1).
In the preferred embodiment, the overlay groups 150, 152 are written in HTML. At all times document URLs are maintained in alphabetically sorted order. For example, some appropriate HTML for the overlay document of Table 1 is shown in Table 2.
TABLE 2 ______________________________________ <UL> <LI> document.sub.-- url 1 <DL> <DT> pattern a, action a <DD> arg a <DT> pattern b, action b <DD> arg b . . . </DL> <LI>
document.sub.-- url 2 <DL> <DT> pattern c, action c <DD> arg a <DT> pattern d, action d <DD> arg b . . . </DL> <LI> document.sub.-- url 3 <DL> <DT> pattern e, action e <DD> arg e . .
. </DL> <LI> document.sub.-- url 4 <DL> <DT> pattern e, action e <DD> arg e . . . </DL> . . . </UL> ______________________________________
In Table 2, the terms between the paired "<" and ">" symbols are standard HTML commands, or tags, which allow the AOP to parse annotation overlays for the various fields. The tags used in Table 2 have the following meanings:
______________________________________ <UL> begin unordered list; <LI> list item; </UL> end unordered list; <DL> begin definition list; <DT> definition list term; <DD> definition of prior definition list
term; and </DL> end definition list. ______________________________________
Having described the structure of the annotation directory 118 and the overlay groups 150, 152, the fields of the annotation overlays, common to both the annotation directory and the overlay groups, are now described in detail.
D. Annotation Overlays:
In the preferred embodiment, the AOP 114 is designed to read annotation overlays that are written in HTML, which allows authors to create annotations using widely available HTML authoring tools. Alternatively, special annotation authoring tools
could be provided.
Typically, overlay authors will place annotation overlays on the Web according to one of many authoring models. For example, in a cooperative (coop) model, a group of individuals contribute annotation overlays directly to one or more overlay
groups 150, 152 associated with a cooperative to which the authors belong. For example, members of different political parties might contribute overlays to the overlay groups 150 and 152, respectively. In a magazine model, paid authors submit
annotations to a centralized editor who then edits and publishes the authors' annotation overlays in an overlay group(s) managed by the editor. In this magazine model, it is envisioned that publishers of overlay groups will sell subscriptions or, like
commercial broadcasters, advertising, to cover the costs of publication.
As described above and shown in FIG. 1, in addition to specifying the document being annotated and the source of the annotations, each annotation overlay 119, 151, 153 includes the following information fields that allow the AOP 114 to perform
the correct transformation on the correct part of the requested document:
pattern (3);
action (4); and
arg (5).
Each of these fields is described below in reference to the exemplary annotation overlay shown in Table 3, which represents a member of an overlay group such as the group 150. Please note that field 2 from the annotation overlays 119 is missing
because source (2) is synonymous with the overlay group identifier. E.g., the overlay group 150 contains only annotation overlays authored by Source 1.
TABLE 3 ______________________________________ (1) http://info.cern.sh/hypertext/WWW/Daemon/User (3)(4) `Files can be real or synthesized` [Insert after sentence] (5) Unfortunately, there is no way to tell the difference between synthesized
and real files; this makes it extremely difficult to reliably cache HTML documents using the CERN server. ______________________________________
In the example of Table 3, the identifier field (1) indicates the URL (or document URL) of one respective document to be annotated. The remaining fields (3)-(5) represent the pattern, action and arg for the same annotation overlay. We proceed
to describe fields (3), (4) and (5).
1. Annotation Overlays--Pattern Field
A pattern (3) is a pattern of words or pixels in the requested document that the AOP 114 must operate on. The pattern field is necessary as the document URL in an annotation overlay does not provide fine location within a document but merely a
pointer to the document as a whole. In the preferred embodiment, the pattern syntax consists of a list of words or pixels making up the pattern set off by balanced single quotes. For example, in the illustration above, the pattern the AOP 114 must
search for in the document identified by the document URL http://info.cern.sh/hypertext/WWW/Daemon/User is: `Files can be real or synthesized`. The remaining fields (4) and (5) tell the AOP 114 what actions to take with regard to occurrences in the
corresponding document.
2. Annotation Overlays--Action Field:
The action field (4) defines the action that the AOP 114 must take when merging an annotation overlay at the specified pattern in the requested document. In the preferred embodiment, this action can be selected from one of four basic operations:
______________________________________ Insert Insert the contents of the arg field into the requested document at a specified location relative to the pattern. Delete Delete the specified pattern or a range of words/images surrounding the
specified pattern.. Replace Replace the specified pattern or a designated part of the document including the specified pattern with the contents of the arg field. Run.sub.-- Program Execute the identified program, which corresponds to a routine
113 that is associated with the AOP 114. ______________________________________
In addition to one of the four basic operations, the action field (4) also includes several additional parameters that specify how the action is to be implemented with respect to the pattern field (3). In the preferred embodiment, all of the
arguments composing the action field follow the pattern field (3) and are enclosed in balanced square brackets. The syntax of the action field (4) is set out in TABLE 4.
TABLE 4 ______________________________________ operation [match] insert .linevert split. [match] delete .linevert split. [match] replace .linevert split. [match] run.sub.-- program match match decimal number insert insert where
.linevert split. delete delete from where to where replace replace where .linevert split. run.sub.-- program run.sub.-- program program.sub.-- id where .linevert split. where before location .linevert split. after location location document
.linevert split. section .linevert split. paragraph .linevert split. sentence .linevert split. word decimal number .linevert split. ______________________________________
In Table 4, the bolded terms are keywords that are used in a particular instruction. The unbolded terms represent variables or parts of an instruction. Optional parts of the action field, such as `match` are shown enclosed in square brackets.
For example, in the exemplary annotation overlay in Table 3, the entire action is: [Insert after sentence]. In this example, the operation is "insert", the where part of the instruction is "after" and the location is "sentence". The vertical bar ("I")
stands for "OR"; e.g, the location field can have one value selected from document, OR section OR paragraph, etc.
The various parts of action field are defined in Table 5.
TABLE 5 ______________________________________ operation As mentioned above, an operation is the action to be implemented with respect to the pattern and is selected from one of insert, delete, replace and run.sub.-- program. Insert, and
replace operations take a where argument, which indicates to the AOP 114 the specific part to be operated on of a document including the pattern. Delete operations take an argumen | | |