|
Claims  |
|
|
We claim:
1. A computer system having one or more memories and one or more central
processing units, the computer capable of accessing one or more database
memories, a plurality of objects stored in one or more of the database
memories and each object identified by an index, the computer system
further comprising:
a user interface for entering a query;
a hit list, stored in one or more of the memories, the hit list being a
topically relevant set of a plurality of topically relevant objects, the
topically relevant set being selected from one or more of the database
memories and ranked by a topical search engine using the index, the rank
indicating a topical relevance to the query;
a relationship data structure containing information about how two or more
of the objects are related to one another by one or more structural
relationships, the structural relationships being directed; and
a hierarchical view generator that selects a structurally relevant set of
the objects, each object in the structurally relevant set being a
candidate parent of one or more of the topically relevant objects
according to one or more of the structural relationships, said
hierarchical view generator dynamically determining which of the
candidates for parents are selected as actual parents for said query
entered by said user interface, based on said one or more structural
relationships,
each of the topically relevant objects being a child of one or more of the
selected parents, the hierarchical view generator organizing each of the
child objects under one or more of its respective selected parents to
create a directional hierarchy.
2. A computer system, as in claim 1, further comprising a display that
renders a presentation of one or more of the selected parents and the
child objects associated with the respective selected parent.
3. A computer system, as in claim 1, where the database memory resides on
one or more connected computers that are connected to the computer system
by a network.
4. A computer system, as in claim 3, where the query accesses the computer
system from the network.
5. A computer system, as in claim 1, where the hierarchical view generator
subselects a subset of the most relevant topically relevant objects
according to rank as the topically relevant set.
6. A computer system, as in claim 1, where the information in the
relationship data structure includes weights for the structural
relationships and the hierarchical view generator subselects a subset of
the structurally relevant objects as the structurally relevant set as a
function of the weights.
7. A computer system, as in claim 1, where one or more relationship data
structures defining one or more structural relationships are selected
using information in the query.
8. A computer system, as in claim 1, where the structural relationships
include any one or more of the following: the hyperlink relationship, the
subclass or categorization relationship, the geographic location
relationship, the location within a file system relationship, the
conceptual relationship, and the is-a-part-of relationship.
9. The system of claim 1, wherein said hit-list is organized into a
hierarchical structure of parent and children objects,
the connections in the structure corresponding to one or more of the
structural relationships, and said structured hit-list allowing a user to
understand a nature of the results of the query, and to find an
appropriate object to satisfy the user's information need.
10. The system of claim 1, wherein a topical search, responsive to the
query, comprises a content-based similarity search where the query does
not define a precise set of objects,
the search engine computing the similarity of the query to each object in
the database to generate a ranked list of objects ordered by relevance to
the query.
11. The system of claim 1, wherein said objects comprise data objects, and
said data objects are stored in a database with relationships defined
between the objects,
said hierarchical view generator selecting and processing the relationships
to identify structurally relevant parents from an original child set.
12. The system of claim 1, wherein said hierarchical view generator
includes an iterator, said iterator processing structural relationships to
build a parent hierarchy based on topical and structural relevance,
said iterator selecting structural relationships in a query context and
ranks parent objects based on their structural relevance to the query.
13. The system of claim 1, wherein the hierarchical structure groups
objects for answering a user's query, and provides insight into the
relevant structure of the database, to satisfy the user's information
need.
14. The system of claim 1, wherein the user's query selects certain
relationships for finding structurally relevant objects and for organizing
and displaying the result list accordingly,
wherein said objects are ranked based on degrees of topical relevance, and
a subset of the topically relevant objects is selected for further
relationship processing.
15. The system of claim 1, wherein said objects on the result list are
organized based on their structural relationships, and display of objects
at certain levels of the hierarchy is suppressed based on the
relationships in which they participate,
wherein objects in the topically relevant set are ranked, and the children
thereof are organized under their parents according to a structural
relationship.
16. A computer system having one or more memories and one or more central
processing units, the computer capable of accessing one or more database
memories, a plurality of objects stored in one or more of the database
memories and identified by an index, the computer system further
comprising:
a user interface for entering a query;
a hit list, stored in one or more of the memories, the hit list being a
topically relevant set of a plurality of topically relevant objects, the
topically relevant set being selected from one or more of the database
memories and ranked by a topical search engine using the index, the rank
indicating a topical relevance to the query,
a relationship data structure containing information about how two or more
of the objects are related to one another by one or more structural
relationships, the structural relationships being directed;
a hierarchical view generator that selects a structurally relevant set of
the objects, each object in the structurally relevant set being a
candidate parent of one or more of the topically relevant objects
according to one or more of the structural relationships, said
hierarchical view generator dynamically determining which of the
candidates for parents are selected as actual parents for said query
entered by said user interface, based on said one or more structural
relationships,
each of the topically relevant objects being a child of one or more of the
selected parents, the hierarchical view generator organizing each of the
child objects under one or more of its respective selected parents to
create a directional hierarchy; and
an iterator in the hierarchical view generator that treats the selected
parents as topically relevant objects and runs the hierarchical view
generator zero or more times, each time creating a parent level in the
directional hierarchy.
17. A system, as in claim 16, where a different set of one or more
structural relationships is selected in one or more iterations of the
iterator.
18. A computer system, as in claim 16, further comprising:
a display that renders a presentation of one or more selected parents in
one of the parent levels and one or more of the objects, being displayed
objects, structurally related to one of the selected parents and at a
lower level than the parent level in the directional hierarchy.
19. A system, as in claim 18, where one or more of the displayed objects is
suppressed in the display.
20. A system, as in claim 19, where the displayed object is suppressed due
to its structural relationship with the respective selected parent.
21. A computer system having one or more memories and one or more central
processing units, the computer capable of accessing one or more database
memories, a plurality of objects stored in one or more of the database
memories and each object identified by an index, the computer system
further comprising:
means for selecting a topically relevant set of two or more of the objects;
means for ranking the objects in the topically relevant set;
means for identifying one or more structural relationships between one or
more of the objects in the topically relevant set, being children, and one
or more of the objects in the database memories, being parents, the
structural relationships being directional from the parent to the child,
said identifying means including means for dynamically determining which
of the candidates for parents are selected as actual parents for said
query entered by said user interface based on said one or more structural
relationships; and
means for organizing one or more of the children under each of the
respective selected parents in a structural hierarchy.
22. A method of hierarchically grouping a plurality of objects stored In
one or more database memories of a computer system, comprising steps of:
a. selecting a topically relevant set of two or more of the objects;
b. ranking the objects in the topically relevant set;
c. identifying one or more structural relationships between one or more of
the objects in the topically relevant set, being children, and one or more
of the objects in the database memories, being parents, the structural
relationships being directional from the parent to the child, said step of
dynamically determining which of the candidates for parents are selected
as actual parents for said query entered by said user interface based on
said one or more structural relationships; and
d. organizing one or more of the children under each of the respective
selected parents.
23. A method, as in claim 22, further comprising steps of:
e. deselecting the children and identifying the parents as children; and
f. repeating steps c and d zero or more times, each time creating a next
hierarchical level. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
This invention relates to the field of searching and navigating a large
object collection, particularly a hypermedia database in a networking
environment. More specifically, the invention relates to a system and
method for generating grouped hierarchical views (with ranking) for a set
of (hypermedia) objects in a query context based on one or more
relationships.
BACKGROUND OF THE INVENTION
A hypermedia object database is a collection of hypermedia objects stored
electronically as files on one or more computers. Hypermedia objects
contain information in the form of text, images, sound, or video.
Hypermedia objects may also participate in relationships, where a
relationship identifies one or more hypermedia objects that are related
somehow (a hypermedia object may be related to itself). One common
relationship is the hyperlink relationship. Two hypermedia objects are
related by a (directed) hyperlink relationship if one of the objects
contains a hyperlink pointer to the other object. A hyperlink pointer is a
reference to a hypermedia object that allows the target object to be
accessed directly from the source object.
Users access a hypermedia object database to locate objects of interest and
retrieve those objects for processing (e.g., reading, viewing, listening,
analysis). Finding objects of interest by manually inspecting every object
in a large database is impractical. Instead, users typically search the
database for interesting objects using a search system. A search system
allows a user to express an information need in the form of a query. The
system's search engine processes the query and returns to the user a
hit-list of relevant objects. The user then selects interesting objects
from the hit-list and retrieves those objects.
A relational database management system (RDBMS) may be used to index and
search arbitrary hypermedia objects based on their attributes. Attributes
include items such as size, creation date, author, and title. Searching
for objects in this fashion is well known. In addition to attribute-based
searching, users may want to search for hypermedia objects based on their
content. The algorithms and data structures used by a content-based search
system depend on the kind of object being searched. Text objects are
typically searched using an information retrieval (IR) system (e.g., IBM
Search Manager/2, a trademark of the IBM Corporation). Image objects are
typically searched using an image indexing and retrieval system (e.g., IBM
QBIC, a trademark of the IBM Corporation). Content-based search techniques
for video and sound exist and have been incorporated into prototype
systems, but this technology is less mature than text and image search.
Objects found using an attribute-based or content-based search system are
said to be "topically relevant" to the query.
Some prior art content-based search systems attempt to improve the search
results for hypermedia object databases by refining object relevance
scores based on the structural relationships (e.g., hyperlinks) between
the objects. Three representative techniques are used by these systems.
The first technique is a form of "spreading activation," where object
relevance scores are propagated along outbound hyperlink pointers to
neighboring objects and used to modify the relevance scores of those
objects (see Cohen, P. R., and Kjeldsen, R. "Information Retrieval by
Constrained Spreading Activation in Semantic Networks," Information
Processing & Management, 23(2), pp. 255-268, 1987; Savoy, J. "Citation
Schemes in Hypertext Information Retrieval," in M. Agosti and A. Smeaton
(Eds.), Information Retrieval and Hypertext, Boston, Kluwer Academic
Publishers, pp. 99-120, 1996). This procedure is typically iterated until
a steady-state is reached or some terminating condition is met.
The objects are then sorted by their final relevance scores and returned on
a flat hit-list (i.e., the hit-list simply enumerates the objects without
describing any structural relationships).
In the second technique, it is assumed that the hypermedia objects are
organized in a given hierarchy, such that every object has at most one
parent and the children of a given object are explicitly identified. An
object's relevance score is then calculated as a function of its
content-based relevance score and the relevance scores of its children.
Relevance scores must be propagated from the leaves of a hierarchy to the
root (see Frisse, M. E. "Searching for Information in a Hypertext Medical
Handbook," Communications of the ACM, 31(7), pp. 880-886, 1988). The
objects are then sorted by their final relevance scores and returned on a
flat hit-list.
In the third technique, the content of neighboring objects is added to the
content of the current object when determining the relevance score for the
current object (see Croft et al. "Retrieving Documents by Plausible
Inference: an Experimental Study," Information Processing & Management,
25(6), pp. 599-614, 1989). Neighboring objects are those objects to which
the current object contains hyperlink pointers. As in the previous two
techniques, objects are sorted by their relevance scores and returned on a
flat hit-list.
The above cited references are incorporated by reference in their entirety.
Regardless of the search technology being used, most search systems follow
the same basic procedure for indexing and searching a hypermedia object
database. First, the objects to be searched must be input to the search
system for indexing. Next, attributes and/or contents are extracted from
the objects and processed to create an index. An index consists of data
that is used by the search system to process queries and identify relevant
objects. After the index is built, queries may be submitted to the search
system. The query represents the user's information need and is expressed
using a query language and syntax defined by the search system. The search
system processes the query using the index data for the database and a
suitable similarity ranking algorithm, and returns a hit-list of topically
relevant objects. The user may then select relevant objects from the
hit-list for viewing and processing.
A user may also use objects on the hit-list as navigational starting
points. Navigation is the process of moving from one hypermedia object to
another hypermedia object by traversing a hyperlink pointer between the
objects. This operation is typically facilitated by a user interface that
displays hypermedia objects, highlights the hyperlinks in those objects,
and provides a simple mechanism for traversing a hyperlink and displaying
the referent object. One such user interface is a Web browser (see below).
By navigating, a user may find other objects of interest.
In a networking environment, the components of a hypermedia object database
system may be spread across multiple computers. A computer comprises a
Central Processing Unit (CPU), main memory, disk storage, and software
(e.g., a personal computer (PC) like the IBM ThinkPad). (ThinkPad is a
trademark of the IBM Corporation.) A networking environment consists of
two or more computers connected by a local or wide area network (e.g.,
Ethernet, Token Ring, the telephone network, and the Internet.) (See for
example, U.S. Pat. No. 5,371,852 to Attanasio et al. issued on Dec. 6,
1994 which is herein incorporated by reference in its entirety.) A user
accesses the hypermedia object database using a client application on the
user's computer. The client application communicates with a search server
(the hypermedia object database search system) on either the user's
computer (e.g. a client) or another computer (e.g. one or more servers) on
the network. To process queries, the search server needs to access just
the database index, which may be located on the same computer as the
search server or yet another computer on the network. The actual objects
in the database may be located on any computer on the network.
A Web environment, such as the World Wide Web on the Internet, is a
networking environment where Web servers, e.g. Netscape Enterprise Server
and IBM Internet Connection Server, and browsers, e.g. Netscape Navigator
and IBM WebExplorer, are used. (Netscape Navigator is a trademark of the
Netscape Communications Corporation and WebExplorer is a trademark of the
IBM Corporation.) Users can make hypermedia objects publicly available in
a Web environment by registering the objects with a Web server. Moreover,
users can create arbitrary relationships between these objects, even if
the objects were created by another user. Other users in the Web
environment can then retrieve these objects using a Web browser. The
collection of objects retrievable in a Web networking environment can be
considered as a large hypermedia object database.
To create an index for a hypermedia object database in a Web networking
environment, the prior art often uses Web crawlers, also called robots,
spiders, wanderers, or worms (e.g., WebCrawler, WWWWorm), to gather the
available objects and submit them to the search system indexer. Web
crawlers make use of the (physical) hyperlinks stored in objects. All of
the objects are gathered by identifying a few key starting points,
retrieving those objects for indexing, retrieving and indexing all objects
referenced by the objects just indexed (via hyperlinks), and continuing
recursively until all objects reachable from the starting points have been
retrieved and indexed. The graph of objects in a Web environment is
typically well connected, such that nearly all of the available objects
can be found when appropriate starting points are chosen.
Having gathered and indexed all of the objects available in the Web
environment, the index can then be used, as described above, to search for
objects in the Web. Again, the index may be located independently of the
objects, the client, and even the search server. A hit-list, generated as
the result of searching the index, will typically identify the locations
of the relevant objects on the Web, and the user will retrieve those
objects directly with their Web browser.
STATEMENT OF PROBLEMS WITH THE PRIOR ART
When searching a hypermedia object database, the prior art fails to create
a hierarchically structured search result based on arbitrary relationships
between the objects in the database. Some prior art systems consider
relationships when ranking objects, but the final result in most of these
systems is still a simple hit-list (comparable to systems that ignore
relationships altogether) that conveys no information about the
relationships between the objects. Those prior art systems that can
display a hierarchically structured search result require that a single,
pre-determined hierarchy be specified for all of the objects in the
hypermedia database. For large databases (e.g., the World Wide Web), this
requirement is impossible to satisfy, and in general, this requirement is
restrictive and inflexible.
No prior art system fully exploits the relationships between objects in a
hypermedia object database to 1) produce a search result that includes
both topically and structurally relevant objects ("structurally relevant"
objects are objects that may not be topically relevant, but are still
relevant to the query due to their structural relationships with topically
relevant objects), and 2) present the search results such that the end
user can easily identify and exploit the relationships between the objects
on the hit-list.
The prior art performs even worse in a networked multi-user environment,
e.g. the World Wide Web, where documents are created by multiple authors
and are poorly cross referenced. For example, when a hypertext document is
created by a single author, its pages are typically well cross referenced
and provide links to next, previous, parent, index, and table of contents
pages. Even though a prior art search system ignores these links and
identifies only topically relevant pages in the document, since the
document is well cross referenced, the user can navigate to structurally
relevant pages in the document by following the cross reference links.
However, the user does not have this option in a networked multi-user
environment where a second, independent author could create a related
document with hypertext links to the document created by the first author.
A relationship now exists between these two documents, but there are no
good cross references from the first document to the second document,
preventing the user from navigating to the second document. If the second
document is not topically relevant to the user's query, the user will fail
to find the second document even though it is structurally relevant to the
query due to its relationship to the first document.
OBJECTS OF THE INVENTION
An object of this invention is a system and method that generates a
hierarchical grouping of topically and structurally relevant objects in a
query context.
An object of this invention is a system and method that generates a ranked,
hierarchical grouping of topically and structurally relevant objects in a
query context.
An object of this invention is a system and method that generates a
hierarchical grouping of relevant objects in a query context based on one
or more structural relationships.
An object of this invention is a system and method that generates ranked,
hierarchical groupings of topically and structurally relevant objects in a
query context in a networking environment.
An object of this invention is a system and method that generates a
hierarchical grouping of topically and structurally relevant objects in a
query context by: identifying objects that are structurally relevant,
organizing structurally and topically relevant objects into meaningful
groups, identifying navigational starting points, and presenting the
groups to a user in a meaningful way.
SUMMARY OF THE INVENTION
The present invention is a system and method for identifying and
hierarchically grouping one or more (hypermedia) objects that are
topically relevant to a user query and/or have a structural relation with
one another.
Topically relevant objects are first identified using any generally known
methods to obtain a set of topically relevant objects (topically relevant
set). Parents, and in alternative embodiments other ancestors, of one or
more of the topically relevant objects are identified according to
directional structural relationships that the parents have with respect to
the topically relevant objects. These objects form a set of structurally
relevant objects (structurally relevant set). In some embodiments, the
user query identifies one or more of these structural relationships. The
topically relevant objects are then organized under one or more of their
respective parents to form a hierarchy level of both (topically relevant
and structurally relevant) sets of objects. In some preferred embodiments,
the process can iterate to create more than one hierarchy level.
In alternative embodiments, subsets of the topically relevant set can be
selected, e.g., by rank, as the topically relevant set. Also, subsets of
the structurally relevant set, e.g., by weighting, can be selected as the
structurally relevant set. For example, the weight can be based on the
number of hyperlink paths that connect the ancestor to one or more of the
topically relevant objects, the strengths of these hyperlink paths, and
other attributes of the ancestor, such as the total number of hyperlinks
originating at the ancestor and the topical relevance of the ancestor.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and other objects, aspects and advantages will be better
understood from the following detailed description of preferred
embodiments of the invention with reference to the drawings that include
the following:
FIG. 1 is a block diagram of the computing environment in which the present
invention is used in a non limiting preferred embodiment.
FIG. 2, comprising FIGS. 2A and 2B, is a block diagram of an example object
collection, in particular a hypermedia object database.
FIG. 3 is a block diagram of an example hypermedia object database that is
hierarchically grouped by the present invention.
FIG. 4 is a block diagram of an example hypermedia object database that is
hierarchically grouped as defined by a first structural relationship.
FIG. 5 is a block diagram of an example hypermedia object database that is
hierarchically grouped as defined by a second structural relationship.
FIG. 6, includes prior art FIGS. 6A and 6B, where FIG. 6A shows an object
catalog, and FIG. 6B a table of named attribute values for objects in the
catalog.
FIG. 7 comprises FIGS. 7A and 7B, where FIG. 7A shows a structured query
and FIG. 7B shows an object hit-list.
FIG. 8 shows a relationship catalog and several of the relationship tables
to which the relationship catalog refers.
FIG. 9 a children table.
FIG. 10 is a flow chart showing the steps of one preferred hierarchical
view generator process executed by the present invention.
FIG. 11 is a flow chart showing the steps of a preferred process for
scoring parent objects based on structural relationships.
FIG. 12, comprising FIGS. 12A and 12B, are flow charts showing the steps of
a preferred process for displaying the results in a ranked hierarchical
order.
FIG. 13 is a screen dump showing a sample output result of the present
invention.
DETAILED DESCRIPTION OF THE INVENTION
FIG. 1 is a block diagram of the computing environment in which the present
invention is used in a non limiting preferred embodiment. The figure shows
some of the possible hardware, software, and networking configurations
that make up the computing environment.
The computing environment or system 100 comprises one or more general
purpose computers 170, 175, 180, 185, 190, and 195 interconnected by a
network 105. Examples of general purpose computers include the IBM Aptiva
personal computer, the IBM RISC System/6000 workstation, and the IBM POWER
parallel SP2. (These are Trademarks of the IBM Corporation.) The network
105 may be a local area network (LAN), a wide area network (WAN), or the
Internet. Moreover, the computers in this environment may support the Web
information exchange protocol (HTTP) and be part of a local Web or the
World Wide Web (WWW). Some computers (e.g., 195) may occasionally or
always be disconnected 196 from the network and operate as stand-alone
computers.
Hypermedia objects 140 are items such as books, articles, reports,
pictures, movies, or recordings that contain text, images, video, audio,
or any other multimedia object and/or information. One or more hypermedia
objects are stored on one or more computers in the environment.
To find a particular hypermedia object in the environment, a query (see
FIG. 7A) is submitted for processing to a topical search engine 120
running on a computer in the environment. The topical search engine uses
an index 130 to identify hypermedia objects that are relevant to the
query. The topical search engine creates an index by indexing a particular
set of hypermedia objects in the environment, called a hypermedia object
database 141. A hypermedia object database 141 may comprise hypermedia
objects located anywhere in the computing environment, e.g., spread across
two or more computers. The relevant hypermedia objects identified by the
index are ranked and returned by the topical search engine in the form of
a hit-list (see FIG. 7B). The process is well known in the prior art.
Examples of topical search engines include Search Manager/2 (a trademark
of the IBM corporation.)
The result of the search is further processed by submitting the query and
topical search engine results to a novel hierarchical view generator 110.
The hierarchical view generator 110 uses structural relationships in the
index 130 (see FIG. 8) to analyze (parent selection and ranking) the
relationships in the database and improve the search result. The
structural relationships are stored in the index by the hierarchical view
generator at indexing time. In some preferred embodiments, the
hierarchical view generator selects structurally relevant objects by
calculating parent ranks based on the ranks of the topically relevant
objects (generated by the topical search engine) and/or the (weighted)
structural relationships in which the topically relevant objects
participate. Structurally relevant objects have a structural relationship
(see FIG. 8) with one or more of the topically relevant objects.
(Structurally relevant objects may or may not be topically relevant.) The
hierarchical view generator then aggregates topically relevant objects
based on their relationships and generates ranked hierarchies of both the
topically relevant objects and structurally relevant objects to present to
the user. For convenience, the topical search engine 120 and hierarchical
view generator 10 are shown here as separate components. Note, however,
that both systems may be internal components of a general hypermedia
object search system.
Hypermedia objects 140 and/or indexes 130 on one computer may be accessed
over the network by another computer using the Web (http) protocol, a
networked file system protocol (e.g., NFS, AFS), or some other protocol.
Services on one computer (e.g., topical search engine 120) may be invoked
over the network by another computer using the Web protocol, a remote
procedure call (RPC) protocol, or some other protocol.
A number of possible configurations for accessing hypermedia objects,
indexes, and services locally or remotely are depicted in the present
figure. These possibilities are described further below.
One configuration is a stand-alone workstation 195 that may or may not be
connected to a network 105. The stand-alone system 195 has hypermedia
objects 140 and an index 130 stored locally. The stand-alone system 195
also has a topical search engine 120 and hierarchical view generator 110
installed locally. When the system is used, a query is input to the
workstation 195 and processed by the local topical search engine 120 and
hierarchical view generator 110 using the index 130. The results from the
topical search engine are output by the workstation 195.
A second configuration is 185, a workstation with hypermedia objects and
indexes connected to a network 105. This configuration is similar to the
stand-alone workstation 195, except that 185 is always connected to the
network 105. Also, the local index 130 may be derived from local
hypermedia objects 140 and/or remote hypermedia objects accessed via the
network 105, and created by either a local topical search engine 120 and
hierarchical view generator 110 or a remote topical search engine 120
and/or a remote hierarchical view generator 110 accessed via the network
105. When queries are input at the workstation 185, they may be processed
locally at 185 using the local topical search engine 120, local
hierarchical view generator 110, and local index 130. Alternatively, the
local topical search engine 120 and hierarchical view generator 110 may
access a remote index 130 (e.g. on system 175) via the network 105.
Alternatively, the workstation 185 may access a remote topical search
engine 120 and/or hierarchical view generator 110 via the network 105.
Another possible configuration is 175, a workstation with an index only.
Computer 175 is similar to computer 185 with the exception that there are
no local hypermedia objects 140. The local index 130 is derived from
hypermedia objects 140 accessed via the network 105. Otherwise, as in
computer 185, the index 130, topical search engine 120, and hierarchical
view generator 110 may be accessed locally or remotely via the network 105
when processing queries.
Another possible configuration is computer 180, a workstation with
hypermedia objects only. The hypermedia objects 140 stored locally at
computer 180 may be accessed by remote topical search engines 120 and
hierarchical view generators 110 via the network 105. When queries are
entered at computer 180, the topical search engine 120, hierarchical view
generator 110, and index 130 must all be accessed remotely via the network
105.
Another possible configuration is computer 190, a client station with no
local hypermedia objects 140, index 130, topical search engine 120, or
hierarchical view generator 110. When queries are entered at computer 190,
the topical search engine 120, hierarchical view generator 110, and index
130 must all be accessed remotely via the network 105.
Another possible configuration is computer 170, a typical web server.
Queries are entered at another workstation (e.g., 175, 180, 185, or
possibly 195) or a client station (e.g., 190) and sent for processing to
the web server 170 via the network 105. The web server 170 uses a remote
topical search engine 120, hierarchical view generator 110, and index 130
(accessed via the network 105) to process the query. Alternatively, one or
more of these functions (110, 120, and 130) can reside on the web server
170. The results are returned to the workstation or client station from
which the query was originally sent.
FIGS. 2 and 3 give an intuitive description of the current invention. The
current invention operates on an object collection, which consists of
objects and directed relationships between those objects. (The hypermedia
object database 141 in FIG. 1 is an example of an object collection.) An
object is an identifiable entity that typically contains data, called the
object's content. The nature of this data depends on the kind of object.
For example, a document object would contain text data, while an image
object would contain image data.
A directed relationship describes how objects in the object collection are
related to each other. A directed relationship R consists of a set of
instances of R, where an instance of relationship R, denoted R(o1,o2),
indicates that relationship R holds from object o1 to object o2. Each
relationship instance may optionally have a weight associated with it,
e.g., R(o1,o2,w), where w is the weight of the instance of relationship R
between objects o1 and o2. If R(o1,o2), then object o1 is a parent of
object o2, and object o2 is a child of object o1. If object o2 can be
reached from object o1 by following one or more instances of relationship
R, then there is a path in relationship R from object o1 to object o2. The
weight of a path is a function of, e.g. the sum total, of the weights of
the relationship instances that make up the path. A directed relationship
defines a structural organization of the objects in the collection,
therefore a directed relationship is also called a structural
relationship.
Note that an undirected relationship is supported by representing it as a
directed relationship that holds in both directions, e.g., undirected
relationship U between object o1 and object o2 would be represented by two
instances of the directed relationship U', e.g., U'(o1,o2) and U'(o2,o1).
An example of a directed relationship is | | |