WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Apparatus and methods for retrieving information by modifying query plan based on description of information sources    
United States Patent5600831   
Link to this pagehttp://www.wikipatents.com/5600831.html
Inventor(s)Levy; Alon Y. (New Providence, NJ); Srivastava; Divesh (New Providence, NJ)
AbstractTechniques for optimizing queries in a system in which executing the query requires retrieval of information from a number of different data bases which are accessible via a network. In the techniques, a query results in a query plan which includes subplans for querying the data bases which contain the required information. When a subplan is executed in one of the data bases, the data base returns not only the information which results from the execution of the subplan, but also source and constraint information about the data in the data base. The source and constraint information is then used to optimize the query plan by pruning redundant subplans. An embodiment is disclosed in which queries are made to a domain model implemented using a knowledge base system. The domain model includes a world view of the data, a set of descriptions of the data bases, and a set of descriptions of how to access the data. The information in the domain model is used to formulate the query plan.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5600831
Apparatus and methods for retrieving information by modifying query plan

     based on description of information sources - US Patent 5600831 Drawing
Apparatus and methods for retrieving information by modifying query plan based on description of information sources
Inventor     Levy; Alon Y. (New Providence, NJ); Srivastava; Divesh (New Providence, NJ)
Owner/Assignee     Lucent Technologies Inc. (Murray Hill, NJ)
Patent assignment
All assignments
Publication Date     February 4, 1997
Application Number     08/347,016
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 30, 1994
US Classification     707/2 706/45
Int'l Classification     G06F 017/30 G06F 007/00
Examiner     Black; Thomas G.
Assistant Examiner     Alam; Hosain T.
Attorney/Law Firm    
Address
Parent Case     RELATED APPLICATIONS The present application is a continuation-in-part of U.S. Ser. No. 08/203082, Apparatus and Methods for Retrieving Information, Thomas Kirk, et al., filed Feb. 28, 1994 and owned by the assignee of the present application.
Priority Data    
USPTO Field of Search     395/600 395/575 395/425
Patent Tags     methods retrieving information modifying query plan based description information sources
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5408655
Oren
715/501.1
Apr,1995

[0 after 0 votes]
5388196
Pajak
715/751
Feb,1995

[0 after 0 votes]
5386556
Hedin
707/4
Jan,1995

[0 after 0 votes]
5379419
Heffernan
707/4
Jan,1995

[0 after 0 votes]
5379366
Noyes
706/55
Jan,1995

[0 after 0 votes]
5355474
Thuraisngham
707/9
Oct,1994

[0 after 0 votes]
5355320
Erjavic
716/4
Oct,1994

[0 after 0 votes]
5335345
Frieder

Aug,1994

[0 after 0 votes]
5315703
Matheny
715/700
May,1994

[0 after 0 votes]
5301317
Lohman
707/2
Apr,1994

[0 after 0 votes]
5197005
Shwartz
707/2
Mar,1993

[0 after 0 votes]
5117349
Tirfing
707/3
May,1992

[0 after 0 votes]
5091852
Tsuchida
707/2
Feb,1992

[0 after 0 votes]
5021989
Fujisawa
345/661
Jun,1991

[0 after 0 votes]
4933800
Yang
361/29
Jun,1990

[0 after 0 votes]
4769772
Dwyer
707/2
Sep,1988

[0 after 0 votes]
4408302
Fessel
715/531
Oct,1983

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. An improved information system for retrieving query result information from one or more information sources in response to a query, the information system including descriptions of the information sources and the improvement comprising query execution means including:

query plan generating means for receiving the query and the descriptions of the information sources and responding thereto by generating a query plan for retrieving the query result information from the information sources; and

query plan execution means for receiving and responding to the query plan by retrieving the query result information from the information sources,

the query plan execution means retrieving additional information from the information sources in addition to the query result information and providing the additional information to the query plan generation means in the course of executing the query plan, the additional information being type information indicating a type of the retrieved query result information; and

the query plan generating means responding to the additional information and to the descriptions of the information sources by making a modification of the query plan and providing the modification to the query plan execution means for execution while the query plan execution means is executing the query plan, the query plan generating means further responds to the type information and the type description in making the modification of the query plan.

2. The improved information system set forth in claim 1, wherein

a description of an information source includes a type description indicating a type of the information in the information source.

3. The improved information system set forth in any of claims 1, 2, or 3 wherein:

the information sources are located remotely from the information system.

4. The improved information system set forth in any of claims 1, 2, or 3 further comprising:

a knowledge base including concepts and wherein

the descriptions of the information sources describe the information sources in terms of the concepts.

5. The improved information system set forth in claim 4 wherein:

the concepts include concepts relating to the information in the information sources and

the additional information is an instance of a concept and the query plan generation means is further responsive to the instance as required by the query and the concepts relating to the information.

6. The improved information system set forth in claim 5 wherein:

the concepts in the knowledge base are ordered in a hierarchy; and

the knowledge base responds to a new concept or a new instance by ordering the new concept or new instance in the hierarchy.

7. The improved information system set forth in claim 6 wherein:

the information sources are accessible by means of a plurality of protocols;

the concepts in the knowledge base include concepts which describe the protocols; and

the query plan generation means is further responsive to the concepts which describe the protocols.

8. A memory medium readable by a computer system, the memory medium being characterized in that:

the memory medium contains code which, when executed by the computer system, implements the improved information system of claim 1 for retrieving query result information from one or more information sources in response to a query.

9. An information system for retrieving query result information from one or more information sources in response to a query, the information system comprising:

a knowledge base including concepts related to information in the information sources, the knowledge base receiving the query and responding thereto by producing an information access description which describes what information is to be accessed in terms of the concepts;

query plan generating means for receiving the information access description and responding to the information access description and the concepts by generating a query plan describing how the query result information is to be obtained from the information sources; and

query plan execution means for receiving the query plan and executing the query plan to retrieve the query result information and additional information from the information sources and provide the additional information to the query plan generating means in the course of executing the query plan, the query plan generating means responding thereto and to the concepts by making a modification of the query plan and providing the modification to the query plan execution means for execution, the additional information being type information indicating a type of the retrieved query result information, the query plan generating means further responds to the type information and the type description in making the modification of the query plan.

10. The information system set forth in claim 9 wherein:

The information sources are accessed by a plurality of protocols; and

the concepts in the knowledge base include concepts related to the protocols by means of which the information sources are accessed.

11. A method practiced in an information system which retrieves query result information from one or more information sources in response to a query, the method comprising the steps of:

receiving the query and the descriptions of the information sources and responding thereto by generating a query plan for retrieving the query result information from the information sources; and

receiving and responding to the query plan by retrieving the query result information from the information sources;

retrieving additional information from the information sources in addition to the query result information and providing the additional information in the course of executing the query plan, the additional information being type information indicating a type of the retrieved query result informations; and

responding to the additional information and to the descriptions of the information sources by making a modification of the query plan while executing the query plan, and further responding to the type information and the type description in making the modification of the query plan.

12. An improved information system for retrieving query result information from one or more information sources in response to a query, the information system including descriptions of the information sources that include a source description indicating a source of the information in the information source, and the improvement comprising query execution means including:

query plan generating means for receiving the query and the descriptions of the information sources and responding thereto by generating a query plan for retrieving the query result information from the information sources; and

query plan execution means for receiving and responding to the query plan by retrieving the query result information from the information sources,

the query plan execution means retrieving additional information from the information sources in addition to the query result information and providing the additional information to the query plan generation means in the course of executing the query plan, the additional information being source information indicating a source of the retrieved query result information; and

the query plan generating means responding to the additional information and to the descriptions of the information sources by making a modification of the query plan and providing the modification to the query plan execution means for execution while the query plan execution means is executing the query plan, the query plan generating means further responds to the source information and the source description in making the modification of the query plan.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention concerns information retrieval generally. More specifically, the invention concerns optimization of query plans for retrieving information from a number of information sources.

2. Description of the Prior Art

Networks now connect computers with information sources located anywhere in the world. The Internet, for example, provides access to a large and diverse body of information, such as technical papers, public domain software, directory services and various databases (e.g., airline schedules, stock market listings). It is thus now possible to speak of global information systems.

Being aware that interesting and useful information exists is insufficient if one cannot find the relevant information sources. The large variety of information sources, and the disparity of interfaces among them renders the task of locating and accessing information over the network even more difficult. In order to address some of these problems, it is important to understand the characteristics of the available information sources.

Autonomy: The first characteristic is the autonomy of the information sources. This means that the information sources (i.e., sites) maintain and update their own data, and they are not willing to change their operations to suit the needs of the global information system. At best, an information source is willing to provide a description of its contents.

Dynamic nature: The second characteristic of information sources is their dynamic nature. Specifically, new information sources are added, while existing information sources disappear or arc no longer maintained.

Number of sources: The third characteristic is the very large number of information sources.

Cost of access: The fourth characteristic is that accessing an information source over the network is expensive (both in time and possibly in money).

The first characteristic distinguishes global information systems from distributed databases, where the information sources are not autonomous, but under the control of co-operating database administrators. The second characteristic sets apart global information systems from enterprise-wide databases, where the set of information sources are relatively stable (though the contents may change, of course). The third characteristic differentiates global information systems from current day multidatabases, that is, systems in which the information is contained in a number of different kinds of data base systems.

These characteristics of the information sources necessitate the following features in an architecture for global information systems.

World-view: A consequence of the very large number of information sources is that it is unreasonable to expect users to interact separately with each source. The users need a conceptually uniform view of the information space, against which they can formulate queries. However, there does not have to be a single such view of the information, but there can be many user and domain-specific world-views. In order to relate the contents of the information sources with the world-view, we need site descriptions.

Expressive site descriptions: A consequence of the large number of information sources and the high cost of accessing these sources is that in answering queries, a global information system must minimize the number of information sources (i.e., sites) that are accessed. Therefore, a key requirement of the site descriptions is that they be rich enough to express various constraints that enable the system to prune the sources accessed.

Extensibility: A consequence of the dynamic nature of the information sources is that it should be possible to easily extend the world-view to manage new kinds of information provided by the sources.

Query only: A consequence of the autonomy of information sources is that while a global information system might be able to support global querying, it is unreasonable to expect that it will support global updating.

The parent of the present patent application disclosed an information retrieval system having some of the above features.

That information retrieval system, shown as system 101 in FIG. 1 of the present patent application, has a knowledge base system 109 which includes a domain model 111. Domain model 111 is a model of information from a specific domain. Domain model 111 has three components: world view 115, information source descriptions 113, and system-network view 117. All of these components include concepts belonging to domain model 111. World view 115 is the part of domain model 111 which is visible to a user of system 101. World view 115 is a conceptually unified view of the information space. In a preferred embodiment, world view 115 is implemented as an expressive object/relational data model. The user can pose queries in terms of the objects and the relations in the world-view, unburdened by details of data location and access. World-views are purely conceptual; all the data required to answer queries is present only in site relations at external information sources. World view 115 is related to the information in information sources 123 by means of information source descriptions 113 which provide descriptions of the contents of information sources 123 and by means of system-network view 117, which describes how the information in a particular information source 123 is accessed.

World view 115, information descriptions 113, and system-network view 117 all include concepts in domain model 111. Knowledge base system 109 is able to classify new concepts and add them to the hierarchy of concepts already in domain model 111. Thus, if knowledge base system 109 receives information about a new information source 123, it automatically classifies new concepts relating to the information source in domain model 111.

Access plan generation and execution 119 in FIG. 1 poses sub-queries to the external sources that contain information relevant to answering the query and combines the answers to these sub-queries to answer the user query. Accessing information sources over the network is expensive, and so an important problem in generating access plans is minimizing the number of external information sources 123. It is an object of the present patent application to provide improved techniques for such minimization.

SUMMARY OF THE INVENTION

The basis for the improved minimization techniques of the present invention is an expansion of the data model to include many relations as well as concepts and roles. The expanded data model in turn permits a site description language which provides the information needed for the minimization techniques. A site description language relates the contents of a site (information source 123) with the world-view. Key aspects of the site description language that are useful in answering queries efficiently are the following: (1) Relating the semantic contents of relations in sites to relations in world-view 105 (note that, in particular, relating semantic content includes relating schema information), (2) Stating that a site relation contains complete information about a fragment of the world-view, and (3) Specifying the query forms that an information source can answer efficiently.

The site descriptions of the invention, finally permit novel query optimization techniques that minimize the number of site relations accessed. The optimization techniques are the following: (1) using constraints in the site descriptions and the query to prune the set of site relations that arc irrelevant to the query, and (2) using information about completeness of site relations to prune redundant site relations.

An important aspect of the optimization techniques is that optimization is done dynamically. In traditional database query optimization, the query plan is generated completely at compile-time, and is not modified at run-time. As pointed out in the parent of the present patent application and disclosed in detail herein, it is crucial to have dynamic query plans, where the query plan generation phase interacts with the plan execution phase. Also disclosed is an algorithm for producing a dynamic query plan.

Other objects and advantages of the apparatus and methods disclosed herein will be apparent to those of ordinary skill in the art upon perusal of the following Drawing and Detailed Description, wherein:

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a conceptual overview of the information retrieval system described in the parent of the present application;

FIG. 2 is a detail of a site description in a preferred embodiment;

FIG. 3 shows the algorithm employed in the preferred embodiment to generate query subplans;

FIG. 4 shows the algorithm employed in the preferred embodiment for dynamically generating a query plan; and

FIG. 5 is a detailed block diagram of access plan generation and execution component 119 of information retrieval system 101 in the preferred embodiment.

Reference numbers in the Drawing have two parts: the two least-significant digits are the number of an item in a figure; the remaining digits are the number of the figure in which the item first appears. Thus, an item with the reference number 201 first appears in FIG. 2.

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT

The following Detailed Description contains material from the detailed description of the parent of the present application through the section titled Site Description Language; the new material which is being added in the present application begins at the section titled Improvements in System 101.

Architecture

Architecture Overview

FIG. 1 presents an overview of an information retrieval apparatus 101 which incorporates the principles of the invention. A preferred embodiment of information retrieval apparatus is implemented using a digital computer system and information sources which arc accessible via the Internet communications network.

The central component of apparatus 101 is a knowledge base 109 built upon a description logic based knowledge representation system (CLASSIC in the preferred embodiment) which is capable of performing inferences of classification, subsumption, and completion. Knowledge-base systems are described generally in Jeffery D. Ullman, Principles of Database and Knowledge-base Systems, Vols. I-II, Computer Science Press, Rockville, Md., 1989. Descriptions of CLASSIC may be found in Alex Borgida, Ronald Brachman, Deborah McGuinness, and Lori Resnick, "CLASSIC: A Structural Data Model for Objects", in Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data, pp. 59-67, 1989, R. J. Brachman, et al., "Living with CLASSIC", in: J. Sowa, ed., Principles of Semantic Networks: Explorations in the Representations of Knowledge, Morgan-Kaufmann, 1991, pp. 401-456, and L. A. Resnick, et al., CLASSIC: The CLASSIC User's Manual, AT&T Bell Laboratories Technical Report, 1991.

Knowledge base, 109 is used to construct a domain model 111 which organizes information accessible via apparatus 101 into a set of concepts which fit the manner in which the user of system 101 is intending to view and use the information. In system 101, domain model 111 has three components: world view 115, which contains concepts corresponding to the way in which a user of the system looks at the information being retrieved, system/network view 117, which contains concepts corresponding to the way in which the information is described in the context of the data bases which contain it and the communications protocols through which it is accessed, and information source descriptions 113, which contains concepts describing the information sources at a conceptual level. System/network view 117 and information source descriptions 113 are normally not visible to the user. The concepts in these portions of domain model 111 do, however, participate fully in the reasoning processes that determine how to satisfy a query.

An important benefit of using a description logic system like CLASSIC is that as new information is added to the system, much of the work of organizing the new information with respect to the concepts already in knowledge base 109 is done automatically. Only a description of the known attributes of the information must be specified; CLASSIC's inference mechanisms then automatically classify these descriptions into appropriate places in the concept hierarchy.

User interaction with the system is accomplished through browsing and querying operations in terms of high-level concepts (concepts that are meaningful to a user unsophisticated in the details for information location and access). These concepts are intended to reflect the terms in which the user thinks about the type and content of information being queried. By working with these high-level concepts, the user is unburdened with the details of the location and distribution of information across multiple remote information servers.

Information sources 123 are generally (though not limited to) network-based information servers that are accessed by standard internet communication protocols. Sources can also include databases, ordinary files and directories, and other knowledge bases.

User Interface

The user interacts with the system through a graphical user interface 103. The two primary modes of interaction supported by this interface are querying and browsing. In both cases the user expresses both browsing and querying operations in terms of concepts from "world view" portion 115 of domain model 111.

A knowledge base browser in CLASSIC 109 allows the user to view and interactively explore the concept taxonomy. The concept taxonomy is represented graphically as a directed graph 105, where the nodes correspond to concepts and edges indicate parent/child relationships among concepts.

To support extension of the concept taxonomy, the knowledge base browser also serves as an editor, allowing the user to define new concepts in terms of existing ones. The classification inferences in knowledge representation system 109 automatically place new concepts at the correct place in the taxonomy with respect to existing concepts.

Since both the high-level world concepts 115 and low-level system concepts 117 coexist in a single domain model 111, an important role of user interface 103 is to filter the system concepts out of the view seen by the user in query results and in the taxonomy browser.

Query Translator 107

The query language used in system 101 is based on CLASSIC, but has additional constructors that enable the user to express queries more easily. The query is formulated in terms of the concepts and objects that appear in the world view part 115 of the knowledge base. Query translator 107 translates queries expressed in the query language into CLASSIC description language expressions which are used to consult the knowledge base. Due to the limited expressive power of the description language and the need for special purpose query operators, the query language may contain elements not expressible in the description language of knowledge representation system 109. After partial translation to a description language expression, the remaining fragments of the query are translated to procedural code that is executed as part of the query evaluation.

Knowledge Representation System 109

The knowledge base is a virtual information store in the sense that the information artifacts themselves remain external to the knowledge base; the system instead stores detailed information (in terms of domain model 111) about the location of these information artifacts and how to retrieve them. Retrieval of a particular piece of information is done on demand, when it is needed to satisfy part of a query. The types of information managed in this manner include files, directories, indexes, databases, etc.

The domain model embodied in the knowledge base is logically decomposed into world view 115, system/network view 117, and information source descriptions 113 World view 115 is the set of concepts with which the user interacts and queries are expressed. System/network view 117 concerns low level details which, though essential for generating successful query results, are normally of no interest to the user. Information source descriptions 113 is a collection of concepts for describing information sources. These information source descriptions are expressed in terms of both world and system concepts. The purpose of encoding information source descriptions 113 in the domain model is to make it possible for CLASSIC to reason about what information sources must be consulted in order to satisfy a query.

We define system concepts comprising system/network view 117 as those concepts that describe the low-level details of information access. This includes concepts related to network communication protocols, location addressing, storage formats, index types, network topology and connectivity, etc. Since the knowledge base generally merely retrieves information instead of storing previously-retrieved information, system/network view 117 includes all those concepts relevant to determining attributes like location, retrieval methods, and content format.

Continuing in more detail, concepts within world view 115 describe things with which the user is familiar; they are the concepts that describe characteristics of information artifacts of interest to users. Concepts within information source descriptions 113 relate the concepts in world view 115 to concepts concerning the semantic content of information sources. Thus, given a query which employs concepts in world view 115, knowledge representation system 109 can employ the concepts in information source descriptions 113 to relate the concepts used in the query to actual information sources and can employ system/network view 117 to relate the concepts used in the query to an access plan which describes how to retrieve information from the sources as required to answer the query.

Access Plan Generation and Execution

When a user wishes to obtain information, the user inputs a query in system 101's query language at graphical user interface 103. System 101 then answers the query. There are several steps involved. First, query translator 107 translates the query into a form to which knowledge representation system 109 can respond. Then the translated query is analyzed in knowledge base system 109 to decide which of the external information sources are relevant to the query, and which subqueries need to be sent to each information source. This step uses world view 115 and system/network view 117. The information in system/network view 117 is expressed in a site description language which will be described in more detail later.

Knowledge base 109 uses the conceptual information from world view 115 and system/network view 117 to produce an information access description describing how to access the information required for the query in information sources 123. Knowledge base 109 provides the information access description to access plan generation and execution component 119, which formulates an access plan including the actual commands needed to retrieve the information from sources 123.

1. Plan formulation: Given the information access description, planner 119 decides on the order in which to access sources 123 and how the partial answers will be combined in order to answer the user's query. The key distinction between this step and traditional database techniques is that planner 119 can change the plan after partial answers are obtained. Replanning may of course involve inferences based on concepts from information source descriptions 113 and/or system/network view 117 and the results of the search thus far.

2. Plan materialization: The previous step produced a plan at the level of logical source accesses. This step takes these logical accesses and translates them to specific network commands. This phase has two aspects:

Format translation: the description of the sites is given at a logical level. However, to actually access the site, one must conform to a syntax of a specific query language. In this step, these translations are done.

Specific network commands are generated to access the sites. Here, information from the system/network view is taken into account. Depending on the site being accessed, the system will generate the appropriate commands for performing the access.

The translations to service and site-specific access commands are performed by Information Access Protocol Modules 121 (0 . . . n), described in the following section.

Several points should be noted about the above process:

In executing the plan, system 101 uses a work space in the computer system upon which system 101 is implemented to store its intermediate results.

After executing part of the plan, system 101 may decide to replan for the rest of the query.

Information Access Protocol Modules 121

Access to information sources is done using a variety of standard information access protocols. The purpose of these modules is to translate generic information access operations (retrieval, listing collections, searching indexes) into corresponding operations of the form expected by the information source. For many standard Internet access protocols, the translation is straightforward.

Examples of access protocols supported by these modules include several network protocols defined by Internet RFC draft standard documents, including FTP (File Transfer Protocol), Gopher, NNTP (Network News Transfer Protocol), HTTP (Hypertext Transfer Protocol). In addition, other modules support access to local (as opposed to network-based) information repositories, such as local file-systems and databases.

Site Description Language

As previously pointed out, the concepts in information source descriptions 113 relate concepts in world view 115 to information sources 123. These relationships are expressed using a site description language. CLASSIC and related knowledge representation systems employ description languages which can function as site description languages, but such site description languages do not permit efficient reasoning. In a preferred embodiment, efficiency has been substantially increased by the use of a site description language which extends CLASSIC.

The following discussion of the site description language employed in the preferred embodiment employs the example below:

Consider an application in which we can obtain information about airline flights from various travel agents. We have access to fares given by specific travel agents and to telephone directory information to obtain their phone numbers. In practice, the information about price qu