WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Apparatus and method for providing users with transparent integrated access to heterogeneous database management systems    
United States Patent5596744   
Link to this pagehttp://www.wikipatents.com/5596744.html
Inventor(s)Dao; Son K. (Northridge, CA); Ebeid; Nader (Westlake Village, CA)
AbstractDisclosed is a federated architecture and system which are extensible and flexible for integrated access to heterogeneous DataBase Management Systems (DBMS) dispersed over a long haul network, allowing transparent access to a wide variety of DBMS while maintaining the local autonomy of the underlying DBMS. In addition the system can run on top of different hardware, operating systems, network communications, and DBMS. The system can include new target DBMS with minimum changes and is not limited to integrate relational DBMS, but also to integrate legacy DBMS such as hierarchial or network DBMS, spatial information or text retrieval systems.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Dao; Son K. (Northridge, CA); Ebeid; Nader (Westlake Village, CA)
Owner/Assignee     Hughes Aircraft Company (Los Angeles, CA)
Patent assignment
All assignments
Publication Date     January 21, 1997
Application Number     08/064,690
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 20, 1993
US Classification     707/10
Int'l Classification     G06F 017/30
Examiner     Black; Thomas G.
Assistant Examiner     Choules; Jack M.
Attorney/Law Firm     Duraiswamy; V. D Low; W. K ., Denson- .
Address
Parent Case    
Priority Data    
USPTO Field of Search     395/600
Patent Tags     providing users transparent integrated access heterogeneous database management
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5369761
Conley
707/2
Nov,1994

[0 after 0 votes]
5345586
Hamala
707/10
Sep,1994

[0 after 0 votes]
5339434
Rusis
709/246
Aug,1994

[0 after 0 votes]
5329619
Page

Jul,1994

[0 after 0 votes]
5295261
Simonetti

Mar,1994

[0 after 0 votes]
5278978
Demers
707/101
Jan,1994

[0 after 0 votes]
5257369
Skeen
719/312
Oct,1993

[0 after 0 votes]
5257366
Adair
707/4
Oct,1993

[0 after 0 votes]
5247664
Thompson
707/10
Sep,1993

[0 after 0 votes]
5239648
Nukui
707/10
Aug,1993

[0 after 0 votes]
5235701
Ohler

Aug,1993

[0 after 0 votes]
5202996
Sugino
717/107
Apr,1993

[0 after 0 votes]
5202983
Orita
707/4
Apr,1993

[0 after 0 votes]
5187787
Skeen
719/314
Feb,1993

[0 after 0 votes]
5058000
Cox
707/10
Oct,1991

[0 after 0 votes]
4930071
Tou
707/4
May,1990

[0 after 0 votes]
4881166
Thompson
707/8
Nov,1989

[0 after 0 votes]
4769772
Dwyer
707/2
Sep,1988

[0 after 0 votes]
4714995
Materna
707/201
Dec,1987

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. In a computer data network having a communications medium commonly connecting a plurality of local site databases containing data with a plurality of users each capable of generating of global data query for accessing and retrieving data from said databases in accord with a single global query protocol, a method for controlling and directing a transmission of the user generated global data query to individual ones of the plurality of databases and for receiving and integrating the requested data received from the databases into a single response and for transmitting the integrated single response to the requesting user, said method comprising the following steps:

creating a smart data dictionary local site database profile containing data representing schema, data distribution, local site configuration and inter-site relationships of data among the local site databases in the network, for each local site database in the network;

communicating with said smart data dictionary local site database profile to retrieve data therefrom for decomposing the global data query into a local-site execution plan for retrieval of data from each local site database having data responsive to the global data query in accord with the data contained in said smart data dictionary local site database profile;

decomposing the global data query into a local-site execution plan for retrieval of data from each local site database having data responsive to the global data query in accord with the data contained in said smart data dictionary local site database profile;

transmitting that portion of said local-site execution plan to be executed to an appropriate said local site database for execution,

receiving data from each local site database responsive to said local-site execution plan and creating a global response database containing such responsive data received from each local site database;

providing the user access to the global response database in accord with the single global query protocol.

2. A method as in claim 1 further including the steps of:

creating for each of said local site databases a plurality of local information manager means, each communicating with said smart data dictionary local site database profile for controlling data flow to and from a specified local site database in the network in response to that portion of said local-site execution plan received by the local site database;

generating, in accord with the data contained in said smart data dictionary local site database profile, a data retrieval request for execution by any other local site databases necessary to complete that portion of said local-site execution plan received by it for execution.

3. The method of claim 1 wherein the step of decomposing the global query further comprises:

parsing and validating the syntax of the global query;

identifying an appropriate said local site database containing the requested data by interfacing with the smart data dictionary local site database profile to retrieve information about a local query protocol and data semantic schema established for each individual local site database and generating a plurality of local site database queries for retrieving responsive data from each of the local site databases containing such responsive data;

optimizing the plurality of local site queries to generate a local-site execution plan that optimizes a total time for executing the global query by minimizing the amount of data needed to be transferred among local sites and by choosing an appropriate home local site for processing the local-site execution plan;

coordinating the execution of that portion of the local-site execution plan that applies to each of the local site databases by:

(1) creating for each of said local site databases at least one local information manager means;

(2) sending each said local information manager means that portion of said local-site execution plan applicable to data requests from its associated local site database;

(3) sending each said local information manager means a request to execute a local reduction plan for reducing queries;

(4) sending each local information manager means a request to replicate responsive data fragments;

(5) sending each local information manager means a request to execute asynchronously a local data query on its associated local site database;

(6) sending each local information manager means, except a local information manager for a home site database, a request to synchronously send data responsive to the local data query request to the local information manager for the home site;

(7) archiving the received data for transmission to the requesting user.

4. In a computer data network having a communications medium commonly connecting a plurality of databases containing data with a plurality of users each capable of generating a global data query for accessing and retrieving data from said databases in accord with a single query protocol, a globally integrated data retrieval controller architecture for controlling and directing a transmission of the user generated global data query to individual ones of the plurality of databases and for receiving and integrating the requested data received from the databases into a single response and for transmitting the integrated single response to the requesting user, said globally integrated data retrieval controller architecture comprising:

a smart data dictionary means containing a database of data representing schema, data distribution, local site configuration and inter-site relationships of data among the databases in the network, for each database in the network;

a data information manager means communicating both with said smart data dictionary means to retrieve data therefrom, and with said user to receive said global data query therefrom and to transmit responsive data thereto, for decomposing the global data query into a local-site execution plan for retrieval of data from each database having data responsive to the global data query in accord with the data contained in said smart data dictionary means, and for transmitting that portion of said local-site execution plan to be executed to an appropriate said database for execution, and receiving data therefrom responsive to said local-site execution plan;

a plurality of local information manager means, each communicating with said data information manager means and said smart data dictionary means, for controlling data flow to and from a specified database in the network in response to that portion of said local-site execution plan received from said data information manager means and for transmitting retrieved data responsive to that portion of said local-site execution plan to said data information manager means,

each said local information manager means further adapted for generating, in accord with the data contained in said smart data dictionary means, a data retrieval request for execution by another local information manager means and for receiving data therefrom in response thereto, in order to complete that portion of said local-site execution plan received by it for execution.

5. A globally integrated data retrieval controller architecture as in claim 4 further comprising:

at least one local information manager means controlling data flow to and from at least two local databases and adapted to decompose that portion of said local site execution plan received from said data information manager means into a sub-local site execution plan for retrieval of data responsive to that portion of said local site execution plan received from said data information manager means from each of said controlled local databases.

6. A globally integrated data retrieval controller architecture as in claim 4 wherein said data information manager means comprises:

syntactive and semantic parser means interfacing with said smart data dictionary means for retrieving data representing local schema information and the inter-relationships among data, and for parsing and validating the syntax of the global data request using such data retrieved from said smart data dictionary means.

7. A globally integrated data retrieval controller architecture as in claim 4 wherein said data information manager means comprises:

optimizer means interfacing with said smart data dictionary means for retrieving data representing local schema information and the inter-relationships among data, and for minimizing the amount of data needed to be transferred among local site databases and for choosing an appropriate said local site database for processing each portion of the local-site execution plan.

8. A globally integrated data retrieval controller architecture as in claim 4 wherein said data information manager means comprises:

local site execution plan control means interfacing with each of said local site databases to send each local site database that portion of said local site execution plan necessary to extract responsive data from each said local site database.

9. A globally integrated data retrieval controller architecture as in claim 4 wherein said local information manager means further includes:

local controller means for controlling the execution of that portion of the local site execution plan sent by the data information manager means by coordinating all internal operations.

10. A globally integrated data retrieval controller architecture as in claim 9 wherein said local controller means operates synchronously.

11. A globally integrated data retrieval controller architecture as in claim 9 wherein said local controller means operates asynchronously.

12. In a computer data network having a communications medium commonly connecting a plurality of databases containing data with a plurality of users each capable of generating a global data query for accessing and retrieving data from said databases in accord with a single query protocol, a globally integrated data retrieval controller architecture for controlling and directing a transmission of the user generated global data query to individual ones of the plurality of databases and for receiving and integrating the requested data received from the databases into a single response and for transmitting the integrated single response to the requesting user, said globally integrated data retrieval controller architecture comprising:

a smart data dictionary means containing a database of data representing schema, data distribution, local site configuration and inter-site relationships of data among the databases in the network, for each database in the network;

a data information manager means communicating both with said smart data dictionary means to retrieve data therefrom, and with said user to receive said global data query therefrom and to transmit responsive data thereto, for decomposing the global data query into a local-site execution plan for retrieval of data from each database having data responsive to the global data query in accord with the data contained in said smart data dictionary means, and for transmitting that portion of said local-site execution plan to be executed to an appropriate said database for execution, and receiving data therefrom responsive to said local-site execution plan,

said data information manager means including syntactic and semantic parser means interfacing with said smart data dictionary means for retrieving data representing local schema information and the interrelationships among data, and for parsing and validating the syntax of the global data request using such data retrieved from said smart data dictionary means,

said data information manager means further including optimizer means interfacing with said smart data dictionary means for retrieving data representing local schema information and the inter-relationships among data, and for minimizing the amount of data needed to be transferred among local site databases and for choosing an appropriate said local site database for processing each portion of the local-site execution plan,

said data information manager means also including local-site execution plan control means interfacing with each of said local site databases to send each local site database that portion of said local-site execution plan necessary to extract responsive data from each said local site database;

a plurality of local information manager means, each communicating with said data information manager means and said smart data dictionary means, for controlling data flow to and from a specified database in the network in response to that portion of said local-site execution plan received from said data information manager means and for transmitting retrieved data responsive to that portion of said local-site execution plan to said data information manager means,

each said local information manager means further adapted for generating, in accord with the data contained in said smart data dictionary means, a data retrieval request for execution by another local information manager means and for receiving data therefrom in response thereto, in order to complete that portion of said local-site execution plan received by it for execution,

at least one local information manager means controlling data flow to and from at least two local databases and adapted to decompose that portion of said local-site execution plan received from said data information manager means into a sub-local-site execution plan for retrieval of data responsive to that portion of said local-site execution plan received from said data information manager means from each of said controlled local databases,

said local information manager means further includes local controller means for controlling the execution of that portion of the local-site execution plan sent by the data information manager means by coordinating all internal operations.

13. A globally integrated data retrieval controller architecture as in claim 12 wherein said local controller means operates synchronously.

14. A globally integrated data retrieval controller architecture as in claim 12 wherein said local controller means operates asynchronously.

15. A computer data network controlling and directing transmission of a user generated global data query to individual ones of a plurality of nodes and associated databases and for receiving and integrating the requested data received from the databases through the nodes into a single response and for transmitting the integrated single response to the requesting user, comprising:

a communication medium connecting the nodes with a plurality of users, each capable of generating a global data request for accessing and retrieving data from the databases through their associated nodes in accord with a single query protocol;

a smart data dictionary node connected to the computer data network and controlling input/output access to a database of data representing schema, data distribution, local site configuration and inter-site relationships of data among the nodes and their associated databases in the network, for each node and associated database in the network;

a data information manager controller communicating both with said smart data dictionary node to retrieve data therefrom, and with said user to receive said global data query therefrom and to transmit responsive data thereto, for decomposing the global data query into a local-site execution plan for retrieval of data from each database through its associated node, said database having data responsive to the global data query in accord with the data contained in said database associated with said smart data dictionary node, and for transmitting that portion of said local-site execution plan to be executed to an appropriate node and associated database for execution, and receiving data therefrom responsive to said local-site execution plan;

a plurality of local information manager controllers, each communicating with said data information manager controller and said smart data dictionary node, for controlling data flow to and from a specified database in the network in response to that portion of said local-site execution plan received from said data information manager controller and for transmitting retrieved data responsive to that portion of said local-site execution plan to said data information manager controller,

each said local information manager controller further adapted for generating, in accord with the data contained in said database associated with said smart data dictionary node, a data retrieval request for execution by another local information manager controller and for receiving data therefrom in response thereto, in order to complete that portion of said local-site execution plan received by it for execution.

16. A computer data network as in claim 15 further comprising:

a least one local information manager controller controlling data flow to and from at least two local databases and adapted to decompose that portion of said local site execution plan received from said data information manager controller into a sub-local site execution plan for retrieval of data responsive to that portion of said local site execution plan received from said data information manager controller from each of said controlled local databases.

17. A computer data network as in claim 15 wherein said data information manager controller comprises:

a syntactic and semantic parser device interfacing with said smart data dictionary node for retrieving data representing local schema information and the inter-relationships among data, and for parsing and validating the syntax of the global data request using such data retrieved from said smart data dictionary node.

18. A computer data network as in claim 15 wherein said data information manager controller comprises:

a data optimizer unit interfacing with said smart data dictionary node for retrieving data representing local schema information and the inter-relationships among data, and for minimizing the amount of data needed to be transferred among local site databases and for choosing an appropriate said local site database for processing each portion of the local-site execution plan.

19. A computer data network as in claim 15 wherein said data information manager controller comprises:

local-site execution plan control controllers interfacing with each of said local site databases to send each local site database that portion of said local-site execution plan necessary to extract responsive data from each said local site database.

20. A computer data network as in claim 15 wherein said local information manager controller further includes:

a local controller for controlling the execution of that portion of the local site execution plan sent by the data information manager controller by coordinating all internal operations.

21. A computer data network as in claim 20 wherein said local controller operates synchronously.

22. A computer data network as in claim 20 wherein said local controller operates asynchronously.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates in general to an architecture and method useful in computer data networks, and, more particularly, to a federated (global) architecture and system which are extensible and flexible for providing users with transparent integrated access to heterogeneous DataBase Management Systems (DBMS) dispersed over a long haul network.

2. Description of the Related Art

During the past decade, large scale organizations and environments have initially adopted heterogeneous and incompatible information systems in an uncoordinated way; independent of each other and without consideration that one day they may need to be integrated. As a result, information systems have become more and more complex, and are characterized by several types of heterogeneity. For example, different DataBase Management Systems (DBMS) models may be used to represent data, such as the hierarchical, network, and relational models. Aside from databases, many software systems (such as spreadsheets, multi-media databases and knowledge bases) store other types of data, each with its own data model. Furthermore, the same data may be seen by various users at different levels of abstraction. Because of such differences, users find it difficult to understand the meaning of all the types of data presented to them. Analysts, operators and current data processing technology are not able to organize, process and intelligently analyze these diverse and massive quantities of information. Their inefficiency often results in late reports to decision makers, missed intelligence opportunities and unexploited data.

One of the needs is to access and manage existing and new earth science data. The data is been collected and stored within a number of different DBMSs and image files for the purpose of monitoring global earth processes. The earth science data are collected by different information systems including data concerning: climate, land, ocean, etc., which are composed of relational databases, images, and files. These systems were designed independently and operate in completely different ways as to how the data are stored and accessed. Moreover, they are tailored to different hardware platforms. So, in order to access the data, the users must learn how to access different systems. This increases training costs and reduces user productivity. In addition, the majority of users do not have the level of computer science expertise necessary to learn the different individual systems within a short period of time, thus discouraging them from accessing dispersed data, or in some instances from even knowing what data are available for their use.

The same problems occur in the Computer Integrated Manufacturing (CIM) environment. CIM is a very complex network of physical activities, decision making and information flow. Most manufacturing facilities contain independently designed and dispersed information bases. In such an environment, improvement in manufacturing productivity can be obtained by providing timely access to all essential data, local or distributed. Present CIM systems lack a federated, i.e., global, database which contains information required for all phases of manufacturing, that is, design, process, assembly and inspection. Usually, manufacturing processes are treated independently from the other phases. This is undesirable in the sense that data or knowledge from one process is not available for use by another. There is need to integrate data so that they can be made globally available to the users and processes of a CIM system.

In conclusion, there is an urgent need to integrate these dispersed data to provide uniform access to the data, to maintain integrity of the data, and to control its access and use. Rather than requiring users to learn a variety of interfaces in order to access different databases, it is preferable that a single interface be made available which provides access to each of the DBMSs and supports queries which reference data managed by more than one information system.

Past and current research and development in distributed databases allows integrated access by providing a homogenizing layer on top of the underlying information systems (UISs). Common approaches for supporting this layer focus on defining a single uniform database language and data model that can accommodate all features of the UISs. The two main approaches are known as view integration and multi-database language.

The view integration approach advocates the use of a relational, an Object-Oriented (OO), or a logic model both for defining views (virtual or snapshot) on the schemas of more than one target database and for formulating queries against the views. The view integration approach is one mechanism for homogenizing the schema incompatibilities of the UISs. In this framework, all UISs are converted to the equivalent schemas in the standard relational, 00, or logical data model. The choice of the uniform data model is based on its expressiveness, its representation power and its supported environment. This technique is very powerful from the user's point of view. It insulates the user from the design and changes of the underlying Information Management System (IMS). Thus, it allows the user to spend more time in an application environment. However, the view integration approach has a limited applicability (low degree of heterogeneity) because there are many situations when the semantics of the data are deeply dependent on the way in which the applications manipulate it, and are only partially expressed by the schema. Many recent applications in areas where traditional DBMSs are not usable fall into this situation (multi-media applications involving Text, Graphics and Images are typical examples). In addition, there are no available tools to semi-automate the building and the maintenance of the unified view which is vital to the success of this technique.

In the multi-database language approach, a user, or application, must understand the contents of each UIS in order to access the shared information and to resolve conflicts of facts in a manner particular to each application. There is no global schema to provide advice about the meta-data. Ease of maintenance and ability to deal with inconsistent databases make this approach very attractive. The major drawback of this approach is that the burden of understanding the underlying IMSs lies on the user. Accordingly, there is a tradeoff between this multi-database language approach and the view integration approach discussed above. This invention will address the deficiencies suffered from the above two approaches.

OBJECTS AND SUMMARY OF THE INVENTION

Therefore, it is an object of the present invention to provide a federated (global) architecture and system which are extensible and flexible for providing users with transparent integrated access to heterogeneous DBMS dispersed over a long haul network.

It is still another object of the present invention to provide an architecture and system for use in multiple large information management systems that are geographically dispersed, such as Command and Control, Computer Integrated Manufacturing, Medical Information Management, and many applications in intelligent analysis and decision support domains that will enable more effective and transparent access to existing high data volume sources that are collected and stored with different geographically dispersed DBMSs.

It is still another object of the present invention to provide a federated information management architecture and system where the users have only to learn one single interface and one unified view of the data.

The invention residing in the Federated Information Management (FIM) architecture described and claimed herein, allows the end-user to access geographically dispersed multiple information management systems. It provides the end-user with a unified view of the underlying information management systems. Data distribution and location transparencies are supported by the FIM architecture of the present invention. This means that the end-user does not need to know how the data is distributed and its location in order to share and access relevant data. In addition, the FIM architecture of the present invention can integrate both existing and new information management systems.

Among the advantages of the invention are the following:

1) The invention allows distributed access without a change to the underlying existing databases;

2) The invention allows a decrease in training cost and time for learning different DBMSs leading to an improvement in user productivity;

3) The invention is able to utilize, share and combine data that is otherwise dispersed in many different physical and logical locations;

4) The invention allows the overall system to evolve and include new information management systems with minimum change; and,

5) The invention is able to adapt and interface with normally incompatible different database vendors.

One novel aspect of the invention therefore is the federated architecture coupled with the Inter-Site Transaction Service (ISTS) architecture to allow transparent access to a wide variety of DBMSs while maintaining the local autonomy of the underlying DBMSs. With this invention architecture, the FIM of the present invention can run on top of different hardware, operating systems, communication networks, and DBMSs. In addition, the system of the present invention can evolve to include new target DBMSs with minimum changes. The federated architecture of the present invention is not limited to integrate relational DBMSs, but may also integrate legacy DBMSs such as hierarchical or network DBMSs, spatial information systems or geographical Information Systems, and text retrieval systems.

The present invention, therefore, provides an Intelligent Integration of Information environment to support seamless access to large scale heterogeneous information management systems which includes relational, spatial, and text systems. The invention includes the following features to support this environment:

(1) A federated architecture that supports transparent access to multiple database systems. It provides the end-user with a unified view of the underlying database systems. Local autonomy of the underlying database systems are fully maintained in the federated architecture. This means that the users can still use the same application to access the local databases, and only minimum change to the local database system is required for sharing and remote accessing relevant data. The architecture includes several distributed query optimization methods for fragmented and replicated data. This optimization ability improves the total query cost by reducing the transmission and the processing costs of the overall system. Also, the architecture uses fragmented dependencies information, called semantic query optimization, to improve the total cost. A high layer of distributed transaction services is also provided to separate the lower layer network communication protocols from the distributed query processing protocols. Detail design of this architecture and the Federated Information Management (FIM) are