|
|
|
| United States Patent | 5596744 |
| Link to this page | http://www.wikipatents.com/5596744.html |
| Inventor(s) | Dao; Son K. (Northridge, CA);
Ebeid; Nader (Westlake Village, CA) |
| Abstract | Disclosed is a federated architecture and system which are extensible and
flexible for integrated access to heterogeneous DataBase Management
Systems (DBMS) dispersed over a long haul network, allowing transparent
access to a wide variety of DBMS while maintaining the local autonomy of
the underlying DBMS. In addition the system can run on top of different
hardware, operating systems, network communications, and DBMS. The system
can include new target DBMS with minimum changes and is not limited to
integrate relational DBMS, but also to integrate legacy DBMS such as
hierarchial or network DBMS, spatial information or text retrieval
systems. |
|
|
|
Title Information  |
|
|
|
|
|
|
| Publication Date |
January 21, 1997 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5369761 Conley 707/2 Nov,1994 |      Your vote accepted [0 after 0 votes] | | 5345586 Hamala 707/10 Sep,1994 |      Your vote accepted [0 after 0 votes] | | 5339434 Rusis 709/246 Aug,1994 |      Your vote accepted [0 after 0 votes] | | 5329619 Page
Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5295261 Simonetti
Mar,1994 |      Your vote accepted [0 after 0 votes] | | 5278978 Demers 707/101 Jan,1994 |      Your vote accepted [0 after 0 votes] | | 5257369 Skeen 719/312 Oct,1993 |      Your vote accepted [0 after 0 votes] | | 5257366 Adair 707/4 Oct,1993 |      Your vote accepted [0 after 0 votes] | | 5247664 Thompson 707/10 Sep,1993 |      Your vote accepted [0 after 0 votes] | | 5239648 Nukui 707/10 Aug,1993 |      Your vote accepted [0 after 0 votes] | | 5235701 Ohler
Aug,1993 |      Your vote accepted [0 after 0 votes] | | 5202996 Sugino 717/107 Apr,1993 |      Your vote accepted [0 after 0 votes] | | 5202983 Orita 707/4 Apr,1993 |      Your vote accepted [0 after 0 votes] | | 5187787 Skeen 719/314 Feb,1993 |      Your vote accepted [0 after 0 votes] | | 5058000 Cox 707/10 Oct,1991 |      Your vote accepted [0 after 0 votes] | | 4930071 Tou 707/4 May,1990 |      Your vote accepted [0 after 0 votes] | | 4881166 Thompson 707/8 Nov,1989 |      Your vote accepted [0 after 0 votes] | | 4769772 Dwyer 707/2 Sep,1988 |      Your vote accepted [0 after 0 votes] | | 4714995 Materna 707/201 Dec,1987 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. In a computer data network having a communications medium commonly
connecting a plurality of local site databases containing data with a
plurality of users each capable of generating of global data query for
accessing and retrieving data from said databases in accord with a single
global query protocol, a method for controlling and directing a
transmission of the user generated global data query to individual ones of
the plurality of databases and for receiving and integrating the requested
data received from the databases into a single response and for
transmitting the integrated single response to the requesting user, said
method comprising the following steps:
creating a smart data dictionary local site database profile containing
data representing schema, data distribution, local site configuration and
inter-site relationships of data among the local site databases in the
network, for each local site database in the network;
communicating with said smart data dictionary local site database profile
to retrieve data therefrom for decomposing the global data query into a
local-site execution plan for retrieval of data from each local site
database having data responsive to the global data query in accord with
the data contained in said smart data dictionary local site database
profile;
decomposing the global data query into a local-site execution plan for
retrieval of data from each local site database having data responsive to
the global data query in accord with the data contained in said smart data
dictionary local site database profile;
transmitting that portion of said local-site execution plan to be executed
to an appropriate said local site database for execution,
receiving data from each local site database responsive to said local-site
execution plan and creating a global response database containing such
responsive data received from each local site database;
providing the user access to the global response database in accord with
the single global query protocol.
2. A method as in claim 1 further including the steps of:
creating for each of said local site databases a plurality of local
information manager means, each communicating with said smart data
dictionary local site database profile for controlling data flow to and
from a specified local site database in the network in response to that
portion of said local-site execution plan received by the local site
database;
generating, in accord with the data contained in said smart data dictionary
local site database profile, a data retrieval request for execution by any
other local site databases necessary to complete that portion of said
local-site execution plan received by it for execution.
3. The method of claim 1 wherein the step of decomposing the global query
further comprises:
parsing and validating the syntax of the global query;
identifying an appropriate said local site database containing the
requested data by interfacing with the smart data dictionary local site
database profile to retrieve information about a local query protocol and
data semantic schema established for each individual local site database
and generating a plurality of local site database queries for retrieving
responsive data from each of the local site databases containing such
responsive data;
optimizing the plurality of local site queries to generate a local-site
execution plan that optimizes a total time for executing the global query
by minimizing the amount of data needed to be transferred among local
sites and by choosing an appropriate home local site for processing the
local-site execution plan;
coordinating the execution of that portion of the local-site execution plan
that applies to each of the local site databases by:
(1) creating for each of said local site databases at least one local
information manager means;
(2) sending each said local information manager means that portion of said
local-site execution plan applicable to data requests from its associated
local site database;
(3) sending each said local information manager means a request to execute
a local reduction plan for reducing queries;
(4) sending each local information manager means a request to replicate
responsive data fragments;
(5) sending each local information manager means a request to execute
asynchronously a local data query on its associated local site database;
(6) sending each local information manager means, except a local
information manager for a home site database, a request to synchronously
send data responsive to the local data query request to the local
information manager for the home site;
(7) archiving the received data for transmission to the requesting user.
4. In a computer data network having a communications medium commonly
connecting a plurality of databases containing data with a plurality of
users each capable of generating a global data query for accessing and
retrieving data from said databases in accord with a single query
protocol, a globally integrated data retrieval controller architecture for
controlling and directing a transmission of the user generated global data
query to individual ones of the plurality of databases and for receiving
and integrating the requested data received from the databases into a
single response and for transmitting the integrated single response to the
requesting user, said globally integrated data retrieval controller
architecture comprising:
a smart data dictionary means containing a database of data representing
schema, data distribution, local site configuration and inter-site
relationships of data among the databases in the network, for each
database in the network;
a data information manager means communicating both with said smart data
dictionary means to retrieve data therefrom, and with said user to receive
said global data query therefrom and to transmit responsive data thereto,
for decomposing the global data query into a local-site execution plan for
retrieval of data from each database having data responsive to the global
data query in accord with the data contained in said smart data dictionary
means, and for transmitting that portion of said local-site execution plan
to be executed to an appropriate said database for execution, and
receiving data therefrom responsive to said local-site execution plan;
a plurality of local information manager means, each communicating with
said data information manager means and said smart data dictionary means,
for controlling data flow to and from a specified database in the network
in response to that portion of said local-site execution plan received
from said data information manager means and for transmitting retrieved
data responsive to that portion of said local-site execution plan to said
data information manager means,
each said local information manager means further adapted for generating,
in accord with the data contained in said smart data dictionary means, a
data retrieval request for execution by another local information manager
means and for receiving data therefrom in response thereto, in order to
complete that portion of said local-site execution plan received by it for
execution.
5. A globally integrated data retrieval controller architecture as in claim
4 further comprising:
at least one local information manager means controlling data flow to and
from at least two local databases and adapted to decompose that portion of
said local site execution plan received from said data information manager
means into a sub-local site execution plan for retrieval of data
responsive to that portion of said local site execution plan received from
said data information manager means from each of said controlled local
databases.
6. A globally integrated data retrieval controller architecture as in claim
4 wherein said data information manager means comprises:
syntactive and semantic parser means interfacing with said smart data
dictionary means for retrieving data representing local schema information
and the inter-relationships among data, and for parsing and validating the
syntax of the global data request using such data retrieved from said
smart data dictionary means.
7. A globally integrated data retrieval controller architecture as in claim
4 wherein said data information manager means comprises:
optimizer means interfacing with said smart data dictionary means for
retrieving data representing local schema information and the
inter-relationships among data, and for minimizing the amount of data
needed to be transferred among local site databases and for choosing an
appropriate said local site database for processing each portion of the
local-site execution plan.
8. A globally integrated data retrieval controller architecture as in claim
4 wherein said data information manager means comprises:
local site execution plan control means interfacing with each of said local
site databases to send each local site database that portion of said local
site execution plan necessary to extract responsive data from each said
local site database.
9. A globally integrated data retrieval controller architecture as in claim
4 wherein said local information manager means further includes:
local controller means for controlling the execution of that portion of the
local site execution plan sent by the data information manager means by
coordinating all internal operations.
10. A globally integrated data retrieval controller architecture as in
claim 9 wherein said local controller means operates synchronously.
11. A globally integrated data retrieval controller architecture as in
claim 9 wherein said local controller means operates asynchronously.
12. In a computer data network having a communications medium commonly
connecting a plurality of databases containing data with a plurality of
users each capable of generating a global data query for accessing and
retrieving data from said databases in accord with a single query
protocol, a globally integrated data retrieval controller architecture for
controlling and directing a transmission of the user generated global data
query to individual ones of the plurality of databases and for receiving
and integrating the requested data received from the databases into a
single response and for transmitting the integrated single response to the
requesting user, said globally integrated data retrieval controller
architecture comprising:
a smart data dictionary means containing a database of data representing
schema, data distribution, local site configuration and inter-site
relationships of data among the databases in the network, for each
database in the network;
a data information manager means communicating both with said smart data
dictionary means to retrieve data therefrom, and with said user to receive
said global data query therefrom and to transmit responsive data thereto,
for decomposing the global data query into a local-site execution plan for
retrieval of data from each database having data responsive to the global
data query in accord with the data contained in said smart data dictionary
means, and for transmitting that portion of said local-site execution plan
to be executed to an appropriate said database for execution, and
receiving data therefrom responsive to said local-site execution plan,
said data information manager means including syntactic and semantic parser
means interfacing with said smart data dictionary means for retrieving
data representing local schema information and the interrelationships
among data, and for parsing and validating the syntax of the global data
request using such data retrieved from said smart data dictionary means,
said data information manager means further including optimizer means
interfacing with said smart data dictionary means for retrieving data
representing local schema information and the inter-relationships among
data, and for minimizing the amount of data needed to be transferred among
local site databases and for choosing an appropriate said local site
database for processing each portion of the local-site execution plan,
said data information manager means also including local-site execution
plan control means interfacing with each of said local site databases to
send each local site database that portion of said local-site execution
plan necessary to extract responsive data from each said local site
database;
a plurality of local information manager means, each communicating with
said data information manager means and said smart data dictionary means,
for controlling data flow to and from a specified database in the network
in response to that portion of said local-site execution plan received
from said data information manager means and for transmitting retrieved
data responsive to that portion of said local-site execution plan to said
data information manager means,
each said local information manager means further adapted for generating,
in accord with the data contained in said smart data dictionary means, a
data retrieval request for execution by another local information manager
means and for receiving data therefrom in response thereto, in order to
complete that portion of said local-site execution plan received by it for
execution,
at least one local information manager means controlling data flow to and
from at least two local databases and adapted to decompose that portion of
said local-site execution plan received from said data information manager
means into a sub-local-site execution plan for retrieval of data
responsive to that portion of said local-site execution plan received from
said data information manager means from each of said controlled local
databases,
said local information manager means further includes local controller
means for controlling the execution of that portion of the local-site
execution plan sent by the data information manager means by coordinating
all internal operations.
13. A globally integrated data retrieval controller architecture as in
claim 12 wherein said local controller means operates synchronously.
14. A globally integrated data retrieval controller architecture as in
claim 12 wherein said local controller means operates asynchronously.
15. A computer data network controlling and directing transmission of a
user generated global data query to individual ones of a plurality of
nodes and associated databases and for receiving and integrating the
requested data received from the databases through the nodes into a single
response and for transmitting the integrated single response to the
requesting user, comprising:
a communication medium connecting the nodes with a plurality of users, each
capable of generating a global data request for accessing and retrieving
data from the databases through their associated nodes in accord with a
single query protocol;
a smart data dictionary node connected to the computer data network and
controlling input/output access to a database of data representing schema,
data distribution, local site configuration and inter-site relationships
of data among the nodes and their associated databases in the network, for
each node and associated database in the network;
a data information manager controller communicating both with said smart
data dictionary node to retrieve data therefrom, and with said user to
receive said global data query therefrom and to transmit responsive data
thereto, for decomposing the global data query into a local-site execution
plan for retrieval of data from each database through its associated node,
said database having data responsive to the global data query in accord
with the data contained in said database associated with said smart data
dictionary node, and for transmitting that portion of said local-site
execution plan to be executed to an appropriate node and associated
database for execution, and receiving data therefrom responsive to said
local-site execution plan;
a plurality of local information manager controllers, each communicating
with said data information manager controller and said smart data
dictionary node, for controlling data flow to and from a specified
database in the network in response to that portion of said local-site
execution plan received from said data information manager controller and
for transmitting retrieved data responsive to that portion of said
local-site execution plan to said data information manager controller,
each said local information manager controller further adapted for
generating, in accord with the data contained in said database associated
with said smart data dictionary node, a data retrieval request for
execution by another local information manager controller and for
receiving data therefrom in response thereto, in order to complete that
portion of said local-site execution plan received by it for execution.
16. A computer data network as in claim 15 further comprising:
a least one local information manager controller controlling data flow to
and from at least two local databases and adapted to decompose that
portion of said local site execution plan received from said data
information manager controller into a sub-local site execution plan for
retrieval of data responsive to that portion of said local site execution
plan received from said data information manager controller from each of
said controlled local databases.
17. A computer data network as in claim 15 wherein said data information
manager controller comprises:
a syntactic and semantic parser device interfacing with said smart data
dictionary node for retrieving data representing local schema information
and the inter-relationships among data, and for parsing and validating the
syntax of the global data request using such data retrieved from said
smart data dictionary node.
18. A computer data network as in claim 15 wherein said data information
manager controller comprises:
a data optimizer unit interfacing with said smart data dictionary node for
retrieving data representing local schema information and the
inter-relationships among data, and for minimizing the amount of data
needed to be transferred among local site databases and for choosing an
appropriate said local site database for processing each portion of the
local-site execution plan.
19. A computer data network as in claim 15 wherein said data information
manager controller comprises:
local-site execution plan control controllers interfacing with each of said
local site databases to send each local site database that portion of said
local-site execution plan necessary to extract responsive data from each
said local site database.
20. A computer data network as in claim 15 wherein said local information
manager controller further includes:
a local controller for controlling the execution of that portion of the
local site execution plan sent by the data information manager controller
by coordinating all internal operations.
21. A computer data network as in claim 20 wherein said local controller
operates synchronously.
22. A computer data network as in claim 20 wherein said local controller
operates asynchronously. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to an architecture and method useful in
computer data networks, and, more particularly, to a federated (global)
architecture and system which are extensible and flexible for providing
users with transparent integrated access to heterogeneous DataBase
Management Systems (DBMS) dispersed over a long haul network.
2. Description of the Related Art
During the past decade, large scale organizations and environments have
initially adopted heterogeneous and incompatible information systems in an
uncoordinated way; independent of each other and without consideration
that one day they may need to be integrated. As a result, information
systems have become more and more complex, and are characterized by
several types of heterogeneity. For example, different DataBase Management
Systems (DBMS) models may be used to represent data, such as the
hierarchical, network, and relational models. Aside from databases, many
software systems (such as spreadsheets, multi-media databases and
knowledge bases) store other types of data, each with its own data model.
Furthermore, the same data may be seen by various users at different
levels of abstraction. Because of such differences, users find it
difficult to understand the meaning of all the types of data presented to
them. Analysts, operators and current data processing technology are not
able to organize, process and intelligently analyze these diverse and
massive quantities of information. Their inefficiency often results in
late reports to decision makers, missed intelligence opportunities and
unexploited data.
One of the needs is to access and manage existing and new earth science
data. The data is been collected and stored within a number of different
DBMSs and image files for the purpose of monitoring global earth
processes. The earth science data are collected by different information
systems including data concerning: climate, land, ocean, etc., which are
composed of relational databases, images, and files. These systems were
designed independently and operate in completely different ways as to how
the data are stored and accessed. Moreover, they are tailored to different
hardware platforms. So, in order to access the data, the users must learn
how to access different systems. This increases training costs and reduces
user productivity. In addition, the majority of users do not have the
level of computer science expertise necessary to learn the different
individual systems within a short period of time, thus discouraging them
from accessing dispersed data, or in some instances from even knowing what
data are available for their use.
The same problems occur in the Computer Integrated Manufacturing (CIM)
environment. CIM is a very complex network of physical activities,
decision making and information flow. Most manufacturing facilities
contain independently designed and dispersed information bases. In such an
environment, improvement in manufacturing productivity can be obtained by
providing timely access to all essential data, local or distributed.
Present CIM systems lack a federated, i.e., global, database which
contains information required for all phases of manufacturing, that is,
design, process, assembly and inspection. Usually, manufacturing processes
are treated independently from the other phases. This is undesirable in
the sense that data or knowledge from one process is not available for use
by another. There is need to integrate data so that they can be made
globally available to the users and processes of a CIM system.
In conclusion, there is an urgent need to integrate these dispersed data to
provide uniform access to the data, to maintain integrity of the data, and
to control its access and use. Rather than requiring users to learn a
variety of interfaces in order to access different databases, it is
preferable that a single interface be made available which provides access
to each of the DBMSs and supports queries which reference data managed by
more than one information system.
Past and current research and development in distributed databases allows
integrated access by providing a homogenizing layer on top of the
underlying information systems (UISs). Common approaches for supporting
this layer focus on defining a single uniform database language and data
model that can accommodate all features of the UISs. The two main
approaches are known as view integration and multi-database language.
The view integration approach advocates the use of a relational, an
Object-Oriented (OO), or a logic model both for defining views (virtual or
snapshot) on the schemas of more than one target database and for
formulating queries against the views. The view integration approach is
one mechanism for homogenizing the schema incompatibilities of the UISs.
In this framework, all UISs are converted to the equivalent schemas in the
standard relational, 00, or logical data model. The choice of the uniform
data model is based on its expressiveness, its representation power and
its supported environment. This technique is very powerful from the user's
point of view. It insulates the user from the design and changes of the
underlying Information Management System (IMS). Thus, it allows the user
to spend more time in an application environment. However, the view
integration approach has a limited applicability (low degree of
heterogeneity) because there are many situations when the semantics of the
data are deeply dependent on the way in which the applications manipulate
it, and are only partially expressed by the schema. Many recent
applications in areas where traditional DBMSs are not usable fall into
this situation (multi-media applications involving Text, Graphics and
Images are typical examples). In addition, there are no available tools to
semi-automate the building and the maintenance of the unified view which
is vital to the success of this technique.
In the multi-database language approach, a user, or application, must
understand the contents of each UIS in order to access the shared
information and to resolve conflicts of facts in a manner particular to
each application. There is no global schema to provide advice about the
meta-data. Ease of maintenance and ability to deal with inconsistent
databases make this approach very attractive. The major drawback of this
approach is that the burden of understanding the underlying IMSs lies on
the user. Accordingly, there is a tradeoff between this multi-database
language approach and the view integration approach discussed above. This
invention will address the deficiencies suffered from the above two
approaches.
OBJECTS AND SUMMARY OF THE INVENTION
Therefore, it is an object of the present invention to provide a federated
(global) architecture and system which are extensible and flexible for
providing users with transparent integrated access to heterogeneous DBMS
dispersed over a long haul network.
It is still another object of the present invention to provide an
architecture and system for use in multiple large information management
systems that are geographically dispersed, such as Command and Control,
Computer Integrated Manufacturing, Medical Information Management, and
many applications in intelligent analysis and decision support domains
that will enable more effective and transparent access to existing high
data volume sources that are collected and stored with different
geographically dispersed DBMSs.
It is still another object of the present invention to provide a federated
information management architecture and system where the users have only
to learn one single interface and one unified view of the data.
The invention residing in the Federated Information Management (FIM)
architecture described and claimed herein, allows the end-user to access
geographically dispersed multiple information management systems. It
provides the end-user with a unified view of the underlying information
management systems. Data distribution and location transparencies are
supported by the FIM architecture of the present invention. This means
that the end-user does not need to know how the data is distributed and
its location in order to share and access relevant data. In addition, the
FIM architecture of the present invention can integrate both existing and
new information management systems.
Among the advantages of the invention are the following:
1) The invention allows distributed access without a change to the
underlying existing databases;
2) The invention allows a decrease in training cost and time for learning
different DBMSs leading to an improvement in user productivity;
3) The invention is able to utilize, share and combine data that is
otherwise dispersed in many different physical and logical locations;
4) The invention allows the overall system to evolve and include new
information management systems with minimum change; and,
5) The invention is able to adapt and interface with normally incompatible
different database vendors.
One novel aspect of the invention therefore is the federated architecture
coupled with the Inter-Site Transaction Service (ISTS) architecture to
allow transparent access to a wide variety of DBMSs while maintaining the
local autonomy of the underlying DBMSs. With this invention architecture,
the FIM of the present invention can run on top of different hardware,
operating systems, communication networks, and DBMSs. In addition, the
system of the present invention can evolve to include new target DBMSs
with minimum changes. The federated architecture of the present invention
is not limited to integrate relational DBMSs, but may also integrate
legacy DBMSs such as hierarchical or network DBMSs, spatial information
systems or geographical Information Systems, and text retrieval systems.
The present invention, therefore, provides an Intelligent Integration of
Information environment to support seamless access to large scale
heterogeneous information management systems which includes relational,
spatial, and text systems. The invention includes the following features
to support this environment:
(1) A federated architecture that supports transparent access to multiple
database systems. It provides the end-user with a unified view of the
underlying database systems. Local autonomy of the underlying database
systems are fully maintained in the federated architecture. This means
that the users can still use the same application to access the local
databases, and only minimum change to the local database system is
required for sharing and remote accessing relevant data. The architecture
includes several distributed query optimization methods for fragmented and
replicated data. This optimization ability improves the total query cost
by reducing the transmission and the processing costs of the overall
system. Also, the architecture uses fragmented dependencies information,
called semantic query optimization, to improve the total cost. A high
layer of distributed transaction services is also provided to separate the
lower layer network communication protocols from the distributed query
processing protocols. Detail design of this architecture and the Federated
Information Management (FIM) are | | |