|
Description  |
|
|
CROSS-REFERENCE TO OTHER APPLICATIONS
The-following applications of common assignee contain some common
disclosure, and are believed to have an effective filing date identical
with that of the present application:
U.S. patent application entitled "Performance Optimization In a
Heterogeneous, Distributed Database Environment", Ser. No. 08/314,643,
filed on Sep. 29, 1994, now abandoned.
U.S. patent application entitled "Pass Through In a Distributed
Multi-Database System", Ser. No. 08/310,799, filed on Sep. 29, 1994, now
pending.
U.S. patent application entitled "Push Down Optimization in a Distributed,
Multi-Database System" Ser. No. 08/310,797, filed on Sep. 29, 1994, now
pending.
The above-listed applications are incorporated herein by reference in their
entireties.
DESCRIPTION
1. Technical Field
The present invention relates generally to computer database systems, and
more particularly to functional compensation in a heterogeneous,
distributed database environment.
2. Background Art
In modern data processing environments, a client's data is often
distributed among a plurality of heterogeneous database systems (more
precisely, a client's data is distributed among a plurality of database
instances which were instantiated from different database management
systems). "Heterogeneous database systems" are database systems that have
different data definition and manipulation procedures, security
procedures, system management approaches, capabilities, etc. Examples of
"heterogeneous database systems" include DB2 produced by International
Business Machines (IBM) Corporation, Oracle produced by Oracle Corp.,
Sybase produced by Sybase Inc., etc. Such heterogeneous database systems,
when used together, collectively represent a heterogeneous, distributed
database environment (or system). Heterogeneous, distributed database
systems are also sometimes called federated database systems and/or
multi-database systems.
In order to enhance user-friendliness, it is preferred that clients be
provided with a common interface to all of the heterogeneous database
systems (heterogeneous database systems to which a client is not directly
connected are called back-end database systems, or simply back-ends). In
other words, it is preferred that clients be under the illusion that they
are interacting with a single database system.
One conventional approach for achieving this goal is to introduce an
interface module between the clients and the back-end database systems.
This interface module, also called database middleware or data access
middleware, attempts to provide to clients transparent access to the
back-end database systems. Generally speaking, the interface module
receives data definition and manipulation instructions from clients. The
interface module translates these instructions such that they are
understandable to the appropriate back-end database systems, and then
transfers the translated instructions to the appropriate back-end database
systems. Similarly, the interface module translates information and
messages received from the back-end database systems such that they are
understandable to the appropriate clients, and then transfers the
translated information and messages to the appropriate clients.
Generally, back-end database systems support different sets of functions.
For example, some back-end database systems (such as DB2) support the
declaration of cursors "with hold". Other back-end database systems (such
as current versions of Oracle and Sybase) do not support this function.
Some conventional interface modules address this functional dissimilarity
problem by relying on a "least-common denominator" approach wherein the
only functions that are supported are those functions that are supported
by all of the back-ends. This is not an optimal approach, however, because
it does not allow clients to take advantage of all of the functions
offered by all of the back-ends.
Other conventional interface modules address the functional dissimilarity
problem by disallowing the use of functions when operating with back-ends
that do not support the functions. Such functions are allowed when
operating with back-ends that do support the functions. This is not an
optimal approach, however, since it violates location transparency. That
is, clients must be aware of which back-ends they are interacting with.
Thus, what is needed is an improved system and method for addressing the
functional dissimilarity problem in a heterogeneous, distributed database
environment.
DISCLOSURE OF INVENTION
The present invention is directed to a system and method of compensating
for functional differences between heterogeneous database management
systems, wherein data associated with a client is distributed among the
heterogeneous database management systems. The present invention simulates
support of multiple pending actions on a single connection in any of the
heterogeneous database management systems which does not support multiple
pending actions on a single connection. Also, the present invention: (1)
simulates support of cursors declared "with hold" in any of the
heterogeneous database management systems which does not support cursors
declared "with hold"; (2) simulates support of positioned update actions
in any of the heterogeneous database management systems which does not
support positioned update actions; (3) simulates support of host variables
in any of the heterogeneous database management systems which does not
support host variables; and (4) compensates for security log-in procedure
differences between the heterogeneous database management systems.
Further features and advantages of the present invention, as well as the
structure and operation of various embodiments of the present invention,
are described in detail below with reference to the accompanying drawings.
In the drawings, like reference numbers indicate identical or functionally
similar elements.
BRIEF DESCRIPTION OF DRAWINGS
The present invention will be described with reference to the accompanying
drawings, wherein:
FIGS. 1 and 2 are block diagrams of a heterogeneous, distributed database
system according to a preferred embodiment of the present invention; and
FIGS. 3-8 are flowcharts depicting the operation of the present invention.
BEST MODE FOR CARRYING OUT THE INVENTION
1. Overview of the Present Invention
For illustrative purposes, the present invention is sometimes described
herein using well known SQL concepts, statements, and syntax. As will be
appreciated, SQL (structured query language) is a well known database
language originally developed by International Business Machines (IBM)
Corporation, and now supported by many relational database vendors, and
defined by various standards bodies, such as the American National
Standards Institute (ANSI). It should be understood, however, that
reference is made to SQL for convenience purposes only. The present
invention is intended and adapted to operate with database management
systems which do not support SQL.
FIG. 1 is a block diagram of a heterogeneous, distributed database system
102 according to a preferred embodiment of the present invention. The
heterogeneous, distributed database system 102 includes one or more client
application processes (also called, simply, "clients"), collectively
represented by client 104.
Client 104 is connected to a plurality of instances of back-end database
management systems (DBMS) (such as database instances 110A-110F) via an
interface module 106. The database instances 110A-110F represent
instantiations of a plurality of heterogeneous database management
systems, such as DB2 produced by International Business Machines (IBM)
Corporation, Oracle produced by Oracle Corp., Sybase produced by Sybase
Inc., as well as other relational DBMS. Such heterogeneous database
management systems may also include non-relational DBMS, flat files, etc.
The database instances 110A-110F execute on a plurality of servers
108A-108C. In the example of FIG. 1, database instances 110A and 110B
execute on server 108A, database instances 110C, 110D, and 110E execute on
server 108A, and database instance 110F executes on server 108F.
The interface module 106 provides clients 104 with a common interface to
all of the database instances 110A (i.e., all of the back-end database
systems). By operation of the interface module 106, the clients 104 are
placed under the illusion that they are interacting with a single database
system containing all of the data stored in database instances 110A-110F.
Generally, the database instances 110A-110F support different sets of
functions (more precisely, the database management systems from which the
database instances 110A-110F were instantiated support different sets of
functions). For example, database instances 110A and 110B may support
functions that are not supported by database instances 110C-110F.
According to the present invention, the interface module 106 compensates
for such functional deficiencies (i.e., the interface module 106 performs
functional compensation). Specifically, the interface module 106 simulates
the existence of certain functions in database instances 110A-110F that
are instantiated from database management systems that do not natively
support such functions. Consequently, the present invention is superior to
conventional interface modules since the present invention does not
compromise location and functional transparency.
The interface module 106 is discussed in detail below.
2. Preferred Structure of the Present Invention
Referring to FIG. 2, the client 104 is preferably implemented as a client
application program 212 operating on a computer platform comprising a
computer 202 and an operating system 210. The computer 202 includes
various hardware components, such as one or more central processing units
(CPU) 204, a random access memory (RAM) 206, and an input/output (I/O)
interface 208. The client application program 212 includes instructions
for defining and manipulating data in databases maintained by the database
instances 110A-110F. The operating system 210 may be implemented using any
well known operating system suitable for executing the client application
program 212, such as DOS, DOS/Windows, AIX, OS/2, HP-UX, and Sun OS. The
computer 202 may be implemented using any well known computers that
support these operating systems. It should be understood, however, that
other computers and operating systems could alternatively be used without
departing from the scope and spirit of the present invention.
The interface module 106 is preferably implemented as a data joiner
application program 224 (preferably written in the C computer programming
language) operating on a computer platform comprising a computer 214 and
an operating system 222. The data joiner application program 224, when
executed, enables the computer 214 to perform the features of the
interface module 106 discussed herein. Thus, the data joiner application
program 224 could be considered a controller of the computer 214.
The computer 214 includes various hardware components, such as one or more
central processing units (CPU) 216, a random access memory (RAM) 218, and
an input/output (I/O) interface 220.
Preferably, the computer 214 is the well known RISC System/6000 family of
computers produced by IBM. Alternatively, the computer 214 is any computer
that can run DB2 (produced by IBM). The operating system 222 is preferably
the well known AIX operating system produced by IBM. It should be
understood, however, that other computers and operating systems could
alternatively be used without departing from the scope and spirit of the
present invention.
The computer 214 is connected to a number of peripheral devices, such as
one or more storage devices. The storage devices may represent floppy
drive units, hard drive units, tape backup units, etc. One such storage
device, designated using reference number 226, is shown in FIG. 2.
A computer program product comprising a computer readable media having
computer program logic recorded thereon, wherein the computer program
logic when executed in the computer 214 enables the computer 214 to
perform the functions of the present invention, may be read by and/or
stored on the storage device 226. The computer program logic may then be
loaded into the RAM 218, and executed by the CPU 216 of the computer 214.
When executing in the CPU 216, the computer program logic is represented
by the data joiner application program 224 as shown in FIG. 2.
As stated above, the database instances 110A-110F execute on a plurality of
servers 108A-108C. Server 108A is shown in FIG. 2 (servers 108B and 108C
are similar to server 108A). The server 108A is preferably implemented as
a computer platform comprising a computer 228 and an operating system 236.
The computer 228 includes various hardware components, such as one or more
central processing units (CPU) 230, a random access memory (RAM) 232, and
an input/output (I/O) interface 234. The operating system 210 may be
implemented using any well known operating system suitable for executing
the database instances 110A and 110B, such as MVS, VM, VSE, OS/400, OS/2,
AIX, HP-UX, SUN OS, etc. The computer 228 may be implemented using any
well known computers that support these operating systems. It should be
understood, however, that other computers and operating systems could
alternatively be used without departing from the scope and spirit of the
present invention.
The computers 202, 214, and 228 are connected to a communication network
238, which may be implemented as a local area network (LAN) or a wide area
network (WAN), for example. It should be understood, however, that other
types of networks, such as a metropolitan area network (MAN), could
alternatively be used without departing from the scope and spirit of the
present invention. The client 104, interface module 106, and database
instances 110A-110F communicate with each other over this network 238.
3. Compensation for Different Connection and Cursor Characteristics
The interface module 106 functionally compensates for a database management
system's lack of support of multiple pending database actions (such as SQL
actions) on a single connection. Also, the interface module 106
functionally compensates for a database management system's lack of
support of declaring cursors "with hold". These capabilities of the
interface module 106 shall now be described in detail.
As is well known, a connection is a logical link between a client and a
database instance. Referring to FIG. 1, for example, before the client 104
can access data in a table maintained by database instance 110C, the
client 104 must create a connection between itself and database instance
110C. (As is well known, a database comprises one or more tables, and a
table comprises one or more rows, or touples.) The well known SQL
"connect" statement can be used to create a connection between a client
and a database instance (as is well known, the connect operation is often
performed implicitly).
As is well known, a cursor is a window into a table. A cursor can be viewed
as a pointer to rows in a table. One or more cursors may be associated
with each connection. Referring again to FIG. 1, for example, before the
client 104 can access data in a table maintained by database instance
110C, and after the client 104 has created a connection to database
instance 110C, the client 104 must declare and then open a cursor. The
well known SQL "declare cursor" statement is used to declare cursors. The
well known SQL "open" statement is used to open cursors.
Some database management systems, such as DB2, support multiple actions on
a single connection. For example, consider the following example code
segment (Code Segment 1). Code Segment 1 is part of the client application
program 212. It is written in pseudocode that is based on the SQL database
language. Line numbers are provided to the left of each statement for
reference purposes.
1 declare cursor C1 with hold
2 select name from emp.sub.-- table
3 open C1
4 while more data
5 fetch C1
6 delete where name="Smith"
7 commit
8 end while
9 close C1
Code Segment 1
The statement at line 1 declares a cursor called "C1" on an existing
connection (the statement that created this connection is not shown). The
statement at line 2 indicates that the table "emp.sub.-- table"
(containing employee data) is to be used, and the statement at line 3
opens C1 as a window into emp.sub.-- table. The statements at lines 4-8
represent a loop that iterates through the rows of emp.sub.-- table. In
particular, the statement at line 5 operates to fetch the row that is
pointed to by cursor C1. The statement at line 6 deletes all rows in
emp.sub.-- table where the name field is equal to "Smith". Line 7 is a
commit statement. The statement at line 9 closes the cursor C1.
In Code Segment 1, the database statements at lines 1, 3, 5, and 9 are
explicitly associated with a particular cursor (i.e., cursor C1).
Consequently, these database statements are explicitly associated with a
particular connection (i.e., the connection associated with cursor C1).
The database statement at line 6 is implicitly associated with a cursor
and a connection.
In some database management systems, such as DB2, multiple actions on a
single connection are supported. In such database management systems, the
database statement at line 6 would be associated with the same connection
as the database statements at lines 1, 3, 5, and 9. Consequently, two
actions would be pending on this connection, the "open/fetch/close" action
(lines 3, 5, and 9), and the delete action (line 6).
In other database management systems, however, multiple actions on a single
connection are not supported. In such database management systems, any
attempts to execute Code Segment 1 would result in an error. This is the
case, since the database statement at line 6 could not be associated with
the same connection as the database statements at lines 1, 3, 5, and 9. It
order to execute without any errors, it would be necessary to modify Code
Segment 1 to create an additional connection before the delete statement
at line 6.
As noted above, the interface module 106 functionally compensates for a
database management system's lack of support of multiple pending database
actions (such as SQL actions) on a single connection. In particular, the
data joiner 224 simulates the existence of this function (i.e., support of
multiple pending database actions on a single connection) in those
database instances 110A-110F that were instantiated from database
management systems that do not natively support this function.
Some database management systems, such as DB2, support the declaration of
cursors "with hold". If a cursor is declared "with hold" then it is not
closed as a consequence of a commit operation (commit operations are well
known to persons skilled in the relevant art). Conversely, cursors that
are not declared "with hold" are closed as a consequence of a commit
operation. The closing of a cursor operates to close the window into the
table. The position of the cursor (i.e., the pointer to the table) is lost
when the cursor is closed. Thus, by declaring a cursor "with hold", the
position of the cursor is not lost when a commit operation is performed,
since the cursor is not closed.
As noted above, the interface module 106 functionally compensates for a
database management system's lack of support of declaring cursors "with
hold". In particular, the data joiner 224 simulates the existence of this
function (i.e., allowing cursors to be declared "with hold") in those
database instances 110A-110F that were instantiated from database
management systems that do not natively support this function.
Flowchart 302 in FIG. 3 depicts the manner in which the interface module
106 simulates the existence of these functions (i.e., allowing multiple
database actions to be pending on a single connection, and allowing
cursors to be declared "with hold") in database instances instantiated
from database management systems that do not natively support these
functions. Flowchart 302 (and all flowcharts contained herein) is of
sufficient detail to enable one skilled in the relevant art to generate a
computer program or a computer program product in accordance with the
present invention. The interface module 106 performs the steps of
flowchart 302 for each database statement (i.e., each SQL data
manipulation statement) for which a connection has not already been
defined. The interface module 106 receives this database statement from
the client 104. Flowchart 302 begins with step 304, where control
immediately passes to step 306.
In step 306, after receiving a database statement from the client 104 (this
is called the "current database statement" for reference purposes), the
interface module 106 determines whether any connections exist. A
distinction must be made between the connections which are made via
"connect" (or equivalent) statements contained in the client 104, and the
connections which the interface module 106 in accordance with the present
invention. For reference purposes, the former connections are called
"user-created connections" and the latter connections are called "data
joiner-created connections" or "DJ-created connections". Multiple
DJ-created connections may be created by the interface module 106 to
support each user-created connection. In step 306, the interface module
106 determines whether any DJ-created connections exist.
If no more DJ-created connections exist (or all existing DJ-created
connections have already been processed), then step 318 is performed
(described below). Otherwise, step 308 is performed.
In step 308, the interface module 106 selects one of the existing
DJ-created connections for processing.
In step 310, the interface module 106 determines in a well known manner
whether the selected connection is free (i.e., no pending actions on the
selected connection), or whether the selected connection is associated
with a database management system that supports multiple pending actions
on each connection. If either of these conditions is true, then it might
be possible to used this selected connection to support (i.e., to execute)
the current database statement. Thus, control passes to step 312 to
further process the selected connection (discussed below). However, if
neither of these conditions is true, then it is not possible to used this
selected connection to support the curren | | |