|
Claims  |
|
|
What is claimed is:
1. A method of providing access to object data stored in an object server
in response to a database query, comprising the steps of:
receiving the database query comprising a relational operation from a
client, said relational operation comprising at least one object surrogate
identifying object data stored in an object server;
transforming the database query into relational database commands;
transmitting the relational database commands to a relational database
management system;
receiving a response table from the relational database management system;
compiling an answer set from the response table, the answer set comprising
an object locator identifying the object data stored in the object server
and responsive to the database query;
transmitting the answer set to the client;
wherein the transforming step comprises the steps of
parsing the database command into a parse tree of object structures
comprising relational database commands and object server commands, said
parse tree comprising parse tree nodes;
obtaining cost parameters for the parse tree object structures from a
global data dictionary;
binding the cost parameters to the parse tree nodes;
iterating over the parse tree object structures to evaluate parse tree
options according to the cost parameters and a selection criteria; and
selecting the parse tree option based upon the cost parameters and selected
criteria.
2. The method of claim 1, wherein the global data dictionary (GDD)
comprises a GDD cache which is updated when changes are made to the GDD.
3. The method of claim 2, wherein the GDD cache is user configurable and
definable.
4. The method of claim 2, wherein the GDD cache is subdivided into boundary
areas which are allocated to a GDD table whose values are replaced using a
least recently used (LRU) algorithm.
5. The method of claim 1, wherein the cost parameters are selected from the
group comprising statistical, static cost, and historical usage
information of the parse tree object structures.
6. An apparatus for providing access to object data stored in an object
server in response to a database query, comprising:
a computer having a processor, the computer coupled to a data storage
device;
means, performed by the computer, for receiving the database query from a
client, the database query comprising a relational operation having at
least one object surrogate identifying object data stored in an object
server;
means, performed by the computer, for transforming the database query into
relational database commands;
means, performed by the computer, for transmitting the relational database
commands to a relational database management system;
means, performed by the computer, for receiving a response table from the
relational database management system;
means, performed by the computer, for compiling an answer set from the
response table, the answer set comprising an object locator identifying
the object data stored in the object server and responsive to the database
query;
means, performed by the computer, for transmitting the answer set to the
client;
wherein the means for transforming the database query into relational
database commands comprises
means, performed by the computer, for parsing the database command into a
parse tree of object structures comprising relational database commands
and object server commands, said parse tree comprising parse tree nodes;
means, performed by the computer, for obtaining cost parameters for the
parse object structures from a global data dictionary;
means, performed by the computer, for binding the cost parameters to the
parse tree nodes;
means, performed by the computer, for iterating over the parse tree object
structures to evaluate parse tree options according to the cost parameters
and a selection criteria; and
means, performed by the computer, for selecting the parse tree option based
upon the cost parameters and selected criteria.
7. The apparatus of claim 5, wherein the global data dictionary (GDD)
comprises a GDD cache which is updated when changes are made to the GDD.
8. The apparatus of claim 7, wherein the GDD cache is user configurable and
definable.
9. The apparatus of claim 7, wherein the GDD cache is subdivided into
boundary areas which are allocated to a GDD table whose values are
replaced using a least recently used (LRU) algorithm.
10. The apparatus of claim 6, wherein the cost parameters are selected from
the group comprising statistical, static cost, and historical usage
information of the parse tree object structures.
11. A program storage device, readable by a computer having a processor,
the computer coupled to a data storage device, tangibly embodying one or
more programs of instructions executable by the computer to perform method
steps of providing access to object data stored in an object server in
response to a database query, the method comprising the steps of:
receiving the database query comprising a relational operation from a
client, said relational operation comprising at least one object surrogate
identifying object data stored in an object server;
transforming the database query into relational database commands;
transmitting the relational database commands to a relational database
management system;
receiving a response table from the relational database management system;
compiling an answer set from the response table, the answer set comprising
an object locator identifying the object data stored in the object server
and responsive to the database query;
transmitting the answer set to the client;
wherein the transforming step comprises the method steps of
parsing the database command into a parse tree of object structures
comprising relational database commands and object server commands, said
parse tree comprising parse tree nodes;
obtaining cost parameters for the parse object structures from a global
data dictionary;
binding the cost parameters to the parse tree nodes;
iterating over the parse tree object structures to evaluate parse tree
options according to the cost parameters and a selection criteria; and
selecting the parse tree option based upon the cost parameters and selected
criteria.
12. The program storage device of claim 11, wherein the global data
dictionary (GDD) comprises a GDD cache which is updated when changes are
made to the GDD.
13. The program storage device of claim 12, wherein the GDD cache is user
configurable and definable.
14. The program storage device of claim 12, wherein the GDD cache is
subdivided into boundary areas which are allocated to a GDD table whose
values are replaced using a least recently used (LRU) algorithm.
15. The program storage device of claim 9, wherein the cost parameters are
selected from the group comprising statistical, static cost, and
historical usage information of the parse tree object structures. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
CROSS-REFERENCE TO RELATED APPLICATIONS
The foregoing application is related to the commonly assigned applications
now pending before the United States Patent and Trademark Office, all of
which are incorporated by reference herein:
Method and Apparatus for Extending Existing Database Management System for
New Data Types, Ser. No. 08/546,101, by Felipe Carino Jr. et al., filed on
same date herewith;
Method and Apparatus for Providing Shared Data to a Requesting Client, Ser.
No. 08/546,466, by Felipe Carino Jr. et al., filed on same date herewith;
Method and Apparatus for Parallel Execution of User-Defined Functions in an
Object-Relational Database Management System, Ser. No. 08/546,465, by
Felipe Carino Jr., filed on same date herewith;
Method and Apparatus for Providing Access to Shared Data to Non-Requesting
Clients, Ser. No. 08/546,070, by Felipe Carino Jr. et al., filed on same
date herewith; and
Method and Apparatus for Extending a Database Management System to Operate
With Diverse Object Servers, Ser. No. 08/546,059, by Felipe Carino Jr. et
al., filed on same date herewith.
BACKGROUND OF THE INVENTION
1. Field of Invention
The present invention relates generally to database management systems, and
in particular to a federated database management system that provides
users and application developers with large object processing and
retrieval capabilities within an SQL-based operating environment.
2. Description of Related Art
Large-scale integrated database management systems provide an efficient,
consistent, and secure means for storing and retrieving vast amounts of
data. This ability to manage massive amounts of information has become a
virtual necessity in business today.
At the same time, wider varieties of data are available for storage and
retrieval. In particular, multimedia applications are being introduced and
deployed for a wide range of business and entertainment purposes,
including multimedia storage, retrieval, and content analysis. Properly
managed, multimedia information technology can be used to solve a wide
variety of business problems.
For example, multimedia storage and retrieval capability could be used to
store check signature images in a banking system. These images may then be
retrieved to verify signatures. In addition, the authenticity of the
signatures could be confirmed using content-based analysis of the data to
confirm that the customer's signature is genuine. However, practical
limitations have stymied development of large multimedia database
management systems.
Multimedia database information can be managed by ordinary relational
database management systems (RDBMS), or by object-oriented database
management systems (OODBMS). Each of these options present problems that
have thus far stymied development.
Object-oriented database management systems are unpopular because they
require a large initial capital investment and are incompatible with
existing RDBMSs. Further, maintaining two separate data repositories in a
RDBMS and a OODBMS is inconsistent with the database management philosophy
of maintaining a secure consistent central repository for all data.
RDBMSs such as the TERADATA.RTM. system are vastly more popular than
OODBMS. However, existing RDBMSs cannot effectively handle large
multimedia objects. Also, although RDBMS database features and functions
apply equally well to alphanumeric or multimedia data types, multimedia
objects introduce new semantics problems, and require new strategies for
manipulating and moving extremely large objects, which would otherwise
overwhelm RDBMS computational capacity and the I/O capability of the
computer implementing the RDBMS.
Accordingly, there is a need to extend existing RDBMSs to efficiently
manipulate and move extremely large objects, especially multimedia
objects. The present invention satisfies this need with a method and
apparatus for extending an object relational database management system
and its supporting functions to store and retrieve very large multimedia
object by appending RDBMS answer sets generated in response to RDBMS
queries with data surrogates associated with multimedia objects stored in
a separate object server.
SUMMARY OF THE INVENTION
To overcome the limitations in the prior art described above, and to
overcome other limitations that will become apparent upon reading and
understanding the present specification, the present invention discloses a
method of providing access to object stored in an object server in
response to a database query.
The method comprises the steps of receiving a database query comprising a
relational operation from a client, the relational operation comprising at
least one object surrogate identifying object data stored in an object
server, transforming the database query into relational database commands,
transmitting the relational database commands to a relational database
management system, receiving a response table from the relational database
management system; compiling an answer set from the response table, the
answer set comprising an object locator identifying object data stored in
the object server and responsive to the database query, and transmitting
the answer set to the client.
BRIEF DESCRIPTION OF THE DRAWINGS
Referring now to the drawings in which like reference numbers represent
corresponding parts throughout:
FIG. 1 is a conceptual illustration of the object-relational database
structure of the present invention;
FIG. 2 is a block diagram showing the architectural elements of one
embodiment of the present invention;
FIG. 3a is a block diagram showing the component modules of the federated
coordinator of the present invention;
FIG. 3b is a block diagram of one embodiment of an object server; and
FIGS. 4a-4p are flow charts describing the operations and transaction flow
for the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
In the following description of the preferred embodiment, reference is made
to the accompanying drawings which form a part hereof, and in which is
shown by way of illustration a specific embodiment in which the invention
may be practiced. It is understood that other embodiments may be utilized
and structural changes may be made without departing from the scope of the
present invention.
1. System Concept/Components
FIG. 1 presents a conceptual illustration of the object-relational database
structure of the present invention, showing a representation of a generic
multimedia table instantiation 101, and an object-relational database
table instantiation 102. Either database structure may be stored in the
memory of one or more computers or processors, or in a related data
storage device. Conceptually, database instances may include both
alphanumeric data and large object instances. In the generic multimedia
table instantiation 101, data instances may include a variety of data
types, including a character string 106, a byte string 108, video data
110, integer data 112, a large bit string 114, a document 116, audio data
118, and another character string 120. Of course, these items could all be
stored in the database directly as shown, however, this makes database
processing difficult, because the large data objects such as the video
110, the large bit string 114, the document 116, and the audio 118 must be
must be processed along with other table elements, overwhelming the
database processing capability.
These problems are avoided by using an object-relational database table
instantiation 102. In the object-relational structure, each object in the
row is instantiated with a portion in an object-relational table
instantiation 102, and a portion in an object storage device 104. Two data
types are therefore associated with large object instances, the object
surrogate or identifier (such as the video object identifier 126, the
large bit string object identifier 130, the document object identifier
132, and the audio object identifier 134) type and an object value type
(the video object 138, large bit string object 140, document object 142,
and audio object 144, respectively). This allows a "lightweight" surrogate
to "represent" data where the objects are too large to be casually copied
or moved from one place to another.
FIG. 2 is a diagram showing one implementation of the object relational
database structure described above. The primary components of the present
invention are a client 220, a receiver client 258, a relational database
management system (RDBMS) 210, an object server 212, a primary network
204, a federated coordinator 206, and a virtual network 218.
The client 220 is where user 221 requests are submitted and where results
are normally displayed. Architecturally, the client 220 may be an
independent computer with sufficient buffering and processing to support
the presentation of results to the user 221. Functionally, the client 220
hosts end-user applications programs, sends application structured query
language (SQL) and general purpose call level interface (CLI) requests to
the federated coordinator 206, which participates in object transport
connections set up by object server 212, and receives result set elements
and stages them for display, playback, or further processing by client
applications.
The client interface 202 provides an interface between the client 220, the
federated coordinator 206, and the virtual network 218. The client
interface 202 may be resident in the same computer system as the federated
coordinator 206, the client 220 or a separate computer, and comprises an
open database connectivity module (ODBC) 227 and an object server
connectivity module (OSC) 229. In the preferred embodiment, the ODBC
module uses MICROSOFT's.RTM. Open Database Connectivity technology, which
is well known in the art. The ODBC 227 provides an interface between the
client 220 and the federated coordinator 206. Since the a command from a
client 220 could be either a direct SQL command or a command in another
language from an application resident at the client, the ODBC 227
translates object-relational database (ORDB) commands from the client 220
into a form suitable for the federated coordinator 206. In one embodiment,
these ORDB commands are translated into Multimedia-SQL (M-SQL), an object
relational database language compatible with and derived from SQL. Of
course, the actual language implementation is unimportant, and those
skilled in the art will recognize that many different languages and
protocols can be selected, so long as the ORDB commands are from
potentially multiple sources are interpreted and translated into commands
that can be understood by the federated coordinator 206. As described
herein, the OSC 229 and ODBC 227 are parallel, but not independent,
because the ODBC 227 also uses the OSC 229 to redirect object instance
data streams to the ODBC 227 control interface to preserve ODBC 227
application interface semantics and to hide the fact that the object data
resides on a different data source (such as object server 212) from the
RDBMS 210.
Receiver clients 258 are client instances that receive subsets of queries
submitted by another client instance. A request from a client 220 can
result in some portion of a result set being transported to a receiving
agent other than the client. This feature allows multimedia objects
selected from the database to be down-loaded to a special playback device,
such as a video server 216. This feature also allows work-sharing between
clients, in which context one client is treated as a receiver by another.
Functionally, receiver clients 258 are capable of a subset of client 220
functions. Receiver clients 258 can participate in object transport
connections, and receive result set elements and stage them for display,
playback, or further processing by client applications.
The RDMBS 210 in FIG. 2 is analogous to the object relational database
table instantiation 102 described in FIG. 1. The RDBMS 210 is used to
store, retrieve, and process alphanumeric data and the object identifiers
described above. Architecturally, the RDBMS 210 logical database component
can be any relational database system, such as the TERADATA.RTM. Database
Version 2. In one embodiment, communications between the client 220 and
the RDBMS 210 are in SQL, the American National Standards Organization
(ANSI) and International Standards Organization (ISO) standard database
management language. The object-relational database management system
enhances this SQL to provide access to non-traditional data types as well
as normal RDMBS data types using the object identifier paradigm to create
an object-relational database system. The present invention enhances the
RDBMS 210 by adding abstract data type functions such as those envisioned
for SQL-3 and user-defined functions for content analysis of objects.
However, the functions performed by the RDBMS 210 are operationally no
different than those which would be expected of any database product. The
extensions provided by the present invention do not change the semantics
of how the relational data in the RDBMS 210 are defined, managed, or used.
The object server 212 stores and manages objects, executes user-defined
functions on those objects, performs connection operations through the
virtual network 218 to transport selected objects, including real time
session control, and participates in distributed transactions, as directed
by the federated coordinator 206, and is analogous to the object storage
104 in FIG. 1. The extent of these object handling capabilities will
depend significantly upon the type of objects and client application
needs. In one embodiment, the object server 212 provides scaleable
processing over very large collections of object values in a query
processing context. Other object servers, such as the auxiliary object
server 212 and the video server 216 are also supported, but are not
required. A wide variety of object servers can be supported by the present
invention. For example, the video server 216 may provide movie-on-demand
applications, and the auxiliary object server 214 may provide special
search engines for text-processing applications. Initially, the video
server 216 may be limited to storage and retrieval of objects, but may
later be expanded to allow content based operations on the data stored
therein. Another possible auxiliary object server would be a dedicated
text processor for independently storing a text index and performing text
searches against the index. In this configuration, the object server 212
could store the actual text documents as data objects.
The federated coordinator 206 comprises a session management/plan
generation module 236 which handles all aspects of primary sessions
between clients 220 and the host. The session plan/generation module 236
also accepts requests from clients 220, and transforms them into execution
plans which are executed by the RDBMS 210 and the object servers 212, 214,
and 216, performs database administration, establishes sessions with the
client 220 (including maintaining accounting information and termination),
and interprets client SQL or CLI requests and transforms them into
execution plans.
The plan execution module 238 manages repositories of large object values,
including multimedia data. The plan execution module 238 is the host agent
responsible for executing plans built by the federated coordinator 206,
and interacts directly with the RDBMS 210 and the object server 212
instances. Relational data and data dictionary tables are stored by the
RDBMS 210, while object values are stored on the object servers 212, 214,
216. The plan execution module processes the execution plans generated by
the session management/plan generation module 236, participates in
distributed transactions, stores result sets and participates in sending
the result sets to clients 220, manages object servers 212, and
participates in setting up transport connections controlling execution of
user defined functions in the object servers 212.
In the present invention, the RDBMS 210 is used as a resource that may be
shared with other applications. As such, the RDBMS 210 may contain
database and table instance which have nothing whatever to do with the
object-relational database or any of its functions or operations.
Accordingly, it is desirable for native database operations and interfaces
to be supported concurrently with the object-relational database
management system. The present invention provides this capability by
providing a separate native interface to the RDBMS 210, and by
automatically applying associated database access control mechanisms to
keep non-object-relational database applications from modifying physical
structures which contain object data values or their locators. This native
connectivity is also illustrated in FIG. 2. The RDBMS 210 can interface
directly with the client 220 via the native RDBMS interface 244 or via the
multimedia database system of the present invention. Database commands are
supplied to native RDBMS interface 224, and passed to the RDBMS 210 via
the primary network 204. Database responses from the RDBMS 210 are
supplied to the client along the same data path.
Communications between the components of the present invention include (1)
primary communications, (2) transport communications, and (3) internal
messaging communications. Communications between the components of the
present invention include as primary communications, transport (or
secondary) communications, and internal messaging communications.
Communications between the user 221 and the object-relational database
management system are provided via the client interface 202, and these
communications will generally be internal to a computer implementing the
client interface 202. Communications between the client interface 202 and
the federated coordinator 206 are provided via a primary network 204 by a
first or primary communication path 234, which provides an electronic or
optical pathway for communication signals. Since it is not necessary for
large object data instances to be transported via this communication path,
the primary communication path 234 need not be a high bandwidth
communications link.
Internal messaging communications are provided through internal messaging
data paths between the federated coordinator 206, the virtual network 208,
and the RDBMS 210, object server 212, and auxiliary object server 214.
Since large object data instances are not transmitted via these paths,
they also need not offer high bandwidth communications.
The client 220 also communicates with the object server 212, and
optionally, an auxiliary object server 214 and/or a video server 216 by
establishing transport connections 254, 256, 250 using the virtual network
218. Optionally, these communications are established according to
selected performance criteria indicated by a quality of service (QOS)
parameter selected by the client 220. Normally, these communications are
established only for the period of time required to transmit object data
to the selected destination. These transport communications could be
established by Asynchronous Transfer Mode (ATM) or Fiber Distributed Data
Interface (FDDI), Switched Multi-megabit Data Services (SMDS) or other
high bandwidth network. ATM is a high-speed switching network technology
for local area networks (LANs) and wide area networks (WANs) that handle
data and real time voice and video. It combines the high-efficiency of
packet switching used in data networks, with the guaranteed bandwidth of
the circuit switching used in voice network and provides "bandwidth on
demand" by charging customers for the amount of data they send rather than
fixed-cost digital lines that often go under-utilized. SMDS is a high
speed switched data communications service offered by local telephone
companies to interconnect LANs. It uses IEEE 802.6 DQDB MAN network
technology at rates up to 45 megabits per second (Mbps). Of course, the
present invention is not limited to the particular embodiments of the
virtual network described herein. Any network that provides enough
bandwidth and nominal latency can be used to practice the present
invention.
2. Federated Coordinator
A diagram showing the component modules of the federated coordinator 206
and their interrelationships is presented in FIG. 3a. The federated
coordinator 206 comprises an ODBC Application Program Interface (API)
manager 302 such as the MICROSOFT.RTM. ODBC API coupled to a session
manager 304, a parser 306, a resolver 308, and an answer set manager 350.
The ODBC API manager 302 coordinates activities between the session
manager 304, the parser 306, the resolver 308, and the answer set manager
350.
The ODBC API manager 302 handles ODBC requests to establish a session,
parse a query, and resolve a query plan. The ODBC API manager 302 also
handles first-pass answer set data that is obtained from the answer set
manager 350 and sent to the requesting client 220.
The session manager 304 creates a session that is used to communicate with
the client 220, and assigns a session identifier. This session handles
incoming requests and sends back responses to the client 20. However, the
session manager does not participate in object data transport activities.
The parser 306 checks the syntax of the commands from the ODBC 227 and uses
a grammar definition 307 (M-SQL, for example) to generate a high-level
collection of object structures that will be later optimized and converted
into a query execution plan. This is accomplished by defining language
protocol classes (objects) that represent the parse tree. In one
embodiment, these objects are defined according to the C++ protocol. For
example, suppose the client 220 wanted to retrieve data comprising a
magnetic resonance image (MRI) for patients who are older than 45 years of
age and who have a tumor greater than 0.13 centimeters in diameter.
Further suppose that the information is stored in a "patient" DBMS table
such as the object relational database table instantiation 102 shown in
FIG. 1, which includes object identifiers to MRI data in object storage.
An SQL command responsive to this client request is as follows:
SELECT patient.sub.-- name, MRI FROM patient WHERE age>45 and TumorSize
(MRI)>0.13
The parser 30 | | |