|
|
|
| United States Patent | 5504900 |
| Link to this page | http://www.wikipatents.com/5504900.html |
| Inventor(s) | Raz; Yoav (Newton, MA) |
| Abstract | Serializability across distributed transactions is guaranteed by
selectively committing and aborting or delaying transactions to enforce an
order of commitment that is the same as an order of performance of
conflicting component operations of the transactions. A first memory
access operation in a first transaction, for example, conflicts with a
second memory access operation in a second transaction when the two memory
access operations reference the same memory location and at least one of
the operations is a write operation. The transaction processing system may
permit a second transaction to read data written by a write operation of a
first transaction before the first transaction is committed. In this case,
depending on the respective order in which the two conflicting operations
occur, the order of commitment is enforced, possibly by aborting either of
the two transactions, to ensure that the order of commitment is the same
as the order of the operations. The conflicts, for example, are detected
when the addresses are determined during preparation of the transactions.
The component operations may be scheduled for most efficient use of the
computer system capabilities. In a multiprocessor system in which a global
coordinator communicates with a plurality of transaction processors by way
of "prepare" and "commit" commands, the commitment order is referenced to
delay acknowledging that a transaction has been "prepared" until the
transaction's "abort set" has been minimized. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5504900 |
|
|
Commitment ordering for guaranteeing serializability across distributed
transactions |
|
|
|
|
|
| Publication Date |
April 2, 1996 |
|
|
|
|
|
| Filing Date |
December 5, 1994 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This application is a continuation of application Ser. No. 07/703,394 filed
May 21, 1991, now abandoned. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. A computer-implemented method of processing transactions in a computing
system, said method comprising the steps of:
a) beginning preparation of results of said transactions;
b) checking for memory access conflicts between said transactions, not all
of said transactions having memory access conflicts, and when said
checking for memory access conflicts finds that one of said transactions
has a first operation that conflicts with a second operation in another
one of said transactions, recording in memory of said computing system an
order of performance for the transactions having the first conflicting
operation and the second conflicting operation;
c) after results of said transactions having the first conflicting
operation and the second conflicting operation have been prepared so that
each one of said transactions having the first conflicting operation and
the second conflicting operation is ready to be committed, selecting one
of said transactions having the first conflicting operation and the second
conflicting operation to be committed and thereby identifying a selected
transaction, and determining an abort set of transactions for which
commitment after commitment of said selected transaction would be contrary
to said order of performance having been recorded in said memory of said
computing system such that said abort set includes all transactions not
yet committed nor aborted that have a performed operation conflicting with
a later performed operation in said selected transaction, and in response
to the selecting of said selected transaction:
(i) committing to memory state of said computing system the prepared
results of said selected transaction; and
(ii) aborting all transactions in said abort set, wherein said abort set
includes at least one transaction having results prepared and being ready
to be committed and having a performed operation conflicting with a later
performed operation in said selected transaction.
2. The method as claimed in claim 1, wherein said step of committing to
memory state includes updating a data base file.
3. The method as claimed in claim 1, wherein said step of committing to
memory state includes changing memory state of a data object.
4. The method as claimed in claim 1, wherein the transactions that are
ready to be committed have predetermined priorities, one of the
transactions that are ready to be committed has a highest one of the
predetermined priorities, and the step of selecting said selected
transaction includes comparing the predetermined priorities of the
transactions that are ready to be committed and selecting the transaction
having the highest one of the predetermined priorities among the
transactions that are ready to be committed so that said selected
transaction has the highest one of said predetermined priorities among the
transactions that are ready to be committed.
5. The method as claimed in claim 1, wherein the step of selecting said
selected transaction is based on ordering of said transactions in a list
so that said selected transaction is a first transaction in said list
among transactions in said list that are ready to be committed.
6. The method as claimed in claim 1, wherein the step of selecting said
selected transaction is performed in response to a commit command from a
coordinator.
7. The method as claimed in claim 1, wherein each of said transactions
having said first conflicting operation and said second conflicting
operation has an abort set of transactions for which commitment after
commitment of said each of said transactions having said first conflicting
operation and said second conflicting operation would be contrary to said
order of performance, the abort set for one of said transactions having
said first conflicting operation and said second conflicting operation
having fewer members than the abort set for the other of said transactions
having said first conflicting operation and said second conflicting
operation, and wherein the step of selecting said selected transaction
includes accessing said order of performance in said memory, and from said
order of performance in said memory, determining which abort set of said
transactions having said first conflicting operation and said second
conflicting operation has fewer members, and selecting to be committed the
one of said transactions having the abort set having fewer members, in
order to minimize the number of transactions that are aborted in said step
(ii).
8. The method as claimed in claim 1, which includes receiving from a
coordinator a request specifying a transaction to be prepared and thereby
identifying a specified transaction, and wherein preparation of results
for said specified transaction is begun in said step (a) in response to
receiving from said coordinator the request to prepare said specified
transaction, and wherein the method further includes the step of delaying
acknowledgement to said coordinator of completion of preparation of said
specified transaction until none of the transactions not yet committed nor
aborted have a performed operation conflicting with a later performed
operation in said specified transaction.
9. The method as claimed in claim 8, further comprising the step of
terminating said delaying when said delaying persists for a predetermined
duration of time.
10. The method as claimed in claim 8, further comprising the step of
terminating said delaying upon receipt of a termination signal from a
coordinator.
11. The method as claimed in claim 1, wherein said checking for memory
access conflicts between transactions is performed during the preparation
of results for transactions having conflicting operations.
12. The method as claimed in claim 1, wherein said checking for memory
access conflicts between transactions is performed after computing an
address of a memory access operation.
13. The method as claimed in claim 1, wherein said checking for memory
access conflicts between transactions includes comparing an address of a
memory access operation for one transaction to addresses of memory access
operations for other transactions.
14. The method as claimed in claim 1, wherein a read operation of a second
one of said transactions reads write data written by a write operation of
a first one of said transactions before said first one of said
transactions is committed, and wherein said method further comprises the
step of aborting all of said transactions that have read data written by
aborted transactions.
15. A computer-implemented method of processing transactions in a computing
system, said method comprising the steps of:
a) receiving requests for processing said transactions;
b) beginning preparation of results of said transactions;
c) checking for memory access conflicts between said transactions, not all
of said transactions having memory access conflicts, and when said
checking for memory access conflicts finds that one of said transactions
has a first operation that conflicts with a second operation in another
one of said transactions, storing in memory of said computing system data
indicating an order of performance for the transactions having the first
conflicting operation and the second conflicting operation, said checking
for memory access conflicts between transactions being performed during
the preparation of results for transactions having conflicting operations;
d) after results of said transactions having the first conflicting
operation and the second conflicting operation have been prepared so that
each one of said transactions having the first conflicting operation and
the second conflicting operation is ready to be committed, selecting one
of said transactions having the first conflicting operation and the second
conflicting operation to be committed and thereby identifying a selected
transaction, and in response to the selecting of said selected
transaction:
(i) committing to memory state of said computing system the prepared
results of said selected transaction; and
(ii) for each transaction, other than said selected transaction, for which
preparation of results has begun and which has not yet been committed nor
aborted,
inspecting said data stored in memory to determine whether commitment of
said each transaction after commitment of said selected transaction would
be contrary to said order of performance such that said each transaction
has a performed operation conflicting with a later performed operation in
said selected transaction, and
when commitment of said each transaction after commitment of said selected
transaction would be contrary to said order of performance such that said
each transaction has a performed operation conflicting with a later
performed operation in said selected transaction, aborting said each
transaction;
wherein at least one transaction, which has results prepared and is ready
to be committed, is aborted in the step (ii) when commitment of said at
least one transaction after commitment of said selected transaction would
be contrary to said order of performance such that said at least one
transaction has a performed operation conflicting with a later performed
operation in said selected transaction.
16. The method as claimed in claim 15, wherein said data is stored in said
memory in the form of a directed graph, and wherein said method further
comprises the step of removing from said graph data with respect to
committed and aborted transactions.
17. The method as claimed in claim 16, further comprising the step of
adding to said graph data indicating memory access conflict with respect
to at least one of said transactions arising after beginning preparation
of results of said at least one of said transactions.
18. The method as claimed in claim 15, wherein the transactions that are
ready to be committed have predetermined priorities, one of the
transactions that are ready to be committed has a highest one of the
predetermined priorities, and the step of selecting said selected
transaction includes comparing the predetermined priorities of the
transactions that are ready to be committed and selecting the transaction
having the highest one of the predetermined priorities among the
transactions that are ready to be committed so that said selected
transaction has the highest one of said predetermined priorities among the
transactions that are ready to be committed.
19. The method as claimed in claim 15, wherein the step of selecting said
selected transaction is based upon an order of said transactions in a list
so that said selected transaction is a first transaction in said list
among transactions in said list that are ready to be committed.
20. The method as claimed in claim 15, wherein the step of selecting said
selected transaction is performed in response to a commit command from a
coordinator.
21. The method as claimed in claim 15, wherein each of said transactions
having said first conflicting operation and said second conflicting
operation has an abort set of transactions for which commitment after
commitment of said each of said transactions having said first conflicting
operation and said second conflicting operation would be contrary to said
order of performance, the abort set for one of said transactions having
said first conflicting operation and said second conflicting operation
having fewer members than the abort set for the other of said transactions
having said first conflicting operation and said second conflicting
operation, and wherein the step of selecting said selected transaction
includes accessing said data in said memory, and from said data in said
memory, determining which abort set of said transactions having said first
conflicting operation and said second conflicting operation has fewer
members, and selecting to be committed the one of said transactions having
the abort set having fewer members, in order to minimize the number of
transactions that are aborted in said step (ii).
22. The method as claimed in claim 15, which includes receiving from a
coordinator a request specifying a transaction to be prepared and thereby
identifying a specified transaction, and wherein preparation of results
for said specified transaction is begun in said step (a) in response to
receiving from said coordinator the request to prepare said specified
transaction, and wherein the method further comprises the step of delaying
acknowledgement to said coordinator of completion of preparation of said
specified transaction until none of the transactions not yet committed nor
aborted have a performed operation conflicting with a later performed
operation in said specified transaction.
23. The method as claimed in claim 22, further comprising the step of
terminating said delaying when said delaying persists for a predetermined
duration of time.
24. The method as claimed in claim 23, further comprising the step of
terminating said delaying upon receipt of a termination signal from a
coordinator.
25. The method as claimed in claim 15, wherein said checking for memory
access conflicts between transactions is performed after computing an
address of a memory access operation.
26. The method as claimed in claim 15, wherein said checking for memory
access conflicts between transactions includes comparing an address of a
memory access operation for one transaction to addresses of memory access
operations for other transactions,
27. The method as claimed in claim 15, wherein a read operation of a second
one of said transactions reads write data written by a write operation of
a first one of said transactions before said first one of said
transactions is committed, and wherein said method further comprises the
step of aborting all of said transactions that have read data written by
aborted transactions.
28. A computer-implemented method of processing transactions in a computing
system, said method comprising the steps of:
a) receiving requests to perform transactions;
b) preparing results of said transactions by scheduling and performing
operations of said transactions on a real-time basis such that operations
of some transactions are performed in accordance with availability of
resources of said computing system before commitment of other
transactions;
c) checking for memory access conflicts between said transactions, not all
of said transactions having memory access conflicts, and when said
checking for memory access conflicts finds that one of said transactions
has a first operation that conflicts with a second operation in another
one of said transactions, recording in memory of said computing system an
order of performance for the transactions having the first conflicting
operation and the second conflicting operation, said checking for memory
access conflicts between transactions being performed during the
preparation of results for transactions having conflicting operations; and
d) after results of said transactions having the first conflicting
operation and the second conflicting operation have been prepared so that
each one of said transactions having the first conflicting operation and
the second conflicting operation is ready to be committed, selecting one
of said transactions having the first conflicting operation and the second
conflicting operation to be committed and thereby identifying a selected
transaction, and determining an abort set of transactions for which
commitment after commitment of said selected transaction would be contrary
to said order of performance having been recorded in said memory of said
computing system such that said abort set includes all transactions not
yet committed nor aborted that have a performed operation conflicting with
a later performed operation in said selected transaction, and in response
to the selecting of said selected transaction:
(i) committing to memory state of said computing system the prepared
results of said selected transaction; and
(ii) aborting all transactions in said abort set, wherein said abort set
includes at least one transaction having results prepared and being ready
to be committed and having a performed operation conflicting with a later
performed operation in said selected transaction.
29. The method as claimed in claim 28, further including the step of
delaying commitment of a transaction having prepared results so long as
there are any other transactions not yet committed nor aborted having a
performed operation conflicting with a later performed operation of said
transaction having prepared results.
30. The method as claimed in claim 29, wherein said delaying of commitment
is terminated after a predetermined duration of time by aborting every one
of said transactions not yet committed nor aborted having a performed
operation conflicting with a later performed operation of said transaction
having prepared results.
31. The method as claimed in claim 28, wherein said requests are received
from a coordinator, and said method further includes delaying
acknowledgement to said coordinator of completion of preparation of a
requested one of said transactions until none of said transactions that
are not yet committed nor aborted have any performed operations that
conflict with any later performed operations of said requested one of said
transactions.
32. The method as claimed in claim 31, further comprising the step of
terminating said delaying acknowledgement upon receipt of a termination
signal from a coordinator.
33. The method as claimed in claim 28, wherein said checking for memory
access conflicts between transactions is performed after computing an
address of a memory access operation.
34. The method as claimed in claim 28, wherein said checking for memory
access conflicts between transactions includes comparing an address of a
memory access operation for one transaction to addresses of memory access
operations for other transactions.
35. The method as claimed in claim 28, wherein a read operation of a second
one of said transactions reads write data written by a write operation of
a first one of said transactions before said first one of said
transactions is committed, and wherein said method further comprises the
step of aborting all of said transactions that have read data written by
aborted transactions.
36. A digital computer system for processing transactions, said digital
computer system comprising, in combination:
a) means for preparing results of said transactions by scheduling and
performing operations of said transactions on a real-time basis such that
operations of some transactions are performed in accordance with
availability of resources of said digital computer system before
commitment of other transactions;
b) means for checking for memory access conflicts between said
transactions, not all of said transactions having memory access conflicts,
and means responsive to said means for checking and operative when one of
said transactions has a first operation that conflicts with a second
operation in another one of said transactions for recording in memory of
said computing system an order of performance for the transactions having
the first conflicting operation and the second conflicting operation; and
c) means operative after results of said transactions having the first
conflicting operation and the second conflicting operation have been
prepared so that each one of said transactions having the first
conflicting operation and the second conflicting operation is ready to be
committed for:
(i) selecting one of said transactions having the first conflicting
operation and the second conflicting operation to be committed and thereby
identifying a selected transaction, and committing to memory state of said
computing system the prepared results of said selected transaction; and
(ii) determining an abort set of transactions for which commitment after
commitment of said selected transaction would be contrary to said order of
performance having been recorded in said memory of said digital computer
system such that said abort set includes all transactions not yet
committed nor aborted that have a performed operation conflicting with a
later performed operation of said selected transaction, and aborting all
transactions in said abort set, wherein said abort set includes at least
one transaction having results prepared and being ready to be committed
and having a performed operation conflicting with a later performed
operation of said selected transaction.
37. The digital computer system as claimed in claim 36, further including
means for delaying commitment of said selected transaction when any
transactions not yet committed nor aborted have a performed operation
conflicting with a later performed operation of said selected transaction.
38. The digital computer system as claimed in claim 36, further including
means for receiving from a coordinator a request specifying a transaction
to be prepared and thereby identifying a specified transaction, and means
for delaying acknowledgement to said coordinator of completion of
preparation of said specified transaction until none of said transactions
that are not yet committed nor aborted have any performed operations that
conflict with any later performed operations of said specified
transaction.
39. The digital computer system as claimed in claim 36, wherein said means
for checking for memory access conflicts between said transactions
includes means for detecting the performance of a memory access operation
that conflicts with a memory access operation previously performed.
40. The digital computer system as claimed in claim 39, wherein said means
for checking for memory access conflicts between said transactions
includes means for comparing an address of a memory access operation for
one transaction to addresses of memory access operations previously
performed for other transactions.
41. The digital computer system as claimed in claim 36, wherein said means
operative after results of said transactions having the first conflicting
operation and the second conflicting operation have been prepared includes
means for aborting all of said transactions that have read data written by
aborted transactions.
42. A computer-implemented method of processing transactions in a computing
system, said method comprising the steps of:
a) beginning preparation of results of said transactions;
b) when one of said transactions has a first operation that conflicts with
a second operation in another one of said transactions, recording in
memory of said computing system an order of performance for the
transactions having the first conflicting operation and the second
conflicting operation;
c) after results of said transactions having the first conflicting
operation and the second conflicting operation have been prepared so that
each one of said transactions having the first conflicting operation and
the second conflicting operation is ready to be committed, selecting one
of said transactions having the first conflicting operation and the second
conflicting operation to be committed and thereby identifying a selected
transaction, and determining an abort set of transactions for which
commitment after commitment of said selected transaction would be contrary
to said order of performance having been recorded in said memory of said
computing system such that said abort set includes all transactions not
yet committed nor aborted that have a performed operation conflicting with
a later performed operation in said selected transaction, and in response
to the selecting of said selected transaction:
(i) committing to memory state of said computing system the prepared
results of said selected transaction; and
(ii) aborting all transactions in said abort set,
wherein each of said transactions having said first conflicting operation
and said second conflicting operation has an abort set of transactions for
which commitment after commitment of said each of said transactions having
said first conflicting operation and said second conflicting operation
would be contrary to said order of performance, the abort set for one of
said transactions having said first conflicting operation and said second
conflicting operation having fewer members than the abort set for the
other of said transactions having said first conflicting operation and
said second conflicting operation, and wherein the step of selecting said
selected transaction includes accessing said order of performance in said
memory, and from said order of performance in said memory, determining
which abort set of said transactions having said first conflicting
operation and said second conflicting operation has fewer members, and
selecting to be committed the one of said transactions having the abort
set having fewer members, in order to minimize the number of transactions
that are aborted in said step (ii). |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to distributed computing, and more
particularly to a transaction processing system in which component
operations in related transactions are distributed so that at least one
operation in a second transaction is performed before a first transaction
having a conflicting operation is committed. The present invention
specifically concerns a method and apparatus for scheduling the
performance of the conflicting operations according to available resources
and ensuring that the results of the conflicting operations are committed
in the same order as the order of performance of the conflicting
operations.
2. Description of the Background Art
A desirable feature of a computing system is the ability to recover from
partial system failures that interrupt memory write operations. If an
application program has a memory write operation in progress at the time
of the system failure, it is most likely that the memory record will
become erroneous. To enable the recovery of memory records after a partial
system failure, it is necessary for the application program to keep backup
copies of the records in nonvolatile memory. When the computing system is
restarted, the memory records to be recovered are replaced with the backup
copies.
To facilitate the making of backup copies and the recovery of memory
records, the operating system typically provides an established set of
memory management procedures that can be invoked or called from an
application program to define a "recovery unit." The recovery unit
consists of program statements between a "START" statement and a "COMMIT"
statement. All of the statements in the "recovery unit" must be completed
before the memory records modified by the statements in the recovery unit
are made available for subsequent processing. The "START" statement
corresponds to the making of a backup copy in nonvolatile memory, and the
"COMMIT" statement corresponds to switching of the backup copy with a
modified version. The statements in the "recovery unit" specify operations
in a single "transaction." Upon recovering from a partial system error,
inspection of the nonvolatile memory will reveal that the operations in
the single "transaction" are either all completed, or none of them are
completed.
In a distributed computing system, the operations in a single transaction
may modify files in different data bases, and the files may be shared by
other processes. During the operation of the transaction, the files may be
inconsistent for a time, although the files will be consistent upon
completion of the transaction. A typical example is a transfer of funds
from one account to another, in which a first account is debited, and at a
slightly later time, another account is credited. During the interim, the
two accounts are inconsistent because the sum of the two accounts does not
represent the total funds in the two accounts. Due to inconsistency when
files are being modified by a transaction, it is known to prevent other
processes from accessing the files until the modification is finished.
Recoverability can be assured in this example by performing commitment for
both files at the same time and place. By changing a single flag, for
example, the backup copies of each file can be replaced at the same time
with the modified versions of the files. In many instances, however, it is
desirable to distribute the operations in a transaction among multiple
processors or processes in a computing system, and to commit the
transaction by committing the operations in each process or processor
while permitting some variability between the times of commitment. In
these instances, an "atomic commitment protocol" is typically used to
ensure recoverability. The protocol requires the exchange of information
about the state of the transaction between the processors or processes. To
identify the transaction being performed, the transaction is typically
assigned a unique "transaction identification number."
A widely used atomic commitment protocol is known as the "two-phase commit
protocol." In a somewhat elementary example of this protocol, one
processor or process in the computing system is assigned the role of a
coordinator which initiates a transaction. To begin a transaction, the
coordinator sends a prepare command to all of the processors or processes
participating in the transaction.
Upon receipt of the "prepare" command, each processor or process
participating in the transaction performs a "START" operation by first
placing "write locks" on memory accessed by the transaction, writes the
transaction identification number into permanent memory to remember that
it is prepared for the transaction, and then sends an acknowledgement back
to the coordinator processor, but does not yet perform its part of the
transaction. The coordinator waits for acknowledgements from all of the
participants. When the coordinator receives acknowledgements from all of
the participants, the coordinator records in permanent memory a list of
the participants and a notation that the transaction is now being
completed, and then the coordinator sends "commit" commands to all of the
participants. The coordinator, however, may receive a message from a
participant indicating that it cannot prepare for the transaction, or the
coordinator may fail to receive acknowledgements from all of the
participants after a predetermined time period, possibly after the
coordinator has retransmitted the "prepare" command. In this case the
coordinator transmits an "abort" command to all of the participants.
Upon receipt of the "commit" command, each participant checks its permanent
memory for the transaction identification number to determine whether the
participant has prepared for the transaction, and if it has, it performs
its part of the transaction, and then performs a "COMMIT" operation to
update the state of permanent memory and clear the transaction ID from
permanent memory in one "atomic" step, and erase the write locks. Then the
participant sends an acknowledgement back to the coordinator. When the
coordinator receives acknowledgments from all of the participants, it
erases the list of participants from permanent memory, and the transaction
is finished.
In a many distributed computing systems, the processors or processes are
permitted to perform multiple transactions simultaneously. In the usual
case each processor or process performs transactions that are local to the
processor or process, and also performs portions of global transactions.
In a distributed data base system, for example, local data base queries
and edits may occur locally, and some of the modifications may be made
globally. A direct application of the two-phase commit protocol described
above may perform satisfactorily in such a system, so long as the global
transactions can be given a high priority with respect to the local
transactions. But use of the read and write locks may unnecessarily
restrict local transactions that could be processed concurrently.
Additional complexity is introduced when it is desired to process global
transactions concurrently across multiple processors or processes in a
distributed computing system. It is impractical to permit a processor or
process to view a global picture of all the conflicts in all of the other
processors or processes. Without a global picture, however, it is
difficult for a processor or process to ensure that there is a correlation
between its seriablility order and the seriability orders of the other
processors or processes. Time-stamping of transaction requests and data
updates is one method that has been used to address this problem of
concurrency control. In general, concurrency control in a distributed
computing system has been achieved at the expense of restricted autonomy
of the local processors or processes, or by locking.
The problem of global deadlock also has to be addressed whenever global
transactions are performed concurrently. One known solution is to provide
a global transaction manager that decides whether or not to dispatch
concurrent global transaction requests. An example is described Y.
Breitbart et al., "Reliable Transaction Management in a Multidatabase
System", Proc. of the ACM SIGMOD conf. on Management of Data, Atlantic
City, N.J., June 1990, pp. 215-224. The global scheduler keeps track of
global transaction requests for local locks on data items by using a
global lock mechanism. Each global data item has a global lock associated
with it. A global transaction that needs only to read a data item requests
a global read-lock. Locks are conflicting if they are requested by two
different transactions on the same data item and at least one of the
requested locks is a write-lock. If two global transactions request
conflicting global locks, the scheduler will prevent one of the
transactions from proceeding because it knows that the two transactions
will cause a conflict at the local site. The scheduler uses strict
two-phase locking for allocating global locks to global transactions, and
maintains a global "wait for graph." The "global wait for graph" is a
directed graph G=(V,E) whose set of vertices V is a set of global
transactions and an edge T.sub.i .fwdarw.T.sub.j belongs to E if and only
if global transaction T.sub.i waits for a global lock allocated to global
transaction T.sub.j. If a global transaction waits for a global lock, then
the transaction state becomes "blocked" and the transaction is included in
the "global wait for graph." The transaction becomes active again only
after it can obtain global locks that it was waiting for. To avoid global
deadlocks, the "global wait for graph" is always made acyclic. To ensure
data consistency in the presence of failures, the scheduler also uses a
"commit graph" and a "wait-for-commit graph" to determine when to schedule
a commit operation. The commit graph CG=<TS,E> is an undirected bipartite
graph whose set of nodes TS consists of a set of global transactions
(transaction nodes) and a set of local sites (site nodes). Edges from E
may connect only transaction nodes with site nodes. An edge
(T.sub.i,S.sub.j) is in E if and only if transaction T.sub.i was executing
at site S.sub.j, and the commit operation for T.sub.i has been scheduled
for processing. After the commit operation for T.sub.i is completed,
T.sub.i is removed from the commit graph along with all edges incidental
to T.sub.i. Global database consistency is assured if the commit graph
does not contain any loops. The wait-for-commit graph is a directed graph
G=(V,E) whose set of vertices V consists of a set of global transactions.
An edge T.sub.i .fwdarw.T.sub.j is in E if and only if T.sub.i has
finished its execution, but its commit operation is still pending and
T.sub.j is a transaction whose commit operation should be completed or
aborted before the commit of T.sub.i can be scheduled. The scheduler uses
the following algorithm for constructing the wait-for-commit graph, and in
scheduling a commit operation of transaction T.sub.i :
1. For each site S.sub.k in which T.sub.i is executing, temporarily add the
edge T.sub.i .fwdarw.S.sub.k to the commit graph.
2. If the augmented commit graph does not contain a cycle, then the global
commit operation is submitted for processing, and the temporary edges
become permanent.
3. If the augmented commit graph contains a cycle then:
a) The edges T.sub.i .fwdarw.T.sub.i1, . . . , T.sub.i .fwdarw.T.sub.im are
inserted into the wait-for-commit graph. The set {T.sub.i1, T.sub.i2, . .
. ,T.sub.im } consists of all the transactions which appear in the cycle
which was created as a result of adding the new edges to the commit graph.
b) Remove the temporary edges from the commit graph.
The transaction T.sub.i, however, need not necessarily wait for the
completion of every transaction T.sub.ik such that T.sub.i
.fwdarw.T.sub.ik. It may be ready to be scheduled for a commit operation
after some of transactions T.sub.ik such that T.sub.i .fwdarw.T.sub.il
(0<1<r) successfully commit (and in some cases, a successful commit of
only one such transaction would be sufficient to schedule the
transaction's commit ).
SUMMARY OF THE INVENTION
The present invention guarantees serializability across distributed
transactions in a computing system by selectively committing and aborting
the transactions to enforce an order of commitment that is the same as an
order of performance of conflicting component operations of the
transactions. When the transaction is committed, results of the component
operations are committed to state memory. When the operation is aborted,
the results of the component operations are discarded. A first memory
access operation in a first transaction, for example, conflicts with a
second memory access operation in a second transaction when the two memory
access operations reference the same memory location and at least one of
the operations is a write operation.
In a typical transaction processing system, a second transaction can read
data written by a first transaction only after the second transaction has
been committed. This restriction is a sufficient condition to ensure
recoverability of the system. To practice the present invention in this
case, when a second transaction performs a read operation before a
conflicting write operation of a first transaction is committed at a time
when the second transaction has not yet committed, the second transaction
is aborted to ensure that the order in which the transactions are
committed is not different from the order in which the conflicting
operations are performed.
The present invention, however, permits the construction of a transaction
processing system in which a second transaction may read data written by a
write operation of a first transaction before the first transaction is
committed. In this case, depending on the respective order in which the
two conflicting operations occur, either of the two transactions may be
aborted to ensure that the order of commitment is the same as the order of
performance of the conflicting operations. Moreover, to insure
recoverability, both of the transactions should be aborted in the case of
the read operation following the write operation and the read operation
being performed before aborting of the write operation. In general, in a
transaction processing system in which a second transaction may read data
written by a write operation of a first transaction, recoverability is
enforced by a process of cascading aborts; the aborting of a transaction
requires the additional aborting of all other transactions that have read
data written by aborted transactions.
In cases where memory addresses of memory access operations are known prior
to preparing the transactions, the required commitment order may be
determined prior to preparation of the transactions. Otherwise, conflicts
are detected when the memory addresses are determined during preparation
of the transaction | | |