|
Claims  |
|
|
Having thus described our invention, what we claim as new, and desire to
secure by Letters Patent is:
1. A coordination method for writing shared data in a computer sysplex
having a plurality of central processor complexes (CPCs) having local
storages containing local caches, direct access storage devices (sysplex
DASDs) connected to the CPCs for storing data items sharable among the
CPCs in the sysplex, and at least one shared electronic storage (SES)
connected to some or all CPCs in the sysplex, the write coordination
method comprising the steps of:
command signalling a write command by a CPC to a SES to access a SES
directory entry associated with a new data version of a data item
contained in the CPC,
writing an unchanged state indication in the SES directory entry for the
new version in response to the write command, writinq the new version of
the data item from the CPC to the sysplex DASD in coordination with the
SES directory entry, and optionally writing the data item in a SES data
area in the SES if indicated by the write command,
SES requesting CPC invalidation of each copy (complement copy) of the data
item in the CPCs for coordinating the execution of the write command,
except not invalidating a copy associated with the signalling Step in the
CPC sending the write command,
setting into the SES directory entry a write-for-castout indication and a
lock indication for identifying the CPC sending the write command and to
lock the SES directory entry,
executing by the CPC an I/O transfer of the data item from the CPC to the
sysplex DASD in response to the CPC receiving a command acceptance signal
from the SES,
lock-release signalling by the CPC to the SES in response to the CPC
receiving a successful I/O write completion signal from the sysplex DASD,
and
resetting by the SES to an unlocked state the write-for-castout indication
and the lock indication in the SES directory entry in response to the
lock-release signalling step to allow other CPC commands to access the SES
directory entry and to control accessing of any copy of the data item in
the SES which may or may not be storing a copy of the data item.
2. A write coordination method in a sysplex as defined in claim 1, further
comprising the step of:
sensing for a locked state in the SES directory entry in response to the
command signalling step, SES rejecting the command if a lock state is
found, and SES accepting the command if acceptance conditions are
indicated in the SES directory entry including an unlocked state, and
response signalling by SES to the CPC the SES command acceptance signal or
a SES command rejection response signal in response to the sensing step.
3. A write coordination method in a sysplex as defined in claim 2, the
command signalling step further comprising the step of:
executing a write-and-register command or a write-when-registered command
as the write command.
4. A write coordination method in a sysplex as defined in claim 1, the SES
writing step further comprising the step of:
SES writing in the SES directory entry an unchanged indication in a change
field for the write command, and
writing a lock indication in a castout-lock-state field in the SES
directory entry.
5. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
sensing the castout-lock-state field in the SES directory entry when the
write command is received by SES from a CPC for the SES directory entry,
and
rejecting the write command by SES if a write-for-castout state indication
is found in the castout-lock-state field for indicating the entry is
currently locked for another command doing a store-multiple write
operation.
6. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
rejecting by SES of commands specifying the setting of the castout-lock
indication while the SES directory entry is locked.
7. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
command signalling a read command by a CPC to SES to access the data item
in the SES data area associated with the SES directory entry for reading
an associated data item from SES to the CPC, and
returning to the CPC a state indication for indicating the content of the
castout-lock-state field in the SES directory entry.
8. A write coordination method in a sysplex as defined in claim 7, further
comprising the steps of:
blocking use of the data item received by the CPC from SES when the
returned state indication for the castout-lock-state field is a
write-for-castout indication.
9. A write coordination method in a sysplex as defined in claim 1, the
invalidating step further comprising the steps of:
SES identifying a CPC location for each complementary copy registered with
the SES directory entry, and
SES sending an invalidate-complement-copy signal to each identified CPC
location having a complement copy to be invalidated except the CPC
location associated with the write command.
10. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
detecting by a non-failing CPC of a failing CPC in the sysplex,
sending by the non-failing CPC of a detach-local-cache (DLC) command to
SES,
examining by the DLC command of a collection of SES directory entries in
each SES directory to locate each SES directory entry containing a lock
indication identifying the failing CPC,
interrogating the castout-lock-state field in each located SES directory
entry, and
if the castout-lock-state field is set to a read-for-castout indication,
resetting the castout-lock field and the castout-lock state,
and setting the SES directory entry to contain a changed indication.
11. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
detecting by a non-failing CPC of a failing CPC in the sysplex,
sending by the non-failing CPC of a detach-local-cache (DLC) command to
SES,
examining by the DLC command of a collection of SES directory entries in
each SES directory to locate each SES directory entry containing a lock
indication identifying the failing CPC,
interrogating the castout-lock-state field in each located SES directory
entry, and
if the castout-lock-state field is set to a write-for-castout indication,
invalidating the SES directory entry and invalidating any copy of the data
item in every CPC.
12. A write coordination method in a sysplex as defined in claim 4, further
comprising the steps of:
CPC sending a read-for-castout (RFC) command to SES,
SES responding to the RFC command by setting a read-for-castout indication
in the castout-lock-state field,
setting a lock indication in the castout-lock field for identifying the
CPC-source of the command, and
sending to the CPC any SES-stored data item associated with the SES
directory entry only if the lock indication in the SES directory entry is
set to a reset state indicating the SES directory entry is not currently
locked, and SES not sending the data to the CPC if the SES directory entry
contains the lock indication. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
SPECIFICATIONS INCORPORATED BY REFERENCE
The entire specifications of the following listed applications are
completely incorporated by reference as part of the subject application.
Each of the following listed applications is owned by the same assignee as
the subject application. They are:
"Sysplex Shared Data Coherency Method and Means" by D. A. Elko et al, Ser.
No. 07/860,805, pending;
"Storage Element For A Shared Electronic Storage Cache" by D.A. Elko et al,
Ser. No. 07/860,807, pending;
"Management of Data Movement from a SES Cache to DASD" by D. A. Elko et al,
Ser. No. 07/860,806 now U.S. Pat. No. 5,493,668;
"Communicating Messages Between Processors and a Coupling Facility" by D.
A. Elko et al Ser. No. 07/860,380, pending; and
"Message Path Mechanism For Managing Connections Between Processors and a
Coupling Facility" by D. A. Elko et al, Ser. No. 07/860,646, pending;
"Management Of Data Objects Used To Maintain State Information For Shared
Data At A Local Complex" by J. A. Frey et al, Ser. No. 07/860,797, now
U.S. Pat. No. 5,388,266;
"Method And Apparatus For Coupling Data Processing Systems" by D. A. Elko
et al, Ser. No. 07/860,803, now U.S. Pat. No. 5,317,739.
INTRODUCTION
This invention provides mechanisms and controls for using a shared
electronic storage (SES) device as a store-multiple shared cache. The
invention concerns the cache coherency, serialization and recovery
constructs associated with SES cache store-multiple protocols. The term
"store-multiple" cache is meant to convey a caching model wherein updates
to data are made by writing the modified data to the SES cache and then
writing the modified data to the backing DASD as well.
BACKGROUND
A shared external storage (SES) in a sysplex is disclosed in
previously-filed patent applications Ser. Nos. 07/860,805, 07/860,807,
07/860,806, and 07/860,803.
The referenced patent applications disclose mechanisms which allow a SES
cache to be shared across one or more Central Processing Complexes
(CPC's), providing cache coherency and storage management functions which
support utilization of the SES cache as a "store-in" cache. As used here,
the term "store-in" cache is meant to convey a caching model wherein
updates to data are made by writing the data to the SES cache in a
"changed" state, reflecting the fact that the SES data element contents
are more current than the copy of the data on backing sysplex DASD. The
changed state for the data element is conveyed to the SES by setting the
change bit command operand for the Write When Registered (WWR) or Write
And Register (WAR) commands to the changed state. Once the updated data is
stored in the SES cache, the write operation is considered complete
without first requiring the update to be written to DASD.
In the store-in model, changed data is periodically cast-out from the SES
cache to the backing DASD storage. This is achieved by reading the data
for a changed SES entry into local CPC memory via a Read For Castout
command, which sets the castout lock in the SES directory entry and resets
the change bit in the SES directory entry to the unchanged state. Then the
data is written to DASD. Finally, the castout lock is released via the
Unlock Castout Locks command.
The castout process is independent of, and not serialized against, mainline
program read or write access to data resident in the SES store-in cache.
While a version of the data element in the SES is in the process of being
retrieved and castout to DASD, concurrent writes of new updates are
allowed to proceed against the SES data element, as are read operations
which may fetch an updated version of the data before the castout of the
prior version is completed.
However, it is necessary to serialize the castout process against another
castout process for the same data element. If this were not provided, a
castout process which fetched an earlier version of the data element might
not complete its I/O operation to DASD until after a later castout process
had fetched a more-current version of the data and succeeded in writing
that version to DASD first. In such a scenario the more recent update of
the data would be lost. The SES store-in cache castout mechanism described
in the referenced art provides a SES entry-level castout serialization
function, which is defined as being "non-blocking" with respect to
non-castout read/write operations against the same data element.
The provided entry-level castout serialization scope supports programs
which provide record-level locking protocols for write operations
(allowing concurrent write operations to proceed against different records
in the same page), avoiding the necessity to obtain a page-level or higher
locking scope when performing castout of data to DASD.
With respect to maintaining data coherency for data pages written in a
changed state to the SES store-in cache, the aforementioned record-level
update locking programs still rely on entry-level data coherency controls
to prevent down-level versions of the data page from being written to a
SES data element. Such controls are provided through use of the Read And
Register (RAR) command and the Write When Registered (WWR) command as
described in patent application Ser. No. 07/860,805. Once a program
registers interest in a SES directory entry via the Read And Register
command, retrieving the then-current copy of the associated data element
from the SES cache, that program will be allowed to write back an updated
version of the page to the SES as long as the program still has a valid
registered interest in the SES directory entry for that data. If a second
program has successfully written an updated version of the page (modifying
a different record in the page) via the WWR command since the first
program issued its RAR command for the data, the attempt to then write the
down-level version of the page by the first program will fail. This is due
to the successful execution of the Write When Registered command by the
second program, having specified the data should be written in a changed
state. As a result, the WWR command execution of the second program causes
all registered interest in the directory entry for the data element by all
sharing programs other than the WWR command issuer to be de-registered and
a cross-invalidate signal to be issued to each of the other registered
users to invalidate their local cache buffer copies of the data element.
It can thus be demonstrated in the referenced patent applications, in
particular Ser. Nos. 07/860805 and 07/860,806 that the SES store-in cache
model provides the entry-level cache coherency and castout serialization
scope to support exploitation by data base manager programs that implement
locking for update with a record-level serialization scope.
The following background information details pertinent aspects of a sysplex
configuration providing SES caching functions as disclosed in the
referenced patent applications.
Sysplex Hardware Structure:
FIG. 1 shows a representation of a sysplex system. It contains a plurality
of computer processing complexes (CPCs) from CPC-1 to CPC-M which
represent any number of CPCs from one to a large number, all being
connected to one or more SES (shared electronic storage) device (of which
one SES device 101 is shown in FIG. 1).
Each CPC is of the type shown in FIG. 2, which may be a multiprocessor such
as the presently commercial IBM ES/9000 model 900 which is designed
according to the IBM ESA/390 architecture specified in the Enterprise
Systems Architecture (ESA)/390) Principles of Operations (POP), which is
orderable from IBM by form number SA22-7201-00, and is incorporated herein
by reference in its entirety. Each CPC has one or more operating systems.
If any CPC has more than one operating system, its resources are logically
partitioned among plural operating systems using the IBM PR/SM feature.
Inter-system channels (ISCs) are connected between SES (101) and CPCs 1
through M. An ISC (106-1 thru 106-M) connecting to a CPC communicates
signals to/from microcode/hardware in the CPC.
Each CPC in the sysplex operates with a storage hierarchy, which for
example may include a private high-speed hardware cache in each CPU of a
CPC (201-1 and 201-N), a shared hardware cache accessible to all of the
private processor caches (202), a main storage (termed central storage in
other literature) (MS) shared by all processors in the CPC (204),a
hardware storage area (HSA) associated with MS but not having MS
addressability (205) and an expanded storage ES (206). However, the DASD
is grouped by DASD controls that allow any CPC in the sysplex to access
any DASD in the group, which is referenced herein by the term "sysplex
DASD" (207).
The MS/ES storage combination may be considered as a single random access
storage unit internal to the CPC. This is because the concept is used in
which pages (4 kilobyte units) in MS are backed by ES and DASD pages,
which can be swapped by programs using instructions, such as
"pagein/pageout" or "move page" to quickly move pages back and forth
between MS, ES to eliminate the need to distinguish between records in ES
or MS when they belong to the same user operation, and they may be viewed
as being in the same CPC cache.
An expanded storage (ES) 206 is connected to the MS 204, and stores data
which is addressed in 4 KiloByte page units. A DASD director 207 is
connected to the MS (204) for controlling the storing of data on disk
storage devices, DASD-1 through DASD-K. The DASD directory 207 controls
the data flows between all CPCs in the sysplex and all the DASD in the
illustrated bank of DASD, so that any CPC can access any record on any
DASD, including records written by other CPCs in the sysplex. Each of
these storages has a speed/capacity characteristic which places it at the
position shown in the illustrated storage hierarchy.
A sysplex may have any mix of OSs running on its different CPCs, some CPCs
each having one OS, and other CPCs each having a plurality of OSs running
independently of each other. One or more subsystems may be running under
any 0S in any CPC, including the IBM DB2, DFP, IMS, VSAM, etc. subsystems.
Different copies of the same data base subsystem program may be running
simultaneously and independently in the different CPCs. These different
programs may be accessing the same or different data elements or records
in the data base, which may simultaneously be in MS/ES local caches (LCs)
of the different CPCs.
The CPC/SES physical connection (208) may be provided by a respective
channel connected at one end to an MS controller in a respective CPC, and
connected at its other end to a SES device. The respective channel bus may
be made of a serial optical fibers. The bus may be a single fiber, but it
may be made of a plurality of fibers operating in parallel by "striping"
(interleaving) data among them.
In a hardware sense, a SES may be considered to be a large random access
memory that may be used in common by all CPCs connected to the SES. The
connected CPCs may use the SES to store shared data records and files on a
temporary or semi- permanent basis. Hence, SES may be considered to be a
component of the storage hierarchy in the system, having a hierarchy level
common to all CPCs attached to the SES, and roughly corresponding to the
ES level in the CPCs.
In a sysplex, one or more SES entities may be physically connected to the
MS/ES in every CPC in the sysplex. It is not required that all CPCs in a
sysplex be connected to a SES. For example, a SES may be attached only to
a subset of CPCs operating the same programming subsystem. And different
subsets of CPCs may be connected to different SESs in a sysplex for
running different programming subsystems.
SES may be used as a high-speed cache for data normally stored in the
sysplex common DASD, although the CPC/SES/DASD physical connections may
not be in a direct hierarchical path. Any CPC in the sysplex can access a
record much faster from SES than it can from the common DASD storage. That
is, a data element or record can be quickly accessed in SES without the
electro-mechanical delays found with DASD, such as waiting for head
movement between tracks and waiting for track spin to reach a requested
DASD record.
Special commands are provided to allocate the SES cache. Also, a plurality
of caches may be allocated within the same SES, such as having a
respective cache handle the data shared by attached subsets of CPCs using
different programs.
Each SES cache includes a directory (102), data area (103), local cache
register (104), and cache controls (105). If the data area part of the
cache is not going to be used, it may be made zero in size. Each valid
directory entry in a SES cache contains a name of a data element
registered in SES by any of its attached CPCs. SES may or may not contain
a copy of the data named in the registered element. The SES registered
name is also the name of one or more copies of the data element in one or
more CPCs in the sysplex. Furthermore, this directory name also identifies
a copy of the data element stored in (or to be stored in) one of the DASDs
1-K in the bank of DASD connected to director 207.
The data element name is used by SES to control data coherence for the data
element regardless of whether the data element is stored in the SES. The
data element name must be registered in SES for SES to provide data
coherence for the data element in the sysplex. It is not necessary to
store the data element itself in SES for SES to provide data coherence.
Maintaining Shared Data Coherency:
Patent application Ser. No. 07/860,805 describes a store-in cache approach
having program controlled caches (buffers) in the memories of different
central processing complexes (CPCs) and in one or more shared electronic
storage (SES) entities. The size of any of these caches may be different
from each other and may be changed without changing the hardware
structures of the entities containing the caches. Data coherence is
maintained among these caches. Moreover, the set of caches may reside on
all or a subset of CPCs with connections to a SES cache. Data coherence is
maintained among all local caches that have attached to the SES cache via
an explicit attach-local-cache command executed for each local cache when
the local cache is created and continues until the local cache is
detached. The number of local caches may change over time.
Patent application Ser. No. 07/860,805 further describes a method and
structure in a shared data, multi-computer system which guarantees that
any page of data in a shared cache (aka SES) will not be overwritten by an
earlier version of that page obtained from any other shared data storage
resource. In a multi-system, data-sharing complex, a database system
executing on a first computer system could be caching an updated page in a
shared cache while another database system could be trying to cache a
down-level copy of the same page obtained from a DASD. The described
mechanisms detect such a condition and, without a serialization mechanism
such as locking, bars entry of the down-level copy obtained from the DASD.
Local Cache Structure:
The executing program in any CPC uses one or more allocated local cache
buffers (LCBs 0-N of 107A, 0-K of 107B, and 0-L of 107C) in the local
caches (107A, 107B, and 107C) of the CPC's MS/ES to contain the data
elements and records in recent use that have been generated, backed-up,
retrieved and/or stored- in by the CPC. And any CPC in the sysplex may
have more than one type of program (for example DB2 and IMS) currently
executing on the central processors and I/O processors of that CPC. Then,
these plural executing programs in a CPC may have their own LCBs allocated
in the CPC's MS/ES which contain their most recently accessed data
elements and records. Hence, a complexity of different LCBs may be used by
different types of programs, and they may simultaneously exist in the
MS/ES of any one or more CPCs in a sysplex. The term "local cache" (LC) is
used herein to refer to a collection of LCBs set up and used by a CPC
programming subsystem.
The allocation and size of each LCB is dependent on the respective program
being used in the CPC. The LCBs may have different sizes and different
numbers in the different LCs. Any local cache buffer may be changed to a
different size during program execution.
Invalidation signalling controls in SES are provided to request the
coherence of data shared among the MS/ES buffers in the different CPCs, so
that each CPC may determine the validity of local cache buffers when
access to the data contained in a buffer is requested by programming.
During initialization of a local cache, operating system services are
invoked to authorize access of the program to the SES cache. These
operating system services assign a local cache identifier to be used to
uniquely identify the local cache and its attachment to the SES cache.
A local cache is attached to a SES cache through execution of the Attach
Local Cache (ALC) command. The ALC command identifies the local cache to
the SES cache structure, initializes the local-cache controls, and assigns
the specified local-cache identifier (LCID), which is saved in the local
cache controls (105). Local cache controls are used by SES to maintain
information regarding each attached local cache.
When the connection between a local cache and the SES cache is no longer
required, either as a result of normal or abnormal termination cleanup,
the Detach Local Cache (DLC) command is executed. The DLC command cleans
up resources in the SES cache associated with the store-in local cache
and, if requested, releases the LCID for re-use.
The operating system services also invokes a CPU instruction (define vector
DV) to cause a bit vector (termed a coherence vector (V)) to be created
(108A, 108B, and 108C) in the Hardware System Area (HSA). Completion of
the instruction to define this coherency bit vector returns a local cache
token (LCT). A new entry for each new coherence vector bit is placed in an
HSA table T (209) (shown as table T in HSA 205 in FIG. 2) to represent the
locations of the new coherence vector bits in the HSA associated with a
particular LC in the CPC. The LCT and LCID are provided to SES by the
operating system when a local cache is attached to the SES cache and
stored in the local cache controls (105).
A local cache is managed in its CPC by a local cache directory (LCD)
comprised of a plurality of entries which contain pointers (addresses) to
respective LCBs accessed through the respective LCD entries. Any CPC is
able to store a data element in each LCB and assign a unique name to the
data element; the name is put into the LCD entry associated with the LCB.
This data element name is used to communicate with SES as the address of
the data element when it is registered in SES, regardless of the size of
the data element. The operating system service invoked to interface with
the SES uses a CPU instruction (set vector entry SVE) to set the coherence
vector bit when the name of a data item is successfully registered at the
SES cache.
When a local cache buffer contains a data element shared among the CPCs in
the sysplex, programming associates a bit in the coherency vector with the
LCB. The entry within the coherency vector is termed the local cache entry
number (LCEN). Each LCD entry has a respective local cache entry number
(LCEN) which is used by the CPC and SES to distinguish between the LCBs of
a local cache. Each LCD entry indicates the valid/invalid state of its LCB
contents.
The CPC is physically connected to a SES through a storage controller (not
shown) to its MS 204 and HSA 205. This storage controller has microcode
addressability to HSA 205 for accessing a vector bit position in HSA by
LCT and LCEN values being signalled by the channel (208 in FIG. 2 and
106-1 thru 106-M in FIG. 1) from the attached SES. A table (T in FIG. 2 at
209) in the HSA 205 translates the LCT/LCEN values received from the
channel into a corresponding coherence vector bit position in HSA.
Local Cache Coherency:
The mechanisms described in patent application Ser. No. 07/860,805 prevent
multiple-version contamination of sysplex data when plural copies of a
data element concurrently exist in different LCBs in the CPCs.
Multiple copies of a record are allowed in different LCBs as long as all
CPCs are only reading the records. Contamination occurs when a data
element is changed in any LCB when a second copy is allowed to
concurrently exist unchanged in any other LCB, since then the two
different versions exist for the same data element. Then, any changes made
to the second copy will not be cumulative with the changes made to the
first copy, making all copies incorrect, including the copy in the common
DASD which is not updated until one of the new versions is committed by
being stored back to DASD.
This multiple-version data contamination can be avoided if all LCB copies
are invalidated except for one LCB copy which is changed. Then, the one
remaining copy receives all changes and represents the latest version of
the data element existing in any CPC, since only it can receive any
changes and only it can be stored back to the common DASD to represent the
latest copy of the record in the system.
The avoidance of multiple-version contamination is herein called
"maintaining data coherence". Invalidation of all outstanding copies of a
record except the copy being changed is the generally accepted solution
for preventing multiple- version contamination in a record.
To prevent multiple-version contamination, any CPC, wanting to access (read
or write) a record in the sysplex common DASD, must first register the
record in a SES directory, and preferably read the record from the SES
cache if it exists there.
In FIG. 1, CPC-1 is shown with local caches LC-1 and LC-2, which have
associated coherency vectors V1 and V2, respectively. Note that LC-1 has N
number of buffer entries (i.e. LCB-0 through LCB-N), which are
respectively associated with coherency vector V1 bits 0-N. And that LC-2
has K number of entries (i.e. LCB-0 through LCB-K), which are associated
with V2 bits 0-K. CPC-M is shown with a single LC(1), having LCBs 0-L
associated with V(1) coherency bits 0-L.
Different versions may exist of the same record among these multiple copies
in their different locations. The latest version exists in SES (if a copy
is written in SES), and in the CPC making the last change in the record.
Generally, in a store-in cache, DASD copy is a more out-of-date version
than the SES copy, because this copy is the last to be updated due to the
slow nature of accessing the electro-mechanical DASD.
SES can only set these vector bits to an invalid state through a
communication channel between SES and the CPC. When SES sets any coherence
vector bit to the invalid state in a CPC's HSA, the CPC programming is not
disturbed by such setting when it happens. The CPC programming continues
without interruption due to the SES setting of any coherence vector bit.
It is up to the CPC to determine when it will test the state of the vector
bits to determine if any invalidation has be indicated. This manner of
operation even gives the CPC the option of not using the sysplex coherence
controls if the CPC has a situation in which coherence is not needed.
When programming determines that data in a local cache buffer is to be
used, the CPC uses a "test vector entry" (TVE) CPU instruction to test the
current state of the data coherence bit for that LCB in the CPC. The TVE
instruction operates to invoke microcode which based on the specified LCT
and LCEN locates the bit in the HSA memory area and test its state. The
TVE microcode then records the result of the instruction execution in a
condition code (CC) of the instruction for use by subsequent executing
instructions in the CPC program that can conform the state of the
corresponding LCB valid/invalid state.
The CPC operating system can set the state of a vector bit to either an
valid or invalid setting at any time by means of a CPC instruction (set
vector entry SVE) which activate microcode/hardware to perform the bit
setting.
SES Structure:
The SES cache is a structure in SES consisting of a collection of data-area
elements, a directory, a local cache register, and local cache controls. A
SES cache structure is created at the request of programs accessing data
shared among CPCs where those programs require coherency and integrity for
locally cached copies of data items.
SES Directory:
A SES cache directory is an essential component of a SES device for
obtaining SES coherency control. Having a SES data area with a SES cache
enhances sysplex performance, but is optional. Without a SES data area,
the data records in the sysplex would be accessible only from the sysplex
DASD. The data base software operations would lose the performance
provided by fast access to shared records in a SES data area. The SES
local cache register (associated with SES directory entries) would still
identify which CPC local cache buffers (LCBs) in the sysplex have copy(s)
of a shared data element.
SES Local Cache Register:
Entries in the local cache register identify the attached local caches
which contain copies of the data element identified by the associated
directory entry. Each entry in the local cache register provides
sufficient information to locate the cache coherency vector associated
with the local cache and the local cache entry within the coherency vector
used to represent the validity of locally cached copies of the data
element.
SES Command Processing:
SES commands, including read and register and write and register cause
registration of a locally cached copy of a data element in the SES
directory. At the completion of successful registration, the local cache
register contains sufficient information to locate locally cached copies
of a data element, provide the CPC information regarding the location of
the coherency vector associated with the local cache, and provide
identification of the bit within the coherency vector being used by the
program to represent the validity of the local cache buffer.
SES command processing may cause invalidation of the bit within the
coherency vector being used by the program to represent the validity of a
locally cached data item. The write when registered, write and register,
invalidate complement copies, and invalidate name commands cause cross
invalidation processing to be performed.
Cache Coherency Processing Flow Overview:
The data element name is used by SES to control data coherence for the data
element regardless of whether the data element is stored in the SES. The
data element name must be assigned to the cache directory in SES and each
local copy of the data element must be registered in the local cache
register (LCR) associated with the directory entry for SES to provide data
coherence for the data element in the sysplex. It is not necessary to
store the data element itself in SES for SES to provide data coherence,
but it is necessary that each local copy of the data be registered by the
CPC.
For a CPC to know if the data element is cached in SES or not and to
register its local copy, the CPC can use a "read and register" (RAR)
command. RAR command checks the SES directory for an existing assignment
of the requested record name, and registers the local copy in the
associated local cache register. SES reports back to the requesting CPC on
whether the data element is cached or not in SES, and when cached, returns
the data element.
If a RAR command finds the SES cache does not have the name of the data
element assigned, SES may assign a directory entry and register the local
copy in the associated local-cache register. SES reports back to the
requesting CPC that the data element was not cached. Then the CPC can
issue a read command to the sysplex DASD to read its archived copy to the
CPC designated LCB.
If a CPC is generating a new data element or knows it has the current copy
of the data element, the CPC can use a "write and register" (WAR) command
to both register the local copy in the local-cache register and write the
data element into the directory entry's data area.
If a CPC is changing (updating) a shared data element in which the CPC has
previously registered interest, the CPC uses a "write when registered"
(WWR) command. WWR checks in the SES directory for an existing
registration of a local copy of the data element for the CPC. Only if the
registration is found is the data element written into the entry's
corresponding SES cache location. SES sets the directory change bit equal
to the CPC specified change indication. An invalidation operation is
performed only if the data element is marked as changed.
Each RAR or WAR command received by SES causes SES to register the
requesting CPC local cache entry number (LCEN) in a local-cache register
associated with the respective SES cache directory entry accessed by the
command. The SES LCR records the LCENs of all LCs which accessed the data
element in SES. SES uses this LCR data to invalidate any or all copies of
a data element in the LCs having the data element. Recording which
specific local caches contain a copy of the data element enables SES to
avoid sending an invalidation request signal to any local cache not
containing a copy of the data element.
Three separate processes invalidate local-cache entries registered in a SES
cache directory entry. Local-copy invalidation invalidates all local
copies registered in the local-cache register of the SES directory entry.
Complement-copy invalidation invalidates all the local copies except for
the local cache identified by the LCID request operand on a command that
causes complement-copy invalidation to occur. Single-copy invalidation
invalidates only the local cache copy of a specified local cache.
SES makes an invalidation request to any attached LC by sending a "cross
invalidate command" on the channel to the CPC and specifying the LCT and
LCEN values identifying the vector and its bit to be set to invalid state.
The transmitted LCT/LCEN values result from information retrieved from the
local cache controls (105) and local cache register.
When a data element is stored in the SES cache, and the write requests
indicates that the data element is changed from the current copy of the
data element registered in SES, SES performs complement-copy invalidation
processing.
An "invalidate complement copies" (ICC) command to SES specifically causes
SES to perform complement-copy invalidation processing for the named data
element. The ICC command is particularly useful when records are not
stored in SES, but are only stored in DASD, where the ICC command is used
after committing (storing) the changed data element in the DASD.
Another command used to maintain SES usage and coherence is an "invalidate
name" (IN) command, which is sent to SES with a data element name by any
CPC that wants to purge a data element from the SES. SES looks up the data
element name in its directory and finds the location of all local caches
currently having a copy of the data element. Then SES performs local-copy
invalidation, sending invalidation requests to all CPCs having an LCB
containing a copy. When responses are received from all requested CPCs,
SES deletes that data element name from its directory and deregisters the
local copies from the LCR.
Managing Data Movement From A SES Cache To DASD:
Patent application Ser. No. 07/860,806 describes a method and structure in
a shared data, multi-computer system which guarantees that any page of
data which is changed in a shared cache (aka SES) and which is being cast
out to DASD prior to making the page available for SES storage reclamation
will not result in deletion of a later version of the page written to the
shared memory (aka SES) while the castout operation is in progress.
The described mechanisms do not require higher level locking or
serialization and queuing mechanisms in a shared memory to guarantee
consistency between page versions when removing a page from the shared
memory for entry into secondary storage (DASD).
In a multi-system, data-sharing complex, a database system executing on a
first computer system could be reading a modified page in a shared cached
as a first step to write the page to secondary storage while another
database system could be trying to cache an even more recently updated
version of the same page in the shared cache. The described mechanisms
detect such a condition and, without a blocking mechanism such as locking,
bars deletion of the updated copy of the page from the cache after the
first computer system has stored the prior version in secondary storage.
Patent application Ser. No. 07/860,806 describes a technique for operating
a shared cache that does not require any additional serialization
mechanisms such as higher-level locking to guarantee that a more recent
version of a page is not deleted from the cache while an earlier version
is written to secondary storage. The SES cache includes a directory
containing a directory entry for each page stored in the cache. A castout
lock field is provided in the directory entry for each page in the cache.
The castout lock field contains the local cache identification of an
attached cache user currently performing a castout operation. This field
operates in conjunction with a change field used to indicate whether the
page has been changed. If the change field indicates that the page has
been changed during an ongoing castout operation, it prevents deletion of
the page, thereby preserving the latest version of the page in the shared
cache for a following castout operation. A castout process is only allowed
to proceed if the castout field is zero, indicating that no castout is in
progress, and allows reclaim of the page only if both the castout ID and
change bit fields are zero. The mechanisms include a castout command
operation, namely a "read for cast out" operation that enters the
identification of the requestor into the castout ID field, sets the change
indication to zero and returns the data to the requestor.
Castout Processing Flow Overview:
The SES-cache storage is normally smaller than the DASD storage. Thus
periodically the changed data must be transferred from the SES cache to
the backing DASD. This process, called castout, is controlled by the
program and involves the following operations:
A SES-read for castout operation is issued that sets the castout
serialization and copies the data block to main storage.
An I/O operation is executed that copies the data block to DASD.
A SES-unlock operation is issued that releases the castout serialization
Related changed data items are maintained in castout cla | | |