|
|
|
| United States Patent | 5404483 |
| Link to this page | http://www.wikipatents.com/5404483.html |
| Inventor(s) | Stamm; Rebecca L. (Wellesley, MA);
Bahar; Ruth I. (Lincoln, NE);
Wade; Nicholas D. (Folsom, CA) |
| Abstract | A processor and method for delaying the processing of cache coherency
transactions during outstanding cache fills in a multi-processor system
using a shared memory. A first processor fetches data having a specified
address by addressing a cache memory, and when the specified address is
not in the cache, saving the specified address in a fill address memory,
and sending a fill request to the shared memory. Before return of fill
data, the first processor receives a cache coherency request including the
specified address from a second processor requesting invalidation of an
addressed block of data. The first processor responds by checking whether
the fill address memory includes the specified address, and upon finding
the specified address in the fill address memory, delaying execution of
the cache coherency request until the fill data is returned, and when the
fill data is returned, using the fill data without retaining a validated
block of the fill data in the cache. In a preferred embodiment, the fill
memory is a content-addressable memory including a plurality of entries,
and each entry has a fill address, an ownership fill bit (OREAD), an
ownership-read invalidate pending bit (OIP), and a read invalidate pending
bit (RIP). The OIP or RIP bit is set when execution of a cache coherency
request is delayed, and these bits are read upon completion of a fill to
execute the delayed request. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5404483 |
|
|
Processor and method for delaying the processing of cache coherency
transactions during outstanding cache fills |
|
|
|
|
|
| Publication Date |
April 4, 1995 |
|
|
|
|
|
| Filing Date |
June 22, 1992 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
RELATED CASES
The present application is a continuation-in-part of Ser. No. 07/547,699,
filed Jun. 29, 1990, entitled BUS PROTOCOL FOR HIGH-PERFORMANCE PROCESSOR,
by Rebecca L. Stamm et al., now abandoned in favor of continuation
application Ser. No. 08/034,581, filed Mar. 22, 1993, entitled PROCESSOR
SYSTEM WITH WRITEBACK CACHE USING WRITEBACK AND NON WRITEBACK TRANSACTIONS
STORED IN SEPARATE QUEUES, by Rebecca L. Stamm, et al., issued on May 31,
1994, as U.S. Pat. No. 5,317,720, and Ser. No. 07/547,597, filed Jun. 29,
1990, entitled ERROR TRANSITION MODE FOR MULTI-PROCESSOR SYSTEM, by
Rebecca L. Stamm et al., issued on Oct. 13, 1992, as U.S. Pat. No.
5,155,843, incorporated herein by reference. The present application is
related to Stamm et al., "Preventing Access to Locked Memory Block By
Recording Lock in Content Addressable Memory with Outstanding Cache
Fills," Ser. No. 07/902,122, filed Jun. 22, 1992, concurrently with the
present application. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5276852 Callander 711/143 Jan,1994 |      Your vote accepted [0 after 0 votes] | | 5265232 Gannon 711/124 Nov,1993 |      Your vote accepted [0 after 0 votes] | | 5249284 Kass 711/141 Sep,1993 |      Your vote accepted [0 after 0 votes] | | 5228136 Shimizu 711/141 Jul,1993 |      Your vote accepted [0 after 0 votes] | | 5226143 Baird 711/145 Jul,1993 |      Your vote accepted [0 after 0 votes] | | 5226144 Moriwaki 711/121 Jul,1993 |      Your vote accepted [0 after 0 votes] | | 5222224 Flynn 711/144 Jun,1993 |      Your vote accepted [0 after 0 votes] | | 5155843 Stamm 714/5 Oct,1992 |      Your vote accepted [0 after 0 votes] | | 5045996 Barth 711/143 Sep,1991 |      Your vote accepted [0 after 0 votes] | | 5016168 Liu 712/216 May,1991 |      Your vote accepted [0 after 0 votes] | | 4875160 Brown, III 712/228 Oct,1989 |      Your vote accepted [0 after 0 votes] | | 4875155 Iskiyan 711/113 Oct,1989 |      Your vote accepted [0 after 0 votes] | | 4858116 Gillett, Jr. 711/155 Aug,1989 |      Your vote accepted [0 after 0 votes] | | 4858111 Steps 711/143 Aug,1989 |      Your vote accepted [0 after 0 votes] | | 4768148 Keeley 711/141 Aug,1988 |      Your vote accepted [0 after 0 votes] | | 4654819 Stiffler 711/162 Mar,1987 |      Your vote accepted [0 after 0 votes] | | 4622631 Frank 707/201 Nov,1986 |      Your vote accepted [0 after 0 votes] | | 4587610 Rodman 711/207 May,1986 |      Your vote accepted [0 after 0 votes] | | 4527238 Ryan 711/123 Jul,1985 |      Your vote accepted [0 after 0 votes] | | 4502110 Saito 711/123 Feb,1985 |      Your vote accepted [0 after 0 votes] | | 4445174 Fletcher 711/121 Apr,1984 |      Your vote accepted [0 after 0 votes] | | 4410944 Kronies 711/147 Oct,1983 |      Your vote accepted [0 after 0 votes] | | 4197580 Chang 711/144 Apr,1980 |      Your vote accepted [0 after 0 votes] | | 4195340 Joyce 711/133 Mar,1980 |      Your vote accepted [0 after 0 votes] | | 4142234 Bean 711/144 Feb,1979 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
We claim:
1. A method of operating a first processor in a multi-processor digital
computer system having said first processor, a second processor, and a
system memory accessed by both of said first and second processors over a
system bus operating in accordance with a block ownership cache coherency
protocol, said first processor having a cache memory for storing blocks of
data in association with memory addresses, said method comprising the
steps of:
a) fetching data having a specified memory address for a data processing
operation by searching said cache memory for said specified memory
address, and when said specified memory address is not found in said cache
memory, storing the specified memory address in a content addressable
memory, and sending a fill data request including said specified memory
address to the system memory;
b) before receipt of fill data from the system memory,
i) receiving a cache coherency request from said second processor in
accordance with said block ownership cache coherency protocol, said cache
coherency request including said specified memory address and requesting
invalidation of a block of data having the specified memory address, and
ii) checking whether said specified memory address is stored in said
content addressable memory, delaying execution of said cache coherency
request until said fill data is received from said system memory; and
c) receiving said fill data from said system memory, and using said fill
data for said data processing without retaining a validated block of said
fill data in said cache memory.
2. The method as claimed in claim 1, wherein said data processing operation
includes the writing of data to said specified memory address, and wherein
said execution of said cache coherency request causes the data to be
written in accordance with said data processing operation back to said
system memory.
3. The method as claimed in claim 1, wherein said cache coherency request
is a read invalidate request requesting said first processor to relinquish
any exclusive privilege to write to said block of data having said
specified memory address, and wherein said method includes clearing an
indication of ownership associated with said block of said fill data in
said cache memory so that a block of said fill data validated for writing
is not retained in said cache memory, and writing said block of said fill
data in said cache memory back to said system memory.
4. The method as claimed in claim 3, wherein said data processing operation
includes the writing of data to said specified memory address, and wherein
said writing of said block of said fill data in said cache memory back to
said system memory writes the data written in accordance with said data
processing operation back to said system memory.
5. The method as claimed in claim 1, wherein said step of delaying
execution of said cache coherency request includes setting an indication
associated with said specified memory address in said content addressable
memory so as to indicate a delayed invalidate operation, and upon
receiving said fill data for said specified memory address from said
system memory, checking whether a delayed invalidate operation is
indicated in association with said specified memory address in said
content addressable memory, and upon finding that a delayed invalidate
operation is indicated, not retaining a validated block of said fill data
in said cache memory.
6. The method as claimed in claim 5, further comprising the step of loading
said fill data into a cache block in said cache memory, and invalidating
the cache block into which said fill data was loaded so as not to retain a
validated block of said fill data in said cache memory.
7. The method as claimed in claim 1, wherein said cache coherency request
is an ownership invalidate request requesting said first processor to
refrain from writing to or reading from said block of data having the
specified memory address, and wherein said method includes the step of
clearing an indication of validity associated with said block of said fill
data in said cache memory.
8. The method as claimed in claim 7, wherein said data processing operation
includes the writing of data to said specified memory address, said fill
data request includes a request for an exclusive privilege to write to
said block of data having said specified memory address, and wherein said
method further includes writing said block of said fill data in said cache
memory back to said system memory.
9. The method as claimed in claim 8, wherein said data processing operation
includes the writing of data to said specified memory address, and wherein
said writing of said block of said fill data in said cache memory back to
said system memory writes the data written in accordance with said data
processing operation back to said system memory.
10. A method of operating a first processor in a multi-processor digital
computer system having said first processor, a second processor, and a
system memory accessed by both of said first and second processors over a
system bus operating in accordance with a block ownership cache coherency
protocol, said first processor having a cache memory for storing blocks of
data and a memory address associated with each of said blocks of data,
said method comprising the steps of:
a) fetching data having a specified memory address for a data processing
operation by said first processor by searching said cache memory for said
specified memory address, and when said specified memory address is not
found in said cache memory, storing the specified memory address in an
entry of a content addressable memory having a plurality of entries, and
sending a fill data request including said specified memory address to
said system memory;
b) before receipt of fill data from the system memory,
i) receiving a cache coherency request from said second processor in
accordance with said block ownership cache coherency protocol, said cache
coherency request including said specified memory address, and
ii) addressing said content addressable memory with the specified memory
address of said cache coherency request, and upon finding that the
specified memory address of said cache coherency request is in said entry
of said content addressable memory, setting in said content addressable
memory an indication that said cache coherency request is pending for said
specified memory address; and
c) receiving said fill data from said system memory, using said fill data
for said data processing operation by said first processor, checking said
entry of said content addressable memory for said indication that said
cache coherency request is pending for said specified memory address,
executing said cache coherency request.
11. The method as claimed in claim 10, wherein said cache coherency request
is an ownership-read invalidate request requesting said first processor to
refrain from reading or writing to a block of data having said specified
memory address, said method includes storing said fill data in a cache
block in said cache memory, and wherein the execution of said cache
coherency request includes clearing an indication of validity associated
with said cache block in said cache memory.
12. The method as claimed in claim 10, wherein said data processing
operation includes the writing of data to said specified memory address,
said fill data request includes a request for an exclusive privilege to
write to said specified memory address, said method includes setting in
said entry of said content addressable memory an indication of said
request for an exclusive privilege to write to said specified memory
address, said cache coherency request is a read invalidate request
requesting said first processor to relinquish any exclusive privilege to
write to a block of data having said specified memory address, said step
(b) (ii) further includes checking whether said entry indicates said
request for an exclusive privilege to write to said specified memory
address, and wherein the setting in said content addressable memory of an
indication that said cache coherency request is pending for said specified
memory address is performed upon finding that said entry indicates said
request for an exclusive privilege to write to said specified memory
address.
13. The method as claimed in claim 10, wherein said data processing
operation includes the writing of data to said specified memory address,
said fill data request includes a request for an exclusive privilege to
write to said specified memory address, said method includes setting in
said entry of said content addressable memory an indication of said
request for an exclusive privilege to write to said specified memory
address, said cache coherency request is an ownership-read invalidate
request requesting said first processor to refrain from reading or writing
to a block of data having said specified memory address, and wherein step
(c) further includes checking whether said entry indicates said request
for an exclusive privilege to write to said specified memory address, and
upon finding that said entry indicates said request for an exclusive
privilege to write to said specified memory address, writing a block of
fill data including data written in accordance with said data processing
operation back to said system memory.
14. The method as claimed in claim 10, wherein said cache coherency request
is a read invalidate request requesting said first processor to relinquish
any exclusive privilege to write to a block of data having said specified
memory address, said method includes storing said fill data in a cache
block in said cache memory, and wherein the execution of said cache
coherency request includes clearing an indication of ownership associated
with said cache block in said cache memory storing said fill data, and
writing said cache block of fill data back to said system memory.
15. The method as claimed in claim 14, wherein said data processing
operation includes the writing of data to said specified memory address,
and wherein said writing of the cache block back to said system memory
writes the data written in accordance with said data processing operation
back to said system memory.
16. A processor for a multi-processor computer system, said multi-processor
computer system having a system bus for coupling processors to a system
memory, said system bus operating in accordance with a block ownership
cache coherency protocol, said processor comprising, in combination:
instruction decoding means for decoding computer program instructions to
generate requests for reading data at specified read addresses;
instruction execution means connected to said instruction means for
executing the computer program instructions decoded by said instruction
decoding means to generate requests for writing data at specified write
addresses;
a cache memory for storing blocks of data, and in association with each
block of data, a memory address, an indication of whether each block is
valid for providing data from said memory address in response to said
requests for reading data, and an indication of whether each block is
valid for receiving data from said requests for writing data to said
memory address;
a content addressable memory including a plurality of entries and means for
storing in each entry a fill address of a fill request to a system memory
in said multi-processor system requesting fill data from said fill address
in said shared memory, an indication of whether the fill address is
associated with a request for validation for writing data to said fill
address, an indication of whether a read invalidate request was received,
before return of said fill data, from another processor in said
multi-processor system requesting invalidation of any indication that a
cache block having said fill address in said cache memory is valid for
receiving write data of whether an ownership-read invalidate request was
received, before return of said fill data, from another processor in said
multi-processor system requesting invalidation of any indication that a
cache block having said fill address is valid for providing read data;
means, responsive to a request for reading data from a specified read
address, for addressing said cache memory with said read address, for
reading data from said cache memory when said cache memory contains a
cache block having said read address and indicated as valid for providing
read data, and when said cache memory does not contain a cache block
having said read address and indicated as valid for providing read data,
for sending a fill request to said main memory including said read address
and for storing said read address in said content addressable memory;
means, responsive to a request for writing data to a specified write
address, for writing data to said cache memory when said cache memory
contains a cache block having said write address and indicated as valid
for receiving write data, and when said cache memory does not contain a
cache block having said write address and indicated as valid for receiving
write data, for sending a fill request to said system memory including
said write address and a request for validation for a write operation, and
for storing in said content addressable memory said write address together
with an indication that the fill address is associated with a request for
validation for a write operation;
means, responsive to receiving from another processor in said
multi-processor system a read invalidate request having a specified read
invalidate address, for addressing said content addressable memory with
said specified read invalidate address, and when a fill address matching
said specified read invalidate address is found in said content
addressable memory, for setting the indication of whether a read
invalidate request was received from another processor in said
multi-processor system before return of said fill data;
means, responsive to receiving from another processor in said
multi-processor system an ownership-read invalidate request having a
specified ownership-read invalidate address, for addressing said content
addressable memory with said specified ownership-read invalidate address,
and when a fill address matching said specified ownership-read
invalidating address is found in said content addressable memory, for
setting the indication of whether an ownership-read invalidate request was
received from another processor in said multi-processor system before
return of said fill data;
first means, responsive to return of said fill data for checking said
indication in said content addressable memory of whether an ownership-read
invalidate request was received before return of said fill data, and when
an ownership-read invalidate request was received before return of said
fill data, for invalidating an indication that a cache block having the
fill address in said cache memory is valid for providing read data; and
second means, responsive to return of said fill data, for checking the
indication in said content addressable memory of whether a read invalidate
request was received before return of said fill data, and when a read
invalidate request was received before return of said fill data, for
invalidating an indication that a cache block having the fill address in
said cache memory is valid for receiving write data.
17. The processor as claimed in claim 16, wherein said means, responsive to
receiving from another processor in said multi-processing system a read
invalidate request, further includes means for checking whether the
matching fill address is associated with an indication of a request for
validation for a write operation, and wherein said means for setting the
indication of whether a read invalidate request was received does not set
said indication of whether a read invalidate request was received when the
matching fill address is not associated with an indication of a request
for validation for a write operation.
18. The processor as claimed in claim 16, wherein said first means
responsive to return of said fill data includes means for checking whether
the fill address of said fill data is associated with a request for
validation for a write operation, and when the fill address of said fill
data is associated with a request for validation of a write operation and
an ownership-read invalidate request was received before return of said
fill data, filling a cache block with said fill data and writing back to
said system memory data in the cache block having been filled with said
fill data.
19. The processor as claimed in claim 18, further including writing write
data of said write operation to said cache block having been filled with
said fill data before writing back to the system memory said data in the
cache block having been filled with said fill data. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
The present application is related to Stamm et al., "Preventing Access to
Locked Memory Block By Recording Lock in Content Addressable Memory with
Outstanding Cache Fills," Ser. No. 07/902,122, filed Jun. 22, 1992,
concurrently with the present application.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention is directed to digital computers, and more particularly to
cache coherency transactions in a multi-processor system following a cache
ownership protocol. Specifically, the invention relates to maintaining
cache coherency in such a system having a "pended" bus permitting cache
coherency transactions to be transmitted among processors during
outstanding fills.
2. Description of the Background Art
Processors in a multi-processor computer system typically communicate via a
shared memory. To improve system performance, each processor has a cache
memory for temporarily storing copies of data being accessed. Such a
hierarchical memory system may follow either a "write through" or a "write
back" protocol. In a "write through" protocol, a processor immediately
writes data to the shared memory so that any other processor may fetch the
most recent memory state from the shared memory. In a "writeback"
protocol, a processor writes data to its cache, but this new memory state
is written back to the shared memory only when the memory space in the
cache needs to be used for different addresses in a cache fill operation,
or when another processor needs the new memory state. Therefore the
writeback protocol reduces the number of memory access operations to the
shared memory when the new memory state is not needed by the other
processors. In general, the write through protocol is preferred when the
different processors frequently access the same shared memory addresses,
and the write back protocol is preferred when the different processors
infrequently access the same shared memory addresses.
Whenever processors communicate via a shared memory, it is desirable to
require the processors to follow a protocol ensuring that a memory address
is not written to simultaneously by more than one processor, or else the
result of one processor will be nullified by the result of another
processor. Such synchronization of memory access is commonly achieved by
requiring a processor to obtain an exclusive privilege to write to an
addressed portion of the shared memory, before executing a write
operation. In a multi-processor system employing writeback caches, such an
exclusive privilege gives rise to a cache coherency problem in which data
written in the cache of a processor having such an exclusive privilege
might be the only valid copy of data for the addressed portion of memory.
A cache coherency protocol is required which permits a processor to obtain
readily the valid copy of data as well as the privilege to write to it.
One known cache coherency protocol for a multi-processor system employing
writeback caches is based on the concept of block ownership; an addressed
portion of memory the size of a cache block is either owned by the shared
memory or it is owned by one of the writeback caches. Only one of the
processors, or the shared memory, may own the block of memory at any given
time, and this ownership is indicated by an ownership bit for each block
in the shared memory and in each of the caches. A processor may write to a
block only when the processor owns the block. Therefore the ownership bits
always identify a unique "valid" block in the system. Shared read-only
access to a block is permitted only when the shared memory owns the block.
To indicate whether a processor may read a block, each of the caches
includes, for each block, a "valid" bit. When a processor desires to read
a block that is not valid in its cache, it issues a read transaction to
the shared memory, requesting the shared memory to fill its cache with
valid data. When a processor desires to write to a block which it does not
own, it issues an ownership-read transaction to the shared memory,
requesting ownership as well as a fill. From the perspective of the other
processors, these transactions are cache coherency transactions, which
request any other processor having ownership to give up ownership and
writeback the data of the requested block, and in the case of an ownership
read transaction, further request the other processors to invalidate any
copies of the requested block.
Typically the time for a cache coherency transaction to be transmitted over
a system bus is much shorter than the time for fill data to be retrieved
from the shared memory. Therefore system performance can be improved by
use of a pended bus (i.e., a bus which permits more than one transaction
to be pending on the bus at any given time). The use of such a "pended"
bus, however, leads to a problem of data coherency where a processor may
issue a read transaction to fill a cache block, but before receiving the
fill data | | |