WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills    
United States Patent5404482   
Link to this pagehttp://www.wikipatents.com/5404482.html
Inventor(s)Stamm; Rebecca L. (Wellesley, MA); Wade; Nicholas D. (Folsom, CA)
AbstractA processor and method for preventing access to a locked memory block in a multiprocessor computer system. The processor has a cache memory and records a memory lock in a content-addressable memory separate from the cache memory. Preferably, outstanding cache fills are recorded in the same content addressable memory as memory locks, and a memory lock or an outstanding cache fill delays the execution of a cache coherency request upon the same memory block. When a cache coherency request is received from another processor, the address of the cache coherency request is compared to addresses stored in the content addressable memory, and when there is a match, a bit in the matching entry is set to indicate a delayed request that is executed after the lock is unlocked or the cache is refilled. In a specific embodiment, a memory lock or an outstanding cache fill also stalls a processor read or write to the same memory block.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5404482
Processor and method for preventing access to a locked memory block by

     recording a lock in a content addressable memory with outstanding cache

     fills - US Patent 5404482 Drawing
Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills
Inventor     Stamm; Rebecca L. (Wellesley, MA); Wade; Nicholas D. (Folsom, CA)
Owner/Assignee     Digital Equipment Corporation (Maynard, MA)
Patent assignment
All assignments
Publication Date     April 4, 1995
Application Number     07/902,122
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 22, 1992
US Classification     711/145 711/108
Int'l Classification     G06F 012/00 G06F 013/00
Examiner     Gossage; Glenn
Assistant Examiner    
Attorney/Law Firm     Maloney; Denis G. Fisher; Arthur W. ,
Address
Parent Case     RELATED CASES The present application is a continuation-in-part of Ser. No. 07/547,699, filed Jun. 29, 1990, entitled BUS PROTOCOL FOR HIGH-PERFORMANCE PROCESSOR, by Rebecca L. Stamm et al., now abandoned in favor of continuation application Ser. No. 08/034,581, filed Mar. 22, 1993, entitled PROCESSOR SYSTEM WITH WRITEBACK CACHE USING WRITEBACK AND NON WRITEBACK TRANSACTIONS STORED IN SEPARATE QUEUES, by Rebecca L. Stamm, et al., issued on May 31, 1994, as U.S. Pat. No. 5,317,720, and Ser. No. 07/547,597, filed Jun. 29, 1990, entitled ERROR TRANSITION MODE FOR MULTI-PROCESSOR SYSTEM, by Rebecca L. Stamm et al., issued on Oct. 13, 1992, as U.S. Pat. No. 5,155,843, incorporated herein by reference. The present application is related to Ser. No. 07/902,156, filed Jun. 22, 1992, concurrently herewith, entitled DELAYING THE PROCESSING OF CACHE COHERENCY TRANSACTIONS DURING OUTSTANDING CACHE FILLS, by Rebecca L. Stamm, Ruth I. Bahar, and Nicholas D. Wade.
Priority Data    
USPTO Field of Search     395/425 395/400 395/725
Patent Tags     processor preventing access locked memory block by recording lock content addressable memory outstanding cache fills
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5276847
Kohn
711/163
Jan,1994

[0 after 0 votes]
5155843
Stamm
714/5
Oct,1992

[0 after 0 votes]
5148536
Witek
711/140
Sep,1992

[0 after 0 votes]
5142676
Fried
711/152
Aug,1992

[0 after 0 votes]
4984153
Kregness
711/152
Jan,1991

[0 after 0 votes]
4977498
Rastegar
711/128
Dec,1990

[0 after 0 votes]
4875160
Brown, III
712/228
Oct,1989

[0 after 0 votes]
4875155
Iskiyan
711/113
Oct,1989

[0 after 0 votes]
4858116
Gillett, Jr.
711/155
Aug,1989

[0 after 0 votes]
4858111
Steps
711/143
Aug,1989

[0 after 0 votes]
4768148
Keeley
711/141
Aug,1988

[0 after 0 votes]
4654819
Stiffler
711/162
Mar,1987

[0 after 0 votes]
4622631
Frank
707/201
Nov,1986

[0 after 0 votes]
4587610
Rodman
711/207
May,1986

[0 after 0 votes]
4561051
Rodman
711/152
Dec,1985

[0 after 0 votes]
4527238
Ryan
711/123
Jul,1985

[0 after 0 votes]
4502110
Saito
711/123
Feb,1985

[0 after 0 votes]
4445174
Fletcher
711/121
Apr,1984

[0 after 0 votes]
4410944
Kronies
711/147
Oct,1983

[0 after 0 votes]
4197580
Chang
711/144
Apr,1980

[0 after 0 votes]
4195340
Joyce
711/133
Mar,1980

[0 after 0 votes]
4142234
Bean
711/144
Feb,1979

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method of operating a digital computer to place a memory lock upon a memory lock address and to release said memory lock, said digital computer having a cache memory including cache block entries for storing blocks of data in association with memory addresses, said digital computer also having a system bus operating in accordance with a block ownership cache coherency protocol, said method comprising the steps of:

performing execution of instructions by reading source operands of said instructions from said cache memory when said source operands are stored in said cache memory, and when said source operands are not stored in said cache memory, filling said cache memory with fill data from a system memory, and writing destination operands in said cache memory; and

when said execution of instructions determines that said memory lock address is specified:

a) storing said memory lock address in an entry of a content addressable memory separate and distinct from said cache memory, and

b) upon receiving from another digital computer a cache coherency request in accordance with said block ownership cache coherency protocol, specifying a cache coherency request address, addressing said content addressable memory with said cache coherency request address to find whether there is a match between said cache coherency request address and said memory lock address, and when a match is found, delaying execution of said cache coherency request until said execution of instructions causes a release of said memory lock upon said memory lock address.

2. The method as claimed in claim 1, wherein said content addressable memory has a small number of entries and said cache memory has a large number of cache block entries in comparison to said small number of entries, and wherein said execution of instructions is temporarily stalled upon finding that said memory lock address is specified when all of said entries in said content addressable memory are filled with valid addresses.

3. The method as claimed in claim 1, wherein execution of said cache coherency request is delayed by setting an indication of said cache coherency request in said entry of said content addressable memory, and when the execution of instructions releases said memory lock, reading said indication in said entry of said content addressable memory, and upon finding that said entry of said content addressable memory includes the indication of said cache coherency request, executing said cache coherency request.

4. The method as claimed in claim 1, further comprising the step of removing said entry of said content addressable memory from said content addressable memory when said execution of instructions releases said memory lock.

5. The method as claimed in claim 4, wherein said entry of said content addressable memory is removed by setting an indication in said entry of said content addressable memory to indicate that said entry of said content addressable memory is invalid.

6. The method as claimed in claim 1, wherein said method includes storing in another entry of said content addressable memory an address in said system memory from which said fill data is being retrieved.

7. The method as claimed in claim 6, wherein said method further includes setting in each entry of said content addressable memory an indication of whether there is a memory lock upon an address stored in that entry.

8. The method as claimed in claim 7, wherein said method further includes setting in each entry of said content addressable memory an indication of whether fill data is being retrieved from said main memory at an address stored in that entry.

9. The method as claimed in claim 8, wherein said method further includes setting in each entry of said content addressable memory an indication of whether that entry is valid.

10. A method of operating a digital computer to place a memory lock upon a memory lock address and to release said memory lock, said digital computer having a cache memory including cache block entries for storing blocks of data in association with memory addresses, said digital computer also having a system bus operating in accordance with a block ownership cache coherency protocol, said method comprising the steps of:

performing execution of instructions by reading source operands of said instructions from said cache memory when said source operands are stored in said cache memory, and when said source operands are not stored in said cache memory, storing, in an entry of a content addressable memory separate and distinct from said cache memory, an address of a system memory from which fill data is retrieved including said source operand, and also setting in said entry of a content addressable memory an indication that a fill operation is in progress for said entry of a content addressable memory, retrieving said fill data from said system memory, filling said cache memory with said fill data from said system memory, and invalidating said entry of a content addressable memory, and writing destination operands in said cache memory; and

when said execution of instructions determines that said memory lock address is specified, storing said memory lock address in another entry of said content addressable memory, and

when said execution of instructions determines that release of said memory lock is specified, invalidating said another entry, and processing a cache coherency request from a digital computer to a specified memory address by addressing said content addressable memory with said specified memory address to find whether said content addressable memory includes a valid entry containing an address matching said specified address, and upon finding a valid entry containing an address matching said specified address, delaying said cache coherency request until said valid entry is invalidated.

11. The method as claimed in claim 10, wherein execution of said cache coherency request is delayed by setting an indication of said cache coherency request in said entry of said content addressable memory, and when said valid entry is being invalidated, reading the indication in said entry of said content addressable memory, and upon finding that said entry of said content addressable memory includes the indication of said cache coherency request, executing said cache coherency request.

12. The method as claimed in claim 10, wherein said content addressable memory has a small number of entries and said cache memory has a large number of cache block entries in comparison to said small number of entries, and wherein said execution of instructions is temporarily stalled upon finding that said memory lock address is specified when all of said entries in said content addressable memory are filled with valid addresses.

13. The method as claimed in claim 10, wherein said method further includes setting in each valid entry of said content addressable memory an indication of whether there is a memory lock upon an address stored in that entry.

14. The method as claimed in claim 10, wherein said method further includes setting in each entry of said content addressable memory an indication of whether fill data is being retrieved from said system memory at an address stored in that entry.

15. A processor for a multi-processor computer system, said multi-processor computer system having system bus for coupling processors to a system memory, said system bus operating in accordance with a block ownership cache coherency protocol, said processor comprising, in combination:

instruction decoding means for decoding computer program instructions to generate requests for reading data at specified read addresses;

instruction execution means connected to said instruction decoding means for executing the computer program instructions decoded by said instruction decoding means to generate requests for writing data at specified write addresses;

means, coupled to said instruction decoding means and to said instruction execution means, for generating a memory lock request for a memory lock upon a specified memory lock address, and for generating a corresponding memory unlock request;

a cache memory for storing blocks of data, and in association with each block of data, a memory address, an indication of whether each block is valid for providing data from said memory address in response to said requests for reading data, and an indication of whether each block is valid for receiving data from said requests for writing data to said memory address;

a content addressable memory including a plurality of entries and means for storing in each entry said memory lock address together with an indication of a read lock in progress or a fill address of a fill request to a shared memory in said multi-processor system requesting fill data from said fill address in said shared memory, an indication of whether the fill request is in progress, an indication of whether the fill address is associated with a request for validation for writing data to said fill address, an indication of whether a read invalidate request was received, before return of said fill data, from another processor in said multi-processor system requesting invalidation of any indication that a cache block having said fill address in said cache memory is valid for receiving write data, and an indication of whether an ownership-read invalidate request was received, before return of said fill data, from another processor in said multi-processor system requesting invalidation of any indication that a cache block having said fill address is valid for providing read data,

means, responsive to a request for reading data from a specified read address, for addressing said cache memory with said read address, for reading data from said cache memory when said cache memory contains a cache block having said read address and indicated as valid for providing read data, and when said cache memory does not contain a cache block having said read address and indicated as valid for providing read data, for sending a fill request to said system memory including said read address and for storing said read address in said content addressable memory,

means, responsive to a request for writing data to a specified write address, for writing data to said cache memory when said cache memory contains a cache block having said write address and indicated as valid for receiving write data, and when said cache memory does not contain a cache block having said write address and indicated as valid for receiving write data, for sending a fill request to said system memory including said write address and a request for validation for a write operation, and for storing in said content addressable memory said write address together with an indication that the fill address is associated with a request for validate for a write operation;

means, responsive to receiving from another processor in said multi-processor system a read invalidate request having a specified read invalidate address, for addressing said content addressable memory with said specified read invalid data address, and when a fill address matching said specified read invalid address is found in said content addressable memory, for setting said indication of whether a read invalidate request was received from another processor in said multi-processor system before release of said memory lock or return of said fill data;

means, responsive to receiving from another processor in said multi-processor system an ownership-read invalidate request having a specified ownership-read invalidate address, for addressing said content addressable memory with said specified ownership-read invalidate address, and when a fill address matching said specified ownership-read invalidating address is found in said content addressable memory, for setting said indication of whether an ownership-read invalidate request was received from another processor in said multi-processor system before release of said memory lock or return of said fill data;

first means, responsive to return of said fill data or release of said memory lock, for checking said indication in a corresponding entry of said content addressable memory of whether an ownership-read invalidate request was received during said memory lock or before return of said fill data, and when an ownership-read invalidate request was received during said memory lock or before return of said fill data, for invalidating an indication that a cache block having the fill address in said cache memory is valid for providing read data;

second means, responsive to return of said fill data or release of said memory lock, for checking said indication in a corresponding entry of said content addressable memory of whether a read invalidate request was received during said memory lock or before return of said fill data, and when a read invalidate request was received during said memory lock or before return of said fill data, for invalidating an indication that a cache block having the fill address in said cache memory is valid for receiving write data.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention is directed to digital computers, and more particularly to a multi-processor system in which the processors share a common memory, but in which a processor may obtain an exclusive right to access a respective block of memory. The invention specifically relates to such a multi-processor system in which each processor has a cache memory and follows a cache coherency protocol.

2. Description of the Background Art

Processors in a multi-processor computer system typically communicate via a shared memory. To improve system performance, each processor has a cache memory for temporarily storing copies of data being accessed. Such a hierarchical memory system may follow either a "write through" or a "write back" protocol. In a "write through" protocol, a processor immediately writes data to the shared memory so that any other processor may fetch the most recent memory state from the shared memory. In a "writeback" protocol, a processor writes data to its cache, but this new memory state is written back to the shared memory only when the memory space in the cache needs to be used for different addresses in a cache fill operation, or when another processor needs the new memory state. Therefore, the writeback protocol reduces the number of memory access operations to the shared memory when the new memory state is not needed by the other processors. In general, the write through protocol is preferred when the processors frequently access the same memory addresses, and the write back protocol is preferred when the processors infrequently access the same memory addresses.

Whenever processors communicate via a shared memory, it is desirable to require the processors to follow a protocol insuring that a memory address is not written to simultaneously by more than one processor, or else the result of one processor will be nullified by the result of another processor. Such synchronization of memory access is commonly achieved by requiring a processor to obtain an exclusive privilege to write to an addressed portion of the shared memory, before executing a write operation. In a multi-processor system employing writeback caches, such an exclusive privilege gives rise to a cache coherency problem in which data written in the cache of a processor having such an exclusive privilege might be the only valid copy of data for the addressed portion of memory. A cache coherency protocol is required which permits a processor to obtain readily the valid copy of data as well as the privilege to write to it.

One known cache coherency protocol for a multi-processor system employing writeback caches is based on the concept of block ownership; an addressed portion of memory the size of a cache block is either owned by the shared memory or it is owned by one of the writeback caches. Only one of the processors, or the shared memory, may own the block of memory at any given time, and this ownership is indicated by an ownership bit for each block in the shared memory and in each of the caches. A processor may write to a block only when the processor owns the block. Therefore the ownership bits always identify a unique "valid" block in the system. Shared read-only access to a block is permitted only when the shared memory owns the block. To indicate whether a processor may read a block, each of the caches includes, for each block, a "valid" bit. When a processor desires to read a block that is not valid in its cache, it issues a read transaction to the shared memory, requesting the shared memory to fill its cache with valid data. When a processor desires to write to a block which it does not own, it issues an ownership-read transaction to the shared memory, requesting ownership as well as a fill. From the perspective of the other processors, these transactions are cache coherency transactions, which request any other processor having ownership to give up ownership and writeback the data of the requested block, and in the case of an ownership read transaction, further request the other processors to invalidate any copies of the requested block.

The architecture of the instruction set for a digital computer typically includes certain "interlocked" instructions that are guaranteed to perform atomic operations upon memory in a multi-processing environment. Such "interlocked" instructions, for example, include operands having an access type of "modify", wherein an operand is first read from memory as a source operand, modified, and written back to memory as a destination operand. In a multi-processor system, the access type of "modify" raises the possibility that another CPU might access memory between the time that an operand is read from memory and the time that the operand is modified and written back to memory, leading to a result in memory which might not appear consistent under certain program sequences. To prevent this problem, interlocked instructions have been executed in a multi-processor environment by using the execution unit to request fetching of the operand and to request a memory "read lock" when fetching the second operand from memory, and to request a memory "write unlock" when putting the result back to memory.

Typically, the time for a cache coherency transaction to be transmitted over a system bus is much shorter than the time for fill data to be retrieved from the shared memory. Therefore system performance can be improved by use of a pended bus (i.e., a bus which permits more than one transaction to be pending on the bus at any given time). The use of such a "pended" bus, however, raises the possibility of receiving a number of cache coherency transactions when a cache fill is outstanding. To obtain the improvement in performance of the pended bus, the cache coherency transactions should be executed as soon as possible, but the cache coherency transactions may conflict with the outstanding fill as well as an outstanding lock.

An outstanding lock should be removed as quickly as possible, because it prevents other processors from gaining ownership to the locked memory block. In this regard, a lock is a kind of ownership that is released upon affirmative action by the processor asserting the lock, in contrast to the typical ownership of a cache consistency protocol that releases ownership upon the request of another processor without any affirmative action by the original owner. The requirement of affirmative action by a processor to release its lock increases the likelihood of deadlock in the system, and therefore locking is not used where ownership would suffice.

Although the locking of a memory block could be recorded in the cache in a "lock bit" for each cache block in a fashion analogous to the ownership bits, this technique would increase the size of the cache memory and would burden the cache with additional access cycles for setting and clearing the lock bits.

SUMMARY OF THE INVENTION

In accordance with a basic aspect of the present invention, a processor has a cache memory and records a memory lock in a content-addressable memory (CAM) separate from the cache memory. The use of locks is disfavored, and therefore the number of outstanding locks need be only a fraction of the number of blocks in the cache memory.

Preferably outstanding cache fills are recorded in the same content addressable memory as the memory locks, and a memory lock or an outstanding cache fill delays the execution of a cache coherency request upon the same memory block. Therefore the same memory hardware and the same control logic is used for delaying execution of a cache coherency request during a memory lock or outstanding cache fill.

In a specific embodiment, a memory lock or an outstanding cache fill also stalls a processor read or write to the same memory block (except a special kind of write operation for unlocking the lock is not stalled) so that the same control logic is also used for stalling a read or write operation for a memory lock or an outstanding fill.

In a preferred control sequence, a specified memory block is locked and unlocked by a READ LOCK--WRITE UNLOCK command sequence originated by an instruction execution unit in the processor. Only one READ LOCK--WRITE UNLOCK sequence is outstanding at any time. Upon receipt of the READ LOCK, an address specifying the memory block is stored in an entry in the content addressable memory, regardless of whether the READ LOCK hits in the cache. In the entry, the following control bits are set: RDLK, to indicate that a READ LOCK is in progress; OREAD, to indicate that the READ LOCK is an ownership-read type of transaction; TO MBOX, if returning fill data is to be sent to the execution unit; and VALID, to indicate that the entry is currently valid. While the READ LOCK is in progress and before the WRITE UNLOCK is done, a cache coherency request may arrive from another processor. Such a request may eventually result in either an invalidation of a cache block or a deallocation of an owned cache block. Such a result must be prevented so long as the READ LOCK is in progress upon the cache block referenced by the cache coherency request. When such a request is received, its address is compared to any valid address in the content addressable memory, including any READ LOCK address or any fill address. If the comparison indicates a match, then either RIP (Read Invalidate Pending) or OIP (Oread Invalidate Pending) is set in the entry having the matching address, so that execution of the cache coherency request is deferred until the entry is removed from the fill CAM. If a fill CAM entry is for a READ LOCK, then the entry is not removed from the fill CAM until the corresponding WRITE UNLOCK is executed. Therefore, a cache coherency request is deferred by the READ LOCK is not executed until the corresponding WRITE UNLOCK is executed.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as other features and advantages thereof, will be best understood by reference to the detailed description of a specific embodiment, when read in conjunction with the accompanying drawings wherein:

FIG. 1 is a block diagram of a multi-processor computer system incorporating the present invention;

FIG. 2 is a block diagram of a primary cache memory of the CPU of FIG. 1;

FIG. 3 is a diagram of the format of data stored in the primary cache of FIG. 2;

FIG. 4 is a block diagram of a writeback cache controller used in the processors of the computer system of FIG. 1;

FIG. 5 is a timing diagram of events occurring on the CPU bus in the system of FIG. 1;

FIG. 6 is a schematic diagram of the conductors used in the CPU bus in the system of FIG. 1;

FIG. 7 is a block diagram of the bus interface and arbiter unit of the computer system of FIG. 1;

FIG. 8 is a block diagram of the invalidate queue and the return queue in the bus interface and arbiter unit of FIG. 7;

FIG. 9 is a schematic diagram showing internal organization of the invalidate queue and the return queue in the bus interface and arbiter unit of FIG. 7.

FIG. 10 is a schematic diagram of a write queue used in the write back cache controller of FIG. 4;

FIG. 11 is a block diagram showing a format of data stored in an entry in the write queue of FIG. 10;

FIG. 12 is a schematic diagram of logic for generating a signal indicating when the write queue of FIGS. 11-12 is full;

FIG. 13 is a schematic diagram of logic for generating a signal indicating a write-read conflict of a data stream read request with preceding writes