WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Methods and apparatus for creating a pending write-back controller for a cache controller on a packet switched memory bus employing dual directories    

Get related patents on CD
United States Patent5434993   
Link to this pagehttp://www.wikipatents.com/5434993.html
Inventor(s)Liencres; Bjorn (Palo Alto, CA); Lee; Douglas (San Francisco, CA); Sindhu; Pradeep S. (Mountain View, CA); Pham; Tung (San Jose, CA)
AbstractA write-back cache control system having a pending write-back cache controller in a multiprocessor cache memory structure. The processor subsystems in the multiprocessor system are coupled together using a high-speed synchronous packet switching bus called a memory bus. Each processor subsystem has an associated cache control system. When a processor's cache control system does not have a required memory location in the cache memory, it broadcasts a memory request packet across the memory bus for the required data. If an owned cache line is being replaced, the cache control system copies the old cache line data to the pending write-back cache controller which is responsible for the write-backs of owned cache lines to main memory. The cache control system then transfers ownership of the old replaced cache line to the pending write-back controller. When the cache control system receives the new cache line information from the memory bus, it immediately replaces the cache line and allows the processing to continue. By buffering the old cache line in the pending write-back controller, the cache control system allows the new cache line to be requested before the old cache line is written back to main memory thereby reducing the cache line replacement latency period.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Inventor     Liencres; Bjorn (Palo Alto, CA); Lee; Douglas (San Francisco, CA); Sindhu; Pradeep S. (Mountain View, CA); Pham; Tung (San Jose, CA)
Owner/Assignee     Sun Microsystems, Inc. (Mountain View, CA) Xerox Corporation (Stamford, CT)
Patent assignment
All assignments
Company News
Publication Date     July 18, 1995
Application Number     07/973,309
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 9, 1992
US Classification    
Int'l Classification    
Examiner     Rudolph; Rebecca L.
Assistant Examiner     Kim; Matthew M.
Attorney/Law Firm     Blakely Sokoloff Taylor & Zafman
Address
Parent Case    
Priority Data    
USPTO Field of Search    
Patent Tags     methods creating pending write-back controller a cache controller packet switched memory bus employing dual directories
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5355471
Weight
714/10
Oct,1994

[0 after 0 votes]
5353424
Partovi
711/128
Oct,1994

[0 after 0 votes]
5347648
Stamm
714/5
Sep,1994

[0 after 0 votes]
5325504
Tipley
711/128
Jun,1994

[0 after 0 votes]
5313609
Baylor
711/121
May,1994

[0 after 0 votes]
5303362
Butts, Jr.
711/121
Apr,1994

[0 after 0 votes]
5265235
Sindhu
711/120
Nov,1993

[0 after 0 votes]
5265233
Frailong
711/118
Nov,1993

[0 after 0 votes]
5263144
Zurawski
711/121
Nov,1993

[0 after 0 votes]
5247648
Watkins
711/143
Sep,1993

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A computer system including a first processor subsystem and a main memory, said first processor subsystem and main memory coupled via a packet-switched memory bus, said first processor subsystem having a write-back cache memory system, said write-back cache memory system comprising:

a cache memory, said cache memory comprising more than one cache line, each of said cache lines storing information;

a cache directory, said cache directory describing said information stored in said cache memory, said cache directory comprising an address tag and a plurality of status bits for each cache line in said cache memory, said plurality of status bits including a valid bit and an owned bit;

a cache control system, said cache control system coupled to said cache memory and said cache directory, said cache control system further coupled to said packet-switched memory bus, said cache control system maintaining said cache directory;

a pending write-back controller, said pending write-back controller coupled to said cache control system, said pending write-back controller comprising;

a data memory for buffering a plurality of owned cache lines, each of said owned cache lines having a corresponding main memory address;

a write-back address tag and a plurality of status bits for each of said plurality of owned cache lines stored in the data memory, said write-back address tag defining a main memory address of said owned cache line;

a write-back memory, said write-back memory storing an encoded list of owned cache lines stored in said data memory which must be written back to said main memory; and

a pending write-back control logic unit for controlling said pending write-back controller such that each owned cache line listed in said encoded list in said write-back memory is written back to said main memory after each owned cache line is replaced.

2. The computer system including write-back cache memory system as claimed in claim 1 wherein said pending write-back controller further comprises:

a read request memory, said read request memory storing a list of read requests received from a second processor subsystem coupled to said packet-switched memory bus, said read requests requesting the owned cache lines in said data memory; and

said pending write-back control logic unit sending a read request reply to said second processor subsystem for each read request stored in said read request memory.

3. The write-back cache memory system as claimed in claim 2 wherein said plurality of status bits in said pending write-back controller comprise a valid bit and an owned bit.

4. The write-back cache memory system as claimed in claim 3 wherein said pending write-back control logic unit further comprises means for resetting the owned status bit for an owned cache line stored in said data memory when a write transaction occurs on said packet-switched memory bus to the corresponding main memory address of the owned cache line such that said pending write-back controller does not write-back said owned cache line.

5. The write-back cache memory system as claimed in claim 4 wherein said write-back memory in said pending write-back control logic unit comprises a first-in-first-out (FIFO) memory.

6. The write-back cache memory system as claimed in claim 5 wherein said read request memory in said pending write-back control logic unit comprises a first-in-first-out (FIFO) memory.

7. The write-back cache memory system as claimed in claim 6 wherein said cache control system comprises:

a bus cache controller, said bus cache controller interfacing said cache control system to the packet-switched memory bus, said bus cache controller having a first cache directory;

a processor cache controller, said processor cache controller interfacing said cache memory to a first processor in said first processor subsystem, said processor cache controller having a second cache directory; and

a packet-switched cache bus, said packet-switched cache bus coupling said bus cache controller and said processor cache controller.

8. The write-back cache memory system as claimed in claim 7 wherein each of said cache lines are further subdivided into M subblocks, each of said subblocks having separate status bits, each of said subblocks of said cache lines handled as individual memory blocks.

9. In a computer system including first processor subsystem and a main memory, said first processor subsystem and main memory coupled via a packet-switched memory bus, a write-back cache memory system in said first processor subsystem, said write-back cache memory system comprising:

a cache memory, said cache memory comprising more than one cache line, each of said cache lines storing lines of information, each of said lines of information having a corresponding main memory address;

a cache directory, said cache directory storing an address tag and a set of status bits for each cache line in said cache memory, said address tag being the corresponding main memory address for said cache line;

a write-back cache control system, said write-back cache control system maintaining said cache memory and said cache directory;

means for requesting a new line of information for a cache line storing an owned line in said cache memory over said packet-switched memory bus;

means for copying the owned line of information from the cache line in the cache memory to a pending write-back controller, said pending write-back controller comprising

means for buffering said owned line of information from said cache memory;

means for storing a write-back address tag and a plurality of status bits for said owned line of information, said write-back address tag being the corresponding main memory address of said owned line of information;

means for writing back said owned line of information to said main memory when no read requests for said owned line of information are pending;

means for replacing said owned line of information with said requested new line of information in the cache line of the cache memory; and

means for writing back said owned line of information in the pending write-back controller to said main memory.

10. The write-back cache memory system as claimed in claim 9 wherein said pending write-back controller further comprises:

means for receiving and storing at least one read request from a second processor subsystem, said read request requesting the owned line of information;

means for replying to said read request from said second processor subsystem with a read request reply to said second processor subsystem, said read request reply containing said owned line of information.

11. The write-back cache memory system as claimed in claim 10 wherein said plurality of status bits in said pending write-back controller comprise a valid bit and an owned bit.

12. The write-back cache memory system as claimed in claim 11 wherein said pending write-back controller further comprises:

means for receiving a plurality of write requests, said write requests writing to the main memory address of the owned line of information;

means for responding to said write requests by resetting the owned bit of the owned line of information such that said owned line of information is not written back to main memory.

13. The write-back cache memory system as claimed in claim 12 wherein each of said cache lines are further subdivided into M subblocks, each of said subblocks of said cache lines handled as individual memory blocks.

14. A method for a computer system including a first processor subsystem and a main memory coupled via a packet-switched memory bus, said first processor subsystem having a write-back cache memory system comprising a cache memory having more than one cache line, a method of replacing the cache lines in the first processor subsystem cache memory, said method for computer system comprising the steps of:

requesting a new line of information for a first cache line in said first processor subsystem cache memory over said packet-switched memory bus;

transferring an owned line of information from the first cache line of said processor subsystem cache memory to a pending write-back controller, said pending write-back controller buffering said owned line of information;

receiving said requested new line of information into the first cache line of said first processor subsystem cache memory;

writing back said owned line of information buffered in the pending write-back controller to the main memory, said step of writing back said owned line of information occurring after said step of requesting said new line of information.

15. The method for computer system including method of replacing the cache lines in the first processor subsystem cache memory as claimed in claim 14 wherein said step of writing back said owned line of information in the pending write-back controller further comprises:

receiving read requests for the owned line of information from a second processor subsystem coupled to said packet-switched memory bus; and

responding to said read requests for the owned line of information from said second processor subsystem by sending a read request reply to said second processor subsystem, said read request reply containing the owned line of information.

16. The method of replacing the cache lines in the first processor subsystem cache memory as claimed in claim 15 wherein said step of writing back said owned line of information in the pending write-back controller further comprises:

receiving write requests to the owned line of information from a second processor subsystem coupled to said packet-switched memory bus; and

responding to said write requests to the owned line of information from said second processor subsystem by not writing back said owned line of information to the main memory.

17. The method of replacing the cache lines in the first processor subsystem cache memory as claimed in claim 16 wherein each of said cache lines are further subdivided into M subblocks, each of said subblocks of said cache lines handled as individual memory blocks.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to the field of cache memory structures for multiprocessor computer systems. More particularly, the present invention relates to a pending write-back cache controller in a cache control system for a multiprocessor computer system using a packet switched bus.

2. Art Background

In a typical computer system, the processing unit operates at a substantially faster speed than the main memory. When the processing unit executes instructions faster than memory can supply them, the processing unit must remain idle while it waits for the memory to retrieve the next instruction. Processing unit idle time adversely affects system performance. To avoid unnecessary processing unit idle time while awaiting data or instructions from the main memory, a cache memory capable of operating at a higher speed than a main memory is often used to buffer the data and the instructions between the main memory and the processing unit. The cache memory is typically much smaller than the main memory.

The data and instructions from the main memory are mapped into the cache memory in uniform units referred to as cache lines. Each cache line represents an aligned continuous segment of main memory. Since the cache memory is usually much smaller than the main memory, it can store only a limited subset of the main memory. Therefore the cache memory needs to store a portion of the data's main memory address. This portion of the address is called the address tag, and there is one address tag per cache line. Each cache line may be further subdivided into smaller uniform increments referred to as subblocks. Access to a cache line in the cache memory is typically made using a cache directory which stores the address tags and a set of status bits associated with the cache line.

Recently, computer systems having multiple processors have become common, directed to increasing processing speed. In a multiprocessor system, each of the processor subsystems may have its own individual cache memory. In order for a multiprocessor system with individual cache memories to operate properly, the system must maintain proper correspondence of the data stored in the cache memories since each processor may alter the data stored in its local cache memory. Correspondence of the data in the various caches is termed "cache consistency". A cache system is deemed "consistent" when the value returned from a "load from memory" operation is always the same value of the latest "store to memory" operation to the same memory address.

To maintain cache consistency, several status bits are usually maintained in the cache directory which reflects the current state of the information in each cache line. Common status bits maintained include a "valid" bit, a "shared" bit, and an "owned" bit. A "valid" bit reflects whether the information stored in the cache line is currently valid. A "shared" status bit indicates whether the information in the cache line is shared by other cache memories. If a cache line is "shared" it cannot be modified without first invalidating the cache line in the other cache memories or updating the cache line in the other cache memories. An "owned" status bit indicates that the information in the cache line has been modified without being written back to the main memory. A line of memory can be "owned" by only one processor subsystem at a time. If a processor needs to modify the contents of one of its cache lines, the processor must first change the status of cache line to make it "owned". Owned cache lines must be written back to main memory before they are replaced with new information.

An example of a multiprocessor system maintaining cache consistency is illustrated in FIGS. 1a through 1d. In FIG. 1a, the main memory unit has an address A that contains a value of 1. Processors 2 and 3 perform load A operations to obtain the value of A. During each processors load operation, the value of A is stored in the processor's local cache memory. Processors 2 and 3 now "share" memory location A and both caches have "valid" data. In FIG. 1b, Processor 1 has written a value of 2 to location A. This is permitted since neither processor 2 or processor 3 "owned" memory location A. In order to change the contents of memory location A, Processor 1 broadcasts a message across the memory bus informing other memory devices that the contents of memory location A has changed. This message causes the cache memories of processor 2 and 3 to change the status of memory location A to "invalid". The main memory unit does not maintain a set of status bits for each memory line. Instead, the main memory monitors a control line on the memory bus that is asserted whenever a request is made for a memory line that is "owned" by a processor subsystem. When the "owned" control line is asserted, the main memory learns that the line is owned by some processor subsystem and therefore does not respond to the request. Cache memory 1 now "owns" location A since it modified the contents of memory location A without updating the main memory. In FIG. 1c, processor 1 has changed the contents of memory location A to the value of 3. Since processor 1 does not share memory location A with any other processor, Processor 1 does not need to send a message across the memory bus. However, in FIG. 1d, processor 3 requires the value of memory location A for a load operation. Processor 3 must therefore send a request across the bus requesting the value of memory location A. Since processor 1 "owns" memory location A, it must respond to the request with a reply containing contents of memory location A. Memory location A is now represented in the cache memories of processors 1 and 3. Although memory location A is still "owned" by processor 1, it must now "share" memory location A with processor 3. Therefore, any further changes to memory location A by processor 1 must be forwarded to processor 3. Processor 1 must eventually write-back the changed contents of memory location A to main memory.

In computer systems implementing a cache memory system, the cache memory is first searched when a processor requests data from a memory address. A cache controller examines the address tags in the cache directory for the requested memory address. If an address tag in the cache directory matches the memory address needed and the cache line is valid, there is a cache "hit" and the data is transferred from the cache memory to the processor. If the processor subsequently modifies the data stored in a cache line, the cache line becomes a "owned" cache line. As illustrated above, the modified or "owned" cache line must eventually be written back to the main memory. If the cache controller always updates the main memory immediately after a cache line is modified, the system is referred to as a "write-through" cache. It is called a "write-through" cache since the cache system always writes through the cache memory and into the main memory.

On the other hand, when a processor makes a read request for data from a memory address and none of the address tags in the cache directory match the requested memory address or an address match occurs but the cache line is invalid, a cache "miss" occurs. The cache controller must therefore retrieve the data from the main memory or from another processor's cache memory which owns the data. During the retrieval of the memory line, the processing unit usually must remain idle until the retrieval is completed.

When a cache controller retrieves a line of data from the main memory or from another processor's cache memory for the local processor, the line is placed into the local cache memory. If no empty cache line is available, the cache must replace one of the currently used cache lines. The cache line chosen to be replaced is typically referred to as the displaced or victim line. If the cache system is a "write-through" cache system replaces the victim line immediately. The victim line in a "write-through" cache system can be immediately replaced since the main memory already has the contents as the victim line. However, if the processor modified the contents of the cache line (an "owned" cache line), the cache controller must first write-back the contents of the cache line to main memory before the cache line can be replaced. Cache systems which only write-back the contents of an owned cache line when the cache line needs to be replaced are referred to as "write-back" caches. "Write-back" cache systems update main memory less frequently than "write-through" systems since consecutive writes by the processor to the same owned cache line will not result in multiple writes to main memory. Since "write-back" cache systems update the main memory less frequently, they are more efficient than "write-through" cache systems.

FIG. 2 illustrates a prior art multiprocessor system with individual write-back cache memories for each processor subsystem. The multiprocessor system of FIG. 2 maintains cache consistency by using a set of cache directories 28 located in each cache controller 29. The cache directories 28 contains the tag addresses for each cache line and the status bits which specify if a cache line is valid (contains valid data), owned (modified and not written back to main memory yet), and/or shared (represented in another processor's cache memory).

When a processor in the multiprocessor system of FIG. 2 needs to read information not currently stored in the local cache memory, it must often replace a currently used cache line. If the cache line to be replaced is "owned", the contents of the cache line must be written back to main memory 23. In a typical write-back cache memory system, the cache controller 29 first writes-back the "owned" cache line to main memory 23 and after the write-back is completed, it requests the new line of data from main memory 23. Although requesting the new cache line only after writing back the owned cache line results in a simple design, this method creates a long latency period while the owned cache line is written back and the new cache line is obtained. During this latency period, the processor 21 usually remains idle while it waits for the needed data. Consequently, this long latency period required for cache line replacement degrades the efficiency of the multiprocessor computer system. This is especially true in large cache memories 37 where cache lines tend to be long and several owned subblocks may need to be written to memory before the new desired cache line data is requested.

SUMMARY OF THE INVENTION

Apparatus and methods for implementing a dual directory cache control system having a pending write-back cache controller in a cache memory structure supporting multiple processing units are disclosed. The processing units in the multiprocessor system are coupled together using a high-speed synchronous packet switching bus called a memory bus. Each processing unit has an associated write-back cache control system. Each write-back cache control system is divided into two separate cache controllers: the bus cache controller and the processor cache controller. The bus cache controller and the processor cache controller are coupled to one another over a second high-speed synchronous packet switching bus called a cache bus. The bus cache controller and the processor cache controller each maintain a separate directory containing tag addresses and status bits.

The processor cache controller is closely coupled to the actual processing unit. The processor cache controller services memory requests made by the processing unit. When the processor cache controller does not have a required memory location in the cache memory, it sends a request across the cache bus to the bus cache controller. If a cache line must be replaced, the processor cache controller then immediately proceeds to send the owned subblocks from the cache line that will be replaced to the bus cache controller.

The bus cache controller is connected directly to the memory bus and handles all the memory bus transactions for the processing unit. The bus cache controller contains a pending write-back controller which is responsible for handling the write-backs of owned cache lines to main memory. When the bus cache controller receives a memory request from the processor cache controller caused by a cache miss, it quickly broadcasts a corresponding memory request packet on the memory bus. If an owned cache line is to be replaced, the processor cache controller sends the subblocks from the owned cache line to the bus cache controller which buffers the owned subblocks in the pending write-back controller. When the bus cache controller receives the new cache line information from the memory bus, it immediately sends the new cache line information to the processor cache controller which replaces the cache line and allows the processing to continue. By buffering the owned cache line, the pending write-back controller allows the new cache line to be requested and replaced before the owned cache line is written back to main memory. This allows the cache miss latency period to be reduced substantially on the average.

The pending write-back controller in the bus cache controller acts as an intelligent write-back buffer for the bus cache controller. Once the bus cache controller has sent out a read request for new cache line data, the old cache line data from the owned cache line is given to the pending write-back controller to be written back to main memory. While the pending write-back controller is in control of an owned cache line which has not been written back yet, it must respond to read requests directed to that cache line. If another device on the memory bus performs a write to a cache line owned by the pending write-back controller, the pending write-back controller must not perform the write-back since it contains stale data.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be understood more fully from the detailed description given below and from the accompanying drawings of the preferred embodiment of the invention in which:

FIGS. 1a-1d provide an illustration of maintaining cache consistency in a multiprocessor system where each processor subsystem has its own cache memory.

FIG. 2 is a block diagram of a multiprocessor system with prior art write-back cache memory systems for each processor subsystem.

FIG. 3a is a block diagram of a multiprocessor system with the cache control system of the present invention where the cache controller is divided into a processor cache controller and a bus cache controller.

FIG. 3b is a block diagram of an alternate embodiment of the cache memory system of the present invention where the multiprocessor system has multiple memory buses.

FIG. 4 is a block diagram depicting the internals of the bus cache controller which are related to the pending write-back controller of the present invention.

FIG. 5 is a block diagram of the cache memory structure used in the present invention.

FIG. 6 is an electrical diagram depicting the subblock logic in the pending write-back controller of the present invention.

FIG. 7a is a first electrical diagram depicting the cache line logic in the pending write-back controller of the present invention.

FIG. 7b is a second electrical diagram depicting the cache line logic in the pending write-back controller of the present invention.

FIG. 8a is a first electrical diagram depicting the global pending write-back controller logic in the present invention.

FIG. 8b is a second electrical diagram depicting the global pending write-back controller logic in the present invention.

FIG. 9 is a block diagram illustrating the interconnections between the various hierarchical logic levels of the pending write-back controller of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Apparatus and methods for implementing a dual directory cache control system having a pending write-back cache controller are disclosed. In the following description, for purposes of explanation, specific numbers, times, signals etc., are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well known circuits and devices are shown in block diagram form in order not to obscure the present invention unnecessarily.

Reference is now made to FIG. 3a which depicts a block diagram overview of a multiple processor high performance computer system incorporating the teachings of the present invention. In FIG. 3a, a main memory unit 23 is shown coupled to a memory bus 25. Although only one main memory unit is illustrated in FIG. 3a, the main memory address space may broken into several distinct memory units. Therefore it is possible to have more than one memory unit connected to memory bus 25. Also shown coupled to the memory bus 25 are a pair of processor subsystems 20. The processor subsystems 20 read data from and write data to the main memory 23 over the memory bus 25. More than two processor subsystems 20 can be coupled to the memory bus 25 to provide additional processing power.

The memory bus 25 of FIG. 3a is a high speed synchronous packet switching bus used to transfer data between a plurality of devices on the memory bus 25. Details for implementing a packet-switched memory bus are given in the U.S. patent application Ser. No. 08/188,660, filed Jan. 30, 1994, which is a continuation of U.S. patent application Ser. No. 07/620,508, filed Nov. 30, 1990, entitled "Consistent Packet Switched Memory Bus For Shared Memory Multiprocessors" which is hereby incorporated by reference. Most transactions on the memory bus 25 consists of a request packet sent by a first device followed an arbitrary time period later by a reply packet sent by a second device. For example, a processor subsystem 20 on the memory bus 25 may send a read request packet requesting a subblock of memory. The main memory 23 (or another processor subsystem 20 that "owns" the subblock) would then reply back to the requesting processor subsystem with a read reply packet containing the requested memory subblock. A few memory bus 25 transactions consist only of a request packet, such as a write request, with no corresponding reply packet.

The processor subsystems 20 of FIG. 3a are comprised of a bus cache controller 31, a cache bus 33, and a processor module 32. The processor module 32 performs the actual processing. The bus cache controller 31 performs all the required memory bus 25 transactions for the associated processor module 32. The bus cache controller 31 and the processor module 32 communicate with one another over a high speed synchronous packet switching bus referred to as the cache bus 33. The cache bus 33 is similar to the memory bus 25 in that each transaction consists of a request packet followed an arbitrary time period later by a reply packet.

The cache bus 33 can be used by the processor module 32 to support multiple bus cache controllers coupled to separate memory buses. Referring to FIG. 3b, an alternate embodiment of the present invention with two memory buses is illustrated. In the embodiment of FIG. 3b, a processor module 32 is coupled to a cache bus 33 which has two separate bus cache controllers 30 and 31. Each bus cache controller 30 and 31 controls bus transactions on a separate memory bus. The separate memory buses each have their own associated main memory units 22 and 24.

Referring back to FIG. 3a, the processor module 32 contains a processor 21, a processor cache controller 35, and a cache memory 37. The processor cache controller 35 maintains a processor cache directory 34 containing address tags and status bits for the cache lines stored in the processor cache memory 37. The processor cache controller 35 is responsible for acting as an interface between the processor cache memory 37 and the processor 21.

The bus cache controller 31 performs a number of cache control operations for the processor subsystem 20. The main purpose of the bus cache controller 31 is to perform all the required memory bus 25 transactions for the processor subsystem 20. The bus cache controller 31 maintains a cache directory 46 containing the address tags and status bits for the data in the cache memory 37. The bus cache controller 31 includes a pending write-back controller 40 which is responsible for writing back owned cache lines which have been replaced with new information as will be explained later. The functionality of the bus cache controller 31 is best explained with reference to the transactions it manages on the memory bus 25. The bus cache controller 31 performs three types of bus transactions on the memory bus 25: reads, writes, and write-backs. Each transaction type will be addressed separately.

Read Transactions

When a memory request by the processor 21 cannot by fulfilled by the data in the processor cache memory 37, the processor cache controller 35 sends a read request packet across the cache bus 33 to the bus cache controller 31. The bus cache controller 31 proceeds to broadcast a corresponding read request packet across the memory bus 25. The read transaction initiated by the bus cache controller 31 consists of two packets: a read request packet sent by the bus cache controller 31 on the memory bus 25 and a read reply packet sent by another device on the memory bus. The read request packet contains the address of the memory requested by the processor cache controller 35 and is broadcast to all entities on the memory bus 25. A device on the memory bus 25 that contains the requested memory address responds to the read request packet with a read reply packet containing the subblock which includes the requested memory address. The read reply packet is generally issued by the main memory 23 except when the desired memory address is "owned" by another processor subsystem 20. In that case, the processor subsystem that owns the subblock must generate a read reply packet with the requested data.

Write Transactions

When a the cache memory system for a processor subsystem "owns" a particular cache line, it is allowed to modify the contents of the cache line. If the processor 21 modifies a cache line which is shared with other cache memories, the bus cache controller 31 performs a write transaction to update the information in the cache memories that share the cache line. If the cache line is subdivided into subblocks, not all modifications to data in a cache line result in a write transaction. In a system with subblocks it is only necessary to broadcast those subblocks which have been modified and reside in other caches as well. "Shared" flags are required for each subblock to keep this information, but will not be discussed here. The disclosure of U.S. patent application Ser. No. 07/620,496, filed Nov. 30, 1990, entitled "Consistency Protocols For Shared Memory Multiprocessors", now U.S. Pat. No. 5,265,235, issued Nov. 23, 1993, are incorporated by reference.

Write-Back Transactions

Write-back transactions are issued by the bus cache controller 31 when updating main memory 23 with owned subblocks from cache lines that are no longer needed by the processor 21. Write-back transactions on the memory bus 25 are directed to the main memory 23 and are ignored by the other processor subsystems on the memory bus.

To completely explain the write-back transaction, it is best to explain step-by-step the events that take place when there is a cache miss and no empty cache lines are available. In such a case there is both a write-back transaction which gets rid of the old information in the cache line, and a read transaction which obtains the new information for the new cache line.

Referring to FIG. 3a, when the processor 21 requires data that is not stored in the local cache memory 37, a cache miss occurs. The processor cache controller 35 issues a read request packet containing the required memory address to the bus cache controller 31 through the cache bus 33. As discussed in the read transaction section, the bus cache controller 31 responds to the read request packet by broadcasting a corresponding read request packet across the memory bus 25. The appropriate memory unit or processor subsystem on the memory bus 25 should eventually respond to the read request packet with a read reply packet containing the requested data. After issuing the read request packet for the new information, the bus cache controller 31 gives the pending write-back controller 40 the responsibility of writing back the old owned subblocks from the cache line being replaced. The pending write-back controller 40 acts as an intelligent buffer which handles all read requests for the old cache line data until it is written back to main memory 23.

After the processor cache controller 35 has sent the read request packet to the bus cache controller 31, and independently of whether or not the bus cache controller 31 issued the read request packet, the processor cache controller 35 begins to send the bus cache controller 31 any owned subblocks in the cache line which will be replaced by the new information. The bus cache controller 31 directs these owned subblocks to the pending write-back controller 40 which stores them into a data RAM. After the processor cache controller 35 has sent all the owned subblocks to the pending write-back controller 40, the processor cache controller 35 marks the cache line as invalid and is ready to accept the new cache line data.

After the pending write-back controller 40 has received all the owned subblocks from the processor cache controller 35 and the bus cache controller 31 has issued a read request for the desired data, the pending write-back controller 40 begins sending write-back packets containing owned subblocks to main memory 23. The write-backs to main memory 23 occur and independently of whether the bus cache controller has received a read reply packet.

While these transactions are in progress, other processor subsystems may issue write transactions to the subblocks being written back to main memory 23. Similarly, other processors may issues read requests to these same subblocks. In order to maintain cache consistency, the following rules must be adhered to by the pending write-back controller 40:

1. If another processor subsystem issues a write transaction to an address corresponding to an owned subblock before the subblock is written back to main memory by the pending write-back controller 40, the write-back must not occur since the pending write-back controller 40 contains "stale" data.

2. If another processor subsystem issues a read request packet to an address corresponding to a subblock owned by the pending write-back controller 40 before the subblock is written back to main memory, or before another processor issues a write transaction to an address corresponding to the same subblock, then the pending write-back controller 40 must reply with a read reply packet.

It can be seen from FIG. 3a that the write-back cache system of the present invention maintains three separate cache directories: the processor cache directory 34, the bus cache controller directory 46, and a small directory in the pending write-back controller.

The processor cache directory 34 and the bus cac