WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Parallelized coherent read and writeback transaction processing system for use in a packet switched cache coherent multiprocessor system    
United States Patent5581729   
Link to this pagehttp://www.wikipatents.com/5581729.html
Inventor(s)Nishtala; Satyanarayana (Cupertino, CA); Ebrahim; Zahir (Mountain View, CA); Van Loo; William C. (Palo Alto, CA); Loewenstein; Paul (Palo Alto, CA); Lee; Sue K. (San Mateo, CA); Coffin III; Louis F. (San Jose, CA)
AbstractA multiprocessor computer system is provided having a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller. At least two of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a respective master cache index. Each master cache index has a set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory. Each data processor includes a master interface having master classes for sending memory transaction requests to the system controller. The system controller includes memory transaction request logic for processing each memory transaction request by a data processor. The system controller maintains a duplicate cache index having a set of duplicate cache tags (Dtags) for each data processor. Each data processor has a writeback buffer for storing the data block previously stored in a victimized cache line until its respective writeback transaction is completed and an Nth+1 Dtag for storing the cache state of a cache line associated with a read transaction which is executed prior to an associated writeback transaction of a read-writeback transaction pair. Accordingly, upon a cache miss, the interconnect may execute the read and writeback transactions in parallel relying on the writeback buffer or Nth+1 Dtag to accommodate any ordering of the transactions.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Nishtala; Satyanarayana (Cupertino, CA); Ebrahim; Zahir (Mountain View, CA); Van Loo; William C. (Palo Alto, CA); Loewenstein; Paul (Palo Alto, CA); Lee; Sue K. (San Mateo, CA); Coffin III; Louis F. (San Jose, CA)
Owner/Assignee     Sun Microsystems, Inc. (Mountain View, CA)
Patent assignment
All assignments
Publication Date     December 3, 1996
Application Number     08/414,763
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 31, 1995
US Classification     711/143 711/144 711/145
Int'l Classification     G06F 012/08 G06F 013/00
Examiner     Swann; Tod R.
Assistant Examiner     King Jr.; Conley B.
Attorney/Law Firm     Flehr; Hohbach ; Test, Albritton & Herbert
Address
Parent Case    
Priority Data    
USPTO Field of Search     395/468 395/470 395/471 395/474 395/472
Patent Tags     parallelized coherent read writeback transaction processing system packet switched cache coherent multiprocessor
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5388224
Maskas
710/104
Feb,1995

[0 after 0 votes]
5375220
Ishikawa
711/141
Dec,1994

[0 after 0 votes]
5361267
Godiwala
714/755
Nov,1994

[0 after 0 votes]
5319766
Thaller
711/146
Jun,1994

[0 after 0 votes]
5276852
Callander
711/143
Jan,1994

[0 after 0 votes]
5226146
Milia
711/141
Jul,1993

[0 after 0 votes]
5222224
Flynn
711/144
Jun,1993

[0 after 0 votes]
5113514
Albonesi
711/144
May,1992

[0 after 0 votes]
5058006
Durdan
711/122
Oct,1991

[0 after 0 votes]
4926317
Wallach
711/3
May,1990

[0 after 0 votes]
4812972
Chastain
712/211
Mar,1989

[0 after 0 votes]
4620275
Wallach
712/6
Oct,1986

[0 after 0 votes]
4345309
Arulpragasam
711/140
Aug,1982

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A computer system, comprising:

a system controller;

a main memory coupled to said system controller;

a data processor having a cache memory having N cache lines for storing N data blocks, where N is an integer greater than 4, N master cache tags (Etags), including one Etag for each said cache line in said cache memory, and a writeback buffer for storing a dirty victim data block displaced from said cache memory until it is written back into said main memory; said Etag for each cache line storing an address index and an Etag state value that indicates whether said data block stored in said cache line includes data modified by said data processor;

said data processor including a master interface, coupled to said system controller, for sending memory transaction requests to said system controller, said memory transaction requests including read requests and writeback requests; each memory transaction request specifying an address for an associated data block to be read or written;

said master interface further including cache coherence logic for responding to a cache miss on any cache line in said cache memory by (A) generating a read request, and (B) when said cache miss requires a cache line to be victimized and said victim cache line includes modified data, according to the Etag state value in the corresponding Etag, storing the data block having said modified data in said writeback buffer and generating a writeback request;

said system controller including a set of N duplicate cache tags (Dtags), each Dtag corresponding to one of said Etags and storing a Dtag state value and the same address index as the corresponding Etag; said Dtag state value indicating whether said data block stored in the corresponding cache line includes data modified by said data processor;

said system controller further including an N+1th Dtag;

said system controller including memory transaction request logic for processing each said memory transaction request by said data processor;

said system controller's memory transaction request logic including writeback logic for processing said writeback request by writing the data block in said writeback buffer into said main memory and invalidating the state value in the corresponding Dtag;

said system controller's memory transaction request logic including read logic for processing said read request by (A) identifying a victim cache line, if any, in said cache memory and accessing the Dtag corresponding to said victim cache line to determine whether processing said read request will displace from said cache memory a data block that includes modified data, (B) retrieving a data block from said main memory corresponding to said read request and providing it to said data processor for storage in said data processor's cache memory, (C) storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when processing said read request does not displace from said cache memory a modified data block and when said corresponding Dtag's state value is invalid, (D) storing said Dtag state value and address tag for said read request in said N+1th Dtag when processing said read request does displace from said cache memory a modified data block and said corresponding Dtag's state value is not invalid, and (E) transferring said N+1th Dtag into said Dtag corresponding to said victim cache line when said writeback logic invalidates said Dtag state value in said corresponding Dtag; and

wherein said memory transaction request logic processes said read request and writeback request such that processing of either of said read request and writeback request may be completed prior to the other in accordance with resource availability for processing said requests.

2. The computer system of claim 1,

each said read request including a DVP flag that has a first value when said read request corresponds to a cache fill operation that displaces a modified data block from said cache memory, said data block displacement being represented by said writeback request; said DVP flag having a second value, distinct from said first value, when said read request corresponds to a cache fill operation that does not displace a modified data block from said cache memory; and

said transaction request logic including logic for storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when DVP flag in said read request has said second value and when said corresponding Dtag's state value is invalid, and storing said Dtag state value and address tag for said read request in said N+1th Dtag when said DVP flag in said read request has said first value and said corresponding Dtag's state value is not invalid.

3. The computer system of claim 1,

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

4. The computer system of claim 1,

wherein said main memory is a reflective memory;

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

5. A computer system, comprising:

a system controller;

a main memory coupled to said system controller;

a data processor having a cache memory having N cache lines for storing N data blocks, where N is an integer greater than 4, N master cache tags (Etags), including one Etag for each said cache line in said cache memory, and a writeback buffer for storing a dirty victim data block displaced from said cache memory until it is written back into said main memory; said Etag for each cache line storing an address index and an Etag state value that indicates whether said data block stored in said cache line includes data modified by said data processor;

said data processor including a master interface, coupled to said system controller, for sending memory transaction requests to said system controller, said master interface including at least two parallel outgoing request queues for storing memory transaction requests to be sent to said system controller; said memory transaction requests including read requests and writeback requests; each memory transaction request specifying an address for an associated data block to be read or written;

said master interface further including cache coherence logic for responding to a cache miss on any cache line in said cache memory by (A) storing a read request in a first one of said outgoing request queues, and (B) when said cache miss requires a cache line to be victimized and said victim cache line, according to the Etag state value in the corresponding Etag, includes modified data, storing the data block having said modified data in said writeback buffer and storing a writeback request in a second one of said outgoing request queues;

said system controller including a set of N duplicate cache tags (Dtags), each Dtag corresponding to one of said Etags and storing a Dtag state value and the same address index as the corresponding Etag; said Dtag state value indicating whether said data block stored in the corresponding cache line includes data modified by said data processor;

said system controller further including an N+1th Dtag;

said system controller including memory transaction request logic for processing each said memory transaction request by said data processor;

said system controller's memory transaction request logic including writeback logic for processing said writeback request by writing the data block in said writeback buffer into said main memory and invalidating said state value in the corresponding Dtag;

said system controller's memory transaction request logic including read logic for processing said read request by (A) identifying a victim cache line in said cache memory, if any, and accessing the Dtag corresponding to said victim cache line to determine whether processing said read request will displace from said cache memory a data block that includes modified data, (B) retrieving a data block from said main memory corresponding to said read request and providing it to said data processor for storage in said data processor's cache memory, (C) storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when processing said read request does not displace from said cache memory a modified data block and when the Dtag state value corresponding to the victim cache line is invalid, (D) storing said Dtag state value and address tag for said retrieved data block in said N+1th Dtag when processing said read request does displace from said cache memory a modified data block and said corresponding Dtag's state value is not invalid, and (E) transferring said N+1th Dtag into said Dtag corresponding to said victim cache line when said writeback logic invalidates said Dtag state value in said corresponding Dtag;

wherein said memory transaction request logic processes said read request and writeback request such that processing of either of said read request and writeback request may be completed prior to the other in accordance with resource availability for processing said requests.

6. The computer system of claim 5,

each said read request including a DVP flag that has a first value when said read request corresponds to a cache fill operation that displaces a modified data block from said cache memory, said data block displacement being represented by said writeback request; said DVP flag having a second value, distinct from said first value, when said read request corresponds to a cache fill operation that does not displace a modified data block from said cache memory; and

said transaction request logic including logic for storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when DVP flag in said read request has said second value and when said corresponding Dtag's state value is invalid, and storing said Dtag state value and address tag for said read request in said N+1th Dtag when said DVP flag in said read request has said first value and said corresponding Dtag's state value is not invalid.

7. The computer system of claim 5,

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

8. The computer system of claim 5,

wherein said main memory is a reflective memory;

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

9. A method for parallelizing writeback and read transactions in a packet switched cache coherent multiprocessor system having a system controller coupled to a main memory and to a data processor having a cache memory comprising the steps of:

storing master cache tags (Etags) in said data processor, including one Etag for each cache line in said cache memory, said Etag for each cache line storing an address index and an Etag state value that indicates whether a data block stored in said cache line includes data modified by said data processor;

storing in a writeback buffer of said data processor a dirty victim data block displaced from said cache memory until it is written back into said main memory;

storing a set of N duplicate tags (Dtags) for said cache memory in said system controller, each Dtag corresponding to one of said Etags including a Dtag state value and the same address index as the corresponding Etag; said Dtag state value indicating whether said data block stored in the corresponding cache line includes data modified by said data processor;

sending memory transaction requests from said data processor to said system controller, said memory transaction requests including read requests and writeback requests;

responding to a cache miss in said cache memory by (A) generating a read request, and (B) when said cache miss requires victimizing a data block that, according to the Etag state value in a corresponding Etag, includes modified data, storing the data block having said modified data in a writeback buffer and generating a writeback request;

processing writeback requests by writing the data block in said writeback buffer into said main memory and invalidating the state value in the corresponding Dtag; and

processing said read request by:

(A) identifying a victim cache line in said cache memory, if any, and accessing the Dtag corresponding to said victim cache line to determine whether processing said read request will displace from said cache memory a data block that includes modified data;

(B) retrieving a data block from said main memory corresponding to said read request and providing it to said data processor for storage in said data processor's cache memory at said victim cache line;

(C) storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when processing said read request does not displace from said cache memory a modified data block and when said corresponding Dtag's state value is invalid;

(D) storing said Dtag state value and address tag for said retrieved data block in a N+1th Dtag when processing said read request does displace from said cache memory a modified data block and said corresponding Dtag's state value is not invalid; and

(E) transferring said N+1th Dtag into said Dtag corresponding to said victim cache line when said writeback processing step invalidates said Dtag state value in said corresponding Dtag;

wherein memory transaction request logic processes said read request and writeback request such that processing of either of said read request and writeback request may be completed prior to the other in accordance with resource availability for processing said requests.

10. The method of claim 9,

each said read request including a DVP flag that has a first value when said read request corresponds to a cache fill operation that displaces a modified data block from said cache memory, said data block displacement being represented by said writeback request; said DVP flag having a second value, distinct from said first value, when said read request corresponds to a cache fill operation that does not displace a modified data block from said cache memory; and

read request processing step including storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when DVP flag in said read request has said second value and when said corresponding Dtag's state value is invalid, and storing said Dtag state value and address tag for said read request in said N+1th Dtag when said DVP flag in said read request has said first value and said corresponding Dtag's state value is not invalid.

11. The method of claim 9,

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

12. The method system of claim 9,

wherein said main memory is a reflective memory;

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

13. A method for parallelizing writeback and read transactions in a packet switched cache coherent multiprocessor system having a system controller coupled to a main memory and to a data processor having a cache memory comprising the steps of:

storing master cache tags (Etags) in said data processor, including N Etags, one Etag for each cache line in said cache memory, said Etag for each cache line storing an address index and an Etag state value that indicates whether a data block stored in said cache line includes data modified by said data processor;

storing in a writeback buffer of said data processor a dirty victim data block displaced from said cache memory until it is written back into said main memory;

storing duplicate tags (Dtags) for said cache memory in said system controller;

responding to a cache miss in said cache memory by (A) generating a read request, and (B) when said cache miss requires victimizing a cache line containing a data block that, according to the Etag state value in the corresponding Etag, includes modified data, storing said data block having said modified data in a writeback buffer and generating a writeback request;

processing said writeback requests by writing the data block in said writeback buffer into said main memory and invalidating the state value in the corresponding Dtag; and

processing said read request by:

(A) identifying a victim cache line in said cache memory, if any, and accessing the Dtag corresponding to said victim cache line to determine whether processing said read request will displace from said cache memory a data block that includes modified data;

(B) retrieving a data block from said main memory corresponding to said read request and providing it to said data processor for storage in said data processor's cache memory;

(C) storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when processing said read request does not displace from said cache memory a modified data block and when said corresponding Dtag's state value is invalid;

(D) storing said Dtag state value and address tag for said retrieved data block in a N+1th Dtag when processing said read request does displace from said cache memory a modified data block and said corresponding Dtag's state value is not invalid; and

(E) transferring said N+1th Dtag into said Dtag corresponding to said victim cache line when said writeback processing step invalidates said Dtag state value in said corresponding Dtag;

wherein memory transaction request logic processes said read request and writeback request such that processing of either of said read request and writeback request may be completed prior to the other in accordance with resource availability for processing said requests.

14. The method of claim 13,

each said read request including a DVP flag that has a first value when said read request corresponds to a cache fill operation that displaces a modified data block from said cache memory, said data block displacement being represented by said writeback request; said DVP flag having a second value, distinct from said first value, when said read request corresponds to a cache fill operation that does not displace a modified data block from said cache memory; and

read request processing step including storing a Dtag state value and address tag in the Dtag corresponding to said victim cache line when DVP flag in said read request has said second value and when said corresponding Dtag's state value is invalid, and storing said Dtag state value and address tag for said read request in said N+1th Dtag when said DVP flag in said read request has said first value and said corresponding Dtag's state value is not invalid.

15. The method of claim 13,

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Modified (O), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.

16. The method of claim 13,

wherein said main memory is a reflective memory;

said Etag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Exclusive Clean (E), Shared Clean (S), and Invalid (I);

said Dtag state being selected from the set of states consisting essentially of: Exclusive Modified (M), Shared Clean (S), and Invalid (I); and

wherein said Dtag state stored in said Dtags never indicates said Exclusive Clean (E) state and when each data processor modifies data stored in its cache memory in a cache line whose Etag thereby transitions from said E state to said M state, said data processor does not generate a corresponding transaction request and the corresponding Dtag remains unchanged with a Dtag state equal to said M state.
 Description Submit all comments and votes
 


The present invention relates generally to multiprocessor computer systems in which the processors share memory resources, and particularly to a multiprocessor computer system that utilizes an interconnect architecture and cache coherence methodology to minimize memory access latency by parallelizing read and writeback transactions for improved system throughput.

BACKGROUND OF THE INVENTION

The need to maintain "cache coherence" in multiprocessor systems is well known. Maintaining "cache coherence" means, at a minimum, that whenever data is written into a specified location in a shared address space by one processor, the caches for any other processors which store data for the same address location are either invalidated, or updated with the new data.

There are two primary system architectures used for maintaining cache coherence. One, herein called the cache snoop architecture, requires that each data processor's cache include logic for monitoring a shared address bus and various control lines so as to detect when data in shared memory is being overwritten with new data, determining whether its data processor's cache contains an entry for the same memory location, and updating its cache contents and/or the corresponding cache tag when data stored in the cache is invalidated by another processor. Thus, in the cache snoop architecture, every data processor is responsible for maintaining its own cache in a state that is consistent with the state of the other caches.

In a second cache coherence architecture, herein called the memory directory architecture, main memory includes a set of status bits for every block of data that indicate which data processors, if any, have the data block stored in cache. The main memory's status bits may store additional information, such as which processor is considered to be the "owner" of the data block if the cache coherence architecture requires storage of such information.

In these cache coherence architectures, read-writeback transaction pairs arise when a read miss requires victimizing a cache line which has modified data, thereby necessitating a writeback to main memory. In the prior art, these transactions normally are strictly ordered, with the victimizing read transaction executing prior to the writeback transaction in order to allow the requesting processor to receive the data right away. In addition to the strict ordering, cache coherence architectures of the prior art required these read and writeback transactions be sequentially executed, not allowing for any other coherent transactions to be executed from the same processor between the read and the writeback transactions, even when transactions are directed to a different cache index. Accordingly, an architecture which supported parallelized transactions would provide reduced latency in processing the individual read-writeback transaction pairs along with an improvement in the overall transaction throughput.

SUMMARY OF THE INVENTION

In summary, the present invention is a multiprocessor computer system that has a multiplicity of sub-systems and a main memory coupled to a system controller. An interconnect module, interconnects the main memory and sub-systems in accordance with interconnect control signals received from the system controller.

All of the sub-systems include a port that transmits and receives data as data packets of a fixed size. At least two of the sub-systems are data processors, each having a respective cache memory that stores multiple blocks of data and a set of master cache tags (Etags), including one cache tag for each data block stored by the cache memory.

Each data processor includes a master interface having master classes for sending memory transaction requests to the system controller and for receiving cache access requests from the system controller corresponding to memory transaction requests by other ones of the data processors. The master classes allow for the simultaneous launching of read and writeback transactions. The system controller includes memory transaction request logic for processing each memory transaction request by a data processor, for determining which one of the cache memories and main memory to couple to the requesting data processor, for sending corresponding interconnect control signals to the interconnect module so as to couple the requesting data processor to the determined one of the cache memories and main memory, and for sending a reply message to the requesting data processor to prompt the requesting data processor to transmit or receive one data packet to or from the determined one of the cache memories and main memory.

The system controller maintains a set of duplicate cache tags (Dtags) for each of the data processors, the set of duplicate cache tags for each data processor having an equal number of cache tags as the corresponding set of master cache tags. Each master cache tag denotes a master cache state and an address tag; the duplicate cache tag corresponding to each master cache tag denotes a second cache state and the same address tag as the corresponding master cache tag.

The system controller includes further includes logic for executing a read-writeback pair of transactions in parallel, including an Nth+1 Dtag and a transient writeback buffer for each data processor. The Nth+1 Dtag for each processor stores the cache state and address tag of the cache line associated with a read transaction which is executed prior to an associated writeback transaction of a read-writeback transaction pair. The system controller contains Dtag update logic for transferring the Dtag value stored in the Nth+1 Dtag entry to its proper Dtag location upon the execution of the associated writeback transaction.

The writeback buffer in each data processor stores the data block previously stored in a victimized cache line until the associated writeback transaction is completed. Accordingly, upon a cache miss, the interconnect may execute the read and writeback transactions in parallel relying on the transient writeback buffer and the Nth+1 Dtag entry to accommodate any ordering of the transactions. As a result, read request and writeback request of a read-writeback transaction pair are processed such that processing of either of said read request and writeback request may be completed prior to the other in accordance with resource availability for processing those requests. For instance, if the read and writeback transactions reference two different main memory banks, one of those memory banks may be busy while the other is available for immediate use. Thus, using the present invention the transaction which references the available bank memory will be processed first, regardless of whether that transaction is the read transaction or the writeback transaction. This is in direct contrast with other systems in which read-writeback pairs are handled in a fixed order, and thus do not make optimal use of system resources.

BRIEF DESCRIPTION OF THE DRAWINGS

Additional objects and features of the invention will be more readily apparent from the following detailed description and appended claims when taken in conjunction with the drawings, in which:

FIG. 1 is a block diagram of a computer system incorporating the present invention.

FIG. 2 is a block diagram of a computer system showing the data bus and address bus configuration used in one embodiment of the present invention.

FIG. 3 depicts the signal lines associated with a port in a preferred embodiment of the present invention.

FIG. 4 is a block diagram of the interfaces and port ID register found in a port in a preferred embodiment of the present invention.

FIG. 5 is a block diagram of a computer system incorporating the present invention, depicting request and data queues used while performing data transfer transactions.

FIG. 6 is a block diagram of the System Controller Configuration register used in a preferred embodiment of the present invention.

FIG. 7 is a block diagram of a caching UPA master port and the cache controller in the associated UPA module.

FIGS. 8, 8A, 8B, 8C, and 8D show a simplified flow chart of typical read/write data flow transactions in a preferred embodiment of the present invention.

FIG. 9 depicts the writeback buffer and Dtag Transient Buffers used for handling coherent cache writeback operations.

FIGS. 10A-10E shows the data packet formats for various transaction request packets.

FIG. 11 is a state transition diagram of the cache tag line states for each cache entry in an Etag array in a preferred embodiment of the present invention.

FIG. 12 is a state transition diagram of the cache tag line states for each cache entry in