WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Dual-ported memory controller which maintains cache coherency using a memory line status table    
United States Patent5991819   
Link to this pagehttp://www.wikipatents.com/5991819.html
Inventor(s)Young; Gene F. (Lexington, SC)
AbstractA symmetric multiprocessor system constructed from industry standard commodity components together with an advanced dual-ported memory controller. The multiprocessor system comprises a processor bus; up to four Intel Pentium.RTM. Pro processors connected to the processor bus; an I/O bus; a system memory; and a dual-ported memory controller connected to the system memory, the dual ported memory controller having a first port connected to the processor bus to manage processor to system memory transactions and a second port connected to the I/O bus to manage I/O transactions. Furthermore, two such systems can be connected together through a common I/O bus, thereby creating an eight-processor Pentium.RTM. Pro processor SMP system.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Young; Gene F. (Lexington, SC)
Owner/Assignee     Intel Corporation (Santa Clara, CA)
Patent assignment
All assignments
Publication Date     November 23, 1999
Application Number     08/760,126
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     December 3, 1996
US Classification    
Int'l Classification    
Examiner     An; Meng-Ai T.
Assistant Examiner     Patel; Gautam R.
Attorney/Law Firm     Blakely, Sokoloff, Taylor & Zafman LLP
Address
Parent Case    
Priority Data    
USPTO Field of Search    
Patent Tags     dual-ported memory controller which maintains cache coherency a memory line status table
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5765195
McDonald
711/141
Jun,1998

[0 after 0 votes]
5752258
Guzovskiy
711/120
May,1998

[0 after 0 votes]
5659687
Kim
710/112
Aug,1997

[0 after 0 votes]
5551005
Sarangdhar
711/145
Aug,1996

[0 after 0 votes]
5548730
Young
710/100
Aug,1996

[0 after 0 votes]
5535116
Gupta
700/5
Jul,1996

[0 after 0 votes]
5524215
Gay
710/107
Jun,1996

[0 after 0 votes]
5519839
Culley
710/52
May,1996

[0 after 0 votes]
5515522
Bridges
711/141
May,1996

[0 after 0 votes]
5511226
Zilka
711/146
Apr,1996

[0 after 0 votes]
5506971
Gullette
710/116
Apr,1996

[0 after 0 votes]
5426765
Stevens
711/131
Jun,1995

[0 after 0 votes]
5386511
Murata
711/120
Jan,1995

[0 after 0 votes]
5361340
Kelly
711/3
Nov,1994

[0 after 0 votes]
5335335
Jackson

Aug,1994

[0 after 0 votes]
5276832
Holman, Jr.
711/3
Jan,1994

[0 after 0 votes]
5249283
Boland
711/146
Sep,1993

[0 after 0 votes]
5247649
Bandoh
711/130
Sep,1993

[0 after 0 votes]
5228135
Ikumi
711/131
Jul,1993

[0 after 0 votes]
5197146
LaFetra
711/144
Mar,1993

[0 after 0 votes]
5025365
Mathur
711/121
Jun,1991

[0 after 0 votes]
5001671
Koo
365/230.05
Mar,1991

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A computer system comprising:

a first processor bus;

a first plurality of processors having cache memories, the first plurality of processors coupled to the first processor bus;

a second processor bus;

a second plurality of processors having cache memories, the second plurality of processors coupled to the second processor bus;

a memory controller coupled to the first processor bus and to the second processor bus, the memory controller to control data flow to and from the first processor bus and the second processor bus; and

a system memory coupled to the memory controller, the system memory to store a set of memory state bits for each line in the system memory, the memory state bits indicating ownership for an associated line in system memory, the memory state bits also indicate sharing information corresponding to a first bus and a second bus for the associated lines in the system memory.

2. The computer system of claim 1 wherein the memory state bits indicate sharing information for an associated line in system memory corresponding to a first bus and a second bus.

3. The computer system of claim 1 wherein the memory state bits comprise three bits for each line in the system memory.

4. A method comprising:

accessing a line in a system memory;

managing coherency of the line in system memory with corresponding lines stored in cache memories coupled to individual processors in a multiprocessor computer system, the managing accomplished with a memory controller; and

determining, based, at least in part, on a set of memory state bits, ownership and sharing information for corresponding lines in the system memory, wherein the ownership information indicates ownership of the corresponding lines in system memory by a processor coupled to one of a first processor bus and a second processor bus, and further wherein the sharing information indicates whether the corresponding lines in system memory are shared between the first processor bus and the second processor bus.

5. The method of claim 4 wherein the memory state bits indicate sharing information for an associated line in system memory corresponding to a first bus and a second bus.

6. The method of claim 4 wherein the memory state bits comprise three bits for each line in the system memory.

7. An apparatus comprising:

means for accessing a line in a system memory;

means for managing coherency of the line in system memory with corresponding lines stored in cache memories coupled to individual processors in a multiprocessor computer system, the managing accomplished with a memory controller; and

means for determining, based, at least in part, on a set of memory state bits, ownership and sharing information for corresponding lines in the system memory, wherein the ownership information indicates ownership of the corresponding lines in system memory by a processor coupled to one of a first processor bus and a second processor bus, and further wherein the sharing information indicates whether the corresponding lines in system memory are shared between the first processor bus and the second processor bus.

8. The apparatus of claim 7 wherein the memory state bits indicate sharing information for an associated line in system memory corresponding to a first bus and a second bus.

9. The apparatus of claim 7 wherein the memory state bits comprise three bits for each line in the system memory.
 Description Submit all comments and votes
 


The present invention relates to multiprocessing computer systems and, more particularly, to symmetric multiprocessing (SMP) computer systems including multiple system busses.

BACKGROUND OF THE INVENTION

Currently available Standard High Volume (SHV) commodity computer components, such as processors, single in-line memory modules (SIMMs), peripheral devices, and other specific components, have made it possible to easily and cheaply design and build computer systems including personal computers and multiprocessor servers. While it is possible to create custom personal computer and server designs which may be more clever and efficient than systems fashioned from standard components, the benefits of volume pricing and availability provided by the use of commodity components imparts a significant advantage to computer systems constructed from commodity SHV components.

FIG. 1 provides a simple block diagram of a standard high volume (SHV) symmetric multiprocessing (SMP) computer system employing currently available commodity components. The design shown employs Intel Pentium.RTM. Pro.TM. processors and a high-performance bus and chipset, such as an Intel processor bus and 8245GX chipset, respectively, that are intended to be the SHV companions of the Pentium.RTM. Pro processor.

The system as shown in FIG. 1 includes up to four processors 101 connected to a high-bandwidth split-transaction bus 103. A system memory 105 is connected to bus 103 through a memory controller chipset 107. Connection to standard PCI devices, not shown, is provided through PCI I/O interfaces 109. As stated above, all of these components are currently available commodity components. The characteristics of the SHV architecture shown in FIG. 1 include:

Support for up to four processors 101 and two PCI interfaces 107 on a single bus 103.

A high-performance bus topology operating at 66 Mhz with a 64 bit datapath, and capable of a sustained data transfer rate of 533 MB/s.

A memory controller 107 consisting of two chips 111 and 113 which present two loads to bus 103.

processor bus-to-PCI I/O bridges (IOBs) 109 that will peak at a data transfer rate of 132 MB/s with a sustained data transfer rate of about 80 MB/s.

The architecture shown in FIG. 1, constructed of commodity components, simplifies the design of reasonably powerful multiprocessor system having up to four processors. However, the Pentium.RTM. Pro processor/processor bus architecture described above does not permit expansion beyond four processors. Other improvements to the system thus far described are also possible.

OBJECTS OF THE INVENTION

It is therefore an object of the present invention to provide a new and useful symmetric multiprocessor system constructed from industry standard commodity components.

It is another object of the present invention to provide such a system which provides support for up to eight or more processors.

It is yet another object of the present invention to provide a new and useful multiple bus symmetric multiprocessor system constructed from industry standard commodity components offering improved performance over existing single bus systems.

It is still a further object of the present invention to provide such a system which provides a first bus for processor to memory transactions and a separate bus for I/O transactions.

SUMMARY OF THE INVENTION

There is provided, in accordance with the present invention, a symmetric multiprocessor system constructed from industry standard commodity components together with an advanced dual-ported memory controller. The multiprocessor system comprises a snooped processor bus; at least one processor connected to the processor bus; an I/O bus; a system memory; and a dual-ported memory controller connected to the system memory, the dual ported memory controller having a first port connected to the processor bus to manage processor to system memory transactions and a second port connected to the I/O bus to manage I/O transactions.

In the described embodiment, up to four Intel Pentium.RTM. Pro processors are connected to the processor bus, and the processor bus and I/O bus are each Intel processor busses. Furthermore, two such systems can be connected together through a common I/O bus, thereby creating an eight-processor Pentium.RTM. Pro processor SMP system.

The above and other objects, features, and advantages of the present invention will become apparent from the following description and the attached drawings

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a simple block diagram representation of a four processor super high volume symmetric multiprocessing (SMP) computer system employing currently available commodity components.

FIG. 2 is a simple block diagram representation of a super high volume (SHV) symmetric multiprocessing (SMP) computer system employing a dual ported advanced memory controller and providing support for more than four processors in accordance with the present invention.

FIG. 3 is a simple block diagram representation of an eight processor super high volume symmetric multiprocessing (SMP) computer in accordance with the present invention.

FIG. 4 is a block diagram illustration of the control logic included within the advanced memory controller shown in FIGS. 2 and 3.

FIG. 5 illustrates the arrangement of the components of FIGS. 5A through 5I.

FIGS. 5A, 5B, 5C, 5D, 5E, 5F, 5G, 5H and 5I illustrate the data path logic of an advanced memory controller according to one embodiment of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

The advanced multiprocessor architecture described herein uses system techniques pioneered by NCR while also advantageously making use of standard high volume (SHV) components, such as Intel Pentium.RTM. Pro processors, PCI I/O chipsets, Pentium.RTM. Pro processor chipsets, Pentium.RTM. Pro processor bus topology, and standard memory modules (SIMMs and DIMMs). Through careful integration of NCR system techniques with SHV components, NCR is able to deliver world class scalability and feature content while still capitalizing on SHV and without the disadvantage associated with full custom development.

1. One to Four Processor SMP Implementation

FIG. 2 provides a simple block diagram representation of SHV symmetric multiprocessing computer system employing a dual ported advanced memory controller and providing support for up to four processors in accordance with the present invention. Note the similarities shown in the system of FIG. 2 and the commodity Pentium.RTM. Pro processor bus SHV system shown in FIG. 1.

The system as shown in FIG. 2 includes up to four processors 201 connected to a high-bandwidth split-transaction processor bus 203. A system memory 205 is connected to bus 203 through an advanced dual-ported memory controller 207. The processor bus 203 is connected to the first port of memory controller 207.The second memory controller port connects to a I/O bus 215, also referred to herein as an expansion bus, which provides connection for multiple PCI I/O interfaces 209. All of these components, with the exception of advance memory controller 207, are currently available commodity components.

The advanced memory controller (AMC) 207 manages control and data flow in all directions between the processor bus 203 and I/O bus 215. The I/O bus may contain processor bus to PCI I/O Bridges and another AMC ASIC for connectivity to another processor bus, as will be discussed below. The AMC 207 also controls access to a coherent DRAM memory array. The AMC as presently implemented consists of a control and data slice ASIC pair. A more detailed discussion of the NCR Advanced Memory Controller will be provided below.

The four processors use a bus snooping protocol on the processor bus 203. Bus snooping is a method of keeping track of data movements between processors and memory. There are performance advantages to this system with a small number of tightly-coupled processors. If a processor needs data that is available in the data cache of another processor on the same bus, the data can be shared by both processors. Otherwise, the data must be retrieved from main memory 205, a more time consuming operation which requires system bus traffic. This method enhances system performance by reducing system bus contention.

The characteristics of the NCR architecture shown in FIG. 2 include:

Capitalizes on industry SHV architecture and supporting commodity chips (IOB. etc.)

Dual ported memory controller 207 permits connection and utilization of dual buses, each operating at 66 MHz with a bandwidth of 64 bits and capable of sustained data transfer rates of 533 MB/s.

Dual bus approach provides greater scalability through a reduction of bus loadings and provision of a private processor to memory path that can operate independent of IOB to IOB traffic.

Additional processors and I/O devices can be connected to the expansion bus 215.

The system as described is able to fill High Availability Transaction Processing (HATP) and Scaleable Data Warehouse (SDW) server needs, while capitalizing on the industry SHV motion.

2. Four to Eight Processor SMP Implementation

The advanced SMP architecture shown in FIG. 2, as well as the SHV architecture shown in FIG. 1, show systems containing up to four processors. However, the advanced architecture of the present invention is designed to allow two complexes, each similar to that shown in FIG. 2, to be interconnected to form an eight-processor system.

FIG. 3 illustrates an eight-processor SMP system formed of two four-processor building blocks. Each block, identified by reference numerals A and B, is seen to include all of the structure shown in FIG. 2. System components are identified by reference numerals ending in an A or a B, for complex "A" and "B", respectively. However the two systems are interconnected thorough a common expansion bus 315. From this Figure it is easy to see the modularity and versatility of this system. Additional structure, not shown in FIG. 2, includes a cache memory 321A and 321B associated with each processor 301A and 301B, respectively.

In any system employing a data cache memory, and particularly a system employing multiple cache memories, data from a given memory location can reside simultaneously in main memory and in one or more cache memories. However, the data in main memory and in cache memory may not always be the same. This may occur when a microprocessor updates the data contained in its associated cache memory without updating the main memory and other cache memories, or when another bus master changes data in main memory without updating its copy in the microprocessor cache memories.

To track the data moving between the processors, system memories 307A and 307B, and the various cache memories, the system utilizes a hybrid of memory and cache based coherency. Coherency between system memory and caching agents is maintained via a combination centralized/distributed directory-based cache coherency.

A directory-based cache coherency scheme is a method of keeping track of data movements between the processors and memory. With this approach to data coherency, a memory status table identifies which processors have which lines of memory in their associated cache memories. When a processor requests data, the status table identifies the location within main memory or processor cache memory where the most current copy of the data resides. The advantage of this method is that no additional work must be performed until a processor needs data that resides in a cache that cannot be accessed through snooping. Directory-based cache coherency is most effective with a large number of tightly-coupled processors on a system bus.

The centralized/distributed directory-based cache coherency scheme employed in the system shown in FIG. 3 consists of two directory elements. The central element within the directory scheme resides in the system memories and is called the Memory Line Status Table (MLST). Each memory line within system memory includes a corresponding entry in its MLST. This corresponding entry contains information indicating whether or not a line is cached, and if so, whether it is exclusively owned by one processor (or processor bus), or shared across multiple processors (or processor buses). The directory scheme and MLST can be set up to identify memory line ownership by processor bus or by processor. The "bit-per-bus" MLST distinguishes ownership on a bus basis, while the more granular "bit-per-processor" MLST distinguishes ownership on a processor basis. Note that the distinction is specific to a memory design and hence transparent to any other device on the system bus. Distributed directory elements reside locally within each processor's cache directory. The element associated with a particular processor is referred to as its Processor Line Status Table (PLST). Each cache line has a corresponding entry in the PLST. From the local processor's perspective this entry contains information indicating whether or not a line contains a valid copy of a main memory line, and if so, whether or not modifications to that line must be broadcast to the rest of the system. From the system's perspective, each processor's PLST is a slave to special system bus cycles known as Memory Intervention Commands (MICs). These cycles query the PLST as to the local state of a particular line, and/or tell the PLST to change that local state.

The Modified-Exclusive-Shared-Invalid (MESI) cache coherency protocol is a hardware-implemented protocol for maintaining data consistency between main memory and data cache memories. A typical implementation of the MESI hardware cache coherency protocol requires the utilization of cache controllers having the ability to:

1. use the same line size for all caches on the memory bus;

2. observe all activity on the memory bus;

3. maintain state information for every line of cache memory; and

4. take appropriate action to maintain data consistency within the cache memories and main memory.

MESI represents four states which define whether a line is valid, if it is available in other caches, and if it has been modified. Each line of data in a cache includes an associated field which indicates whether the line of data is MODIFIED, EXCLUSIVE, SHARED, or INVALID. Within the Processor Line Status Table each cache line is marked in one of the four possible MESI states:

MODIFIED (PM)--This state indicates a line of data which is exclusively available in only this cache, and is modified. Modified data has been acted upon by a processor. A Modified line can be updated locally in the cache without acquiring the shared memory bus. If some other device in the system requires this line, the owning cache must supply the data.

EXCLUSIVE (PE)--This state indicates a line of data which is exclusively available in only this cache, that this line is not Modified (main memory also has a valid copy), and that the local processor has the freedom to modify this line without informing the system. Exclusive data can not be used by any other processor until it is acted upon in some manner. Writing to an Exclusive line causes it to change to the Modified state and can be done without informing other caches, so no memory bus activity is generated. Note that lines in the (PE) state will be marked (MO) in the MLST, as will be described below.

SHARED (PS)--This state indicates a line of data which is potentially shared with other caches (the same line may exist in one or more caches). Shared data may be shared among multiple processors and stored in multiple caches. A Shared line can be read by the local processor without a main memory access. When a processor writes to a line locally marked shared, it must broadcast the write to the system as well.

INVALID (PI)--This state indicates a line of data is not available in the cache. Invalid data in a particular cache is not to be used for future processing, except diagnostic or similar uses. A read to this line will be a "miss" (not available). A write to this line will cause a write-through cycle to the memory bus. All cache lines are reset to the (PI) state upon system initialization.

In accordance with the MESI protocol, when a processor owns a line of memory, whether modified or exclusive, any writes to the owned line of memory within main memory will result in an immediate update of the same data contained within the processor's data cache memory.

The Memory Line Status Table marks a memory line in one of three possible states: NOT CACHED (MNC), SHARED (MS), and OWNED (MO). The letter M distinguishes these states from PLST states, which are identified by use of the letter P. Additionally there are bus and/or processor state bits indicating sharing or ownership on either a bus or processor basis.

NOT CACHED (MNC): Indicates that no cache has a copy of that line. All memory lines must be reset to the (MNC) state upon system initialization.

SHARED STATE (MS): Indicates that one or more caches potentially have a copy of that line.

OWNED STATE (MO): Indicates that one and only one cache potentially has a copy of that line, and that the data in memory potentially does not match it (Memory data is referred to as stale).

Note the word "potentially" used in the definition of the shared and owned states. There are several situations in which the MLST does not have the most up-to-date information about a particular memory line. For example, the MLST may mark a line as shared by two particular processors since it saw them both read it. However, both processors may have long since discarded that line to make room for new data without informing the MLST (referred to as "silent replacement"). The MLST will naturally "catch up" to the latest state of a particular line whenever an access to that line by some master forces a MIC. In this example, a write by a third processor to this line will initiate a (now superfluous) MIC to invalidate other cached copies, and will bring the MLST up-to-date. Note however that the MLST always holds a conservative view of the state of cache lines. That is, a line that is owned or shared by a processor will always be marked correctly in the MLST. "Stale" information in the MLST takes the form of lines marked owned or shared that are no longer present in any processor's data cache.

As stated above, the MLST includes additional bus and/or processor state bits indicating sharing or ownership on either a bus or processor basis.

The Bit-per-Bus Protocol uses three memory state bits per line to indicate the current state of the line. One bit indicates shared or owned, and the other two depict which bus (A or B) or buses (A and B) have the line shared or owned. Bus ownership indicates that one of the processors on that bus owns the line. Note that a line can be owned by only one processor and therefore by only one bus. A shared line can be shared by one or more processors on each bus.

TABLE 1 ______________________________________ Memory State Bits for Bit-per-Bus Protocol OBA STATE BIT DEFINITIONS DESCRIPTION ______________________________________ 000 MNC - Not Cached; Not owned or shared 001 MS - Shared; Shared on Bus A 010 MS - Shared; Shared on Bus B 011 MS - Shared; Shared on Buses A and B 100 x - (not a valid state) 101 MO - Owned; Owned by Bus A 110 MO - Owned; Owned by Bus B 111 x - (not a valid state) ______________________________________

The Bit-per-Processor Protocol has an MLST consisting of n+1 bits per line (n is equal to the number of processors) to indicate the current state of that line. One bit indicates whether the line is shared (MS) or owned (MO), and the other n bits depict which processor or processors have the line cached. A particular processor is numbered Pi, where i=0 to n-1. All Pi, where i is even, are on bus A, and all Pi, where i is odd, are on bus B. Processor ownership indicates which processor (only one) owns the line. A shared line can be shared by one or more processors on either or both buses.

TABLE 2 ______________________________________ Memory State Bits for Bit-per-Processor Protocol O P0..Pn-1 STATE BIT DEFINITIONS ______________________________________ 0 all zeros MNC - Not Cached 0 one or more set MS - Shared 1 only one set MO - Owned 1 more than one set x - (not a valid state) 1 all zeros x - (not a valid state) ______________________________________

3. NCR Advanced Memory Controller

The NCR Advanced Memory Controller (AMC) manages control and data flow in all directions between a processor bus (also referred to herein as the CPU bus) and a I/O bus (also referred to herein as the expansion bus). The I/O bus may contain processor bus-to PCI I/O Bridges and another AMC ASIC for connectivity to another processor bus. The AMC also controls access to a system memory.

At its highest level, the AMC is made up of a CPU side interface. an I/O side interface, logic for controlling a Memory Line Status Table (MLST) and generating/controlling Memory Intervention Commands (MICs), and a dual-ported DRAM controller.

Further granularity of these blocks is described in the following subsections. These logic blocks are partitioned into two ASICs--a control logic ASIC (the AMC.sub.-- DC chip) and a data path logic ASIC (the AMC.sub.-- DP). FIG. 4 provides a block diagram illustration of the control logic included within the advanced memory controller, and FIG. 5 provides a block diagram illustration of the data path logic included within the advanced memory controller.

3.1 CPU Bus Side

3.1.1 CPU Decoder Logic (PDEC) 401

This logic examines every request on the CPU bus and determines whether the request is targeted at this chip (the request should be loaded into its CPU Outbound Request Queue). The logic determines whether the request should be forwarded to its DRAM controller unit, or forwarded out onto the I/O bus.

The CPU Decoder Logic also works with the MLST/MIC Control Logic. It can cancel MLST lookups that may have started when the decode results show the request is targeting Remote memory or I/O. 3.1.2 CPU BUS Engine (PBE) 403

The CPU Bus Engine contains all the circuitry necessary for efficient communication between the up to four processors on its CPU bus and the AMC. The CPU Bus Engine includes the following logic sub-blocks: Bus Tracking Queues and Control, Data Buffer Allocation/Transfer Logic, Control Signal/Response Generation, and Requesting Agent Logic.

3.1.2.1 Bus Tracking Queues and Control

The Bus Tracking Queue (BSTQ) is responsible for tracking all bus transactions that are initiated on the CPU bus. It is a single physical structure which implements the 4 logical structures of the processor bus: the In-Order queue, which tracks all transactions on the processor bus, the Snoop Queue, the Out-Bound Buffer Allocation Queue and the Internal Tracking Logic Queue. The BSTQ is a moderate sized structure which allows the resource to track the state of the current operation as it relates to this resource (AMC) and other resources processors on the CPU bus. The table below lists the information that is contained within the BSTQ.

TABLE X ______________________________________ Bus State Tracking Queue bit assignments BSTQ field # bits Function ______________________________________ Transaction Responding 1 This bit is set to indicate that the request Agent (tra) assigned to this element is targeted to this resource. Snoop Window This bit is set to a `1` when tbe snoop Complete (swc) result phase for this request has completed on the CPU Interface. Internal Operation This b t is set to a `1` by the target Commit (ioc) block to indicate that the data operation is complete. The ioc bit may be set to a `I` at the access initiation or a `O` and be set to a `I` at a later time. If the ioc bit is set to a `O` at the initiation of the request, the snoop result phase of this request will be extended until ioc is set to a `1`. RequestOr (ro) This bit is set to a `I` if this resource initiated the request logged into this BSTQ element. Data Buffer Pointer These bits contain the pointer of the (db) data buffer element (outbound or inbound) assigned to this request. Transfer Direction This bit specifies the direction of the (Trndir) data transfer requested for this access. Response Type (rt) These bits are used to determine the transaction response for a given request. Data Length (Len) This field is used to specify the length of the data transfer of the original request ______________________________________

3.1.2.2 Data Buffer Allocation/Transfer Logic

The Data Buffer Allocation logic is used to control the allocation of the data buffers to the requests initiated on the CPU bus. The data buffers are allocated for both read and write operations. Data buffers are allocated for read operations with the anticipation that the read could be converted to an implicit writeback.

The Data Transfer logic is responsible for controlling the signals to initiate the transfer of data on the CPU interface. This logic is responsible for the generation of DBSY#, TRDY# (both by the AMC control logic ASIC), and DRDY# (by the AMC data path logic ASIC).

3.1.2.3 Control Signal/Response Generation

The Response Generation logic venerates the response for transactions owned by the AMC on the CPU bus. This will be ALL transactions on the CPU bus except for Branch Trace messages if the Enable Branch Trace Message Response Shadow hit is equal to `O` and the Shadow Bit Enable is equal to I. This mode is only used when In-Circuit Emulator test equipment is used on the processor bus for debug purposes.

The Control Signal Generation logic is responsible for the generation of the various signals that are used to control the rate of requests on the CPU bus and to a large extent, the depth of the bus pipeline.

3.1.2.4 Requesting Agent Logic

The Requesting Agent Logic block consists of those elements within the Bus Engine that are required for the AMC to be a bus master on the CPU bus. The AMC will become a CPU bus requesting agent for Deferred Replies, and for MIC invalidates (BRPS, BELS, or BRILS) that are initiated by its local MIC Generation block or by the MIC generation logic of a Remote AMC.

3.1.3 CPU Queue Management Unit (PQMU)

The CPU Queue Management Unit contains the inbound and outbound request queues and data buffers to/from the CPU bus. Inbound and outbound here are referred to with respect to the processor bus. This means the Outbound Request Queue contains requests outbound from the processor bus INTO the AMC; the Inbound Request Queue contains requests inbound to the processor bus out form the AMC.

3.1.3.1 Outbound Request Queue (PORQ) 405

The CPU Outbound Request Queues hold pending CPU bus requests targeted at the AMC. Every transaction on the CPU bus is initially loaded into the PORQ; however the CPU Decoder will allow subsequent requests to overwrite previous requests that were not decoded to be to the AMC. The PORQ is actually implemented as two parallel queues. one targeted for local decode requests (Local PORQ), and one targeted for remote decode requests (Remote PORQ). Both queues are loaded in parallel, but only one (or none) of the two is "pushed" (marked as valid). The PORQ queues contain two logical records for each pending request: the request which holds the request address and command, and the request status which holds status (error, snoop) information. The address/command field of the Local PORQ also includes an Effective address field that is loaded by the CPU Decoder.

The address/command record is not changed once it is loaded into the PORQ; however, the status record is updated dynamically after the PORQ element is initially loaded. This is done by allowing the CPU Bus Engine or LST logic to "address" PORQ elements via the 3-bit InOrderQ