|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to the field of computers, and more
particularly, to a system and method of managing read resources in a
multiprocessing environment.
2. Related Art
Modern multiprocessing systems typically have a plurality of processors, a
main memory and other input/output (I/O) devices that are coupled to a
main system bus. To permit maximum utilization of the bus these systems
use split transactions. (For simplification, the processors, main memory
and I/O devices will be referred to as "nodes".) A split transaction is a
read transaction that is split into a read request transaction that
contains the address of the data requested, and a reply transaction that
contains the requested data. Reply transactions are also called read
responses. Each transaction must be tagged so that the requesting node and
the replying node (e.g., main memory or a processor having the requested
data in its cache) can keep track of the status of the read request. When
individual transactions being performed by a node are long in duration, a
split transaction approach allows several transactions to simultaneously
occupy the bus, thus increasing the effective bandwidth of the bus.
In order to support split transactions on a shared bus, both the requesting
node and the replying node must arbitrate for the bus to perform their
respective functions. Thus a split transaction bus has higher bandwidth,
but usually has higher latency than a bus that is held for the entire read
transaction.
A cache coherent system bus can also use split transactions. However, to
maintain cache coherency all nodes on the bus must be aware of all pending
split transactions. Those nodes that do not have caches must also be aware
of all pending split transactions. What is desired is a mechanism to
handle split transactions and maintain cache coherency across a
multiprocessing domain.
SUMMARY OF THE INVENTION
Once a read transaction is split and left pending, the read request and
read response must be protected from a new transaction that could disturb
the state of the data addressed by the pending transaction. Read resources
according to the present invention are a protection for those pending
transactions. When a read request is issued it is allocated a read
resource from a pool of read resources. The read resource protects the
read request for the duration of its pendency. When the transaction ends,
the read resource assigned to that transaction terminates. Thus, this read
resource or protector stays with the transaction until its conclusion, and
at the conclusion of the transaction the read resource is released back to
the free pool.
The present invention is a system and method of implementing read
resources. Read resources allow transactions to be physically split, while
permitting transactions to remain logically unified. As long as coherent
transactions do not conflict with each other (i.e., are not targeting the
same physical address, block of data or cache line), several can occupy
the bus concurrently. Any conflicting coherent transactions must be issued
serially.
In the preferred embodiment of the present invention, read resources are
implemented as a series of tags which track outstanding read requests and
read responses. The system designer chooses the maximum number of
simultaneously pending split transactions by setting the number of read
resource tags. Each node on the bus must then abide by the contents of the
read resource tags, and only issue coherent transactions which do not
conflict with the currently pending read resources. The tags are
maintained in a fully associative manner to minimize read resource tag
contention. New split transactions can occupy any read resource tag
without restriction.
Read resources also provide a method by which read responses can be issued
from the node with the most recent copy of the data. When a read request
appears on the bus, the memory system will generally respond before
processors which may have more recent data available. Once the memory
responds, a processor may indicate that more recent data may be available
by asserting the pending read resource associated with the pending read.
Once that processor has completed its check for more recent data, it may
provide that data or simply de-asserting the read resource.
According to the present invention, non-coherent reads may only occupy the
first seven read resources, while coherent reads may occupy all eight.
This restriction on non-coherent reads exits to prohibit read requests to
I/O nodes, which are non-coherent, from occupying all read resources and
potentially creating a deadlock situation.
BRIEF DESCRIPTION OF THE FIGURES
The invention will be better understood if reference is made to the
accompanying drawings in which:
FIG. 1 shows a high level representative block diagram of a multiprocessing
system incorporating read resources according to the present invention.
FIG. 2 is a state diagram of the main system bus.
FIG. 3 shows a preferred embodiment of a read resource CAM of the present
invention.
FIG. 4 is a flow chart generally describing the read resources allocation
process of the present invention.
FIG. 5 shows a bus timing diagram in which a read request is successfully
placed on the bus according to the present invention.
FIG. 6 shows a modification of the bus timing diagram of FIG. 5.
FIG. 7 shows a further modification of the bus timing diagram of FIG. 5.
FIG. 8 shows a still further modification of the bus timing diagram of FIG.
5.
In the drawings, like reference numbers indicate identical or functionally
similar elements. Additionally, the left-most digit of the reference
number identifies the drawing in which the reference number first appears.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The following two text books provide further discussions of modern system
buses and transactions between nodes on such buses: John L. Hennessy et
al., "Computer Architecture--A Quantitative Approach", (Morgan Kaufmann
Publishers, Inc., San Mateo, Calif., 1990); and Stephen B. Furber, "VLSI
RISC Architecture and Organization", (Marcel Dekker, Inc., New York, N.Y.,
1989), which are incorporated herein by reference.
Read Resources System Overview
Now turning to a detailed description of the present invention, FIG. 1
shows a high level representative block diagram of a multiprocessing
system 100 incorporating read resources according to the present
invention. A plurality of nodes include processor nodes 102 and 104,
memory node 106 and an input/output (I/O) node 108. Each node is coupled
to a main system bus 109 comprising at least an address bus 110 and a data
bus 112. As would be apparent to a person skilled in the relevant art, the
number and type of nodes coupled to buses 110 and 112 may vary greatly.
Reads on the main system bus 109 are split into read requests and read
responses. This allows other transactions, including other read requests,
to occur between a read request and read response pair, because the memory
or cache of another processor, for example, will require several cycles to
provide the data.
After a read request is issued on the bus, the read is said to be pending.
It will remain pending until the read response which satisfies the request
is issued. While the read is pending, no coherent transactions which
target the same data as the pending read may be issued. This restriction
allows reads to behave as if they were not split transactions for
coherency considerations.
To enforce this restriction, each node on the bus must keep track of
pending coherent reads. Because this requires a fully associative
comparison to occur at each node before it can issue a coherent
transaction, the bus protocol limits the number of pending reads. In the
preferred embodiment of the present invention, no more than eight reads
may be pending at any given time.
There is no coherency restriction for pending, non-coherent reads, but
non-coherent reads must follow the pending read limit as well. This
provides flow control for non-coherent reads and simplifies the
implementation of the read response control logic (to be described in
detail below).
FIG. 2 is a state diagram of the main system bus 109. Each bus transaction
consists of five cycles: arbitration, resolution, address, flow control,
and acknowledge, labeled 202, 204, 206, 208 and 210, respectively. When
the bus is idle, it drops into a two-cycle mode, as shown by arrow 212.
This allows new requests to appear on an idle bus as soon as they are
ready, instead of waiting for the arbitration cycle to arrive.
The main system bus 109 also includes a number of other buses/lines which
are not shown in FIG. 1. The main system bus 109 also includes a 256 bit
wide data bus 112, a 40 bit wide address bus, an 8 bit wide command bus.
The functionality of the ADDRESS.sub.-- HERE line will be described below.
A data resource identification number bus (DRSC) comprises 8 lines that are
valid during cycle III. As with many of the buses in the system, the DRSC
bus is considered time multiplexed, since it functions differently during
different bus cycles. Thus, the 8 lines of the DRSC bus are also used as
inhibit lines during cycles I and V.
The 40 bit wide address bus and 8 bit wide command bus are time multiplexed
and subgrouped into 3 16-bit wide buses. During the acknowledge cycle 210,
the upper 16 lines are used by the nodes as address not acknowledged
(NACK) lines (i.e., one line per node, assuming 16 maximum nodes in the
preferred embodiment), which are also called ADDRESS.sub.-- ERR lines. The
middle 16 lines are used by the nodes as data NACK lines, which are called
DATA.sub.-- ERR lines. The lower 16 lines are used by nodes as lines to
specify whether a read response is shared or not. An ADDRESS.sub.-- HERE
line is used indicated whether a read request is valid.
Multi bit parity/parity error lines, and address bus and data bus
arbitration lines are also included. The bus parity scheme is omitted from
this description because it is beyond the scope of the present invention.
An appropriate parity scheme would be apparent to a person skilled in the
relevant art.
Address lines in the address bus 110 are used for arbitration and
acknowledgement as well as for transmitting addresses. During the address
cycle 206, a node granted use of the address bus (e.g., for issuing a read
request) drives the address and command buses. Simultaneously, a node
granted use of the data bus 112 (e.g., for a read response) drives the
read resource number of its read response on the DRSC bus. For read
requests, writes, updates, and invalidates, the data resource number is
not driven.
The DRSC bus also functions as an inhibit bus. The inhibit bus is used to
indicate that a bus node which has a cache has not yet completed its cache
snoop. Inhibits are driven on cycles I and V. Note that cycle I appears on
an idle bus as well as a busy bus, as the inhibit bus must be visible at
all times. Each line of the inhibit bus corresponds to a read resource.
Read Request
The read request transaction does not use the data bus 112. A read request
may be issued any time a read resource is available, assuming the read
address does not conflict with any other pending coherent read addresses.
To issue a read request, a node on the bus asserts its address arbitration
line only. If arbitration is won, the node issues an address and command
during the address cycle. The read is now pending, and every node in the
system must place the read address in the first available read resource
register (to be described below). This placement is deterministic, as the
read resource information must be identical across all nodes.
The implementation of address and data bus arbitration are beyond the scope
of the present invention. Appropriate arbitration protocols would be
apparent to a person skilled in the relevant art.
A read request can be coherent or non-coherent. A coherent read cannot be
issued if it targets the same cache block as another pending coherent
read; non-coherent reads need not follow this restriction. When a
non-coherent read fills a read resource register, it will block coherent
requests targeting the same cache block. Note, however, that this scenario
should preferably not occur; a particular cache block should be considered
either coherent or non-coherent system wide.
Once successfully issued, a read request will remain pending until a read
response indicates it is for a particular pending read request by
asserting the data resource number associated with that pending read
request.
Read Response
The read response transaction does not use the address bus 110. When a node
is ready to provide a read response for a pending read request, it
arbitrates for the data bus 112 only. If arbitration is granted, the node
drives the read resource number it is responding to on the DRSC bus during
the address cycle. The read resource number for a read request is
determined by each node after the read request is issued. Each node will
determine the same read resource number for a specific read request, such
that all nodes can track the progress of that specific read request using
the same read resource number. The specific manner in which a read
resource number is assigned is discussed below. Read response data begins
two cycles later.
When a read request appears, the memory corresponding to the address of the
read request will drive the ADDRESS.sub.-- HERE line. If no memory node
drives the ADDRESS.sub.-- HERE line, all nodes will understand that read
request is an error (i.e., its a read to an address that does not exist).
If a node with a cache has not had enough time to complete a snoop and
determine the state of its local copy, it will drive the inhibit line on
the DRSC bus. This indicates that the requestor must not use the
transferred data until the node completes its snoop and either supplies a
new read response or stops driving the inhibit line. The node driving
inhibit must continue to do so during each acknowledge cycle until it
completes its snoop. If it finds that the data is clean in its cache, it
changes its cache state to shared or invalid, as appropriate, and stops
driving the inhibit line. This allows the requester to use the data
transferred by memory and frees the associated read response. If the cache
line is dirty, the node transfers the dirty data to the requester and
changes its state to shared or invalid, as appropriate. Memory will
respond to any read request as long as a node with a cache has not yet
provided data satisfying the read. For responses to both coherent and
non-coherent read requests, any node may indicate an error during the read
response by driving the data error line.
Memory nodes write coherent block read response data provided by other
devices back into main memory. The memory may NACK a read response
provided by another node if it has insufficient buffer space to store the
data when the read response occurs. The responding node must resend the
read response.
Read Resource CAMs
Read resources are implemented as Content Addressable Memories (CAMs). Read
resource CAMs 114 are distributed throughout the system, with a full CAM
and an associated CAM controller logic (not shown) residing in each node.
Each node must have a consistent copy of the read resources in its CAM to
maintain cache coherency. Pending transactions are entered in the read
resource CAMs 114 throughout the system so that split transactions can be
tracked and protected. Read resources protect split transactions by
preventing the broadcasting (e.g., transmission on the bus) of new
transactions which target the same address as pending transactions. Read
resources are also used to identify read responses, as well as indicating
the termination of a transaction.
Each read resource CAM 114 includes a number of storage registers,
elements, or the like, (302) equal to the number of read resources
desired, and each register is a predetermined number of bits wide. A
preferred embodiment of a CAM of the present invention is shown in FIG. 3.
Each CAM 114 is preferably 8 registers deep. The entries 302 are assigned
read resource numbers 0-7 and are implicit according to their entry
position in the CAM 114. For example, the top entry of CAM 114 can be read
resource 0 and the bottom entry read resource 7. Increasing the number of
registers (read resources) can facilitate a slower memory system, allow
for interventions from processors or allow for longer reads to I/O
devices.
A single register comprises a number of bits of physical address as well as
one valid bit. The total width of the read resource CAM is determined by
the size of the physical address. To avoid address aliasing, the read
resource CAM width is equal to the total width of the physical address
minus the width of data transferred in a single bus transaction (this
latter number is typically the cache line size.) In addition to the
address, each CAM register also contains a single bit which indicated that
the entry is currently valid (in use) and protected.
According to the preferred embodiment, each register 302 is 36 bits wide.
35 bits consists of the full address down to the double word level, and
one valid bit. When the valid bit is turned "on" the resource is
allocated, that address is activated and is currently being protected by a
read resource. When a transaction is terminated, the corresponding valid
bit of the read resource is turned "off". Turning off the valid bit
returns the resource to the free pool.
When any node attempts to initiate a new transaction, the transaction
address is first compared with each entry in the read resource CAM. This
is shown diagrammatically in FIG. 3. If a "match" is found by one of the
comparators 304, the transaction is not allowed to proceed until the
matching entry becomes invalid. By blocking new transactions when a match
with a pending transaction occurs, the read resource CAM 114 protects the
pending transaction from conflicting bus traffic.
When a node successfully places a read request on the bus (i.e., no match
is found in the local CAM), a read resource is allocated and an entry is
made in a register of the read resource CAM of that node. When that node
places the read request on the bus it is said to "broadcast" the
transaction. Thus, every other node will observe this request. Each of the
other nodes will deterministically assign the same read resource number to
that read request, and it will store the address of the read request at
the same register in its CAM and mark that entry valid. The read request
is now considered pending, and occupies exactly one, and the same,
numbered register in each read resource CAM. In this way it is guaranteed
that the distributed system will all be in step (i.e., each CAM will have
the same read resource information).
A major advantage in the above described distribution is that the read
resource information implicitly flows to all nodes without any additional
communication overhead. In a similar fashion, when read responses are
transmitted on the bus, the appropriate read resource number is also
transmitted (on the DRSC bus as discussed above). This tagging of read
responses with read resource numbers is an efficient way to provide
response data without rebroadcasting the read address, and thus conserving
bus bandwidth.
Controlling the Read Resource CAM
It is critical that each node's read resource CAM be identical at all
times. Since the read resources are distributed, each node allocates and
de-allocated read resources according to a precise set of conditions.
These conditions depend on the current state of the read resource CAM, as
well as bus activity. Allocation and de-allocation of read resources is
described in detail, including pseudo code, below.
Read Resource Allocation
Read resources are allocated when a new read request appears on the bus. In
the implementation described below, a read request must appear on the bus
without encountering errors if it is to be allocated a read resource.
There are several errors which can prevent a read resource from being
allocated to a successful read request. An error is indicated by the
ADDRESS.sub.-- ERR line being asserted, or by the ADDRESS.sub.-- HERE line
not being asserted (failure to assert the address the ADDRESS.sub.-- HERE
line indicates that the read targeted an illegal address.) Finally, the
read must not be Not Acknowledged (NACK'ed) for any reason. A NACK'ed read
is not allocated a read resource, and the requesting node must retry
later.
If the read successfully appears on the bus without errors or nacks, it
will be allocated a read resource by each CAM controller. When a read
resource is allocated, its number is assigned through a priority encoder.
Preferably, the priority encoder is implemented using hardware logic
gates. A preferred embodiment of the priority encoder is shown in Table 1,
in the form of pseudo code. Conversion form the pseudo code to hardware
logic gates would be apparent to a parson skilled in the relevant art.
The priority encoder allocates the lowest numbered available resource. A
section of the Read Resource Allocation Module is executed by each node to
priority encode to find the lowest numbered bit read resource that is
clear (i.e., not in use). (See the priority encode section beginning at
code line 19.) Once a resource is allocated, it will remain valid until
the de-allocation logic terminates the transaction. The remainder of Table
1 shows the pseudo code for handling errors during allocation.
FIG. 4 is a flow chart generally describing the read resources allocation
process. A check of the local CAM (the CAM of the node attempting a read
request) is first conducted to determine whether a valid read resource for
the transaction address is pending, as shown at step 402. If a match is
determined as shown at the "yes" line, the node's CAM controller loops
back to check again in the future, as shown at step 404. If no match is
determined as shown at the "no" line, the node's CAM controller places a
read request on the bus, as shown at step 406. If an error or nack is
detected during a step 408 by the node that transmitted the read request,
the node's CAM controller loops back to retransmit the read request again
in the future, as shown at the "yes" line. Otherwise, all the CAMs assign
a read resource, as shown at a step 410.
According to the above description, a node puts a request on the bus, if no
errors or the like are detected, that request is allocated a read
resource. All nodes track the read resource during its pendency; they
cannot act on that transaction or do anything which interacts with that
transaction for the duration of that period for which it is pending.
When, for example, memory responds to a read request, it will place the
data on the bus and place the read resource number on the bus so that all
nodes see that this read response is coming up in response to the read
resource number which relates to the original transaction that created it.
When that response successfully completes, the transaction is terminated
and the read resource returns to the free pool.
If a processor puts a read on the bus and memory was going to respond but
it turns out that another processor or node has the data in its cache,
dirty or otherwise modified such that it is the most recent copy, then the
node with the most recent copy is going to place the read response on the
bus, assert the resource number, and the transaction will terminate. This
is called a three-party transaction, because three nodes are involved.
There is the original node requesting, a second node providing the data,
and memory which plays a passive role by merely snooping the read response
and updating its contents.
Read Resource De-allocation
A transaction is complete when it has been satisfied by a successful read
response, or when it has remained pending too long, which causes a timeout
error that frees the read resource. Table 2 below sh | | |