|
Claims  |
|
|
What is claimed:
1. A packet-switched bus for transferring data between a plurality of
system buses and a cache controller of a cache for storing data, said
cache controller being coupled to a processor, comprising:
data lines coupled to said system buses and said cache controller for
transmitting data and command information between said system buses and
said cache controller;
control lines coupled to said system buses and said cache controller for
transmitting a plurality of clock signals, said clock signals being
coupled to said system buses and said cache controller;
sending and receiving means for transferring data and command information
between one of said system buses and said data and control lines, said
sending and receiving means being coupled to said system buses and said
data and control lines, said sending and receiving means including:
at least one input FIFO buffer coupled to each of said system buses for
buffering data and command information to be transferred to said data and
control lines;
at least one output FIFO buffer coupled to each of said system buses for
transferring data and command information from said data and control lines
to its system bus; and
storage means for storing and monitoring a disjointed set of cache tags for
said cache for each of said system buses;
priority arbitration means coupled to said sending and receiving means and
said cache controller for receiving and granting requests for control of
said data and control lines, each sending and receiving means for each
system bus transmitting a bus request for control of said data and control
lines to said priority arbitration means a predetermined delay after
receiving said data and command information from its system bus such that
the bus requests are received by said priority arbitration means in the
same order as each input FIFO buffer receives its data and command
information from its system bus, said priority arbitration means receiving
and granting each request through point-to-point request and grant lines
coupling to each sending and receiving means of each system bus, each of
said grant lines being continuously asserted while the data and command
information is transferred onto said data and control lines; and
overflow control means coupled to said priority arbitration means, to said
cache controller and to said sending and receiving means for preventing
overflow, said overflow control means controlling the flow to said
priority arbitration means by causing said system buses not to transfer
data and command information to each of said sending and receiving means
such that no bus request is sent from each sending and receiving means,
said overflow control means controlling the flow into said cache
controller by causing said priority arbitration means not to grant bus
requests to said sending and receiving means, said overflow control means
controlling the flow to said sending and receiving means from said data
and control lines by causing said cache controller not to transfer to said
sending and receiving means,
whereby data is transferred between said cache controller and said system
buses.
2. The bus as defined by claim 1, further comprising means for
automatically granting control of said data and control lines to said
cache controller when said data and control lines is idle but before a
cache miss, such that as soon as control of said data and control lines is
requested by said cache controller during said cache miss, data and
commands will be transferred from said controller to said data and control
lines; said priority arbitration means then granting control of said data
and control lines to said sending and receiving means after the data and
commands have been transferred to said system buses but before a response
from said system buses arrives at said sending and receiving means such
that said response can be transferred from said sending and receiving
means to said cache controller through said data and control lines as soon
as control of said data and control lines is requested by said sending and
receiving means when said response arrives from one said system buses.
3. The bus as defined by claim 1, wherein said plurality of system buses
are coupled to a plurality of other processors and cache controllers
through their corresponding sending and receiving means, and said data and
control lines further comprises interrupt means for generating and
acknowledging processor interrupts, said interrupt means transferring a
processor interrupt and acknowledgment between a first cache controller
and a second cache controller through their respective sending and
receiving means coupled through said system buses, said second cache
controller generating an acknowledgment to reply said interrupt and
interrupting its processor if it is the processor intended by said
interrupt.
4. The bus as defined by claim 3, wherein said interrupts means comprises:
means for identifying a target to be interrupted by transmitting a command
specifying said target through its sending and receiving means to its
system bus;
means for identifying a source for the target to transmit its reply through
its system bus after said target determines that it is the target to be
interrupted.
5. The bus as defined by claim 1, wherein said priority arbitration means
is disposed in said cache controller for receiving bus requests and
granting bus grants directly from said cache controller such that bus
requests and grants for said cache controller are not transferred through
said data and control lines.
6. In a computer system including a plurality of system buses coupled to a
plurality of cache controllers, each of said cache controllers coupled to
a cache RAM and to a processor, a packet-switched bus for transferring
data between said system buses and each of said cache controllers,
comprising:
data lines coupled to each of said system buses and said cache controllers
for transmitting data and command information between said system buses
and each of said cache controllers;
control lines coupled to said system buses and each of said cache
controllers for transmitting a plurality of clock signals, said clock
signals being coupled to said system buses and each of said cache
controllers;
sending and receiving means for transferring data and command information
between one of said system buses and said data and control lines, said
sending and receiving means being coupled to one of said system buses and
said data and control lines, said sending and receiving means including:
at least one input FIFO buffer coupled to each of said system buses for
buffering data and command information to be transferred to said data and
control lines;
at least one output FIFO buffer coupled to each of said system buses for
transferring data and command information from said data and control
lines; and
storage means for storing and monitoring a disjointed set of cache tags for
said cache for each of said system buses;
priority arbitration means coupled to said sending and receiving means and
each of said cache controllers for receiving and granting requests for
control of said data and control lines, said sending and receiving mean
for each system bus transmitting a bus request for control of said data
and control lines to said priority arbitration means a predetermined delay
after receiving said data and command information from its system bus such
that the bus requests are received by said priority arbitration means in
the same order as each sending and receiving means of each system bus
receives its data and command information from its system bus, said
priority arbitration means receiving and granting each request through
point-to-point request and grant lines coupling to each sending and
receiving means of each system bus, each of said grant lines being
continuously asserted while the data and command information are
transferred on said data and control lines; and
overflow control means coupled to said priority arbitration means, to said
cache controller and to said sending and receiving means for preventing
overflow, said overflow control means controlling the flow to said
priority arbitration means by causing each of said system buses not to
transfer data and command information to its sending and receiving means,
said overflow control means controlling the flow into each of said cache
controllers by causing said priority arbitration means not to grant bus
requests to said sending and receiving means, said overflow control means
controlling the flow to said sending and receiving means from said data
and control lines by causing each of said cache controllers not to
transfer to said sending and receiving means,
whereby data is transferred between said cache controllers and said system
buses.
7. The bus as defined by claim 6, wherein said priority arbitration means
coupled to each of said data and control lines includes means for
automatically granting control of said data and control lines to its cache
controller when said data and control lines is idle but before a cache
miss, such that as soon as control of said data and control lines is
requested by said cache controller during said cache miss, data and
commands will be transferred from said controller to said data and control
lines; said priority arbitration means then granting control of said data
and control lines to said sending and receiving means after the data and
commands have been transferred to said system buses but before a response
from said system buses arrives at said sending and receiving means such
that said response can be transferred from said sending and receiving
means to said cache controller through said data and control lines as soon
as control of said data and control lines is requested by said sending and
receiving means when said response arrives from said system buses.
8. The bus as defined in claim 6, further including interrupt means for
generating and acknowledging processor interrupts, said interrupt means
comprises:
means for identifying a target to be interrupted by transmitting a command
specifying said target through its sending and receiving means to its
system bus;
means for identifying a source for the target to return its reply through
its system bus after said target determines that it is the target to be
interrupted.
9. The bus as defined by claim 6, wherein said priority arbitration means
is disposed in each of said cache controller.
10. A method for transferring data on a packet-switched bus coupled between
a plurality of system buses and a cache controller of a processor,
comprising the steps of:
providing data and command information from either said system buses or
said cache controller to be transferred over said packet switched bus;
arbitrating for control of said packet switched bus using priority
arbitration means based on a predetermined priority heirarchy, including
the steps of:
transmitting a request signal for control of said packet switched bus on
data lines coupled to said system buses and said cache controller; and
receiving a grant signal for control of said packet switched bus on said
data lines coupled to said system buses and said cache controller;
transferring said data and command information on said data lines coupled
to said system buses and said cache controller to either of said system
buses by sending and receiving means coupled to said system buses and said
cache controller;
controlling data flow over said packet switched bus through overflow
control means to prevent said arbitration means and said sending and
receiving means from exceeding the data transfer capability of said packet
switched bus with data and command information, including:
deasserting said grant signal for control of said packet switched bus by
said sending and receiving means when said sending and receiving means
contains requests for control of said packet switched bus beyond a first
predetermined level;
halting flow of data from said system buses into said sending and receiving
means and flow of request signals from said sending and receiving means to
said cache controller when said arbitration means contains request signals
for control of said packet switched bus beyond a second predetermined
level;
halting flow of reply signals from said cache controller to said sending
and receiving means when said arbitration means contains request signals
for control of said packet switched bus beyond a third predetermined
level; and
halting flow of data from said cache controller to said sending and
receiving means when said sending and receiving means contains data to be
sent on said system buses beyond a fourth predetermined level;
whereby data is transferred between said system buses and said cache
controller.
11. A method according to claim 10, further comprising the steps of
minimizing arbitration latency for transferring data on a first bus
coupled to a plurality of system buses and a cache controller, said
latency occurring when control of said first bus is requested for a cache
miss while said first bus is idle, said steps comprising:
granting control of said first bus to said cache controller when said first
bus is idle;
generating a request for control of said first bus by said cache controller
upon a cache miss by a processor coupled to said cache controller;
transferring a request for data and command to one of said system buses
through the receiving and sending means coupled to each of said system
buses, said receiving and sending means being coupled to said first bus
and to said cache controller;
granting control of said first bus to said receiving and sending means of
said system bus;
whereby upon arrival of a reply for said request at said receiving and
sending means, said reply is transferred to said cache controller without
arbitration delay.
12. In a computer system including a plurality of system buses coupled to a
plurality of cache controllers, each of said cache controllers coupled to
a cache RAM and to a processor, a method for minimizing arbitration
latency for transferring data on a first bus coupled to said plurality of
system buses and said cache controller, said latency occurring when
control of said first bus is requested for a cache miss while said first
bus is idle, comprising the steps of:
granting control of said first bus to said cache controller when said first
bus is idle;
generating a request for control of said first bus upon a cache miss by a
processor coupled to said cache controller;
transferring a request for data and command to one of said system buses
through its receiving and sending means coupled to each of said system
buses, said receiving and sending means being coupled to said first bus;
granting control of said first bus to said receiving and sending means of
said system bus;
whereby upon arrival of a reply for said request at said receiving and
sending means, said reply is transferred to said cache controller without
arbitration delay.
13. The method as defined by claim 10, further comprising a method of
interrupting a target processor by a source processor coupled through said
system buses, wherein said method comprises:
transferring an interrupt command from the cache controller of said source
processor to said packet switched bus, said interrupt command specifying
said target processor to be interrupted and said source processor;
transferring said interrupt command to said system buses from said packet
switched bus through its sending and receiving means;
receiving said interrupt command by the sending and receiving means of all
other processors coupled to said system buses;
replying said interrupt command by transmitting an acknowledgement to the
sending and receiving means of said source processor through said system
buses by said all other processors, each of said other processors
determining whether it is the intended target processor by reading said
interrupt command;
interrupting said target processor as specified by said interrupt command.
14. In a computer system including a plurality of system buses coupled to a
plurality of cache controllers, each of said cache controllers coupled to
a cache and to a processor, a method for transferring data on a
packet-switched bus coupled between said system buses and each of said
cache controllers, comprising the steps of:
providing data and command information to be transferred between said
system buses and said cache controller through said packet switched bus;
arbitrating for control of said packet switched bus using priority
arbitration means based on a predetermined priority hierarchy, including
the steps of:
transmitting a request signal for control of said packet switched bus on
data lines coupled to said system buses and said cache controller; and
receiving a grant signal for control of said packet switched bus on said
data lines coupled to said system buses and said cache controller;
transferring said data and command information on said data lines coupled
to said system buses and said cache controller;
controlling data flow over said packet switched bus through overflow
control means to prevent said arbitration means and said sending and
receiving means from exceeding the data transfer capability of said packet
switched bus with data and command information, including:
deasserting said grant signal for control of said packet switched bus by
said sending and receiving means when said sending and receiving means
contains requests for control of said packet switched bus beyond a first
predetermined level;
halting flow of data from said system buses into said sending and receiving
means and flow of request signals from said sending and receiving means to
said cache controller when said arbitration means contains request signals
for control of said packet switched bus beyond a second predetermined
level;
halting flow of reply signals from said cache controller to said sending
and receiving means when said arbitration means contains request signals
for control of said packet switched bus beyond a third predetermined
level; and
halting flow of data from said cache controller to said sending and
receiving means when said sending and receiving means contains data to be
sent on said system buses beyond a fourth predetermined level;
whereby data is transferred between said system buses and said cache
controller.
15. The method as defined by claim 14, further comprising the steps of
interrupting a target processor by a source processor coupled through said
system buses in said computer system, said steps comprising:
transferring an interrupt command from the cache controller of said source
processor to its packet switched bus, said interrupt command specifying
said target processor to be interrupted and said source processor;
transferring said interrupt command to said system buses from said packet
switched bus through its sending and receiving means;
receiving said interrupt command by the sending and receiving means of all
other processors coupled to said system buses;
replying said interrupt command by transmitting an acknowledgement to the
sending and receiving means of said source processor through said system
buses by said all other processors, each of said other processors
determining whether it is the target processor by reading said interrupt
command;
interrupting said target processor as specified by said interrupt command. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to apparatus and methods for transferring
data between a source and a plurality of data processing devices. More
particularly, the present invention relates to an improved bus apparatus
and method for transferring data between multiple system buses and a data
processing device, such as a cache controller or an I/O bus interface.
2. Art Background
In the computing industry it is quite common to transfer data and commands
between a plurality of data processing devices, such as computers,
printers, memories, and the like, on a system or data bus. A data
processing system typically includes a processor which executes
instructions that are stored in addresses in a memory. The data that is
processed is transferred into and out of the system by way of input/output
I/O devices, onto a bus which interconnects the data processing system
with other digital hardware. Common constraints on the speed of data
transfer between data processing devices coupled to a bus or protocol or
"handshake" restrictions which require a pre-determined sequence of events
to occur within specified time periods prior to actual exchange of data
between the devices. It is therefore desirable to have a low latency and
high bandwidth bus which operates quickly to minimize the computing time
required for a particular task. The protocol utilized by the bus should be
designed to be as efficient as possible and minimize the time required for
data transfer.
Another limitation on a computer bus is the size of the bus itself.
Essentially, a bus is a collection of wires connecting the various
components of a computer system. In addition to address lines and data
lines, the bus will typically contain clock signal lines, power lines, and
other control signal lines. As a general rule, the speed of the bus can be
increased simply by adding more lines to the bus. This allows the bus to
carry more data at a given time. However, as the number of lines
increases, so does the cost of the bus. It is therefore desirable to have
a bus which operates as quickly as possible while also maintaining a bus
of economical size. One such bus is disclosed in three U.S. patent
applications, filed Nov. 30, 1990, by Sindhu et al, assigned to the
co-Assignee of the present application, Xerox Corporation, entitled:
CONSISTENT PACKET-SWITCHED MEMORY BUS FOR SHARED MEMORY MULTI-PROCESSORS,
CONSISTENCY PROTOCOLS FOR SHARED MEMORY MULTI-PROCESSORS, and ARBITRATION
OF PACKET-SWITCHED BUSES INCLUDING BUSES FOR SHARED MEMORY
MULTI-PROCESSORS.
As will be described, the present invention provides a high speed,
synchronous, packet-switched bus apparatus and method for transferring
data between multiple system buses and a cache controller of a processor.
In comparison with the prior art circuit-switched buses allowing only one
outstanding operation, the present packet-switched bus allows multiple
outstanding operations. The present invention also has an arbitration
implementation that allows lower latency than other prior art
packet-switched buses. As will be appreciated from the following
description, the present invention permits higher performance processors
and I/O devices to be utilized in a system without requiring the use of
extremely high pincount packages or extremely dense VLSI technologies. In
the cache controller embodiment, the present invention permits a larger
dual-port cache to be built by spreading the tags over multiple chips. A
larger cache results in higher hit rate and therefore better processor
performance. This larger cache also has available to it a higher system
bus bandwidth since it is connected to multiple system buses. Higher
bandwidth also translates directly to improved processor performance. In
the I/O bus interface embodiment, the present invention permits multiple
high bandwidth I/O devices to be connected to multiple system buses in
such a way that each I/O device has uniform access to all system buses.
This provides each I/O device with a large available I/O bandwidth and
therefore allows it to provide a high throughput of I/O operations.
SUMMARY OF THE INVENTION
A high speed, synchronous, packet-switched inter-chip bus apparatus and
method is disclosed. In the present invention, the bus connects a cache
controller client chip within the external cache of a processor to a
plurality of bus watcher client chips, each of which is coupled to a
separate system bus. The bus comprises a plurality of lines including
multiplexed data/address path lines, parity lines, and various other
command and control lines for flow control and arbitration purposes.
Additionally, the bus has a plurality of point-to-point arbitration wires
for each device. A variety of logical entities, referred to as "devices",
can send and receive packets on the bus, each device having a unique
device identification. A "chip" coupled to the bus can have multiple
devices coupled to it and can use any device identification allocated to
it.
The bus operates at three levels: cycles, packets, and transactions. A bus
cycle is one period of the bus clock; it forms the unit of time and
one-way information transfer. A packet is a contiguous sequence of cycles
that constitutes the next higher unit of transfer. The first cycle of a
packet, called a header, carries address and control information, while
subsequent cycles carry data. In the present invention, packets come in
two sizes: two cycles and nine cycles. A transaction in the third level:
it consists of a pair of packets (request, reply) that together performs
some logical function.
The bus allows the cache controller to provide independent processor-side
access to the cache and the bus watchers to handle functions related to
bus snooping. An arbiter is employed to allow the bus to be multiplexed
between the bus watchers and the cache controller. Before a device can
send a packet, it must get bus mastership from the arbiter. Once the
device has control of the bus, it transmits the packet onto the bus one
cycle at a time without interruption. The arbiter is implemented in the
cache controller, and is specialized to provide low latency for cache
misses and to handle flow control for packet-switched system buses. Packet
transmission on the bus is point-to-point in that only the recipient
identified in a packet typically takes action on the packet. These flow
control mechanisms ensure that the queues receiving packets or arbitration
requests over the bus never overflow. A default grantee mechanism is
employed to minimize the arbitration latency due to a request for the
control of the bus when the bus is idle. A mechanism is further employed
to preserve the arrival order of packets on system buses as the packets
arrive on the bus of the present invention.
Packet headers contain a data command, control signals, a tag command,
source and destination bus identifications, and an address. The data
command indicates the type of data transfer between the bus watchers and
the cache controller, while the tag command is used to keep the bus-side
and the processor-side copies of the cache tags consistent with one
another. The data command (with the exception of the rqst/rply bit) and
address in reply packets are the same as those for the corresponding
request packet. These commands, along with the control signals, provide
sufficient flexibility to accommodate a variety of system buses.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1a is a schematic representation of a processor system employing the
preferred embodiment of the present invention.
FIG. 1b is a schematic representation of an I/O bus interface employing the
present invention.
FIG. 2 diagrammatically illustrates the various sub-bus structures
comprising the bus structure employing the teachings of the present
invention.
FIG. 3 diagrammatically illustrates the structure of queues in the bus
watchers and the cache controller employing the teachings of the present
invention.
FIG. 4 is a timing diagram illustrating the arbitration timing for gaining
access to the bus of the present invention.
FIG. 5 illustrates the bus interface between a bus watcher and the cache
controller for the purpose of computing minimum arbitration latency.
FIG. 6 is a timing diagram illustrating the arbitration sequence when the
cache controller requests a low priority 2 cycle packet and the cache
controller is not the default grantee.
FIG. 7 is a timing diagram illustrating the arbitration sequence when the
cache controller requests a low priority 2 cycle packet and the cache
controller is the default grantee.
FIG. 8 is a timing diagram illustrating the arbitration sequence when the
bus watcher requests a high priority 9 cycle packet and the bus watcher is
not the default grantee.
FIG. 9 diagrammatically illustrates the various components comprising the
header cycle.
FIG. 10 diagrammatically illustrates the various components comprising the
victim cycle for the GetSingle and GetBlock commands.
FIG. 11 diagrammatically illustrates the various components comprising the
second cycle for a DemapRqst command.
FIG. 12 diagrammatically illustrtes the various components comprising the
second cycle for an interrupt command.
FIG. 13 diagrammatically illustrates the various components comprising the
tag command within a packet header.
FIG. 14 diagrammatically illustrates the various components comprising the
header cycle and first data cycle for an error reply packet.
FIG. 15 is a schematic representation of the operation of an interrupt
command among multiple processors employing the bus of the presently
claimed invention.
FIG. 16 diagrammatically illustrates the operation of the default grantee
mechanism.
DETAILED DESCRIPTION OF THE INVENTION
An improved high speed, synchronous, packet-switched inter-chip bus
apparatus and method is described having particualar application for use
in high bandwidth and low latency connections between the various parts of
a cache. In the following description for purposes of explanation specific
memory sizes, bit arrangements, numbers, data transfer rates, etc. are set
forth in order to provide a thorough understanding of the present
invention. It will be apparent to one skilled in the art, however, that
the present invention may be practiced without these specific details. In
other instances, well known circuits and components are shown in block
diagram form in order not to obscure the present invention unnecessarily.
The bus of the presently claimed invention is a high speed, synchronous,
packet-switched inter-chip bus apparatus and method for transferring data
between multiple system buses and a data processing device. The data
processing device may be either a cache controller coupled to a processor,
as shown in FIG. 1a, or an I/O bus interface coupled to an I/O bus, as
shown in FIG. 1b. To simplify the description, terminology associated with
FIG. 1a will be used throughout the present application. It should be born
in mind, however, that the description for the cache controller embodiment
also applies to the I/O bus interface, except where indicated otherwise.
Referring to FIG. 1a, the bus 100 connects a cache controller client chip
110 and the external cache RAM 150 of a processor 120 to a plurality of
"bus watcher" client chips 130 and 131, each of which is coupled to a
separate system bus 140 and 141. The bus 100 comprises a plurality of
lines including multiplexed data/address path lines, parity lines, and
various other command and control lines for flow control and arbitration
purposes. Additionally, the bus 100 has a plurality of point-to-point
arbitration wires for such devices as the bus watchers 130 and 131 and the
cache controller 110.
The bus 100 allows the cache controller 110 to provide independent
processor-side access to the cache 150 and the bus watchers 130 and 131 to
handle functions related to bus snooping. An arbiter 160 is employed to
allow the bus 100 to be multiplexed between the bus watchers 130 and 131
and the cache controller 110. Before a device can send a packet, it must
get bus mastership from the arbiter 160. Once the device has control of
the bus 100, it transmits the packet onto the bus 100 one cycle at a time
without interruption. The arbiter 160 is implemented in the cache
controller 110, and is specialized to provide low latency for cache misses
and to handle flow control for packet-switched system buses 140 and 141.
BUS SIGNALS
Referring to FIG. 2, the signals on the bus 100 are divided into three
functional groups: control signals, arbitration signals and data signals.
The control signal group contains clock signals 210 and error signals 220;
the arbitration group contains request signals 230, grant signals 240 and
grant type 250 for each device; and the data signal group contains data
signals 270 and parity 280 signal lines. In the present preferred
embodiment, signals except the clock signals 210 and data signals 270 are
encoded low true. Clock signals 210 provide the timing for all bus
signals. The error signal 220 is used by the cache controller 110
(hereinafter "CC") to indicate an unrecoverable error in CC 110 or the
processor 120 to the bus watcher clients 130 and 131 (hereinafter "BW").
In the present embodiment, the error signal 220 is driven active low.
However, there is no corresponding error signal from the BWs to the CC
because unrecoverable errors in BWs 130 and 131 are reported through the
system bus 140 and 141. Currently, the bus 100 supports up to four BW's
and four corresponding system buses solely due to the limitation by the
cache controller 110. In the arbitration group, bus request signals 230
(XReqN:N being the index for the requesting BW) are used by a BW to
request the bus 100 and to control the flow of packets being sent by the
CC 110. In the present embodiment, a request to use the bus for sending
data consists of two contiguous cycles, while flow control requests are
one or two cycles. The signals are encoded as follows:
__________________________________________________________________________
First Cycle
Second Cycle
Meaning
__________________________________________________________________________
00 -- No Request
01 -- Block CC Request Queue (XOL) for 9 cycles
01 01 Block XOL and CC Reply Queue (XOH) for 9 cycles
10 L0 Request bus at Priority BWLow for 2 cycles if L=0;
9 cycles if L=1
10 L1 Request bus at Priority BWLow for 2 cycles if L=0;
9 cycles if L=1; and block XOL and XOH for 9 cycles.
11 L0 Request bus at Priority BWHigh for 2 cycles if L=0;
9 cycles if L=1.
11 L1 Request bus at Priority BWHigh for 2 cycles if L=0;
9 cycles if L=1; and block XOL and XOH for 9
__________________________________________________________________________
cycles.
Currently, these signals are driven active low.
A grant signal 240 (XGntN) is used by the arbiter 160 to notify a requestor
that it has been granted the bus mastership. This signal is asserted
continuously for the number of cycles granted, and is never asserted
unless the specific BW (BW-N) has made a request. If the BW Default
Grantee mechanism (to be discussed more fully below) is implemented then
it is possible for the grant signal to be asserted without a request
having been made by the BW. In the present embodiment, the duration of the
grant signal 240 is two cycles or nine cycles depending on the length of
the packet requested. This signal is always driven. A grant-type signal
250 (XGTyp) is used to quality the grant signal 240, and has exactly the
same timing as the grant signal 240. Currently, this signal is driven
active low. Finally a signal 260 (XCCAF) is used by the CC 110 to notify
the BWs 130 and 131 that the queue the CC uses to hold BWLow arbitration
requests is at its high water mark. (see discussion below).
The data group contains data 270 and parity 280 signals. The data signals
270 (XData) on the bus 100 are bi-directional signals that carry the bulk
of the information being transported on the bus 100. During header cycles
they carry address and control information; during other cycles they carry
data. A device drives these signals only after the receiving a grant
signal 240 from the arbiter 160. The parity signals 280 (XParity) also
comprise a number of bi-directional signals that carry the parity
information computed over the data signals 270. The parity for a given
value of the data signals appear in the same cycle as the value.
ARBITRATION AND FLOW CONTROL
As will be described, the bus 100 also has an arbiter 160 that allows the
bus 100 to be multiplexed between the BWs 130 and 131 and the CC 110. When
either a BW or the CC has a packet to send, it makes a request to the
arbiter 160 through its dedicated request lines 230 (XReqN), and the
arbiter 160 grants the bus 100 using the corresponding grant line 240
(XGntN) and the bussed grant-type line 250 (XGTyp). In the present
embodiment, the arbiter 160 implements four priority levels for flow
control and deadlock avoidance purposes. Service at a given priority level
is round-robin among contenders, while service between levels is based
strictly on priorities. The arbiter 160 is implemented in the CC 110
because this is simpler and more efficient, as will be appreciated from
the following description of the invention.
Referring to FIG. 3, the bus 100 imposes the following FIFO queue structure
on the BWs 130 and 131 and CC 110. Each BW has four system bus queues, two
for output, two for input. The output queues, DOL 334 and 335 and DOH 336
and 337, are used to send packets at system bus priorities CacheLow and
CacheHigh, respectively; the input queues, DIL 330 and 331 and DIH 332 and
333, hold packets that will be sent at bus priorities BWLow and BWHigh,
respectively. The queue DIH 332 is used to hold only replies to packets
originally sent by the CC 110. An implementation is also allowed to merge
the queues DIL and DIH in the BWs, and XIL 310 and XIH 311 in the CC 110
if deadlock-free operation is still possible with this arrangement.
Referring to FIG. 3, the CC 110 also has four packet queues, two for input
from the bus 100 and two for output to the bus 100. The input queues, XIL
310 and XIH 311, hold packets from DIL 330 and 331 and DIH 332 and 333,
respectively. The output queue, XOL 312, is used to send out CC requests,
while XOH 313 is used to send out CC replies. Additionally, each CC 110
has two queues, ArbLow 360 and ArbHigh 361, used to hold arbitration
requests from the BWs at the priorities BWLow and BWHigh respectively. If
the delay from the reception of packet on the system bus 140 and 141 to
arbitration requests on bus 100 is fixed then these queues ensure that the
packets from multiple system buses in each class (DIL or DIH) are serviced
by the bus arbiter 160 in their system bus arrival order.
Referring again to FIG. 3, when packets are transferred from system buses
140 and 141 to CC 110 through bus 100 of the present invention, the
following scheme is used to ensure that packets arriving on bus 100 are in
the same order as they arrive on respective system buses 140 and 141. As
an illustrative example, assume packet A arrives on system bus 140 at
cycle 1 and packet B arrives on system bus 141 at cycle 4. An
order-preserving implementation of the present invention will preserve the
arrival order of packets A and B on system buses 140 and 141 as they are
transferred to bus 100, i.e. packet A arriving on bus 100 before packet B.
Conversely, if packet B arrives on system bus 141 before packet A arrives
on system bus 140, then packet B is transferred onto bus 100 before packet
A. Currently, the arrival order is preserved for packets entering the
queues DIL 330 and 331.
The order-preserving implementation works as follows: when a packet arrives
at the input of a BW, a request for control of bus 100 is sent to bus
arbiter 160 a fixed number of cycles later. Currently, a request is sent
to arbiter 160 two cycles after a packet arrives at the BW. When bus
arbiter 160 receives requests for control of bus 100, it services the
requests on a FIFO (First-in, first-out) basis, therefore preserving the
system bus arrival order of packets.
In a case where both packets A and B arrive on their respective system
buses 140 and 141 at the same cycle, then a fallback implementation is
employed so that the packet from one pre-determined system bus is
transferred first to bus 100. In the current embodiment, packets from BWO
are transferred to bus 100 first in the case of simultaneous arrival on
the system buses. However, it will be apparent to those skilled in the art
that other fallback schemes are also available for the case of
simultaneous arrival.
The BWs 130 and 131 and CC 110 interact with the arbiter 160 through three
dedicated wires--XReqN, and XGntN, and the bussed XGTyp 250. In the
present preferred embodiment, the arbitration wires for the CC 110 are
internal since the arbiter 160 is implemented in the CC 110, while those
for the BWs 130 and 131 appear at the pins of the CC 110. A BW requests
the bus 100 by using its XReqN lines as follows:
______________________________________
First Second
Cycle Cycle Meaning
______________________________________
00 00 No Request
01 00 Block XOL for 9 cycles
01 01 Block XOL and XOH for 9 cycles
10 L0 Request Bus at Priority BWLow
for 2 cycles if L = 0 and 9 cycles if L = 1.
10 L1 Request Bus at Priority BWLow
for 2 cycles if L = 0 and 9 cycles if L = 1 and
block XOL and XOH for 9 cycles.
11 L0 Request Bus at Priority BWHigh
for 2 cycles if L = 0 and 9 cycles if L = 1.
11 L1 Request Bus at Priority BWHigh
for 2 cycles i | | |