|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 6070209 Hausauer
May,2000 |      Your vote accepted [0 after 0 votes] | | 5964859 Steinbach 710/310 Oct,1999 |      Your vote accepted [0 after 0 votes] | | 5951655 Inoue 710/37 Sep,1999 |      Your vote accepted [0 after 0 votes] | | 5922062 Evoy 710/305 Jul,1999 |      Your vote accepted [0 after 0 votes] | | 5859988 Ajanovic 710/306 Jan,1999 |      Your vote accepted [0 after 0 votes] | | 5828856 Bowes 710/308 Oct,1998 |      Your vote accepted [0 after 0 votes] | | 5826101 Beck 712/34 Oct,1998 |      Your vote accepted [0 after 0 votes] | | 5802055 Krein 370/402 Sep,1998 |      Your vote accepted [0 after 0 votes] | | 5799207 Wang
Aug,1998 |      Your vote accepted [0 after 0 votes] | | 5787304 Hodges 710/1 Jul,1998 |      Your vote accepted [0 after 0 votes] | | 5771356 Leger 709/233 Jun,1998 |      Your vote accepted [0 after 0 votes] | | 5678064 Kulik
Oct,1997 |      Your vote accepted [0 after 0 votes] | | 5632021 Jennings 710/312 May,1997 |      Your vote accepted [0 after 0 votes] | | 5623697 Bland 710/22 Apr,1997 |      Your vote accepted [0 after 0 votes] | | 5619728 Jones 710/27 Apr,1997 |      Your vote accepted [0 after 0 votes] | | 5613162 Kabenjian 710/22 Mar,1997 |      Your vote accepted [0 after 0 votes] | | 5596729 Lester 710/309 Jan,1997 |      Your vote accepted [0 after 0 votes] | | 5590377 Smith 710/22 Dec,1996 |      Your vote accepted [0 after 0 votes] | | 5564026 Amini 710/315 Oct,1996 |      Your vote accepted [0 after 0 votes] | | 5559986 Alpert 711/144 Sep,1996 |      Your vote accepted [0 after 0 votes] | | 5550989 Santos 710/306 Aug,1996 |      Your vote accepted [0 after 0 votes] | | 5546543 Yang 709/235 Aug,1996 |      Your vote accepted [0 after 0 votes] | | 5546549 Barrett 710/305 Aug,1996 |      Your vote accepted [0 after 0 votes] | | 5522050 Amini 710/315 May,1996 |      Your vote accepted [0 after 0 votes] | | 5450551 Amini 710/119 Sep,1995 |      Your vote accepted [0 after 0 votes] | | 5396602 Amini 710/113 Mar,1995 |      Your vote accepted [0 after 0 votes] | | 5329615 Peaslee
Jul,1994 |      Your vote accepted [0 after 0 votes] | | 4933846 Humphrey 710/107 Jun,1990 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a data transfer device and, more particularly, to a peripheral component interconnect (PCI) host bridge device.
2. Description of the Related Art
In 1995, version 2.0 of the peripheral component interconnect (PCI) bus specification was replaced by version 2.1. This new version introduced new requirements intended to better control bus latency and performance. For example, the new version
only allows a certain amount of cycles during which a PCI target device has to transfer data in response to a request from a master device. If the target device cannot respond within the allowable time (e.g., 16 cycles), then the target device must
instruct the master device to resend the request at a later time. This 16-cycle constraint is intended to free the PCI bus for other transactions instead of holding the bus in a wait state until the data is fetched. These new requirements, however,
engender new technical problems that are yet to be solved.
One solution to these technical problems is to re-design the PCI host bridge (PHB) to accommodate the new requirements. A PHB is a hardware that interconnects a system bus to a PCI bus to receive/transmit input/output (I/O) data. The PHB
accepts I/O commands from the system bus and controls the execution of these commands on the PCI bus. Conversely, the PHB accepts direct memory access (DMA) commands from the PCI bus and controls the execution of the DMA commands on the system bus. The
PHB has an internal arbiter for controlling the PCI bus along interrupt support logic. The PHB contains data buffering for processor to I/O commands (i.e., loads, stores) along with data buffering and an internal data cache for DMA accesses to system
memory.
SUMMARY OF THE INVENTION
A host bridge having a dataflow controller is provided. In a preferred embodiment, the host bridge contains a read command path which has a mechanism for requesting and receiving data from an upstream device. The host bridge also contains a
write command path that has means for receiving data from a downstream device and for transmitting the received data to an upstream device. A target controller is used to receive the read and write commands from the downstream device and to steer the
read command toward the read command path and the write command toward the write command path. A bus controller is also used to request control of an upstream bus before transmitting the request for data of the read command and transmitting the data of
the write command.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 is a block diagram of a computer system using the present invention.
FIG. 2 is a block diagram of a data flow in a PHB of the present invention.
FIG. 3 is a block diagram of a control and address flow in a PHB of the present invention.
FIG. 4 depicts a block diagram of a multi-channel structure of a DMA read data buffer.
FIG. 5 depicts a block diagram of a multi-channel structure of a DMA write data buffer.
FIG. 6 is a flow diagram of a read data catcher.
FIGS. 7A and 7B are a flow diagram of a PCI target dispatching read and write addresses.
FIG. 8 depicts a tree-based LRU assignment algorithm.
FIGS. 9A-9C are a pseudo code implementation of the tree-based LRU assignment algorithm.
FIGS. 10A-10C depict a latched dataflow used in the PHB of the present invention.
FIGS. 11A-C are a software implementation of the latched dataflow.
FIG. 12 is a timing diagram of an outbound dataflow of a PHB as a master doing a store to a 32-bit PCI target.
FIG. 13 is a timing diagram of an outbound dataflow of a PHB as a master doing a store to a 64-bit PCI target.
FIG. 14 is a timing diagram of an outbound dataflow of a PHB as a target for a DMA read from a 32-bit PCI Master.
FIG. 15 is a timing diagram of an outbound dataflow of a PHB as a target for a DMA read from a 64-bit PCI target.
FIG. 16 is a block diagram of a hot plug connection.
DESCRIPTION OF THE INVENTION
FIG. 1 is a block diagram of a computer system 100 using the present invention. The computer system 100 may contain one to n processor cards 110 (n being a positive integer) connected to a memory controller 120 via a system bus. The memory
controller 120 is further connected to PCI host bridges (PHB) 130, 140, 150 and 160 through a remote system bus. The memory controller is also connected to a system memory 170 through the memory bus.
Connected to PHB 130 via PCI bus 1 are small computer system interfaces (SCSI) 1, 2 and 3 as well as an ethernet adapter and an industry standard architecture (ISA) bus bridge 170. Three ISA devices may be plugged into the three ISA slots to
become part of the computer system 100. An ISA device may be a keyboard, a mouse etc. Note that bus bridge 170 need not be necessarily an ISA bus bridge, it can be an extended industry standard architecture (EISA) bus bridge or a microchannel bus
bridge. Furthermore, the ISA bus bridge 170 may be connected to any one of the PCI buses and not necessarily to PCI bus 1.
PCI devices such as audio adapter, local area network (LAN) adapter, printers etc. may be connected to the system 100 through the PCI slots of 33 Mhz PCI buses 2 and 3. The system 100 may also contain a graphics subsystem (i.e., a graphics
adapter) connected to PHB 160 through 50 Mhz PCI bus 4.
When a PCI device wants to do a DMA read or write into system memory, it has to first request the use of the local PCI bus to which it is attached from the corresponding PHB. The PHBs arbitrate bus requests among those various devices. The PHBs
also have to request the use of the remote system bus from the memory controller 120.
FIG. 2 is a block diagram of a DMA data flow in either one of the PHBs 130, 140, 150 and 160. When a PCI device is transferring data to the system memory 120, the data goes from tri-state buffer 210 into DMA write buffer 250. The data then
continues through multiplexers 255 and 260 until it is put on the remote system bus by tri-state buffer 265. Data from the system memory 120 goes from tri-state buffer 270 to DMA read buffer 245 and through multiplexers 220 and 215 before reaching the
local PCI bus via tri-state buffer 205. Data to and from a processor is put into either MMIO (memory mapped I/O) buffer 0 or 1 before on the remote system bus or the local PCI bus.
FIG. 3 is a block diagram of a DMA control and address flow, in either one of the PHBs 130, 140, 150 and 160. As can be seen, PCI target 1300 is connected to write and read channel assignments 1305 and 1360. The PCI target 1300 is a logic
circuit that interprets the PCI protocol and decodes addresses and all signals associated with the PCI bus. The decoded address is either transferred to write channel assignment 1305, in the case of DMA write commands, or to the read channel assignment
1360 in the case of DMA read commands.
FIGS. 7A and 7B are a flow diagram of a PCI target 1300 dispatching read and write addresses. The PCI target 1300 receives an address at step 300. At step 302, it determines whether the address matches the address range of the PHB. If no, it
returns to step 300. If yes, the PHB responds as target on the PCI bus. Then, it is determined whether it is a read or write command. If it is a read command, the process goes to step 310. If it is a write command the process continues to step 308.
If it is a read command, the command is sent to the read channels 1355 (step 310). Then it is determined whether any of the read channels respond as having been assigned to process an address that matches this transfer address (step 312). If
no, the address is dispatched to the read channel assignment 1360 for assignment and the process returns to step 300 (step 316). If yes, it is determined whether the responding read channel is ready with the data (step 314). If no, the process goes to
step 316 and returns to step 300. If yes, the data is placed on the PCI bus (step 315)
If the command was a write command, the command is sent to the write channels 1310 (step 320). Then a determination is made as to whether any of the write channels responded as having been assigned to process an address that matches this
transfer address (step 322). If no, the process continues to step 326 by dispatching the address to the write channel assignment 1305 and returns to step 300. If a channel responded as having been assigned the address, a determination is made as to
whether the responding channel is ready to accept data (step 324). If no, the process continues to step 326 and returns to step 300. If yes, the write channel will begin to receive data from the PCI bus and to place the data in the DMA write buffer
250.
When the write channel assignment 1305 receives the address, it queries the eight write channels 1310 to determine whether the translation control element (TCE) for the new transaction is cached in any one of them. (A TCE translates a 4 k PCI
I/O page address into a system memory page address.) If none of the channels claim ownership of this 4K page, then the address is assigned to one of the channels according to the present state of each channel and a least most recently used (LRU)
algorithm explained below.
The DMA write channels are assigned to fetch and cache the TCEs from the TCE table in system memory. The logic of the DMA write channels is referenced by the write channel assignment 1305 to determine the state of each channel. Each channel
controls one TCE element, thus eight different DMA write streams can be managed at the same time. The channel that is assigned the queried address will pass the address to the DMA write buffer load 1315. Using this address, the DMA write buffer load
1315 will control the loading of the DMA write buffer 250 of FIG. 2. The DMA write buffer load 1315 will prematurely terminate loading of the write buffer 250 if the buffer becomes full before the end of the transaction. The PCI device will have to
re-initiate the write request command. When the re-initiated write command is honored, it will start loading data at the point where the previous command had stopped.
The DMA write buffer unload 1320 controls the unloading of the buffer 250. The DMA write buffer unload 1320 can be unloading a previously loaded PCI transfer data stream from the DMA write buffer 250 while the DMA write buffer load 1315 is
loading another DMA write transfer data stream into the buffer 250. Indeed, both the DMA write buffer load 1315 and the DMA write buffer unload 1320 are able to reference the same write channel (i.e., TCE) simultaneously. The DMA write buffer 250 can
then be regarded as a write-through first-in first-out (FIFO) buffer that preserves the order of all DMA write data transfers.
The DMA write buffer unload 1320 is connected to both the DMA write bus control 1335 and the DMA read channel arbiter 1330. The DMA read channel arbiter 1330 arbitrates between the eight read channels 1355 and the DMA write buffer load 1315 when
it is reading TCE data for the DMA write channels 1310 from the system memory 120. Control for the Data from the write buffer 250 to get on the system bus is sent to the DMA write bus control 1335 which makes sure that the bus is granted to the PHB
before placing the data on the bus.
Similar to the DMA write channel assignment 1305, when the DMA read channel assignment 1360 receives the address from the PCI target 1300, it queries the eight read channels 1355 to determine which one is currently caching data at that address
from the system memory 120. If no channel responds as having the address for the 4K page, the DMA read channel assignment 1360 assigns the address to one of the eight read channels. This channel then begins to fetch and cache data starting at that
address. The decision regarding which channel to assign the address to is based on the state of each read channel and an LRU algorithm as will be further explained.
Depending on the number of DMA Read Request streams active at any given time, each one of the eight DMA read channels 1355 may be assigned to fetch and cache data. The data fetched by each of the read channels will come from a specific 4 k
memory page of the system memory 120. Thus, two channels will never both cache valid data for the same 4K I/O page as each channel manages only one TCE. In addition, each channel keeps track of a re-assignment state used by the DMA read channel
reassignment 1360 to determine which channel to assign the address of the next requested data. Using the snoop detection 1350, each channel snoops DMA write transfers and the remote system bus traffic to invalidate its data when the data is modified by
another device. Hence, as the channels work independently from each other, the PHB can manage up to eight different PCI DMA Read data streams at the same time.
Once a read channel is given permission to access the system bus by the DMA read channel arbiter 1330, the address is passed to the DMA read bus control 1340. The DMA read bus control 1340 issues the address to the bus and assigns a bus tag used
with the read address to a DMA read catcher from the six DMA read catchers 1345. By having six DMA read catchers 1345, the PHB may have six DMA read transactions in process (i.e., address issued, data return pending) at any given time. As is obvious,
therefore, this architecture is designed to pipeline several DMA read transactions to counteract the latency that is usually associated with reading data from the system memory.
FIG. 6 is a flow diagram of the read data catcher 1345. The process starts at step 600. At step 602, one of the DMA read data catchers marks itself as being available for assignment. At step 604, it is determined whether the DMA read catcher
is being assigned by the DMA Read Bus Controller 1340. If not, the process returns to 602 to for the DMA read catcher to continue to be available for assignment. If yes, the DMA read catcher marks itself as being busy (step 606). At step 608, the DMA
read catcher stores the DMA read channel it is being serviced along with the system bus read tag issued for the read. At step 610, it is determined whether the data is available on the system bus and whether the data tag matches the tag sent with the
read request. If no, the process returns to step 608. If yes, the process continues to step 612 where the data is stored into the DMA Read buffer 245. At step 614, the DMA read data catcher notifies the DMA read channel that it is servicing that the
data has arrived.
The ordering rules for DMA traffic is as follows: (1) a DMA write request must NOT pass a previous DMA write request, (2) a DMA read request must NOT pass a previous DMA write request, (3) a DMA write request may pass a previous DMA read request
and (4) a DMA read request may pass a previous DMA read request. Accordingly, therefore, (1) and (2) are the only rules that must stay strictly ordered. By using the write buffer 250 as a write-through FIFO buffer, the DMA write buffer load 1315 and
the DMA write buffer unload 1320 ascertain that rule 1 is adhered to. Rules 2 and 3 are managed by the DMA read/write fairness control 1325. The multiple read channels 1355 allow the PHB to maximize the performance of rule 4 by allowing the read
channels 1355 to run independently of each other.
Arbitration control for the DMA write and read channels has a special control to maintain proper order of execution while maintaining fairness across the write and read channels. This is handled with a multi-tier arbitration scheme. The
arbitration scheme is as follows: (1) DMA reads vs. DMA writes; (2) read requests from DMA read channels vs. DMA write channels and (3) read priority between the eight read channels.
Thus, between DMA write requests and DMA read requests:
(a) a request from DMA write bus controller 1335 (e.g., a DMA write) has higher priority than a request from DMA read controller 1340 (e.g., a DMA read);
(b) when the DMA write bus controller 1335 detects a second DMA write request from the DMA write buffer unload 1320, it does not allow the DMA read bus controller 1340 to issue a DMA read request onto the remote system bus;
(c) if the DMA read bus controller 1340 is being retried, the DMA write bus controller operation may take priority and be issued before the DMA read bus controller operation.
As mentioned before, the rules for maintaining proper order and fairness between read and write channels are implemented within the DMA read/write fairness control 1325. A DMA write channel request to read a TCE has first priority. Requests
from the eight DMA read channels to read TCEs or data have second priority. Therefore, when there is a long continuous stream of DMA write transfers onto the remote system bus, DMA read requests may be precluded from using the system bus for a long
period of time. To circumvent this occurrence, therefore, the DMA read/write fairness control 1325 only allows eight consecutive write requests to be accepted while a DMA read channel is requesting the use of the system bus. Although in this case eight
consecutive write requests are allowed to be serviced, it should be understood that depending on the design any other number may be allowed to be serviced consecutively. Once the eight write requests have been honored, the DMA read/write fairness
control 1325 informs the DMA write buffer load 1315 to stop accepting any new DMA write transactions so that the read channels can be serviced. When the DMA write buffer 250 has emptied its current DMA write data, the DMA read/write fairness control
1325 then allows the DMA write buffer load 1315 to start accepting new write transactions.
Requests from the eight read channels are accepted on a round robin scheme. For example, if a read request from DMA read channel 4 (DRC4) had just been accepted by the DMA read bus control 1340, then the DMA read channel arbiter 1330 would
assign the following priority to the channels:
Returning to FIG. 2, DMA write buffer 250 and DMA read buffer 245 can store 256 bytes and 2 kilobytes of data, respectively. They are further subdivided into a multi-channel structure where each channel can independently process a DMA transfer.
FIG. 4 is a block diagram of the multi-channel structure of DMA read data buffer 245. The DMA read buffer 245 can be programmed to run in one of two modes. The first mode is an eight independent channel buffer mode capable of storing 256 bytes of data
each. Each channel buffer is partitioned into four 64 bytes buffer sectors with one TCE register associated with each channel buffer. The second mode is a four independent channel buffer mode capable of storing 512 bytes of data. Each channel buffer
is partitioned into four 128 bytes buffer sectors with one TCE register associated with each channel. Shown in FIG. 4 is the DMA read buffer 245 being run in the four independent channel buffer mode. This mode is intended to work in a system
environment that may perform better with deeper data buffering across fewer channels.
When a PCI device initiates a PCI read transfer to the system memory 120, the PHB to which it is connected allocates one of the channels to service t | | |