WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Apparatus and method for a synchronous, high speed, packet-switched bus    
United States Patent5195089   
Link to this pagehttp://www.wikipatents.com/5195089.html
Inventor(s)Sindhu; Pradeep S. (Mountain View, CA); Liencres; Bjorn (Palo Alto, CA); Cruz-Rios; Jorge (Mountain View, CA); Lee; Douglas B. (San Francisco, CA); Chang; Jung-Herng (Saratoga, CA); Frailong; Jean-Marc (Palo Alto, CA)
AbstractA high speed, synchronous, packet-switched inter-chip bus apparatus and method for transferring data between multiple system buses and a cache controller. In the preferred embodiment, the bus connects a cache controller client within the external cache of a processor to a plurality of bus watcher clients, each of which is coupled to a separate system bus. The bus allows the cache controller to provide independent processor-side access to the cache and allows the bus watchers to handle functions related to bus-snooping. An arbiter is employed to allow the bus to be multiplexed between the bus watchers and cache controller. Flow control mechanisms are also employed to ensure that queues receiving packets or arbitration requests over the bus never overflow. A default grantee mechanism is employed to minimize the arbitration latency due to a request for the bus when the bus is idle.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Sindhu; Pradeep S. (Mountain View, CA); Liencres; Bjorn (Palo Alto, CA); Cruz-Rios; Jorge (Mountain View, CA); Lee; Douglas B. (San Francisco, CA); Chang; Jung-Herng (Saratoga, CA); Frailong; Jean-Marc (Palo Alto, CA)
Owner/Assignee     Sun Microsystems, Inc. (Mountain View, CA)
Patent assignment
All assignments
Publication Date     March 16, 1993
Application Number     07/636,446
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     December 31, 1990
US Classification     370/235 370/414 370/461 710/114 710/309
Int'l Classification     H04J 003/02
Examiner     Olms; Douglas W.
Assistant Examiner     Hom; Shick
Attorney/Law Firm     Blakely Sokoloff Taylor & Zafman
Address
Parent Case    
Priority Data    
USPTO Field of Search     370/85.6 370/85.1 370/85.2 370/85.9 370/85.13 370/94.1 370/94.2 370/60 370/61
Patent Tags     synchronous, high speed, packet-switched bus
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5081622
Nassehi

Jan,1992

[0 after 0 votes]
5058110
Beach
370/464
Oct,1991

[0 after 0 votes]
4933846
Humphrey
710/107
Jun,1990

[0 after 0 votes]
4809269
Gulick
370/462
Feb,1989

[0 after 0 votes]
4787033
Bomba
710/116
Nov,1988

[0 after 0 votes]
4677616
Franklin
370/423
Jun,1987

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed:

1. A packet-switched bus for transferring data between a plurality of system buses and a cache controller of a cache for storing data, said cache controller being coupled to a processor, comprising:

data lines coupled to said system buses and said cache controller for transmitting data and command information between said system buses and said cache controller;

control lines coupled to said system buses and said cache controller for transmitting a plurality of clock signals, said clock signals being coupled to said system buses and said cache controller;

sending and receiving means for transferring data and command information between one of said system buses and said data and control lines, said sending and receiving means being coupled to said system buses and said data and control lines, said sending and receiving means including:

at least one input FIFO buffer coupled to each of said system buses for buffering data and command information to be transferred to said data and control lines;

at least one output FIFO buffer coupled to each of said system buses for transferring data and command information from said data and control lines to its system bus; and

storage means for storing and monitoring a disjointed set of cache tags for said cache for each of said system buses;

priority arbitration means coupled to said sending and receiving means and said cache controller for receiving and granting requests for control of said data and control lines, each sending and receiving means for each system bus transmitting a bus request for control of said data and control lines to said priority arbitration means a predetermined delay after receiving said data and command information from its system bus such that the bus requests are received by said priority arbitration means in the same order as each input FIFO buffer receives its data and command information from its system bus, said priority arbitration means receiving and granting each request through point-to-point request and grant lines coupling to each sending and receiving means of each system bus, each of said grant lines being continuously asserted while the data and command information is transferred onto said data and control lines; and

overflow control means coupled to said priority arbitration means, to said cache controller and to said sending and receiving means for preventing overflow, said overflow control means controlling the flow to said priority arbitration means by causing said system buses not to transfer data and command information to each of said sending and receiving means such that no bus request is sent from each sending and receiving means, said overflow control means controlling the flow into said cache controller by causing said priority arbitration means not to grant bus requests to said sending and receiving means, said overflow control means controlling the flow to said sending and receiving means from said data and control lines by causing said cache controller not to transfer to said sending and receiving means,

whereby data is transferred between said cache controller and said system buses.

2. The bus as defined by claim 1, further comprising means for automatically granting control of said data and control lines to said cache controller when said data and control lines is idle but before a cache miss, such that as soon as control of said data and control lines is requested by said cache controller during said cache miss, data and commands will be transferred from said controller to said data and control lines; said priority arbitration means then granting control of said data and control lines to said sending and receiving means after the data and commands have been transferred to said system buses but before a response from said system buses arrives at said sending and receiving means such that said response can be transferred from said sending and receiving means to said cache controller through said data and control lines as soon as control of said data and control lines is requested by said sending and receiving means when said response arrives from one said system buses.

3. The bus as defined by claim 1, wherein said plurality of system buses are coupled to a plurality of other processors and cache controllers through their corresponding sending and receiving means, and said data and control lines further comprises interrupt means for generating and acknowledging processor interrupts, said interrupt means transferring a processor interrupt and acknowledgment between a first cache controller and a second cache controller through their respective sending and receiving means coupled through said system buses, said second cache controller generating an acknowledgment to reply said interrupt and interrupting its processor if it is the processor intended by said interrupt.

4. The bus as defined by claim 3, wherein said interrupts means comprises:

means for identifying a target to be interrupted by transmitting a command specifying said target through its sending and receiving means to its system bus;

means for identifying a source for the target to transmit its reply through its system bus after said target determines that it is the target to be interrupted.

5. The bus as defined by claim 1, wherein said priority arbitration means is disposed in said cache controller for receiving bus requests and granting bus grants directly from said cache controller such that bus requests and grants for said cache controller are not transferred through said data and control lines.

6. In a computer system including a plurality of system buses coupled to a plurality of cache controllers, each of said cache controllers coupled to a cache RAM and to a processor, a packet-switched bus for transferring data between said system buses and each of said cache controllers, comprising:

data lines coupled to each of said system buses and said cache controllers for transmitting data and command information between said system buses and each of said cache controllers;

control lines coupled to said system buses and each of said cache controllers for transmitting a plurality of clock signals, said clock signals being coupled to said system buses and each of said cache controllers;

sending and receiving means for transferring data and command information between one of said system buses and said data and control lines, said sending and receiving means being coupled to one of said system buses and said data and control lines, said sending and receiving means including:

at least one input FIFO buffer coupled to each of said system buses for buffering data and command information to be transferred to said data and control lines;

at least one output FIFO buffer coupled to each of said system buses for transferring data and command information from said data and control lines; and

storage means for storing and monitoring a disjointed set of cache tags for said cache for each of said system buses;

priority arbitration means coupled to said sending and receiving means and each of said cache controllers for receiving and granting requests for control of said data and control lines, said sending and receiving mean for each system bus transmitting a bus request for control of said data and control lines to said priority arbitration means a predetermined delay after receiving said data and command information from its system bus such that the bus requests are received by said priority arbitration means in the same order as each sending and receiving means of each system bus receives its data and command information from its system bus, said priority arbitration means receiving and granting each request through point-to-point request and grant lines coupling to each sending and receiving means of each system bus, each of said grant lines being continuously asserted while the data and command information are transferred on said data and control lines; and

overflow control means coupled to said priority arbitration means, to said cache controller and to said sending and receiving means for preventing overflow, said overflow control means controlling the flow to said priority arbitration means by causing each of said system buses not to transfer data and command information to its sending and receiving means, said overflow control means controlling the flow into each of said cache controllers by causing said priority arbitration means not to grant bus requests to said sending and receiving means, said overflow control means controlling the flow to said sending and receiving means from said data and control lines by causing each of said cache controllers not to transfer to said sending and receiving means,

whereby data is transferred between said cache controllers and said system buses.

7. The bus as defined by claim 6, wherein said priority arbitration means coupled to each of said data and control lines includes means for automatically granting control of said data and control lines to its cache controller when said data and control lines is idle but before a cache miss, such that as soon as control of said data and control lines is requested by said cache controller during said cache miss, data and commands will be transferred from said controller to said data and control lines; said priority arbitration means then granting control of said data and control lines to said sending and receiving means after the data and commands have been transferred to said system buses but before a response from said system buses arrives at said sending and receiving means such that said response can be transferred from said sending and receiving means to said cache controller through said data and control lines as soon as control of said data and control lines is requested by said sending and receiving means when said response arrives from said system buses.

8. The bus as defined in claim 6, further including interrupt means for generating and acknowledging processor interrupts, said interrupt means comprises:

means for identifying a target to be interrupted by transmitting a command specifying said target through its sending and receiving means to its system bus;

means for identifying a source for the target to return its reply through its system bus after said target determines that it is the target to be interrupted.

9. The bus as defined by claim 6, wherein said priority arbitration means is disposed in each of said cache controller.

10. A method for transferring data on a packet-switched bus coupled between a plurality of system buses and a cache controller of a processor, comprising the steps of:

providing data and command information from either said system buses or said cache controller to be transferred over said packet switched bus;

arbitrating for control of said packet switched bus using priority arbitration means based on a predetermined priority heirarchy, including the steps of:

transmitting a request signal for control of said packet switched bus on data lines coupled to said system buses and said cache controller; and

receiving a grant signal for control of said packet switched bus on said data lines coupled to said system buses and said cache controller;

transferring said data and command information on said data lines coupled to said system buses and said cache controller to either of said system buses by sending and receiving means coupled to said system buses and said cache controller;

controlling data flow over said packet switched bus through overflow control means to prevent said arbitration means and said sending and receiving means from exceeding the data transfer capability of said packet switched bus with data and command information, including:

deasserting said grant signal for control of said packet switched bus by said sending and receiving means when said sending and receiving means contains requests for control of said packet switched bus beyond a first predetermined level;

halting flow of data from said system buses into said sending and receiving means and flow of request signals from said sending and receiving means to said cache controller when said arbitration means contains request signals for control of said packet switched bus beyond a second predetermined level;

halting flow of reply signals from said cache controller to said sending and receiving means when said arbitration means contains request signals for control of said packet switched bus beyond a third predetermined level; and

halting flow of data from said cache controller to said sending and receiving means when said sending and receiving means contains data to be sent on said system buses beyond a fourth predetermined level;

whereby data is transferred between said system buses and said cache controller.

11. A method according to claim 10, further comprising the steps of minimizing arbitration latency for transferring data on a first bus coupled to a plurality of system buses and a cache controller, said latency occurring when control of said first bus is requested for a cache miss while said first bus is idle, said steps comprising:

granting control of said first bus to said cache controller when said first bus is idle;

generating a request for control of said first bus by said cache controller upon a cache miss by a processor coupled to said cache controller;

transferring a request for data and command to one of said system buses through the receiving and sending means coupled to each of said system buses, said receiving and sending means being coupled to said first bus and to said cache controller;

granting control of said first bus to said receiving and sending means of said system bus;

whereby upon arrival of a reply for said request at said receiving and sending means, said reply is transferred to said cache controller without arbitration delay.

12. In a computer system including a plurality of system buses coupled to a plurality of cache controllers, each of said cache controllers coupled to a cache RAM and to a processor, a method for minimizing arbitration latency for transferring data on a first bus coupled to said plurality of system buses and said cache controller, said latency occurring when control of said first bus is requested for a cache miss while said first bus is idle, comprising the steps of:

granting control of said first bus to said cache controller when said first bus is idle;

generating a request for control of said first bus upon a cache miss by a processor coupled to said cache controller;

transferring a request for data and command to one of said system buses through its receiving and sending means coupled to each of said system buses, said receiving and sending means being coupled to said first bus;

granting control of said first bus to said receiving and sending means of said system bus;

whereby upon arrival of a reply for said request at said receiving and sending means, said reply is transferred to said cache controller without arbitration delay.

13. The method as defined by claim 10, further comprising a method of interrupting a target processor by a source processor coupled through said system buses, wherein said method comprises:

transferring an interrupt command from the cache controller of said source processor to said packet switched bus, said interrupt command specifying said target processor to be interrupted and said source processor;

transferring said interrupt command to said system buses from said packet switched bus through its sending and receiving means;

receiving said interrupt command by the sending and receiving means of all other processors coupled to said system buses;

replying said interrupt command by transmitting an acknowledgement to the sending and receiving means of said source processor through said system buses by said all other processors, each of said other processors determining whether it is the intended target processor by reading said interrupt command;

interrupting said target processor as specified by said interrupt command.

14. In a computer system including a plurality of system buses coupled to a plurality of cache controllers, each of said cache controllers coupled to a cache and to a processor, a method for transferring data on a packet-switched bus coupled between said system buses and each of said cache controllers, comprising the steps of:

providing data and command information to be transferred between said system buses and said cache controller through said packet switched bus;

arbitrating for control of said packet switched bus using priority arbitration means based on a predetermined priority hierarchy, including the steps of:

transmitting a request signal for control of said packet switched bus on data lines coupled to said system buses and said cache controller; and

receiving a grant signal for control of said packet switched bus on said data lines coupled to said system buses and said cache controller;

transferring said data and command information on said data lines coupled to said system buses and said cache controller;

controlling data flow over said packet switched bus through overflow control means to prevent said arbitration means and said sending and receiving means from exceeding the data transfer capability of said packet switched bus with data and command information, including:

deasserting said grant signal for control of said packet switched bus by said sending and receiving means when said sending and receiving means contains requests for control of said packet switched bus beyond a first predetermined level;

halting flow of data from said system buses into said sending and receiving means and flow of request signals from said sending and receiving means to said cache controller when said arbitration means contains request signals for control of said packet switched bus beyond a second predetermined level;

halting flow of reply signals from said cache controller to said sending and receiving means when said arbitration means contains request signals for control of said packet switched bus beyond a third predetermined level; and

halting flow of data from said cache controller to said sending and receiving means when said sending and receiving means contains data to be sent on said system buses beyond a fourth predetermined level;

whereby data is transferred between said system buses and said cache controller.

15. The method as defined by claim 14, further comprising the steps of interrupting a target processor by a source processor coupled through said system buses in said computer system, said steps comprising:

transferring an interrupt command from the cache controller of said source processor to its packet switched bus, said interrupt command specifying said target processor to be interrupted and said source processor;

transferring said interrupt command to said system buses from said packet switched bus through its sending and receiving means;

receiving said interrupt command by the sending and receiving means of all other processors coupled to said system buses;

replying said interrupt command by transmitting an acknowledgement to the sending and receiving means of said source processor through said system buses by said all other processors, each of said other processors determining whether it is the target processor by reading said interrupt command;

interrupting said target processor as specified by said interrupt command.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to apparatus and methods for transferring data between a source and a plurality of data processing devices. More particularly, the present invention relates to an improved bus apparatus and method for transferring data between multiple system buses and a data processing device, such as a cache controller or an I/O bus interface.

2. Art Background

In the computing industry it is quite common to transfer data and commands between a plurality of data processing devices, such as computers, printers, memories, and the like, on a system or data bus. A data processing system typically includes a processor which executes instructions that are stored in addresses in a memory. The data that is processed is transferred into and out of the system by way of input/output I/O devices, onto a bus which interconnects the data processing system with other digital hardware. Common constraints on the speed of data transfer between data processing devices coupled to a bus or protocol or "handshake" restrictions which require a pre-determined sequence of events to occur within specified time periods prior to actual exchange of data between the devices. It is therefore desirable to have a low latency and high bandwidth bus which operates quickly to minimize the computing time required for a particular task. The protocol utilized by the bus should be designed to be as efficient as possible and minimize the time required for data transfer.

Another limitation on a computer bus is the size of the bus itself. Essentially, a bus is a collection of wires connecting the various components of a computer system. In addition to address lines and data lines, the bus will typically contain clock signal lines, power lines, and other control signal lines. As a general rule, the speed of the bus can be increased simply by adding more lines to the bus. This allows the bus to carry more data at a given time. However, as the number of lines increases, so does the cost of the bus. It is therefore desirable to have a bus which operates as quickly as possible while also maintaining a bus of economical size. One such bus is disclosed in three U.S. patent applications, filed Nov. 30, 1990, by Sindhu et al, assigned to the co-Assignee of the present application, Xerox Corporation, entitled: CONSISTENT PACKET-SWITCHED MEMORY BUS FOR SHARED MEMORY MULTI-PROCESSORS, CONSISTENCY PROTOCOLS FOR SHARED MEMORY MULTI-PROCESSORS, and ARBITRATION OF PACKET-SWITCHED BUSES INCLUDING BUSES FOR SHARED MEMORY MULTI-PROCESSORS.

As will be described, the present invention provides a high speed, synchronous, packet-switched bus apparatus and method for transferring data between multiple system buses and a cache controller of a processor. In comparison with the prior art circuit-switched buses allowing only one outstanding operation, the present packet-switched bus allows multiple outstanding operations. The present invention also has an arbitration implementation that allows lower latency than other prior art packet-switched buses. As will be appreciated from the following description, the present invention permits higher performance processors and I/O devices to be utilized in a system without requiring the use of extremely high pincount packages or extremely dense VLSI technologies. In the cache controller embodiment, the present invention permits a larger dual-port cache to be built by spreading the tags over multiple chips. A larger cache results in higher hit rate and therefore better processor performance. This larger cache also has available to it a higher system bus bandwidth since it is connected to multiple system buses. Higher bandwidth also translates directly to improved processor performance. In the I/O bus interface embodiment, the present invention permits multiple high bandwidth I/O devices to be connected to multiple system buses in such a way that each I/O device has uniform access to all system buses. This provides each I/O device with a large available I/O bandwidth and therefore allows it to provide a high throughput of I/O operations.

SUMMARY OF THE INVENTION

A high speed, synchronous, packet-switched inter-chip bus apparatus and method is disclosed. In the present invention, the bus connects a cache controller client chip within the external cache of a processor to a plurality of bus watcher client chips, each of which is coupled to a separate system bus. The bus comprises a plurality of lines including multiplexed data/address path lines, parity lines, and various other command and control lines for flow control and arbitration purposes. Additionally, the bus has a plurality of point-to-point arbitration wires for each device. A variety of logical entities, referred to as "devices", can send and receive packets on the bus, each device having a unique device identification. A "chip" coupled to the bus can have multiple devices coupled to it and can use any device identification allocated to it.

The bus operates at three levels: cycles, packets, and transactions. A bus cycle is one period of the bus clock; it forms the unit of time and one-way information transfer. A packet is a contiguous sequence of cycles that constitutes the next higher unit of transfer. The first cycle of a packet, called a header, carries address and control information, while subsequent cycles carry data. In the present invention, packets come in two sizes: two cycles and nine cycles. A transaction in the third level: it consists of a pair of packets (request, reply) that together performs some logical function.

The bus allows the cache controller to provide independent processor-side access to the cache and the bus watchers to handle functions related to bus snooping. An arbiter is employed to allow the bus to be multiplexed between the bus watchers and the cache controller. Before a device can send a packet, it must get bus mastership from the arbiter. Once the device has control of the bus, it transmits the packet onto the bus one cycle at a time without interruption. The arbiter is implemented in the cache controller, and is specialized to provide low latency for cache misses and to handle flow control for packet-switched system buses. Packet transmission on the bus is point-to-point in that only the recipient identified in a packet typically takes action on the packet. These flow control mechanisms ensure that the queues receiving packets or arbitration requests over the bus never overflow. A default grantee mechanism is employed to minimize the arbitration latency due to a request for the control of the bus when the bus is idle. A mechanism is further employed to preserve the arrival order of packets on system buses as the packets arrive on the bus of the present invention.

Packet headers contain a data command, control signals, a tag command, source and destination bus identifications, and an address. The data command indicates the type of data transfer between the bus watchers and the cache controller, while the tag command is used to keep the bus-side and the processor-side copies of the cache tags consistent with one another. The data command (with the exception of the rqst/rply bit) and address in reply packets are the same as those for the corresponding request packet. These commands, along with the control signals, provide sufficient flexibility to accommodate a variety of system buses.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1a is a schematic representation of a processor system employing the preferred embodiment of the present invention.

FIG. 1b is a schematic representation of an I/O bus interface employing the present invention.

FIG. 2 diagrammatically illustrates the various sub-bus structures comprising the bus structure employing the teachings of the present invention.

FIG. 3 diagrammatically illustrates the structure of queues in the bus watchers and the cache controller employing the teachings of the present invention.

FIG. 4 is a timing diagram illustrating the arbitration timing for gaining access to the bus of the present invention.

FIG. 5 illustrates the bus interface between a bus watcher and the cache controller for the purpose of computing minimum arbitration latency.

FIG. 6 is a timing diagram illustrating the arbitration sequence when the cache controller requests a low priority 2 cycle packet and the cache controller is not the default grantee.

FIG. 7 is a timing diagram illustrating the arbitration sequence when the cache controller requests a low priority 2 cycle packet and the cache controller is the default grantee.

FIG. 8 is a timing diagram illustrating the arbitration sequence when the bus watcher requests a high priority 9 cycle packet and the bus watcher is not the default grantee.

FIG. 9 diagrammatically illustrates the various components comprising the header cycle.

FIG. 10 diagrammatically illustrates the various components comprising the victim cycle for the GetSingle and GetBlock commands.

FIG. 11 diagrammatically illustrates the various components comprising the second cycle for a DemapRqst command.

FIG. 12 diagrammatically illustrtes the various components comprising the second cycle for an interrupt command.

FIG. 13 diagrammatically illustrates the various components comprising the tag command within a packet header.

FIG. 14 diagrammatically illustrates the various components comprising the header cycle and first data cycle for an error reply packet.

FIG. 15 is a schematic representation of the operation of an interrupt command among multiple processors employing the bus of the presently claimed invention.

FIG. 16 diagrammatically illustrates the operation of the default grantee mechanism.

DETAILED DESCRIPTION OF THE INVENTION

An improved high speed, synchronous, packet-switched inter-chip bus apparatus and method is described having particualar application for use in high bandwidth and low latency connections between the various parts of a cache. In the following description for purposes of explanation specific memory sizes, bit arrangements, numbers, data transfer rates, etc. are set forth in order to provide a thorough understanding of the present invention. It will be apparent to one skilled in the art, however, that the present invention may be practiced without these specific details. In other instances, well known circuits and components are shown in block diagram form in order not to obscure the present invention unnecessarily.

The bus of the presently claimed invention is a high speed, synchronous, packet-switched inter-chip bus apparatus and method for transferring data between multiple system buses and a data processing device. The data processing device may be either a cache controller coupled to a processor, as shown in FIG. 1a, or an I/O bus interface coupled to an I/O bus, as shown in FIG. 1b. To simplify the description, terminology associated with FIG. 1a will be used throughout the present application. It should be born in mind, however, that the description for the cache controller embodiment also applies to the I/O bus interface, except where indicated otherwise.

Referring to FIG. 1a, the bus 100 connects a cache controller client chip 110 and the external cache RAM 150 of a processor 120 to a plurality of "bus watcher" client chips 130 and 131, each of which is coupled to a separate system bus 140 and 141. The bus 100 comprises a plurality of lines including multiplexed data/address path lines, parity lines, and various other command and control lines for flow control and arbitration purposes. Additionally, the bus 100 has a plurality of point-to-point arbitration wires for such devices as the bus watchers 130 and 131 and the cache controller 110.

The bus 100 allows the cache controller 110 to provide independent processor-side access to the cache 150 and the bus watchers 130 and 131 to handle functions related to bus snooping. An arbiter 160 is employed to allow the bus 100 to be multiplexed between the bus watchers 130 and 131 and the cache controller 110. Before a device can send a packet, it must get bus mastership from the arbiter 160. Once the device has control of the bus 100, it transmits the packet onto the bus 100 one cycle at a time without interruption. The arbiter 160 is implemented in the cache controller 110, and is specialized to provide low latency for cache misses and to handle flow control for packet-switched system buses 140 and 141.

BUS SIGNALS

Referring to FIG. 2, the signals on the bus 100 are divided into three functional groups: control signals, arbitration signals and data signals. The control signal group contains clock signals 210 and error signals 220; the arbitration group contains request signals 230, grant signals 240 and grant type 250 for each device; and the data signal group contains data signals 270 and parity 280 signal lines. In the present preferred embodiment, signals except the clock signals 210 and data signals 270 are encoded low true. Clock signals 210 provide the timing for all bus signals. The error signal 220 is used by the cache controller 110 (hereinafter "CC") to indicate an unrecoverable error in CC 110 or the processor 120 to the bus watcher clients 130 and 131 (hereinafter "BW"). In the present embodiment, the error signal 220 is driven active low. However, there is no corresponding error signal from the BWs to the CC because unrecoverable errors in BWs 130 and 131 are reported through the system bus 140 and 141. Currently, the bus 100 supports up to four BW's and four corresponding system buses solely due to the limitation by the cache controller 110. In the arbitration group, bus request signals 230 (XReqN:N being the index for the requesting BW) are used by a BW to request the bus 100 and to control the flow of packets being sent by the CC 110. In the present embodiment, a request to use the bus for sending data consists of two contiguous cycles, while flow control requests are one or two cycles. The signals are encoded as follows:

__________________________________________________________________________ First Cycle Second Cycle Meaning __________________________________________________________________________ 00 -- No Request 01 -- Block CC Request Queue (XOL) for 9 cycles 01 01 Block XOL and CC Reply Queue (XOH) for 9 cycles 10 L0 Request bus at Priority BWLow for 2 cycles if L=0; 9 cycles if L=1 10 L1 Request bus at Priority BWLow for 2 cycles if L=0; 9 cycles if L=1; and block XOL and XOH for 9 cycles. 11 L0 Request bus at Priority BWHigh for 2 cycles if L=0; 9 cycles if L=1. 11 L1 Request bus at Priority BWHigh for 2 cycles if L=0; 9 cycles if L=1; and block XOL and XOH for 9 __________________________________________________________________________ cycles.

Currently, these signals are driven active low.

A grant signal 240 (XGntN) is used by the arbiter 160 to notify a requestor that it has been granted the bus mastership. This signal is asserted continuously for the number of cycles granted, and is never asserted unless the specific BW (BW-N) has made a request. If the BW Default Grantee mechanism (to be discussed more fully below) is implemented then it is possible for the grant signal to be asserted without a request having been made by the BW. In the present embodiment, the duration of the grant signal 240 is two cycles or nine cycles depending on the length of the packet requested. This signal is always driven. A grant-type signal 250 (XGTyp) is used to quality the grant signal 240, and has exactly the same timing as the grant signal 240. Currently, this signal is driven active low. Finally a signal 260 (XCCAF) is used by the CC 110 to notify the BWs 130 and 131 that the queue the CC uses to hold BWLow arbitration requests is at its high water mark. (see discussion below).

The data group contains data 270 and parity 280 signals. The data signals 270 (XData) on the bus 100 are bi-directional signals that carry the bulk of the information being transported on the bus 100. During header cycles they carry address and control information; during other cycles they carry data. A device drives these signals only after the receiving a grant signal 240 from the arbiter 160. The parity signals 280 (XParity) also comprise a number of bi-directional signals that carry the parity information computed over the data signals 270. The parity for a given value of the data signals appear in the same cycle as the value.

ARBITRATION AND FLOW CONTROL

As will be described, the bus 100 also has an arbiter 160 that allows the bus 100 to be multiplexed between the BWs 130 and 131 and the CC 110. When either a BW or the CC has a packet to send, it makes a request to the arbiter 160 through its dedicated request lines 230 (XReqN), and the arbiter 160 grants the bus 100 using the corresponding grant line 240 (XGntN) and the bussed grant-type line 250 (XGTyp). In the present embodiment, the arbiter 160 implements four priority levels for flow control and deadlock avoidance purposes. Service at a given priority level is round-robin among contenders, while service between levels is based strictly on priorities. The arbiter 160 is implemented in the CC 110 because this is simpler and more efficient, as will be appreciated from the following description of the invention.

Referring to FIG. 3, the bus 100 imposes the following FIFO queue structure on the BWs 130 and 131 and CC 110. Each BW has four system bus queues, two for output, two for input. The output queues, DOL 334 and 335 and DOH 336 and 337, are used to send packets at system bus priorities CacheLow and CacheHigh, respectively; the input queues, DIL 330 and 331 and DIH 332 and 333, hold packets that will be sent at bus priorities BWLow and BWHigh, respectively. The queue DIH 332 is used to hold only replies to packets originally sent by the CC 110. An implementation is also allowed to merge the queues DIL and DIH in the BWs, and XIL 310 and XIH 311 in the CC 110 if deadlock-free operation is still possible with this arrangement.

Referring to FIG. 3, the CC 110 also has four packet queues, two for input from the bus 100 and two for output to the bus 100. The input queues, XIL 310 and XIH 311, hold packets from DIL 330 and 331 and DIH 332 and 333, respectively. The output queue, XOL 312, is used to send out CC requests, while XOH 313 is used to send out CC replies. Additionally, each CC 110 has two queues, ArbLow 360 and ArbHigh 361, used to hold arbitration requests from the BWs at the priorities BWLow and BWHigh respectively. If the delay from the reception of packet on the system bus 140 and 141 to arbitration requests on bus 100 is fixed then these queues ensure that the packets from multiple system buses in each class (DIL or DIH) are serviced by the bus arbiter 160 in their system bus arrival order.

Referring again to FIG. 3, when packets are transferred from system buses 140 and 141 to CC 110 through bus 100 of the present invention, the following scheme is used to ensure that packets arriving on bus 100 are in the same order as they arrive on respective system buses 140 and 141. As an illustrative example, assume packet A arrives on system bus 140 at cycle 1 and packet B arrives on system bus 141 at cycle 4. An order-preserving implementation of the present invention will preserve the arrival order of packets A and B on system buses 140 and 141 as they are transferred to bus 100, i.e. packet A arriving on bus 100 before packet B. Conversely, if packet B arrives on system bus 141 before packet A arrives on system bus 140, then packet B is transferred onto bus 100 before packet A. Currently, the arrival order is preserved for packets entering the queues DIL 330 and 331.

The order-preserving implementation works as follows: when a packet arrives at the input of a BW, a request for control of bus 100 is sent to bus arbiter 160 a fixed number of cycles later. Currently, a request is sent to arbiter 160 two cycles after a packet arrives at the BW. When bus arbiter 160 receives requests for control of bus 100, it services the requests on a FIFO (First-in, first-out) basis, therefore preserving the system bus arrival order of packets.

In a case where both packets A and B arrive on their respective system buses 140 and 141 at the same cycle, then a fallback implementation is employed so that the packet from one pre-determined system bus is transferred first to bus 100. In the current embodiment, packets from BWO are transferred to bus 100 first in the case of simultaneous arrival on the system buses. However, it will be apparent to those skilled in the art that other fallback schemes are also available for the case of simultaneous arrival.

The BWs 130 and 131 and CC 110 interact with the arbiter 160 through three dedicated wires--XReqN, and XGntN, and the bussed XGTyp 250. In the present preferred embodiment, the arbitration wires for the CC 110 are internal since the arbiter 160 is implemented in the CC 110, while those for the BWs 130 and 131 appear at the pins of the CC 110. A BW requests the bus 100 by using its XReqN lines as follows:

______________________________________ First Second Cycle Cycle Meaning ______________________________________ 00 00 No Request 01 00 Block XOL for 9 cycles 01 01 Block XOL and XOH for 9 cycles 10 L0 Request Bus at Priority BWLow for 2 cycles if L = 0 and 9 cycles if L = 1. 10 L1 Request Bus at Priority BWLow for 2 cycles if L = 0 and 9 cycles if L = 1 and block XOL and XOH for 9 cycles. 11 L0 Request Bus at Priority BWHigh for 2 cycles if L = 0 and 9 cycles if L = 1. 11 L1 Request Bus at Priority BWHigh for 2 cycles i