WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Integrated circuit with unified memory system and dual bus architecture    
United States Patent6247084   
Link to this pagehttp://www.wikipatents.com/6247084.html
Inventor(s)Apostol, Jr.; George (Santa Clara, CA); Baran; Peter R. (Fremont, CA); McInnis; Roderick J. (Milpitas, CA)
AbstractA unified memory system includes a processor, a memory controller, a plurality of bus transactor circuits and a shared memory port. A processor bus is coupled between the processor and the memory controller. A first multiple-bit, bidirectional system bus is coupled between the shared memory port, the memory controller and the plurality of bus transactor circuits. A second multiple-bit, bidirectional system bus is coupled between the memory controller and the plurality of bus transactor circuits.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6247084
Integrated circuit with unified memory system and dual bus architecture - US Patent 6247084 Drawing
Integrated circuit with unified memory system and dual bus architecture
Inventor     Apostol, Jr.; George (Santa Clara, CA); Baran; Peter R. (Fremont, CA); McInnis; Roderick J. (Milpitas, CA)
Owner/Assignee     LSI Logic Corporation (Milpitas, CA)
Patent assignment
All assignments
Publication Date     June 12, 2001
Application Number     09/166,262
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 5, 1998
US Classification     710/108 710/100 710/240 711/147 711/149 711/150
Int'l Classification     G06F 013/00
Examiner     Dharia; Rupal D.
Assistant Examiner    
Attorney/Law Firm    
Address
Parent Case     CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of U.S. Provisional Application Serial No. 60/061,489, filed Oct. 8, 1997, which is hereby incorporated by reference.
Priority Data    
USPTO Field of Search     711/149 711/147 711/150 710/240 710/100 710/108
Patent Tags     integrated circuit unified memory dual bus architecture
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5854638
Tung

Dec,1998

[0 after 0 votes]
5822768
Shakkarwar
711/149
Oct,1998

[0 after 0 votes]
5805905
Biswas
710/244
Sep,1998

[0 after 0 votes]
5561777
Kao
711/5
Oct,1996

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A unified memory system comprising:

a processor;

a memory controller;

a plurality of bus transactor circuits;

a shared memory port, including a memory address interface, a memory control interface and a memory data interface, which are coupled to the memory controller;

a processor bus which is coupled between the processor and the memory controller;

a first multiple-bit, bidirectional system data bus which is coupled between the memory data interface of the shared memory port, the memory controller and the plurality of bus transactor circuits and which carries memory data between the memory data interface and the plurality of bus transactor circuits; and

a second multiple-bit, bidirectional system command bus which is coupled between the memory controller and the plurality of bus transactor circuits and which carries non-memory data, including requests for access to the memory data interface over the data bus and memory addresses related to the memory data, between the memory controller and the plurality of bus transactor circuits.

2. The unified memory system of claim 1 wherein the plurality of bus transactor circuits comprises:

a display controller which comprises a first bus interface unit coupled to the data bus and the command bus;

a parallel input-output controller which comprises a second bus interface unit coupled to the data bus and the command bus; and

a serial input-output controller which comprises a third bus interface unit coupled to the data bus and the command bus.

3. The unified memory system of claim 2 wherein the first and second system buses, the processor bus, the shared memory port, the processor, the memory controller, the display controller, the parallel input-output controller and the serial input-output controller are fabricated on a single semiconductor integrated circuit.

4. The unified memory system of claim 1 wherein:

one of the plurality of bus transactor circuits comprises a display controller which has a display queue for queueing an amount display data received from the shared memory port over the data bus and has a watermark output which is coupled to the memory controller, wherein the watermark output indicates whether the amount of display data queued in the display queue is more than or less than a predetermined amount; and

the memory controller preempts memory data transfers over the data bus by the other of the plurality of bus transactor circuits and the processor when the watermark output indicates the amount of display data queued in the display queue is less than the predetermined amount.

5. The unified memory system of claim 1 wherein:

one of the plurality of bus transactor circuits comprises a display controller which has a display queue for queueing an amount display data received from the shared memory port over the data bus and has a watermark output which is coupled to the memory controller, wherein the watermark output indicates whether the amount of display data queued in the display queue is more than or less than a predetermined amount; and

the memory controller controls access to the command bus by the processor, the display controller and the other bus transactor circuits according to the following priority:

the display controller has a first, highest priority when the watermark output indicates the amount of display data queued in the display queue is less than the predetermined amount;

the processor has a second priority which is less than the first priority;

the other bus transactor circuits have a third priority which is less than the second priority; and

the display controller has a fourth, priority which is less than the third priority when the watermark output indicates the amount of display data queued in the display queue is more than the predetermined amount.

6. The unified memory system of claim 1 wherein each bus transactor circuit comprises:

a dual port random access memory (DPRAM) having first and second ports, wherein the first port is operably coupled to the data bus and the command bus; and

a subsystem which is operably coupled to the second port of the DPRAM.

7. The unified memory system of claim 6 wherein each bus transactor circuit further comprises:

a bus interface circuit which is coupled between the first port and the data bus and between the first port and the command bus; and

a subsystem interface circuit which is coupled between the second port and the subsystem.

8. The unified memory system of claim 7 wherein:

the bus interface circuits of at least two of the plurality of bus transactor circuits are logically and physically identical to one another; and

the subsystem interface circuits of the at least two bus transactor circuits are logically and physically unique to the subsystems of the respective bus transactor circuits.

9. The unified memory system of claim 1 wherein the memory controller comprises means for transferring the memory data between the memory data interface of the shared memory port and the plurality of bus transactor circuits over the data bus and for transferring the non-memory data between the plurality of bus transactor circuits over the command bus.

10. The unified memory system of claim 1 wherein the memory controller comprises means for controlling access by the plurality of bus transactor circuits to the data bus independently of access to the command bus.

11. The unified memory system of claim 1 wherein the memory controller comprises a command queue for storing memory access commands transferred over the command bus by the plurality of bus transactor circuits and wherein the memory controller controls access to the data bus based on the memory access commands stored in the command queue.

12. The unified memory system of claim 1 wherein the memory controller comprises means for enabling a data transaction by one of the plurality of bus transactor circuits over the data bus and for simultaneously enabling a command transaction by another of the plurality of bus transactor circuits over the command bus.

13. The unified memory system of claim 1 wherein:

the memory controller further comprises a plurality of load data bus control outputs and a plurality of data bus grant control outputs; and

each bus transactor circuit comprises a load data bus control input which is coupled to a corresponding one of the load data bus control outputs and a data bus grant control input which is coupled to a corresponding one of the data bus grant control outputs.

14. The unified memory system of claim 1 wherein:

the memory controller further comprises a plurality of load command bus control outputs, a plurality of command bus grant control outputs, and a plurality of command bus request inputs; and

each bus transactor circuit comprises a load command bus control input which is coupled to a corresponding one of the load command bus control outputs, a command bus grant control input which is coupled to a corresponding one of the command bus grant control outputs, and a command bus request output which is coupled to a corresponding one of the command bus request inputs.

15. The unified memory system of claim 1 wherein the memory controller comprises means for receiving memory data from the shared memory port over the data bus and passing the memory data received from the shared memory port to the processor over the processor bus and comprises means for receiving memory data from the processor over the processor bus and passing the memory data received from the processor to the shared memory port over the data bus.

16. A method of passing data between a shared memory port, a memory controller and a plurality of bus transactor circuits, the method comprising:

passing memory data between the shared memory port, the memory controller and the plurality of bus transactor circuits over a multiple-bit, bidirectional data bus;

passing non-memory data including requests for access to the shared memory port over the data bus and memory addresses related to the memory data, between the memory controller and the plurality of bus transactor circuits over a multiple-bit, bidirectional command bus;

controlling access by the plurality of bus transactor circuits to the data bus with the memory controller based on the requests for access to the shared memory port; and

controlling access by the plurality of bus transactor circuits to the command bus with the memory controller independently of access to the data bus.

17. The method of claim 16 wherein controlling access to the data bus comprises:

passing a data bus request command from a first of the bus transactor circuits to the memory controller over the command bus;

passing a data bus grant signal from the memory controller to the first bus transactor circuit in response to the data bus request command; and

performing the step of passing memory data between the shared memory port and the first bus transactor circuit over the data bus in response to the data bus grant signal.

18. The method of claim 17 wherein passing a data bus request command comprises:

passing a command bus request signal from the first bus transactor circuit to the memory controller;

passing a command bus grant signal from the memory controller to the first bus transactor circuit in response to the command bus request signal; and

passing the data bus request command from the first bus transactor circuit to the memory controller over the command bus in response to the command bus grant signal.

19. A single integrated circuit comprising:

a processor;

a memory controller;

a plurality of bus transactor circuits;

a shared memory port, including a memory address interface, a memory control interface and a memory data interface, which are coupled to the memory controller;

a processor bus which is coupled between the processor and the memory controller;

a data bus which is coupled to the memory data interface of the shared memory port, the memory controller and the plurality of bus transactor circuits for passing memory data between the memory data interface and the plurality of bus transactor circuits; and

a command bus which is coupled to the memory controller and the plurality of bus transactor circuits for passing non-memory data, including requests for access to the memory data interface over the data bus and memory addresses related to the memory data, between the memory controller and the plurality of bus transactor circuits.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to integrated circuits and, in particular, to an integrated circuit having a unified memory architecture.

Unified memory architectures have been used for various computer applications, such as network computers, Internet appliances and mission specific terminal applications. In a typical unified memory architecture, all devices requiring access to memory are coupled to a common system bus. These devices can include a processor, an input-output device or a graphics device, for example. A memory controller arbitrates access to memory between the various devices.

Memory latency is a common difficulty in unified memory architectures since each device must arbitrate for access to memory over the system bus. Latency can be reduced by requesting bursts of data from memory. For example, graphics devices may request bursts of display data from a frame buffer. Since graphics devices continually supply data to a screen display, these devices have a high bandwidth requirement and cannot easily accommodate long memory latencies. On the other hand, processors typically request specific data from memory or another device and then wait for the data without giving up access to the system bus. Also, processors require a relatively high priority. This often results in contention for the system bus between the processor and devices having high bandwidth requirements.

A conventional system with multiple bus masters uses an address bus and a data bus to control the memory system. Typically, both of these busses are arbitrated for and granted to one master at a time. Many cycles of bus time are lost due to dead time between masters, and time required for each master to communicate its data request to the memory controller. In addition, the processor uses the same bus for doing "program Input/Output" functions, which are very inefficient in terms of bus utilization.

A typical system that includes a raster scan display output for graphics uses a second memory system for this time critical function. Not only does this extra memory system increases cost, but the overall performance of the system is impacted due to the need for the data to be copied from processor memory space into the display memory space.

SUMMARY OF THE INVENTION

The unified memory system of the present invention provides a high enough bandwidth to enable a graphics and display subsystem to use the same memory as a processor and other bus transactor circuits. The unified memory system includes a processor, a memory controller, a plurality of bus transactor circuits and a shared memory port. A processor bus is coupled between the processor and the memory controller. A first multiple-bit, bidirectional system bus is coupled between the shared memory port, the memory controller and the plurality of bus transactor circuits. A second multiple-bit, bidirectional system bus is coupled between the memory controller and the plurality of bus transactor circuits.

Another aspect of the present invention relates to a method of passing data between a shared memory port, a memory controller and a plurality of bus transactor circuits, the method includes: passing memory data between the shared memory port, the memory controller and the plurality of bus transactor circuits over a multiple-bit, bidirectional data bus; passing non-memory data between the memory controller and the plurality of bus transactor circuits over a multiple-bit, bidirectional command bus; controlling access by the plurality of bus transactor circuits to the data bus with the memory controller; and controlling access by the plurality of bus transactor circuits to the command bus with the memory controller independently of access to the data bus.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an integrated circuit according to one embodiment of the present invention.

FIG. 2 is a block diagram showing the integrated circuit coupled to a variety external devices.

FIG. 3 is a memory map of the integrated circuit.

FIG. 4 is a more detailed block diagram of the integrated circuit, according to one embodiment of the present invention.

FIG. 5 is a diagram illustrating inputs and outputs of a system bus interface unit in a bus transactor circuit within the integrated circuit.

FIG. 6 is a diagram illustrating an acknowledge message format.

FIG. 7 is a diagram illustrating logical separation of a dual port RAM in the system bus interface unit shown in FIG. 5.

FIG. 8 is a diagram illustrating a command bus message header format.

FIG. 9 is a diagram illustrating a command bus message header format for a screen block transfer.

FIG. 10 is a table illustrating available transaction types of a command field in the header formats of FIGS. 8 and 9.

FIG. 11 is a waveform diagram illustrating data bus timing within the integrated circuit.

FIG. 12 is a waveform diagram illustrating command bus within the integrated circuit.

FIG. 13 is a block diagram illustrating an example of a subsystem interface to the DPRAM shown in FIG. 5.

FIG. 14 is a waveform diagram illustrating waveforms in the subsystem interface shown in FIG. 13 during a PIO read.

FIG. 15 is a waveform diagram illustrating waveforms in the subsystem interface shown in FIG. 13 during a PIO write.

FIG. 16 is a waveform diagram illustrating waveforms during outbound data transfers.

FIG. 17 is a block diagram of a processor in the integrated circuit according to one embodiment of the present invention.

FIG. 18 is a simplified block diagram illustrating connection of a memory controller to the system blocks of integrated circuit 10.

FIG. 19 is a diagram illustrating inputs and outputs of the memory controller shown in FIG. 18.

FIG. 20 is a block diagram of an interface between the memory controller and external memory.

FIGS. 21A-21C together form a table of memory controller registers.

FIG. 22 is a table which defines each bit of a reset and status register.

FIG. 23 is a table which defines each bit of a system configuration register.

FIG. 24 is a table which defines each bit of a memory configuration register.

FIG. 25 is a table which defines each bit of a memory initialization and refresh register.

FIG. 26 is a table which defines each bit of a frame configuration register.

FIG. 27 is a table which defines each bit of frame starting tile address and tile configuration registers.

FIG. 28 is a table which lists common frame resolution numbers.

FIG. 29 is a table which defines each bit of a display DMA control register.

FIG. 30 is a table which defines each bit of a display DMA ID register.

FIG. 31 is a table which defines each bit of a display starting offset register.

FIG. 32 is a table which defines each bit of a display screen size register.

FIG. 33 is a table which defines each bit of a dither LUT register.

FIG. 34 is a diagram illustrating how pixel data is cached in a window cache.

FIG. 35 is a table which defines each bit of a window starting address register.

FIG. 36 is a table which defines each bit of a window size register.

FIG. 37 is a table which defines each bit of a load window cache register.

FIG. 38 is a table which defines each bit of a flush window cache register.

FIG. 39 is a table which defines each bit of a window cache status register.

FIG. 40 is a table which defines a packer data register.

FIG. 41 is a table which defines each bit of a packer starting address register.

FIG. 42 is a table which defines each bit of a packer data size register.

FIG. 43 is a table which defines each bit of display current address registers.

FIG. 44 is a table which defines each bit of display remain size registers.

FIG. 45 is a table which defines each bit of a window current address register.

FIG. 46 is a table which defines each bit of window remain registers.

FIG. 47 is a waveform diagram illustrating PIO read response timing.

FIG. 48 is a waveform diagram illustrating cache line fill response timing.

FIG. 49 is a waveform diagram illustrating PIO write timing.

FIG. 50 is a waveform diagram illustrating PIO read timing.

FIG. 51 is a waveform diagram illustrating DMA request timing.

FIG. 52 is a diagram illustrating interface signals to and from a graphics and display subsystem within the integrated circuit.

FIG. 53 is a table indicating a DISP_LD[1:0] signal format.

FIG. 54 is a diagram of a DMA command header for Screen relative addressing direct memory accesses (DMAs).

FIG. 55 is a block diagram of the graphics and display subsystem.

FIG. 56 is a diagram illustrating partitioning of a DPRAM in the graphics and display subsystem.

FIG. 57 is a simplified block diagram of a data path a bus interface unit of the graphics and display subsystem.

FIG. 58 is a simplified block diagram of a subsystem interface unit of the graphics and display subsystem.

FIG. 59 is a block diagram of a pixel pipe section of the graphics and display subsystem.

FIG. 60 is a block diagram of a graphics BitBLT data flow through the graphics and display subsystem.

FIG. 61 is a block diagram of a serial subsystem in the integrated circuit.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The integrated circuit of the present invention has a unified memory and dual bus architecture which maximizes bandwidth to and from an external memory device while minimizing latency for individual subsystems that compete for access to the memory device.

FIG. 1 is a block diagram of the integrated circuit of the present invention. Integrated circuit 10 includes processor 12, memory controller 14, plurality of bus transactor circuits 15A-15C, shared memory port 20 and dual system buses 22 and 24. Processor 12 is coupled to memory controller 14 over a bidirectional processor bus 26 which includes processor address lines 28, processor control lines 30 and processor data lines 32 which allow processor 12 to communicate with memory controller 14.

Memory controller 14 is coupled to shared memory port 20 and system buses 22 and 24. Shared memory port 20 includes a memory address interface 40, a memory control interface 42 and a memory data interface 44. Memory data interface 44 is coupled to system bus 22. Shared memory port 20 is coupled to an external memory device 46, which can include a synchronous dynamic random access memory (SDRAM), for example.

Bus transactor circuits 15A-15C are coupled to memory controller 14, shared memory port 20 and to one another through multiple-bit, bidirectional system bus 22. Bus transactor circuits 15A-15C are also coupled to one another and to memory controller 14 through multiple-bit, bidirectional system bus 24. System bus 22 is a data bus which carries memory data being transmitted to and from external memory 46 by bus transactor circuits 15A-15C and processor 12 (through memory controller 14). System bus 24 is a command bus which carries command data and programmed input-output (PIO) data being transmitted between bus transactor circuits 15A-15C and processor 12 (through memory controller 14).

Data bus 22 is used exclusively for transferring memory data between memory 46 and one of the bus masters. Command bus 24 is used for transferring "requests" for memory data transfers by bus transactor circuits 15A-15C and for PIO operations. Memory controller 14 includes a command queue for storing the requests so that the next memory access can be started at the earliest possible time without relying on performance or latency of command bus 24. Access to data bus 22 results from memory controller 14 executing one of the commands that is stored in the command queue. If the next command in the queue is for access to memory 46, data bus 22 is automatically granted to the requesting device. A bus transactor circuit requesting read access to memory 46 is always ready to receive the corresponding data, and a bus transactor circuit requesting write access is always ready to send the data.

Each bus transactor circuit 15A-15C can include a variety of devices requiring access to external memory 46 such as another processor, a serial input-output (I/O) subsystem, a parallel I/O subsystem and a graphics and display subsystem.

With two system buses, including data bus 22 and command bus 24, bus transactor circuits 15A-15C can request access to external memory 46 and pass memory controller 14 the address of the next block of data over command bus 24 while data is being transferred simultaneously to another one of the bus transactor circuits or the processor over data bus 22. Bus transactor circuits 15A-15C do not have to wait until the end of the data transfer to pass the address of the next block of data to be transferred. This reduces memory latency. Also, PIO data transfers are passed over command bus 24, which leaves data bus 22 free for higher bandwidth data transfers and therefore reduces contention on data bus 22.

The dual bus architecture of the present invention allows the system to utilize a much greater amount of the theoretical memory performance. This enables a graphics and display subsystem to use the same memory as the processor and other bus transactor circuits in unified memory system. A second memory system for display data is not required as in conventional computer systems. This results in a significant cost savings and performance improvement.

In one embodiment of the present invention, integrated circuit 10 is implemented as an Application-Specific Standard Product (ASSP) for use in Network Computer, Internet Appliance and mission specific terminal applications. In this embodiment, integrated circuit 10 integrates many of the common functions associated with attaching to the Internet such that all of the functions needed for an Internet browser box can be implemented with only the addition of memory, such as external memory 46.

For example, FIG. 2 is a block diagram showing integrated circuit 10 coupled to a variety external devices, including a peripheral component interface (PCI) 60, an Ethernet local area network (LAN) 62, an Interactive Services Digital Network (ISDN) network 64, a keyboard 66, a mouse 68, a monitor or LCD panel 70, an audio digital-to-analog (D/A) converter 72, an audio analog-to-digital converter 74, SDRAM 46, a read-only memory 76, a serial electrically-erasable read-only memory (EEPROM) 78, an ISO7816 compliant SmartCard interface 80, a printer 82 and a scanner 84.

1. Physical Address Map for Integrated Circuit 10

Integrated circuit 10 has a 32--bit physical address, which allows integrated circuit 10 to address four gigabytes of contiguous physical memory. All internal resources, as well as system resources, are mapped within this address space.

FIG. 3 is a memory map illustrating the division of system resources. The starting address of each block of memory is indicated at 90, where "0x" represents a hexadecimal number. The system resource associated with each block of memory is indicated at 92. The quantity of memory contained in each block of memory is indicated at 94, where "M" represents megabytes and "G" represents gigabytes.

2. Overall System Architecture

FIG. 4 is a block diagram of integrated circuit 10 according to the above-example. The same reference numerals are used in FIG. 4 as were used in FIG. 1 for the same or similar elements. Integrated circuit 10 includes a plurality of external pins, including serial I/O pins 100, PCI and parallel I/O pins 102, display pins 104, and SDRAM pins which include memory data pins 106 and memory address and control pins 108. Pins 106 and 108 form shared memory port 20.

Integrated circuit 10 further includes processor 12, memory controller 14, bus transactor circuits 15A-15C, data bus 22 and command bus 24. In one embodiment, processor 12 includes a CW4011 Microprocessor Core available from LSI Logic Corporation, a Multiply/Shift Unit, a MMU/TLB, 16K instruction cache, 8K data cache, and a Cache Controller/Bus Interface Unit. The CW4011 core is a MIPS.RTM. architecture processor that implements the R4000, MIPS.RTM. II compliant 32--bit instruction set. Other types of processors can also be used.

Processor 12 is coupled to memory controller 14 and interrupt controller 110 through processor bus interface unit 112. As memory and interrupt functions are closely tied to processor 12, interrupt controller 110 is coupled to processor 12 to take advantage of an arbitration scheme geared towards maintaining processor performance. System interrupts are funneled through interrupt controller 110 to the processor 12. Interrupt controller 110 supports programmable priority assignments which provide flexibility to the system design of integrated circuit 10. Processor 12 can read from or write to any one of the bus transactor circuits 15A-15C directly over command bus 24 via programmed I/O cycles. In most cases, data to and from external memory 46 is transferred over data bus 22 via one of many on-chip direct memory access (DMA) engines located in bus transactor circuits 15A-15C and memory controller 14, as described in more detail below. The DMA capabilities serve to off-load data transfer duties from processor 12 as well as to ensure that data bus 22 is used most effectively by using burst transfers whenever possible.

Memory controller 14 passes memory data between shared memory port 20, processor 12 and bus transactor circuits 15A-15C over data bus 22. Memory controller 14 passes non-memory data between processor 12 and bus transactor circuits 15A-15C over command bus 22. For example, memory controller 14 passes header data (data transfer requests) between memory controller 14 and bus transactor circuits 15A-15C and passes programmed input-output (PIO) data between processor 12 and bus transactor circuits 15A-15C over command bus 24.

Bus transactor circuits 15A-15C include bus interface units (BIUs) 120A-120C, dual port RAMs (DPRAMs) 122A-122C, subsystem interface units (SIUs) 124A-124C and subsystems 126A-126C, respectively. Subsystems 126A-126C are also referred to as "peripheral blocks".

Subsystem 126A is a serial I/O subsystem which implements a fast Ethernet 10 Mbit/100 Mbit per second peripheral device, a four port universal serial bus host controller, an audio-97 AC-link audio peripheral and a set of generic programmed I/O pins. Subsystem 126B is a PCI and parallel I/O subsystem which includes a high performance PCI interface, an IEEE 1284 compliant parallel port, and IDE/ATA-PI disk interface, provisions for flash ROM and PCMCIA adapters, PS2 compatible keyboard and mouse inputs, I.sup.2 C interfaces and a SmartCard interface.

Subsystem 126C is a graphics and display subsystem which supports direct attachment to a CRT monitor or an LCD panel, such as monitor 70, shown in FIG. 2, through red-green-blue (RGB) and digital outputs formed by display pins 104. External memory 46, shown in FIG. 1, is coupled to SDRAM pins 106 and 106 and is used to hold a video frame buffer for display and graphics subsystem 126C.

Each subsystem 126A-126C uses a message passing, split transaction protocol to transfer data and control information over data bus 22 and command bus 24. Buses 22 and 24 are 64-bit, bidirectional, tri-state, buses. Each bus transactor circuit 15A-15C has an input and output queue within DPRAMs 122A-122C for storing messages being passed to and from its subsystem and the other bus transactor circuits 15A-15C and processor 12. Since processor 12 requires a low latency, high speed access to memory, it has a private port to memory controller 14 through processor bus 26 (shown in FIG. 1).

2.1 Data and Command Bus Interfaces

Bus interface units (BIUS) 120A-120C direct traffic over buses 22 and 24 to and from respective subsystems 126A-126C. Messages are passed between buses 22 and 24 and subsystem interface units (SIUs) 124A-124C through bus interface units 120A-120C and DPRAMs 122A-122C, respectively.

Typically, the operating frequency of each subsystem differs from that of system buses 22 and 24. DPRAMs 122A-122C are the logical boundaries for the different clock domains. In one embodiment, BIUs 120A-120C and DPRAMs 122A-122C are logically and physically identical. Although some portions of SIUs 124A-124C are similar, subsystem specific logic is typically required in each implementation. Thus, each SIU 124A-124C is logically and physically unique to the corresponding subsystem.

2.1.1 Bus Interface Unit Signals

FIG. 5 is a diagram illustrating the inputs and outputs of one of the system bus interfaces for subsystems 126A-126C. The system bus interface includes BIU 120, DPRAM 122 and SIU 124. DPRAM 122 is divided into a plurality of queues and forms a clock boundary 130 between BIU 120 and SIU 124. BIU 120 has the following input and output signals:

BCLK (input) is a System Bus Clock to which all bus signals are referenced.

RESET_N (input) is a System Reset signal.

DATA[63:0] (tri-state, bidirectional) is the 64-bit bi-directional data bus 22 (shown in FIGS. 1 and 4) for transferring data to and from external memory 46.

DATA_ERR (input) is asserted by memory controller 14 when the subsystem attempts a transaction to an invalid memory address.

DATA_LD (input) is a signal which loads the contents of data bus 22 into DPRAM 122. This signal will be asserted by memory controller 14 when data is to be transferred from external memory 46 to DPRAM 122. Data will be valid on Data Bus 22 on the following clock. This signal is used for direct memory access (DMA) data transfers from external memory to the corresponding subsystem.

DATA_GNT (input) is a DATA_GRANT signal which is asserted by memory controller 14 to the subsystem, indicating that BIU 120 should drive data onto data bus 22 on the following clock. This signal is used for DMA data transfers to external memory.

DATA_EOT (input) is a Data bus End Of Transfer signal which is asserted by memory controller 14 on the clock cycle that precedes the last cycle of a data transfer.

CMD[63:0] (tri-state, bidirectional) is the 64-bit bi-directional Command Bus 24 for communicating command headers and CPU data transfers (PIO) between memory controller 14 and each subsystem.

CMD_LD (input) is a Command Load signal which is asserted by memory controller 14 when the CPU (processor 12) is requesting a PIO transfer to the corresponding subsystem indicating that a valid command header will be present on command bus 24 on the following clock.

CMD_GNT (input) is a COMMAND GRANT signal which is asserted by memory controller 14 to indicate that BIU 120 is granted command bus 24.

CMD_PWA (output) is a PIO Write Acknowledge signal which is asserted by BIU 120 to indicate to memory controller 14 that a PIO write has been completed.

CMD_REQ[1:0] (output) is a Command Request signal which is asserted by BIU 120 to memory controller 14 to request that Command Bus data be transferred. The Command Request signal is coded per the following Table:

TABLE 1 00 IDLE 01 Memory Request 10 CPU Read Reply 11 Interrupt Request

2.1.2 Subsystem Interface Unit (SIU) Signals

Subsystem Interface Unit (SIU) 124 provides a synchronous interface between DPRAM 122 and the subsystem hardware logic. SIU 124 has the following input and output signals:

SCLK (input) is a Sub-system clock signal to which all SIU signals are referenced.

SRESET_N (output) is an SIU Reset signal which provides a synchronized system reset to the subsystem.

Dout[63:0] (output) is a 64-bit Data Out signal from SIU 124 to the subsystem.

Din[63:0] (input) is a 64-bit Data In signal from the subsystem to SIU 124.

ADDin (input) is an Address in signal from the subsystem to SIU 124.

WCE (input) is a Write Clock Enable which is asserted by the subsystem during the clock period when valid address and data are presented to SIU 124. Data will be written in DPRAM 122 on the rising edge of the clock when WCE is asserted.

VALID_PIO (output) is a Valid Program I/O in queue signal which, when asserted, indicates that PIO information is still being held in an Input Command Queue in DPRAM 122. The assertion of VP_ACK will pop an entry off the VALID_Input Command Queue. The signal VALID_PIO may remain asserted if additional PIO requests have been loaded into the queue.

VP_ACK is a Valid PIO Acknowledge input which is asserted by the subsystem to indicate that the top entry in the Input Command Queue has been used and can be discarded. This signal will be used by Input Command Queue pointers to advance to the next entry as well as to decrement the VALID_PIO counter.

WRITE is an SIU write input which is asserted by the subsystem to indicate that the PIO data has been decoded to be a write.

ACK_VLD is an ACK bus Valid output which is asserted by SIU 124 to indicate that ACK_BUS[7:0] contains a valid acknowledge message. This signal will be asserted when a data transfer begins.

AB_ACK is an ACK bus Acknowledge input which is asserted by the subsystem to indicate that the current acknowledge message has been read and is no longer needed.

PNTR_VLD is a Pointer Valid output which is asserted by SIU 124 to indicate that ACK_BUS[7:0] contains an updated queue pointer. This signal will be asserted when a data transfer completes.

ACK[7:0] is the Acknowledge Bus output which includes an Acknowledge message sent from BIU 120 to SIU 124 to inform the subsystem when memory requests have been completed and to provide the updated DPRAM address for buffer queue management (may be used by the subsystems for FIFO control). The Acknowledge message format is illustrated in FIG. 6, where "CMD" is a command field which indicates a memory write, a memory read or an error condition, "SSID" is a subsystem identification field, and "NEWRAMADR" is a new address for DPRAM 122.

HEADER_ADD is a Header Queue Addition input which is asserted for one clock when the subsystem has placed a header into a Request queue in DPRAM 122.

HQ_FULL is a Header Queue Full output which is asserted by SIU 124 when the Request queue is full.

2.1.3 Global Signals

The following signals are global signals within integrated circuit 10, which are not specifically shown in FIG. 5.

BIG is a Big Endian Mode signal. When asserted, BIG indicates that system buses 22 and 24 are operating in Big endian mode (i.e. byte address 0 is bits 63:56).

CONFIG_ENABLE is a Configuration Mode Enable. When asserted, this signal indicates that integrated circuit 10 is in a configuration mode and that the power-on defaults are being shifted in through a CONFIG_DIN port.

CONFIG_CLK is a signal on which configuration data is based.

CONFIG_DINx is a serial Configuration Data signal stream which is used to establish reset defaults. Each hierarchical block will take Din, direct it to all necessary register elements, then provide Dout.

CONFIG_Doutx is a serial Configuration Data output.

2.2 System Bus Transactions

To facilitate communications with system buses 22 and 24, DPRAM 122 is logically separated as illustrated in FIG. 7. DPRAM 122 has a Data Queue 150A, a reserved section 150B, a Read Response Queue 150C, an Input Command Queue 150D and a Request Queue 150E. The individual locations in DPRAM 122 are shown at 152, and their corresponding hexadecimal addresses are shown at 154.

The first 256 locations in DPRAM 122 define Data Queue 150A and are used to store DMA data for the subsystem. Read Response Buffer 150C is used to store PIO and Cache Line Fill data from the subsystem when processor 12 is reading data from the subsystem (a CPU read cycle). Input Command Queue 150D stores incoming PIO requests from processor 12 to the subsystem. Request Queue 150E is used for storing subsystem messages being sent to system command bus 24.

2.2.1 Header Format

All command bus messages which are passed through Input Command Queue 150D or Request Queue 150E commence with a header 160 which is formatted as shown in FIG. 8. Each field of header 160 is defined below:

ERROR (Transaction Error) is a read reply error flag. In the event that a PIO read request cannot be completed, the subsystem will return a header with this bit set.

CMD (Command) contains the three bit Transaction type (see FIG. 10).

BCNT(7:0)/Mask (Byte Count/Write Mask). For all read operations and burst write transfers, this field contains the number of bytes to be transferred. For write single commands, this field indicates the byte lanes to be written. Bit 7 corresponds to bits 63-56 of the 64-bit word, and bit 0 corresponds to bits 7-0 of the 64-bit word.

SSID (Subsystem ID) is used for message tracking to identify the particular subsystem associated with the message. These bits are set by the subsystem when a memory data transfer is requested. They are undefined for PIO headers.

RAMADR[7:0] (Ram Address) is the address offset into Data Queue 150A which contains the data to be used for the data transfer. The most significant bit (MSB) of the DPRAM 122 is implied by the type of transfer (i.e. DMA data versus command data).

WRAP (Address Wrap Select) is the bit on which to wrap the RAM pointer. A value of zero wraps on bit 0, resulting in a two word buffer. A value of 1 wraps on bit 1, providing a two bit address, resulting in a four word buffer. A value of 7 wraps the address on bit 7, which yields a 256 word buffer in Data Queue 150A.

DEC (decrementing burst direction), when set, instructs memory controller 14 that the memory addresses for a burst transfer should decrement.

ADDRESS [31-0] (System Address) is the physical address in the external memory where the data will be transferred. This is a byte address, and bits 1-0 are significant.

2.2.2 Screen Block Header Format

For graphics accesses, such as for graphics and display subsystem 126C, a special command header is used to allow tile based DMA operations using screen relative addressing. This header is used when memory controller 14 must perform address translation from a screen coordinate to a physical memory location in the external memory. The header format for a special command header 170 is shown in FIG. 9. The fields in header 170 are defined as follows:

offset (bits [7-0]) define an X offset within a tile for a starting pixel.

offset (bits [15-8]) define a Y offset within a tile for the starting pixel.

TileID (bits [23-16]) define a tile number with respect to a particular frame buffer for the starting pixel.

Width (bits [31-24]) define a number of bytes per line.

Height (bits [36-32]) define the number of lines (5 bits).

Direction (bit [37]) 1=read; 0=write.

FrameID (bits [39-38]) is a frame buffer ID (e.g. front/back buffer or overlay plane).

RAMADR (bits [47-40]) define the starting DPRAM address (8 bits) for subsystem use.

Bits [49-48] are reserved.

BSize (bits [55-50]) define the burst size.

BSteer (bits [59-56]) are used by the subsystem on a read for byte steering.

CMD bits ([62-60]) are set to "000" for this special header type.

ERROR (bit [63]) is always `0` for compatibility with other command headers.

2.2.3 Transaction Types

FIG. 10 is a table which shows the transaction types supported by command bus 22. The transaction types defined by the command CMD field in headers 160 and 170.

The dual system bus architecture of integrated circuit 10 allows for concurrent transfers on data bus 22 and command bus 24. There are some limitations and rules that should be adhered to, however. Concurrent transfers on data bus 22 and command bus 24 to the same bus transactor circuit are not supported. Memory controller 14 has the responsibility to ensure this does not occur. One clock of bus free time is required between data transfers into DPRAM 122 (Data_LD or CMD_LD asserted) and data transfers out of DPRAM 122 (assertion of DATA_GNT or CMD_GNT). BIU 120 guarantees part of this requirement by not asserting CMD_REQ[1:0] during an ongoing data phase, assuring that CMD_GNT will not be issued. Memory controller 14 assures that a data phase (DATA_GNT) is not started until one clock after a CMD_LD has been issued.

2.3 Bus Interface Unit (BIU)

BIU 120 controls transfers on system buses 22 and 24 by managing the input and output message queues in DPRAM 122. BIU 120 is the transport mechanism by which the subsystem communicates to memory controller 14 and processor 12. BIU 120 contains no subsystem specific data. All DMA and PIO functions, such as buffer allocation, address generation, and register processing, are maintained by the corresponding subsystem.

BIU 120 reacts to messages sent by the subsystem or memory controller 14/processor 12 and manages flow control on buses 22 and 24.

2.3.1 Data Bus Timing

Data bus 22 is used exclusively for passing data between external memory 46, through shared memory port 20, and a bus master (either processor 12 or one of the bus transactor circuits 15A-15C). FIG. 11 is a waveform diagram illustrating the timing for a four cycle data burst on data bu