An improvement in a microprocessor having a cache memory providing strong and weak write ordering modes. The microprocessor includes a terminal for receiving a signal indicating whether an external write buffer is empty and an internal signal indicating whether an internal write buffer is empty. Operation of the microprocessor is halted in the strong ordering mode if the write buffers are not empty and a hit condition occurs during a write cycle until the buffers are empty.
A method and system of implementing a cache coherency mechanism for supporting a non-inclusive cache memory hierarchy within a data processing system is disclosed. In accordance with the method and system of the invention, the memory hierarchy includes a primary cache memory, a secondary cache memory, and a main memory. The primary cache memory and the secondary cache memory are non-inclusive. Further, a first state bit and a second state bit are provided within the primary cache, in association with each cache line of the primary cache. As a preferred embodiment, the first state bit is set only if a corresponding cache line in the primary cache memory has been modified under a write-through mode, while the second state bit is set only if a corresponding cache line also exists in the secondary cache memory. As such, the cache coherency between the primary cache memory and the secondary cache memory can be maintained by utilizing the first state bit and the second state bit in the primary cache memory.
A method and system in a data processing system for transferring data from a first device to a second device within the data processing system. The data processing system includes a data bus, an address bus, a first address space associated with a memory and a second address space associated with an input/output device. Initially, a transfer signal is transmitted in the data processing system. The transfer signal identifies the transfer as a transfer concerning an address in the second address space associated with the input/output device. A first address package is then transmitted to the second device from the first device on the address bus. The first address package includes a transfer identifier, a first identifier associated with the first device and a second identifier associated with the second device. A second address package, comprising a byte count and an address, are transmitted to the second device from the first device on the address bus. If data is to be transferred, the data is then transferred on the data bus. Finally, a reply signal may be transmitted between the first and second devices, acknowledging the success or failure of the data transfer.
After a portion of a cache line has been zone written from a processor core (102) to a cache array (105), a read access received from the processor core (102) for one or more bytes within the cache line corresponding to the zone written data can be satisfied before a cache fill operation initiated by the zone written operation is completed. If the read access is for one or more bytes of the cache line which was not previously zone written, then the requested data is passed directly from the filling bus (113) to the processor core (102) as soon as it becomes valid on the filling bus (113). If the read access is for one or more bytes of the zone written data, then those one or more bytes are read from the cache array (105) to the processor core (102) regardless of the progress of the cache fill. All read accesses to filling cache lines are serviced in the minimum amount of time by satisfying the access immediately upon availability of only the exact portion requested.
A data cache and a plurality of companion fill buffers having corresponding tag matching circuitry are provided to a computer system. Each fill buffer independently stores and tracks a replacement cache line being filled with data returning from main memory in response to a cache miss. When the cache fill is completed, the replacement cache line is output for the cache tag and data arrays of the data cache if the memory locations are cacheable and the cache line has not been snoop hit while the cache fill was in progress. Additionally, the fill buffers are organized and provided with sufficient address and data ports as well as selectors to allow the fill buffers to respond to subsequent processor loads and stores, and external snoops that hit their cache lines while the cache fills are in progress. As a result, the cache tag and data arrays of the data cache can continue to serve subsequent processor loads and stores, and external snoops, while one or more cache fills are in progress, without ever having to stall the processor.
Following a cache miss by an operation, the address for the operation is transmitted on the bus coupling the cache to lower levels of the storage hierarchy. A portion of the address including the index field is transmitted during a first bus cycle, and may be employed to begin directory lookups in lower level storage devices before the address tag is received. The remainder of the address is transmitted during subsequent bus cycles, which should be in time for address tag comparisons with the congruence class elements. To allow multiple directory lookups to be occurring concurrently in a pipelined directory, a portion of multiple addresses for several data access operations, each portion including the index field for the respective address, may be transmitted during the first bus cycle or staged in consecutive bus cycles, with the remainders of each address--including the cache tags--transmitted during the subsequent bus cycles. This allows directory lookups utilizing the index fields to be processed concurrently within a lower level storage device for multiple operations, with the address tags being provided later, but still timely for tag comparisons at the end of the directory lookup. Where the lower level storage device operates at a higher frequency than the bus, overall latency is reduced and directory bandwidth is more efficiently utilized.