A directory system directs cache line access requests from processors in a multi-processor system with a shared memory system through a cache line states directory. The cache line states directory stores a state value that identifies a cache line shared states word. The cache line shared states word identifies the processor that owns the cache line and the state of access of each processor that shares access to the cache line. A state value encoder encodes a cache line shared state word into a state value and loads the state value into the cache line states directory. A state value decoder decodes the state value into a cache line shared state word for use by the cache line directory system in retrieving the cache line. A plurality of cache line tables are used with each cache line assigned to one of the tables. The cache line table stores a state value for each cache line shared states word stored in the table. The encoder and decoder perform a table look-up to convert between a cache line shared state word and a state value. Each of said cache line tables stores an ordered list of cache line shared state words and their corresponding state values. The ordered list is a list of cache line shared state words that have the most significance to the multi-processor system.
A shared-memory system includes processing modules communicating with each other through a network. Each of the processing modules includes a processor, a cache, and a memory unit that is locally accessible by the processor and remotely accessible via the network by all other processors. A home directory records states and locations of data blocks in the memory unit. A prediction facility that contains reference history information of the data blocks predicts a next requester of a number of the data blocks that have been referenced recently. The next requester is informed by the prediction facility of the current owner of the data block. As a result, the next requester can issue a request to the current owner directly without an additional hop through the home directory.
A cache coherence system and method for use in a multiprocessor computer system having a plurality of processor nodes, a memory and an interconnect network connecting the plurality of processor nodes to the memory. Each processor node includes one or more processors. The memory includes a plurality of lines and a cache coherence directory structure having a plurality of directory structure entries. Each of the directory structure entries is associated with one of the plurality of lines and each directory structure entry includes processor pointer information, expressed as a set of bit vectors, indicating the processors that have cached copies of lines in memory. Processor pointer information may be a function of a processor number assigned to each processor; the processor number may be expressed as a function of a first set of bits and a second set of bits which are respectively mapped into first and second bit vectors of the n bit vectors.
A system and method of maintaining consistent cached copies of memory in a multiprocessor system having a main memory, includes a memory directory having entries mapping the main memory, an access history information in the memory directory entries, and a directory cache having records corresponding to a subset of the memory directory entries. The memory directory may be a full map directory having entries mapping all of the main memory or a sparse directory having entries mapping to a subset of the main memory. The method includes the steps of receiving a signal indicating a processor cache miss, retrieving a memory directory entry from the memory directory, updating the access history of the memory directory entry, selecting a directory cache line based on its access history and allocating the directory cache line for replacement, and writing the memory directory entry into the directory cache.
An improved directory-based cache coherency memory system for a multiprocessor computer system. The memory system includes a dual ported system memory shared by the multiple processors within the computer system; a plurality of data cache memories, at least one data cache memory associated with each processor; and first and second memory busses, the first memory bus connecting a first subset of processors and associated data cache memories to a first port (PORT A) of the system memory, and the second memory bus connecting a second subset of processors and associated data cache memories to a second port (PORT B) of the system memory. Cache coherency is maintained through the use of memory line state information saved with each line of memory within the system memory and data cache memories. The system memory contains a system memory line state for each line of memory saved within the system memory, the system memory line state being any one of the group: SHARED PORT A, SHARED BOTH, OWNED PORT A and OWNED PORT B. Each one of these states is represented by a different two bit code saved with each line of memory in system memory. Additionally, each data cache memory contains a data cache memory line state for each line of memory saved within the data cache memory, the data cache memory line state being any one of the group: MODIFIED, EXCLUSIVE, SHARED, or INVALID.
A method and apparatus for a coherence mechanism that supports a distributed memory programming model in which processors each maintain their own memory area, and communicate data between them. A hierarchical programming model is supported, which uses distributed memory semantics on top of shared memory nodes. Coherence is maintained globally, but caching is restricted to a local region of the machine (a "node" or "caching domain"). A directory cache is held in an on-chip cache and is multi-banked, allowing very high transaction throughput. Directory associativity allows the directory cache to map contents of all caches concurrently. References off node are converted to non-allocating references, allowing the same access mechanism (a regular load or store) to be used for both for intra-node and extra-node references. Stores (Puts) to remote caches automatically update the caches instead of invalidating the caches, allowing producer/consumer data sharing to occur through cache instead of through main memory.