WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method and apparatus for incorporating cache line replacement and cache write policy information into tag directories in a cache system    
United States Patent5325504   
Link to this pagehttp://www.wikipatents.com/5325504.html
Inventor(s)Tipley; Roger E. (Houston, TX); Kelly; Philip C. (Houston, TX)
AbstractA method and apparatus for incorporating cache line replacement and cache write policy information into the tag directories in a cache system. In a 2 way set-associative cache, one bit in each way's tag RAM is reserved for LRU information, and the bits are manipulated such that the Exclusive-OR of each way's bits points to the actual LRU cache way. Since all of these bits must be read when the cache controller determines whether a hit or miss has occurred, the bits are available when a cache miss occurs and a cache line replacement is required. The method can be generalized to caches which include a number of ways greater than two by using a pseudo-LRU algorithm and utilizing group select bits in each of the ways to distinguish between least recently used groups. Cache write policy information is stored in the tag RAM's to designate various memory areas as write-back or write-through. In this manner, system memory situated on an I/O bus which does not recognize inhibit cycles can have its data cached.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5325504
Method and apparatus for incorporating cache line replacement and cache

     write policy information into tag directories in a cache system - US Patent 5325504 Drawing
Method and apparatus for incorporating cache line replacement and cache write policy information into tag directories in a cache system
Inventor     Tipley; Roger E. (Houston, TX); Kelly; Philip C. (Houston, TX)
Owner/Assignee     Compaq Computer Corporation (Houston, TX)
Patent assignment
All assignments
Publication Date     June 28, 1994
Application Number     07/752,761
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     August 30, 1991
US Classification     711/128
Int'l Classification     G06F 012/02
Examiner     Dixon; Joseph L.
Assistant Examiner     Nguyen; Hiep T.
Attorney/Law Firm     Pravel, Hewitt, Kimball & Krieger
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/200 MS File 364/900 MS File 395/400 395/425
Patent Tags     incorporating cache line replacement cache write policy information into tag directories cache
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5155824
Edenfield
711/143
Oct,1992

[0 after 0 votes]
5140690
Hata
714/5
Aug,1992

[0 after 0 votes]
5125085
Phillips
711/122
Jun,1992

[0 after 0 votes]
5119485
Ledbetter, Jr.
711/146
Jun,1992

[0 after 0 votes]
5091851
Shelton
711/128
Feb,1992

[0 after 0 votes]
5060136
Furney

Oct,1991

[0 after 0 votes]
5043885
Robinson
711/133
Aug,1991

[0 after 0 votes]
4736293
Patrick
711/128
Apr,1988

[0 after 0 votes]
4493026
Olnowich
711/128
Jan,1985

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. A method for incorporating cache line replacement information into tag memory in a cache subsystem organized as a multiple way cache which includes first and second ways, the cache subsystem having a cache controller and tag memory associated with each of the cache ways, the tag memory including a plurality of tag entries, wherein tag entries in tag memory associated with the first way correspond to tag entries in tag memory associated with the second way and each of the tag entries includes cache line replacement information comprising a replacement bit having either of two opposite value, the method comprising:

changing the replacement bit in a tag entry associated with the first way to the opposite value of the replacement bit from a corresponding tag entry associated with the second way when cache line replacement information is being updated in said tag entry associated with the first way;

changing the replacement bit in a tag entry associated with the second way to the value of the replacement bit from a corresponding tag entry associated with the first way when cache line replacement information is being updated in said tag entry associated with the second way; and

calculating an Exclusive-OR value from corresponding first and second way tag entry replacement bits when a cache line replacement is required to determine a way into which a line will be placed, wherein said Exclusive-OR value points to the way where said cache line is to be placed.

2. The method of claim 1, wherein the multiple way cache is a 2 way set-associative cache.

3. The method of claim 1, further comprising:

setting the replacement bits in each of the tag entries to an initial value when the cache subsystem is flushed.

4. The method of claim 1, wherein the cache may include a number of ways greater than two, the ways are organized into first and second groups, and each of the tag entries include a group select bit, the method further comprising:

changing the group select bit in a tag entry in the first group such that the Exclusive-OR value of the group select bits in tag entries in the first group is opposite of the Exclusive-OR value of group select bits in corresponding tab entries in the second group when cache line replacement information is being updated in said tag entry in the first group;

changing the group select bit in a tag entry in the second group such that the Exclusive-OR value of the group select bits in tag entries in the second group is the Exclusive-OR value of group select bits in corresponding tag entries in the first group when cache line replacement information is being updated in said tag entry in the second group; and

calculating the Exclusive-OR value from the Exclusive-OR of the group select bits for corresponding tag entries in each group when a cache line replacement is required to determine the group into which the line will be placed, wherein said Exclusive-OR value points to the group where the cache line is to be placed.

5. A cache subsystem which incorporates cache line replacement information into tag memory, the cache subsystem organized as a multiple way cache which includes first and second ways, comprising:

a cache controller;

tag memory coupled to said cache controller and associated with each of the cache which includes a plurality of tag entries, wherein tag entries in tag memory associated with the first way correspond to tag entries in tag memory associated with the second way and each of said tag entries includes cache line replacement information comprising a replacement bit having one of two opposite values;

means coupled to said tag memory for changing the replacement bit in a tag entry associated with the first way to the opposite value of the replacement bit from a corresponding tag entry associated with the second way when cache line replacement information is being updated in said tag entry associated with the first way;

means coupled to said tag memory for changing the replacement bit in a tag entry associated with the second way to the value of the replacement bit from a corresponding tag entry associated with the first way when cache line replacement information is being updated in said tag entry associated with the second way; and

means coupled to said tag memory for calculating the Exclusive-OR value from corresponding first and second way tag entry replacement bits when a cache lie replacement is required to determine the way into which the line will be placed, wherein said Exclusive-OR value points to the way where said cache line is to be placed.

6. The cache subsystem of claim 5, wherein said cache memory is 2 way set-associative cache memory.

7. The cache subsystem of claim 5, wherein said cache controller further comprises:

means coupled to said tag memory for setting the replacement bits in each of said tag entries to an initial value when the cache subsystem is flushed.

8. A method for incorporating cache write policy information into tag memory in a cache subsystem having a cache controller, cache memory which holds data entries, and tag memory which includes a plurality of tag entries that correspond to data entries stored in the cache memory, the cache subsystem being comprised in a computer system having memory and a means for generating cache write policy information to the cache subsystem when the cache subsystem receives data from the computer system memory, the method comprising:

the cache controller receiving cache write policy information from the generating means when the cache memory receives a data entry from the computer system memory;

the cache controller setting a bit in the tag entry corresponding to said data entry indicating the cache write policy of said data entry;

the cache controller reading said bit in said corresponding tag entry to determine the cache write policy for said data entry when a cache write hit occurs to said data entry; and

the cache controller employing the cache write policy as determined by said bit.

9. The method of claim 8, wherein the computer system includes multiple processors, data entries stored in the cache memory have an associated state, wherein the associated state determines whether a write-back or write-through policy is employed, and each of the tag entries includes a plurality of state bits which represent the state of corresponding data entries, the method further comprising:

the cache controller setting the state bits in a tag entry to a state which employs a write-through policy when it receives cache write policy information from the generating means designating a write-through policy for a data entry corresponding to said tag entry; and

the cache controller setting the state bits in a tag entry to a state which employs a write-back policy when it receives cache write policy information from the generating means designating a write-back policy for a data entry corresponding to said tag entry.

10. The method of claim 9, the computer system having a first bus, a second bus, memory coupled to the first bus, and memory coupled to the second bus, wherein the first bus may have cycles aborted and the second bus cannot have cycles aborted, the method further comprising:

the generating means generating cache write policy information indicating a write-back policy for data entries stored in the memory coupled to the first bus; and

the generating means generating cache write policy information indicating a write-through policy for data entries stored in the memory coupled to the second bus.

11. The method of claim 9, the computer system having a first bus, a second bus, memory coupled to the first bus memory coupled to the second bus, wherein the first bus recognizes write-back inhibit cycles and the second bus does not recognize write-back inhibit cycles, the method further comprising:

the generating means generating cache write policy information indicating a write-back policy for data entries stored in the memory coupled to the first bus; and

the generating means generating cache write policy information indicating a write-through policy for data entries stored in the memory coupled to the second bus.

12. A computer system having a cache subsystem which incorporates cache write policy information into tag memory, the computer system having a first bus, memory coupled to the firs bus, and a means coupled to the first bus for generating cache write policy information to the cache subsystem when the cache subsystem receives data, the cache subsystem being coupled to the first bus and comprising:

cache memory coupled to the first bus;

a cache controller coupled to the first bus and said cache memory;

tag memory coupled to the first bus and said cache controller which includes a plurality of tag entries that correspond to data entries stored in said cache memory;

means coupled to the first bus for receiving cache write policy information from the generating means when the cache memory receives a data entry from the computer system memory;

means coupled to said tag memory and said receiving means for setting a bit in the tag entry corresponding to said data entry indicating the cache write policy of said data entry;

means coupled to said tag memory for reading said bit in said corresponding tag entry to determine the cache write policy for said data entry when a cache write hit occurs to said data entry; and

means coupled to said bit reading means for employing the cache write policy as determined by said bit.

13. The computer system of claim 12, the computer system further including multiple processors, wherein data entries stored in said cache memory have an associated state which determines whether a write-back or write-through policy is employed for the associated data entry, and wherein each of said tag entries includes a plurality of state bits which represent the state of corresponding data entries, the cache controller further comprising:

means coupled to said tag memory and said receiving means for setting the state bits in a tag entry to a state which employs a write-through policy when said receiving means receives cache write policy information from the generating means designating a write-through policy for a data entry corresponding to said tag entry; and

means coupled to said tag memory and said receiving means for setting the state bits in a tag entry to a state which employs a write-back policy when said receiving means receives cache write policy information from the generating means designating a write-back policy for a data entry corresponding to said tag entry.

14. The computer system of claim 12, further comprising:

a second bus; and

memory coupled to said second bus;

wherein the first bus can have cycles aborted and the second bus cannot have cycles aborted, wherein the generating means includes:

means for generating cache write policy information indicating a write-back policy for data entries stored in the memory coupled to the first bus; and

means for generating cache write policy information indicating a write-through policy for data entries stored in the memory coupled to the second bus.

15. The computer system of claim 12, further comprising:

a second bus; and

memory coupled to said second bus;

wherein the first bus recognizes write-back inhibit cycles and the second bus does not recognize write-back inhibit cycles, wherein the generating means includes:

means for generating cache write policy information indicating a write-back policy for data entries stored in the memory coupled to the first bus; and

means for generating cache write policy information indicating a write-through policy for data entries stored in the memory coupled to the second bus.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to microprocessor cache subsystems in computer systems, and more specifically to a method for incorporating least recently used and cache write policy information into the tag memories of a cache system.

2. Description of the Prior Art

The personal computer industry is a vibrant and growing field that continues to evolve as new innovations occur. The driving force behind this innovation has been the increasing demand for faster and more powerful computers. A major bottleneck in personal computer speed has historically been the speed with which data can be accessed from memory, referred to as the memory access time. The microprocessor, with its relatively fast processor cycle times, has generally been delayed by the use of wait states during memory accesses to account for the relatively slow memory access times. Therefore, improvement in memory access times has been one of the major areas of research in enhancing computer performance.

In order to bridge the gap between fast processor cycle times and slow memory access times, cache memory was developed. A cache is a small amount of very fast, and expensive, zero wait state memory that is used to store a copy of frequently accessed code and data from main memory. The microprocessor can operate out of this very fast memory and thereby reduce the number of wait states that must be interposed during memory accesses. When the processor requests data from memory and the data resides in the cache, then a cache read hit takes place, and the data from the memory access can be returned to the processor from the cache without incurring wait states. If the data is not in the cache, then a cache read miss takes place, and the memory request is forwarded to the system and the data is retrieved from main memory, as would normally be done if the cache did not exist. On a cache miss, the data that is retrieved from memory is provided to the processor and is also written into the cache due to the statistical likelihood that this data will be requested again by the processor.

An efficient cache yields a high "hit rate", which is the percentage of cache hits that occur during all memory accesses. When a cache has a high hit rate, the majority of memory accesses are serviced with zero wait states. The net effect of a high cache hit rate is that the wait states incurred on a relatively infrequent miss are averaged over a large number of zero wait state cache hit accesses, resulting in an average of nearly zero wait states per access. Also, since a cache is usually located on the local bus of the microprocessor, cache hits are serviced locally without requiring use of the processor or host bus. Therefore, a processor operating out of its local cache has a much lower "bus utilization." This reduces host bus bandwidth used by the processor, making more bandwidth available for other processors or bus masters.

An important consideration in cache performance is the organization of the cache and the cache management policies that are employed in the cache. A cache can generally be organized into either a direct-mapped or set-associative configuration. In a direct-mapped organization, the physical address space of the computer is conceptually divided up into a number of equal pages, with the page size equaling the size of the cache. The cache is divided up into a number of sets, with each set having a certain number of lines. Each of the pages in main memory has a number of lines equivalent to the number of lines in the cache, and each line from a respective page in main memory corresponds to a similarly located line in the cache. An important characteristic of a direct-mapped cache is that each memory line from a page in main memory, referred to as a page offset, can only reside in the equivalently located line or page offset in the cache. Due to this restriction, the cache only need refer to a certain number of the upper address bits of a memory address, referred to as a tag, to determine if a copy of the data from the respective memory address resides in the cache because the lower order address bits are pre-determined by the page offset of the memory address.

Whereas a direct-mapped cache is organized as one bank of memory that is equivalent in size to a conceptual page in main memory, a set-associative cache includes a number of banks, or ways, of memory that are each equivalent in size to a conceptual page in main memory. Accordingly, a page offset in main memory can be mapped to a number of locations in the cache equal to the number of ways in the cache. For example, in a 4-way set associative cache, a line or page offset from main memory can reside in the equivalent page offset location in any of the four ways of the cache.

A set-associative cache generally includes a replacement algorithm that determines which bank, or way, in which to fill data when a read miss occurs. Many set-associative caches use some form of a least recently used (LRU) algorithm that places new data in the way that was least recently accessed. This is because, statistically, the way most recently used or accessed to provide data to the processor is the one most likely to be needed again in the future. Therefore, the LRU algorithm generally ensures that the block which is replaced is the least likely to have data requested by the cache. An LRU algorithm is generally maintained by keeping LRU bit information associated with each set that points to the way least recently used.

Another replacement algorithm that can be used is referred to as a least recently modified (LRM) algorithm. In an LRM scheme, the LRU information is not updated on every cache read and write hit, but is updated when there is a cache line replacement or write hit. Other replacement techniques referred to as pseudo-LRU techniques have also been developed. One example of a pseudo-LRU technique is the internal cache in the Intel Corporation i486 microprocessor, which uses a 4-way set associative cache architecture. In this method, the four ways are grouped into two sets of two ways. Three bits are provided to determine first, which of the two groups was least recently used and then second, which of the two ways in the least recently used group was least recently used. This is a pseudo-LRU technique because it does not account for properly reshuffling the replacement order based on read hits to a particular way.

Cache management is generally performed by a device referred to as a cache controller. The cache controller includes a directory that holds an associated entry for each set in the cache. This entry generally has two components: a tag and a number of tag state bits. The tag acts as a main memory page number, and it holds the upper address bits of the particular page in main memory from which the copy of data residing in the respective set of the cache originated. The tag state bits determine the status of data in the respective set in the cache. In single processor schemes, the tag state bits generally comprise one or more valid bits which indicate whether the data associated with the respective tag entry is valid. In multiprocessor systems, the tag state bits generally indicate the status of data with respect to the other caches, as is explained further below.

The directories for each of the sets in the cache are generally stored in random access memory referred to as tag RAM's. Most conventional cache controllers also include a separate RAM for the LRU information to keep track of which way has been least recently used for each of the respective sets. However, the use of a separate RAM for LRU information is generally wasteful and unnecessary if extra bits can be made available in the tag RAM's. Therefore, a method is desired to allow LRU information to be stored in the tag RAM's to obviate the necessity of having an additional RAM exclusively for LRU information.

Multiprocessing is a major area of research in computer system architecture. Multiprocessing involves a computer system which includes multiple processors that work at the same time on different problems. In most multiple processor systems, each processor includes its own local cache. In this manner, each processor can operate out of its cache when it does not have control of the bus, thereby increasing system efficiency. However, one difficulty that has been encountered in multiprocessor architectures is the maintenance of cache coherency when each processor includes its own local cache.

As previously mentioned, cache management is performed by a device referred to as a cache controller. A principal cache management responsibility in multiprocessor systems is the preservation of cache coherency. The type of cache management policy used to maintain cache coherency in a multiprocessing system generally depends on the architecture used. One type of architecture commonly used in multiprocessing systems is referred to as a bus-based scheme. In a bus-based scheme, system communication takes place through a shared bus, and this allows each cache to monitor other requests by watching or snooping the bus. Each processor has a cache which monitors activity on the bus and in its own processor and decides which blocks of data to keep and which to discard in order to reduce bus traffic. Requests by a processor to modify a memory location that is stored in more than one cache requires bus communication in order for each copy of the corresponding line to be marked invalid or updated to reflect the new value.

Various types of cache coherency protocols can be employed to maintain cache coherency in a multiprocessor system. One type of cache coherency protocol that is commonly used is referred to as a write-through scheme. In a write-through scheme, all cache writes or updates are simultaneously written into the cache and to main memory. Other caches on the bus must monitor bus transactions and invalidate any matching entries when the memory block is written through to main memory. In a write-back scheme, a cache location is updated with the new data on a processor write hit, and main memory is generally only updated when the updated data block must be exchanged with a new data block or the updated block is requested by another processor.

Multiprocessor cache systems which employ a write-back scheme generally utilize some type of ownership protocol to maintain cache coherency. In this scheme, any copy of data in a cache must be identical to the owner of that location's data. The owner of a location's data is generally defined as the respective location having the most recent version of the data residing in the respective memory location. Ownership is generally acquired through special read and write operations defined in the ownership protocol.

A cache that owns a data entry assumes responsibility for the data's validity in the entire system. The cache that owns a particular data entry is responsible for ensuring that main memory is properly updated, if necessary, and that ownership is passed to another cache when appropriate. The owning cache is also responsible for providing the correct copy of data to other processors or devices which request the data during a read cycle. When a cache owns a data entry and snoops a read cycle generated by another device or processor requesting the data, the cache must inhibit the system memory from providing the data and provide the correct copy of data to the requesting device.

As previously mentioned, the cache controller includes a directory that holds an associated tag entry for each data entry or set in the cache. Each tag entry is comprised of a tag and a number of tag state bits. The tag state bits determine the status of the data in the respective set of the cache, i.e. whether the data is invalid, owned, or shared with another cache, etc. The cache controller also includes a snooping mechanism which monitors or snoops the host bus when other processors or devices are using the host bus. The cache controller maintains cache coherency by updating the tag state bits for a data entry whenever the cache obtains or relinquishes ownership and in certain instances when the snooping mechanism detects a read or write cycle for data stored in the cache.

Most cache systems can cache the majority of computer system addresses. However, a portion of the address space is generally deemed non-cacheable for various reasons. For example, the input/output (I/O) address space is generally designated as non-cacheable to prevent I/O addresses from being placed in the cache. This is because data stored in I/O addresses is subject to frequent change or may control physical devices. In addition, in most computer systems the I/O bus is separate from the host or processor bus, and therefore data stored in I/O addresses are subject to change without any activity occurring on the host bus. Thus, data stored in these I/O addresses can change without being detected by the cache controllers situated on the host bus. In addition, video memory located on the I/O bus is generally designated as non-cacheable due to its dual ported nature.

In many computer systems, there is generally some system memory situated on an I/O bus which can be designated as cacheable. For example, if the system read only memory (ROM) is situated on the I/O bus, then this memory can be designated as cacheable. However, it should be noted that if system ROM is designated as cacheable, this memory must also be write-protected in the cache to prevent the processor from inadvertently changing ROM data stored in the cache. In addition, if the I/O bus includes slots for expansion memory, then memory placed in these slots can be designated as cacheable. However, a problem arises in a multiprocessor write-back cache architecture where system memory situated on the I/O bus is designated as cacheable if the I/O bus cannot recognize inhibit cycles. As previously mentioned, in a write-back cache architecture if a cache controller snoops a read hit to an owned or modified location, the cache must inhibit the current memory cycle by the memory controller or memory device and provide the data to the requesting device. However, if the I/O bus does not recognize inhibit cycles, the cache which owns the data will not be able to inhibit the system memory on the I/O bus from returning incorrect or "dirty" data to the requesting device, resulting in cache coherency problems. In addition, the system memory on the I/O bus and the cache controller may both attempt to return data to the requesting device, resulting in bus contention problems.

Therefore, in multiprocessor, write-back cache architectures which include an I/O bus that does not recognize inhibit cycles, it is generally not possible to cache the system memory situated on the I/O bus. However, if the system memory situated on the I/O bus could be designated as write-through in each of the caches, then no problems would result since there would never be a need for an inhibit cycle for this memory. No inhibit cycles would be needed if a write-through protocol was used since the system memory, not the cache, would always be the owner of data it contains. Therefore, a method is needed which enables a cache controller to designate memory address spaces as write-through or write-backable in a computer system. In addition, this method must be easily accessed by the cache controller to prevent a cache write policy determination from degrading system performance.

Background on the Extended Industry Standard Architecture (EISA) is deemed appropriate. EISA is a superset of the Industry Standard Architecture (ISA), a bus architecture introduced in the International Business Machines (IBM) PC AT personal computer. The EISA bus is built around an EISA chip set which includes an EISA bus controller (EBC) chip, among others. The EBC acts as an interface between a host or processor bus and the EISA (I/O) bus. Preferably the 82358 from Intel Corporation is used as the EBC. The 82358 does not include an inhibit input, and therefore the EISA bus does not recognize inhibit cycles.

SUMMARY OF THE INVENTION

The present invention includes a method and apparatus for incorporating LRU and cache write policy information into the tag RAM's in a cache controller. The incorporation of LRU information into the tag RAM's obviates the necessity of having an extra RAM exclusively for LRU information. The incorporation of cache write policy information into the tag RAM's enables caching of system memory situated on an I/O bus which does not recognize inhibit cycles. In addition, the incorporation of cache write policy information into the tag RAM's allows a ready determination of the cache write policy used whenever the associated tag entry is read. Since the tag entry must be read on each cache access to determine if the tag matches the address provided to the cache, the cache write policy information can be determined without degrading system performance.

The method according to the present invention for incorporating LRU information into the tag RAM's is implemented in a 2-way set-associative cache in the preferred embodiment. The cache may generally use any type of cache replacement algorithm according to the present invention. In the preferred embodiment, the cache uses a modified LRU replacement algorithm referred to as least recently modified (LRM). The LRM algorithm used in the preferred embodiment updates the LRU bits on write hits and on cache line replacements caused by read misses. The method according to the present invention can be generalized to caches which include a greater number of ways than two by using an LRU, LRM, pseudo-LRU or pseudo-LRM replacement algorithm.

According to the present invention, one bit in the entries of each way's tag RAM, referred to as a partial LRU bit, is reserved for LRU information. The LRU bits are manipulated in such a way that the Exclusive-OR of both way's partial LRU bits forms the actual LRU information used by the cache controller. Both LRU bits are available during a cache read because the cache controller must read both way's tags to determine whether a hit or miss has occurred. If a match occurs in one of the respective tags, then the corresponding cache way is accessed by the processor. By definition, the cache way that is currently being accessed is the most recently used. Therefore, logic is used to manipulate the partial LRU bits so that an Exclusive-OR of these bits points to the way opposite of the way being used on each tag write or line replacement.

The manipulation of the partial LRU bits during an LRU update is as follows. Since only one way's tag can be written at a time, the partial LRU bit for the other way remains unchanged. The new partial LRU bit value for the way being changed is as follows. The new value for the partial LRU bit of way 0 becomes the opposite or inverted value of the old partial LRU bit of way 1. The new value for the partial LRU bit of way 1 becomes the old partial LRU bit for way 0. When way 0 is written with the inverted value of LRU1, the two partial LRU bits are always different and thus the Exclusive-OR of these bits point to way 1 as the least recently used. When way 1 is written with LRU0, the two partial LRU bits are always the same and thus the Exclusive-OR of these bits point to way 0 as the least recently used.

In the preferred embodiment, the computer system includes a host or processor bus that is separate from an I/O bus. The host bus includes random access memory referred to as main memory and a memory controller. Expansion memory and system read only memory are preferably situated on the I/O bus. The I/O bus is preferably of a type that does not recognize inhibit or abort cycles. The present invention is preferably implemented in a multiprocessor, multiple cache computer system wherein each tag entry in the tag RAM's includes a number of tag state bits. The tag state bits define a number of states which includes a state referred to as shared unmodified. In the preferred embodiment, the shared unmodified state indicates that there may be more than one cache with an unmodified copy of a particular location's data and that processor write hits must be broadcast to main memory. In addition, one bit in each of the tag entries, referred to as the CW bit, is reserved for cache write policy information according to the present invention.

The memory controller includes associated Data Destination Facility (DDF) logic which generates a signal referred to as H.sub.-- CW. The H.sub.-- CW signal conveys cache write policy information to each of the caches when data from computer system memory is provided to the caches. In the preferred embodiment, the H.sub.-- CW line is generated by the DDF logic on each memory read or write cycle, regardless of whether the data is located in main memory or in memory situated on the I/O bus. In the preferred embodiment, the H.sub.-- CW line is negated low when system memory on the I/O bus provides data to a cache, indicating that the cache write policy for this data should be designated as write-through. The H.sub.-- CW signal is asserted high when main memory on the host bus provides data to a cache, indicating that the cache write policy for this data should be designated as write-back.

Each of the cache controllers in the system receives the H.sub.-- CW signal to determine whether the cache write protocol for the data it receives should be designated as write-through or write-back. When the H.sub.-- CW signal is asserted high and a write-back policy is designated, the CW bit in the tag entry is set to a logic high value, and the tag state bits are manipulated as would normally occur in a write-back cache environment. When the H.sub.-- CW signal is negated low and a write-through policy is designated, the CW bit in the tag entry is set to a logic low value, and the tag state bits for the respective data entry stored in the cache are set to the shared unmodified state. The CW state bit keeps track of the cache write protocol associated with each data entry. When the CW bit is negated low, indicating a write-through policy, the negated CW bit maintains the tag state as shared unmodified and prevents the tag state from being inadvertently changed by the cache controller. In this manner, the respective data entry in the cache is essentially designated as write-through since a write hit to a shared unmodified data entry always results in the new value being broadcast to its system memory location. In addition, a bit in each tag entry referred to as the write protect (WP) bit includes write policy information that determines whether a write protect policy is employed for the respective cache location. The WP bit is asserted for a respective cache location when the corresponding data location is read only. When the WP bit is asserted in a tag entry, the corresponding cache location cannot be modified, and the tag state bits are maintained in the shared state, thus maintaining a write-through policy for this data.

Therefore, a method for incorporating LRU and cache write policy information into the tag RAM's is disclosed. The incorporation of LRU information into the tag RAM's removes the necessity of having a separate RAM for LRU information. In addition, the incorporation of cache write policy information into the tag RAM's allows the cache controller to maintain write-back and write-through policies for different data entries, thereby allowing system memory located on an I/O bus which does not recognize inhibit cycles to be cached.

BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the invention can be obtained when the following detailed description of the preferred embodiment is considered in conjunction with the following drawings, in which:

FIG. 1 is a block diagram of a multiprocessor computer system including microprocessor cache systems according to the present invention;

FIG. 2 is a block diagram of one of the cache controllers of FIG. 1;

FIG. 3 depicts a tag entry in the tag memory area of FIG. 2;

FIG. 4 is a schematic diagram of LRU bit generation and way select logic in the tag control area of FIG. 2 according to the present invention; and

FIG. 5 is an illustration of how the incorporation of LRU information into the tag RAM's can be generalized to multiple way caches with a number of ways greater than two using a pseudo-LRU algorithm.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

The following related disclosures are herein incorporated by reference:

U.S. Application Ser. No. 753,420, entitled "Multiprocessor Cache Snoop Access Protocol" by Mike T. Jackson, Jeffrey C. Stevens, and Roger E. Tipley, filed on Aug. 30, 1991; and

U.S. Application Ser. No. 753,199, entitled "Multiprocessor Cache Arbitration" by Jeffrey C. Stevens, Mike T. Jackson, Roger E. Tipley, Jens K. Ramsey, Sompong Olarig, and Philip C. Kelly filed on Aug. 30, 1991; both of which are assigned to the assignee of this invention.

Referring now to FIG. 1, a computer system S is generally shown. Many of the details of a computer system that are not relevant to the present invention have been omitted for the purpose of clarity. In the description that follows, signal names followed by an asterisk are asserted when they have a logic low value. The computer system S preferably includes multiple microprocessors. However, it is understood that the incorporation of the present invention into single processor computer systems can be readily accomplished. The system S includes two microprocessors 20 and 30 coupled to a host bus 22. Each of the microprocessors 20 and 30 have associated cache subsystems 23 and 25, respectively, comprising caches 24 and 34 and cache controllers 26 and 36, respectively, coupled between the respective microprocessors 20 and 30 and the host bus 22. The caches 24 and 34 are preferably organized as 2 way set-associative caches according to the present embodiment. However, the LRU method according to the present invention can be generalized to work with caches which