|
Claims  |
|
|
What is claimed is:
1. A cache memory system comprising:
at least one main memory, said main memory comprising lines of data located
at specified addresses, said main memory maintaining a prefetch pointer
field associated with each line of data;
at least one cache coupled to main memory, said cache storing a plurality
of lines of data and associated prefetch pointer fields;
a memory access control means coupled to receive a request for data from a
requesting device located at a predetermined address and further coupled
to the main memory and cache, said means comprising;
means for accessing the cache to determine if the requested data is located
in a line of data in the cache, and if the data is located in the cache,
causing the data requested to be returned to the requesting device, and
means for accessing main memory to cause the requested data to be returned
to the requesting device if the data is not located in the cache, said
means further causing the line in main memory containing the requested
data and the associated prefetch pointer field to be written to the cache;
and
cache prefetch means comprising;
means for reading a prefetch pointer from the prefetch pointer field of the
line in the cache containing the data requested, and
means for accessing the line of data in main memory pointed to by the
prefetch pointer which is read and storing the line of data and the
associated prefetch pointer in the cache;
wherein a line of data is prefetched and placed in the cache in accordance
with the prefetch pointer associated with the line of data requested,
thereby decreasing the probability of a cache miss occurring when a
subsequent request is issued.
2. The cache memory system as set forth in claim 1, wherein the prefetch
pointer comprises the address of a line of data in main memory which is
likely to be requested for access subsequent to the line of data
associated with the prefetch pointer.
3. The cache memory system as set forth in claim 1, wherein the prefetch
pointer comprises a relative address of a line of data in main memory
which is likely to be requested for access subsequent to the line of data
associated with the prefetch pointer, said relative address relative to
the address of the line of data associated with the prefetch pointer.
4. The cache memory system as set forth in claim 1, wherein if said
prefetch pointer comprises a Null value, said means for accessing the line
of data in main memory accesses the next sequential line of memory.
5. The cache memory system as set forth in claim 1, wherein a prefetch
pointer comprises a Null value when the line of data associated with the
prefetch pointer is initially stored in main memory.
6. The cache memory system as set forth in claim 5, wherein a prefetch
pointer associated with a line in main memory is updated in the cache from
a Null value to a pointer to a line of data subsequently accessed.
7. The cache memory system as set forth in claim 1, wherein a prefetch
pointer associated with a line in main memory is unmodified when the line
of data is written to main memory.
8. The cache memory system as set forth in claim 7, further comprising a
bit mask to mask the prefetch pointer field in main memory such that when
a line of data is written to main memory the associated prefetch pointer
field is unmodified.
9. The cache memory system as set forth in claim 7, wherein the associated
prefetch pointer field is unmodified by performing a read-modify-write
sequence to perform the write operation to main memory.
10. The cache memory system as set forth in claim 1, wherein an associated
prefetch pointer comprises a Null value when a line of data is stored in
main memory by an input/output device and the associated prefetch pointer
is unmodified in main memory when a line of data is written by a processor
device.
11. The cache memory system as set forth in claim 1, wherein the cache
memory system comprises a first level cache coupled to the requesting
device and main memory and a prefetching second level cache coupled to the
requesting device and main memory, said second level cache operating in
parallel with the first level cache between the requesting device and main
memory.
12. The cache memory system as set forth in claim 1, wherein the cache
memory system comprises a first level cache coupled to the requesting
device and main memory and a prefetching second level cache coupled to the
first level cache and main memory, said second level cache operating in
series between the first level cache and main memory.
13. The cache memory system as set forth in claim 1, wherein the prefetch
pointer is stored in parity fields located in the line of data.
14. The cache memory system as set forth in claim 13, wherein error
detection and correction (EDC) information is further stored in the parity
fields located in the line of data.
15. The cache memory system as set forth in claim 1, wherein the prefetch
pointer is stored in a region of main memory separate from the line of
data with which the prefetch pointer is associated.
16. The cache memory system as set forth in claim 1, comprising an
instruction cache for storing lines of data comprising instructions and a
data cache for storing lines of data comprising other information
requested, each cache maintaining independent prefetch pointers.
17. The cache memory system as set forth in claim 1, wherein the cache is a
direct mapped cache.
18. The cache memory system as set forth in claim 1, wherein the cache is
an associative cache.
19. The cache memory system as set forth in claim 1, wherein the prefetch
pointer field comprises a plurality of pointers.
20. The cache memory system as set forth in claim 19, wherein the prefetch
pointer field comprises a first prefetch pointer pointing to a line of
memory to prefetch and a second prefetch pointer comprising a bit, if when
set, indicates that the next sequential line is to be prefetched.
21. The cache memory system as set forth in claim 20, wherein if the second
prefetch pointer of a line is set, said means for accessing main memory
causes the next sequential line in main memory and the associated prefetch
pointer field to be written to the cache.
22. The cache memory system as set forth in claim 1, further comprising
means for detecting patterns of accesses to main memory, said means for
prefetching prefetching a line of data from main memory in accordance with
the patterns detected.
23. The cache memory system as set forth in claim 1, comprising a plurality
of independent caches, each cache containing, at any point in time, lines
from a single block of memory.
24. The cache memory system as set forth in claim 23, wherein a memory
block comprises a memory page in a virtual memory system.
25. The cache memory system as set forth in claim 1, wherein the system
comprises a plurality of devices issuing requests for access to a line of
the memory to the memory access control means, said memory comprising a
shared memory, and parity fields located in each line of data are used to
store a directory information identifying devices of the plurality of
device that have a copy of the line stored in caches associated with the
identified devices.
26. In a cache memory system comprising at least one main memory, said main
memory comprising lines of data located at specified addresses, and at
least one cache coupled to main memory, said cache storing a plurality of
lines of data, wherein requests for access to data identified by an
address are issued, a method for prefetching lines of data from main
memory for placement in the cache, said method comprising the steps of:
associating a prefetch pointer field comprising at least one prefetch
pointer with each line of data, each of said prefetch pointers pointing to
a line in main memory which may be requested for access subsequent to the
line associated with the prefetch pointer;
prefetching a line of data from main memory and placing the line of data
and the associated prefetch pointer in the cache, said line of data
prefetch being the line pointed to by the prefetch pointer associated with
line of data requested for access;
wherein the frequency of cache misses are decreased.
27. The method as set forth in claim 26, wherein the step of associating a
prefetch pointer field comprises the step of utilizing at least a portion
of parity fields of the line of data to store the prefetch pointer field.
28. The method as set forth in claim 27, wherein the step of associating a
prefetch pointer field comprises the step of utilizing at least a portion
of parity fields of the line of data to further store EDC information.
29. The method as set forth in claim 26, wherein the step of associating a
prefetch pointer field comprises the step of storing the prefetch pointer
in a region of main memory separate from the line of data with which the
prefetch pointer is associated.
30. The method as set forth in claim 26, wherein said step of associating a
prefetch pointer field comprises storing a plurality of prefetch pointers.
31. The method as set forth in claim 30, wherein the plurality of prefetch
pointers comprises a first prefetch pointer pointing to a line of memory
to prefetch and a second prefetch pointer comprising a bit, if when set,
indicates that the next sequential line is to be prefetched.
32. The method as set forth in claim 31, wherein said step of prefetching a
first line of data from main memory further comprises the step of
prefetching the next sequential line and placing that line in the cache if
the second prefetch pointer bit is set in the first line.
33. The method as set forth in claim 26, further comprising the steps of:
detecting patterns of accesses to main memory; and
prefetching a line of data from main memory in accordance with the patterns
detected.
34. The method as set forth in claim 26, further comprising the step of
statically generating an initial set of prefetch pointers prior to said
step of prefetching.
35. The method as set forth in claim 34, wherein virtual memory is
utilized, said method further comprising the step of associating an
independent cache with each currently accessed page in memory; said step
of associating a prefetch pointer field comprising prefetch pointers to
lines of data within the same page.
36. The method as set forth in claim 34, wherein virtual memory is
utilized, and said method further comprises the step of adjusting the
prefetch pointers when a page of memory is moved within physical memory
such that corresponding prefetch pointers continue to point to a line of
data after the page of memory is moved.
37. The method as set forth in claim 26, wherein the step of associating a
prefetch pointer field comprises the steps of:
initializing the prefetch pointer fields of lines in main memory to a Null
value;
when a cache miss occurs when a line of data is not located in the cache
when requested, updating the prefetch pointer field of a recently accessed
line in the main memory having a prefetch pointer field containing a Null
value with the pointer to the line requested.
38. The method as set forth in claim 37, wherein if the prefetch pointer
fields of recently accessed lines of data in the main memory do not
contain a Null value, updating the prefetch pointer field of a prior
requested line.
39. The method as set forth in claim 26, further comprising the step of
periodically setting prefetch pointer fields to Null values in order to
eliminate stale pointers.
40. The method as set forth in claim 26, wherein the step of prefetching
further comprises the step of prefetching the next sequential line of
memory if the prefetch pointer of the line requested for access comprises
a Null value.
41. The method as set forth in claim 26, further comprising the steps of:
writing data to a line in main memory; and
setting the prefetch pointer field of the line written to a Null value.
42. The method as set forth in claim 26, further comprising the step of
writing data to a line in main memory such that the existing prefetch
pointer value of the line is maintained.
43. The method as set forth in claim 42, wherein the step of writing data
further comprises the step of masking the prefetch pointer field such that
the existing prefetch pointer value is maintained.
44. The method as set forth in claim 42, wherein the step of writing data
comprises the step of performing a read-modify-write operation such that
the existing prefetch pointer value is maintained.
45. The method as set forth in claim 26, further comprising the steps of:
writing data to a line in main memory and setting the prefetch pointer
field of the line written to a Null value if the data is written from an
input/output device; and
writing data to a line in main memory such that the existing prefetch
pointer value of the line is maintained if data is written from a
processor.
46. The method as set forth in claim 26, wherein the step of associating a
prefetch pointer field further comprises the steps of:
providing a Null pointer which points to the last recently accessed line in
the main memory requested that contains a Null pointer value;
providing a Last pointer which points to the last line in the main memory
requested;
if a cache hit occurs and the requested line is located in the cache,
updating the Last pointer to point to the requested line and if the
prefetch pointer value of the requested line contains a Null value,
updating the Null pointer to point to the requested line;
if a cache miss occurs and a prefetch pointer field of a line in the main
memory as pointed to by the Null pointer exists, updating the prefetch
pointer value of the line to point to the requested line, and if a Null
prefetch pointer value does not exist, updating the prefetch pointer value
of the last line pointed to by the Last pointer to point to the requested
line.
47. The method as set forth in claim 26, further comprising the step of
storing a directory information in parity fields of each line to assist in
maintaining consistency is a multiple processor system.
48. A processing system comprising:
at least one main memory, said main memory comprising lines of data located
at specified addresses, said main memory maintaining a prefetch pointer
associated with each line of data;
at least one cache coupled to main memory, said cache storing a plurality
of lines of data and associated prefetch pointer fields;
a processor coupled to the cache and the main memory, said processor
issuing a request for access to data at a specified address;
a cache controller coupled to receive the request for data and further
coupled to the main memory and cache, said cache controller determining if
the requested data is located in a line of data in the cache, and if the
data is located in the cache, causing the data requested to be returned to
the processor, and if the data is not located in the cache, said cache
controller communicating a cache miss signal to the main memory causing
the line in main memory containing the requested data and the associated
prefetch pointer field to be written to the cache and the requested data
returned to the processor, said cache controller further reading the
prefetch pointer of the line containing the data requested, and copying
the line of data in main memory pointed to by the prefetch pointer field
read and storing the line of data and the associated prefetch pointer in
the cache;
wherein a line of data is prefetched and placed in the cache in accordance
with the prefetch pointer associated with the line of data requested,
thereby decreasing the probability of a cache miss occurring when a
subsequent request is issued.
49. The processing system as set forth in claim 48, wherein the prefetch
pointer comprises the address of a line of data in main memory which is
likely to be requested for access subsequent to the line of data
associated with the prefetch pointer.
50. The processing system as set forth in claim 48, wherein the prefetch
pointer is stored in parity fields located in the line of data.
51. The processing system as set forth in claim 50, wherein error detection
and correction (EDC) information is further stored in the parity fields
located in the line of data.
52. The processing system as set forth in claim 48, wherein the prefetch
pointer is stored in a region of main memory separate from the line of
data with which the prefetch pointer is associated.
53. The processing system as set forth in claim 48, wherein a prefetch
pointer comprises a Null value when the line of data associated with the
prefetch pointer is initially stored in main memory.
54. The processing system as set forth in claim 53, wherein a prefetch
pointer associated with a line in main memory is updated from a Null value
to a pointer to a line of data subsequently accessed.
55. The processing system as set forth in claim 48, wherein a prefetch
pointer associated with a line in main memory is unmodified when a line of
data is written to main memory.
56. The processing system as set forth in claim 55, further comprising a
bit mask to mask the prefetch pointer field in main memory such that when
a line of data is written to main memory the associated prefetch pointer
field is unmodified.
57. The processing system as set forth in claim 55, wherein the associated
prefetch pointer field is unmodified by performing a read-modify-write
sequence to perform the write operation to main memory.
58. The processing system as set forth in claim 48, wherein an associated
prefetch pointer comprises a Null value when a line of data is stored in
main memory by an input/output device and the associated prefetch pointer
is unmodified in main memory when a line of data is written by the
processor.
59. The processing system as set forth in claim 48, wherein if the prefetch
pointer comprises a Null value, said cache controller causing the next
sequential line in memory to be prefetched.
60. The processing system as set forth in claim 48, wherein the cache
memory system comprises a first level cache coupled to the requesting
device and main memory and a prefetching second level cache coupled to the
requesting device and main memory, said second level cache operating in
parallel with the first level cache between the processor and main memory.
61. The processing system as set forth in claim 48, wherein the cache
memory system comprises a first level cache coupled to the requesting
device and main memory and a prefetching second level cache coupled to the
first level cache and main memory, said second level cache operating in
series between the first level cache and main memory.
62. The processing system as set forth in claim 48, comprising an
instruction cache for storing lines of data comprising instructions and a
data cache for storing lines of data comprising other information
requested, each cache maintaining independent prefetch pointers.
63. The processing system as set forth in claim 48, wherein the cache is a
direct mapped cache.
64. The processing system as set forth in claim 48, wherein the cache is an
associative cache.
65. The processing system as set forth in claim 48, wherein the prefetch
pointer field comprises a plurality of pointers.
66. The processing system as set forth in claim 65, wherein the prefetch
pointer field comprises a first prefetch pointer pointing to a line of
memory to prefetch and a second prefetch pointer comprising a bit, if when
set, indicates that the next sequential line is to be prefetched.
67. The processing system as set forth in claim 65, wherein if the second
prefetch pointer of a line is set, said means for accessing main memory
causing the next sequential line in main memory and the associated
prefetch pointer field to be written to the cache.
68. The processing system as set forth in claim 48, further comprising
means for detecting patterns of accesses to main memory, said means for
prefetching prefetching a line of data from main memory in accordance with
the pattern detected.
69. The cache memory system as set forth in claim 48, comprising a
plurality of independent caches, each cache containing, at any point in
time, lines from a single block of memory.
70. The processing system as set forth in claim 69, wherein a memory block
comprises a memory page in a virtual memory system.
71. The processing system as set forth in claim 48, wherein the system
comprises a plurality of devices issuing requests and parity fields
located in each line of data are used to store directory information.
72. A cache memory system comprising:
at least one main memory, said main memory comprising lines of data located
at specified addresses, each line of data further comprising an associated
prefetch pointer field;
at least one cache coupled to main memory, said cache storing a plurality
of lines of data and maintaining a prefetch pointer field associated with
each line of data;
a memory access controller coupled to receive a request for data at a
predetermined address from a requesting device and further coupled to the
main memory and cache, said means comprising;
means for accessing the cache to determine if the requested data is located
in a line of data in the cache, and if the data is located in the cache,
causing the data requested to be returned to the requesting device, and
means for accessing main memory to cause the requested data to be returned
to the requesting device if the data is not located in the cache, said
means further causing the line in main memory containing the requested
data and the associated prefetch pointer field to be written to the cache;
if the data is not located in the cache, means for providing a prefetch
pointer to the line of memory accessed from main memory and storing in the
prefetch pointer field in a previous accessed line of data in the cache;
and
cache prefetch means comprising;
means for reading a prefetch pointer from the prefetch pointer field of the
line in the cache containing the data requested, and
means for accessing the line of data in main memory pointed to by the
prefetch pointer read and storing the line of data and the associated
prefetch pointer in the cache if the line of data pointed to by the
prefetch pointer read is not located in the cache;
wherein a line of data is prefetched and placed in the cache in accordance
with the prefetch pointer associated with the line of data requested,
thereby decreasing the probability of a cache miss occurring when a
subsequent request is issued.
73. The cache memory system as set forth in claim 72, wherein if said
prefetch pointer comprises a Null value, said means for accessing the line
of data in main memory access the next sequential line of memory.
74. The cache memory system as set forth in claim 72, wherein the prefetch
pointer is stored in parity fields located in the line of data.
75. The cache memory system as set forth in claim 72, wherein the prefetch
pointer field comprises a plurality of pointers.
76. The cache memory system as set forth in claim 75, wherein the prefetch
pointer field comprises a first prefetch pointer identifying an address of
a line of memory to prefetch and a second prefetch pointer comprising a
bit, if when set, indicates that a line of data at a next sequential
address is to be prefetched.
77. In a cache memory system comprising at least one main memory, said main
memory comprising lines of data located at specified addresses, and at
least one cache coupled to main memory, said cache storing a plurality of
lines of data, wherein requests for access to data identified by an
address are issued, a method for prefetching lines of data from main
memory for placement in the cache, said method comprising the steps of:
associating a prefetch pointer field comprising at least one prefetch
pointer with each line of data in the cache, each of said prefetch
pointers pointing to a line in main memory which may be requested for
access subsequent to the line in the cache associated with the prefetch
pointer;
associating a prefetch pointer field comprising at least one prefetch
pointer with each line of data in main memory, such that main memory
maintains prefetch pointer information when a line of data is no longer in
the cache;
prefetching a line of data and associated prefetch pointer field from main
memory and placing the line of data in the cache if the line of data
pointed to by the prefetch pointer read is not located in the cache, said
line of data prefetch being the line pointed to by the prefetch pointer
associated with line of data requested for access;
wherein the frequency of cache misses are decreased.
78. The method as set forth in claim 77, wherein the steps of associating a
prefetch pointer field each comprises the step of utilizing at least a
portion of parity fields of the line of data to store the prefetch pointer
field.
79. The method as set forth in claim 77, wherein said steps of associating
a prefetch pointer field each comprising storing a plurality of prefetch
pointers.
80. The method as set forth in claim 79, wherein the plurality of prefetch
pointers comprises a first prefetch pointer pointing to a line of main
memory to prefetch and a second prefetch pointer comprising a bit, if when
set, indicates that the next sequential line of memory is to be
prefetched. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the cache memory systems. More
particularly, the present invention relates to a cache memory system which
performs prefetching of data from memory based upon pointers in the cache.
2. Art Background
Dynamic Random Access Memory (DRAM) components provide an inexpensive form
of solid-state storage for computer systems. However, the speed of DRAM
components is typically slower than the processors which access the DRAM.
A common technique for lessening the impact of slow DRAM access time (the
DRAM latency) on the processor's performance is to employ a cache memory.
A cache memory is typically much smaller than the main DRAM memory, but
much faster. It's speed is well matched to the processor speed. A cache is
typically implemented with Static Random Access Memory (SRAM) components,
which store blocks of instructions and data (referred to as lines of
data), that are copies of selected main memory locations. Because a cache
memory is smaller than the main memory from which it copies data, the
cache is not fully addressable and stores an address tag field for each
data field. This tag field identifies the main memory address
corresponding to a particular data line.
FIG. 1 shows a processor with a level one (L1) cache located between the
DRAM and main memory. When a read request, R1, cannot be satisfied by the
cache, a cache miss occurs and the cache must make a read request, R2, to
the main memory. Likewise, for example in a write allocate policy, when a
write request W1 can not be satisfied by the cache, a miss occurs and the
cache must make a read request R2 to the main memory.
A read request affects a processor's performance more directly than a write
request. This is because a processor must usually stall (wait) until the
read data it has requested is returned before continuing execution. When a
processor makes a write request, the address and data can typically be
written into temporary buffers while the processor continues execution.
Write requests can be serviced using techniques such as write through and
write back. Using the write through technique, the data line in main
memory is always updated with the write data, and the copy in the cache is
updated only if it is present in the cache. Using the write back
technique, the copy in the cache is updated only if the line of data is
present in the cache. If the line of data is not present in the cache,
then the data must first be read in, and then updated. Using this
technique, some lines in main memory will be incorrect. To track the lines
in main memory which hold incorrect data because the line in main memory
has not been updated, dirty bits associated with each line are used.
Modern processor components typically include an L1 cache. However,
referring to FIG. 2, the resulting processor-L1 structure can still
benefit from a second level (L2) cache which satisfies some portion of the
read and write requests (R2 and W2) directed to main memory. FIG. 3 is an
exemplary illustration of the structure of an L2 cache. The illustration
utilizes the following variables:
L--The number of address bits needed to access a byte in a cache line
S=2.sup.L --The number of bytes in a cache line
A--Main memory contains 2.sup.A+L-L lines (address bits A:L access a line).
C--The number of address bits needed to access a cache line in the cache
2.sup.C --The number of cache lines in the cache
Valid[2.sup.C --1:0]--A value of TRUE means the cache line holds a correct
copy of the main memory line. A value of FALSE means it does not.
Tag[2.sup.C --1:0][A:L+C]--The value of the Address[A:L+C] bits of the
memory line if the cache line is valid.
Data[2.sup.C --1:0][S--1:0][8:0]--The value of the cache line.
ReqAddr[A:L]--The address of a memory line requested by the processor
ReqData[S--1:0][8:0]--The memory line that is read or written.
Memory[2.sup.A+1--L --1:0][S--1:0][8:0]--Main memory organized by [memory
lines][bytes][bits].
FIGS. 4a and 4b are exemplary pseudo code which show how the above cache
and memory structures are manipulated during an R2 read request or a W2
write request. Referring to FIG. 4a , an index into the cache (IndexA) is
generated by taking the unsigned integer value of a C-bit field in the
ReqAddr field using the UnsignedValue( ) routine. Similarly, a pointer
into memory (PointerA) is generated. The rest of the ReqAddr field is then
compared to the Tag field located at IndexA of the cache. If there is a
match, and the Valid field at IndexA is set to TRUE, then a cache hit has
occurred and the ReqData parameter is set to the Data field located at
IndexA. If there is a cache miss, then a memory read (R3) is performed to
get the requested data from main memory. The ReqData parameter is set to
this data, as is the Data field at IndexA of the cache. The Tag and Valid
fields are also updated.
Referring to FIG. 4b, the write routine, the IndexA and PointerA unsigned
integers are generated as in the read routine from the ReqAddr parameter.
The ReqData parameter is written (W3) into main memory at the PointerA
line address. If this line is present in the cache at IndexA, then the
Data field is also set to ReqData.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a cache
memory system in which lines of data are prefetched from memory prior to
requests for access wherein the line prefetched is selected based upon the
prior execution history of the processor.
It is an object of the present invention to provide a cache system and
method for prefetching data from memory which results in a high cache hit
rate and low latency.
The present invention provides an innovative cache structure and method for
utilizing the same which increases the cache hit rate by prefetching lines
from main memory based on the prior history of lines accessed. In
particular, each cache line is associated with a pointer field which
points to the line to be prefetched. Space for this pointer field can be
established in a number of ways. In one embodiment, the pointer field is
held in a region of main memory not used for the normal code and data
structures for executing processes.
In the preferred embodiment, space in main memory is not impacted while
maintaining pointer information for the prefetching in the cache. In
particular, the parity bits associated with each byte in the line of
memory are in part used to store pointer information. Thus, when
subsequent accesses are performed to main memory, pointer information
identifying the line of memory to prefetch is provided to enhance the
efficiency of the cache.
The pointer information is updated whenever the cache misses, so the
probability of a hit on a subsequent request to the same line is
increased.
BRIEF DESCRIPTION OF THE DRAWINGS
The objects, features and advantages of the present invention will be
apparent to one skilled in the art from the following detailed description
in which:
FIG. 1 is a prior art block diagram of a cache memory system.
FIG. 2 is a prior art block diagram of a two level cache memory system.
FIG. 3 is a prior art illustration of an exemplary structure of a cache.
FIGS. 4a and 4b provide exemplary pseudo code for prior art read and write
procedures using a cache.
FIGS. 5a and 5b are exemplary block diagrams of the cache memory system of
the present invention.
FIG. 6 is a exemplary diagram of a cache structure utilized in one
embodiment of the present invention.
FIG. 7 is an exemplary diagram of the memory structure utilized in one
embodiment of the present invention.
FIG. 8 is a flowchart diagram illustrating one embodiment of the method for
prefetching in the cache system of the present invention.
FIG. 9 is a flow chart diagram illustrating one embodiment for performing a
write operation.
FIGS. 10a, 10b and 10c set forth pseudo code illustrating one embodiment of
the read present the cache system of the present invention.
FIG. 11 is exemplary pseudo code illustrating one embodiment of the write
process in the cache system of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
In the following description, for purposes of explanation, numerous details
are set forth, such as byte and word sizes, in order to provide a thorough
understanding of the present invention. However, it will be apparent to
one skilled in the art that these specific details are not required in
order to practice the invention. In other instances, well known electrical
structures and circuits are shown in block diagram form in order not to
obscure the present invention unnecessarily.
The structure and methods of the present invention are described in the
form of a direct mapped, write through cache. This invention will be
described with reference to the write through policy because of its
simplicity. However, the write policy chosen is independent of the
innovation that is disclosed in this document and it is readily apparent
that extension to caches using write back policies is straightforward to
someone skilled in the art.
Likewise, there are different methods of organizing caches: Direct-mapped,
two (or more) -way set associative, and fully associative. These cache
organizations affect the number of places in the cache where a particular
line from memory may be placed. The present invention will be described in
terms of the direct-mapped policy. However, the associativity policy
chosen is independent of the innovation that disclosed herein and
extensions to caches with set-associative or fully-associative policies is
straightforward to someone skilled in the art.
Caches are effective because the pattern of address requests for an
application shows a spatial and temporal locality. Spatial locality
implies that if an address A is requested, there is a high probability
that address (A+a) will also be requested. Temporal locality implies that
if an address A is requested at time T, there is a high probability that
address A will also be requested at a time (T+t). The probability
increases as "a" and "t" are made smaller. A cache takes advantage of this
locality by fetching a line of information whenever any portion of that
line is requested by the processor (a line is typically four to eight
times bigger than the data "word" which the processor operates on). This
means that the information at nearby addresses will be "prefetched", that
is, the information will already be present in the cache when requested.
Exemplary block diagrams of a cache system are shown in FIGS. 5a and 5b.
Referring to FIG. 5a, a typical system will include a processor 355 which
issues requests to access data located at a specified address in memory.
This request is received by the cache 360 which determines, by comparing
the requested address with the tags located in the tag RAM 362, whether
the data is located in the cache RAM 364. The cache controller 365 and
address comparator 367 provide the logic to perform the comparison and
provide the data back to the processor 355 if a cache hit occurs and the
data is located in the cache. The cache controller 365 also provides the
logic necessary to perform the prefetching in accordance with the
teachings of the present invention. Main memory 370 provides the data to
the processor 355 if a cache miss occurs. The line of data requested is
also written to the cache 360 so that the data is readily available if
data within the line of memory is subsequently requested. This structure
is exemplary; it is apparent that other cache structures can be
implemented which incorporate the teachings of the present invention.
FIG. 5b illustrates the use of a two level cache to speed up the time
required by the processor to access data. Preferably, the first level
cache functions as in the prior art and the second level cache functions
in accordance with the teachings of the present invention; however other
embodiments are contemplated. The cache structure and method for accessing
will be discussed with reference to the second level cache L2; however, it
is readily apparent that the structures and methods described herein can
be applied to different level caches including a single level cache.
As noted earlier, blindly prefetching information ahead of every requested
address uses up main memory bandwidth with only a limited return in
performance. It is better to take advantage of the previous execution
history of the processor and the locality of reference exhibited by the
requested addresses to create and maintain a pointer array within main
memory and to perform prefetching of data based upon the addresses
identified by the pointers.
In the present example consisting of a 32 Mbyte main memory using 32 byte
lines, there is room for 3 MBytes of pointers. If this information
resource is used effectively, it can provide the performance levels of a
very large cache implemented with discrete SRAM devices without
significant degradation of error control. The system of the present
invention is able to do this with a very small cache because it is able to
reduce the compulsory and capacity miss rate of an application. A
traditional cache suffers a compulsory miss the first time a memory line
is requested. A cache also suffers a capacity miss if that line is
discarded by the cache because too many other requests have occurred
before the line is reused.
The pointer array, once established, gives the cache the ability to
prefetch a memory line before it is requested, and avoid some compulsory
misses altogether, something that a traditional cache can not. The pointer
array also permits many capacity misses to be avoided without increasing
the cache to a large size. This is possible because it does not require
that a memory line reside in the cache between the occurrences or requests
for that line. Instead, the line is discarded, and is prefetched again
when the pattern of request addresses include a memory line with the
appropriate pointer to the line.
This prefetching scheme does require extra bandwidth between the cache and
main memory, compared to a traditional cache. The extra transfers between
the cache and main memory tend to occur during the time the cache is
transferring a requested line to the processor, so a significant
performance loss is not realized. In any case, it is relatively easy to
increase the bandwidth of main memory, for example, by increasing the
width of the buses to the main memory components. The benefit is a
reduction in the size and cost of the cache.
In the present invention, the second level cache, L2, is configured with
2.sup.L =S bytes per cache line and 2.sup.C cache lines per cache. Each
cache line includes one additional field:
Pointer[2.sup.C --1:0][A:L]--A pointer to a memory line which is likely to
be used soon after this line.
Furthermore, three registers are located in the cache:
Count[C--1:0]--a register which keeps track of how many cache accesses have
occurred since an access to a line with a Null pointer field.
Null[A:L]--a register which points to the last line accessed with a null
pointer field.
Last[A:L]--a register which points to the last line accessed in the cache.
An exemplary structure of a cache is shown in FIG. 6. Preferably, the
pointer field in the cache line originates from a pointer array that is
maintained in main memory. There is a pointer field associated with each
memory line, which points to another memory line which is likely to be
used shortly after the current line is used. Alternately, the pointer may
be maintained solely in the cache such that when a new line of memory is
transferred to the cache, the pointer field will be initialized to a null
value and subsequently updated with pointer identifying the next line
accessed.
When the processor requests a memory line at address X, the cache or main
memory supplies the data at memory location X. The cache also prefetches
the memory line at address Y, wherein Y is the address identified by the
pointer field of the line of memory at location X, and places it in the
cache. If the processor later requests the memory line at address Y, it is
likely to be in the cache thereby increasing the cache hit rate.
In the preferred embodiment, the pointer array is maintained in accordance
with the access performed. Thus, if a cache miss occurs, the pointer field
in a memory line which was previously accessed is changed so that a future
miss at that address can be avoided. It is not necessary, however, that
the pointer Y for a line of memory at address Y be in the line that was
last accessed. As long as the pointer Y is placed in the pointer field of
a line memory at address X which was accessed within the last 2.sup.C--1
requests, then it is probable that the line will still be in the cache
when subsequently requested.
In order to maintain the pointers outside the cache to further increase the
cache hit rate, the pointer information is also maintained in main memory.
One way to provide sufficient memory space for the pointer information is
to restrict a portion of physical memory (e.g., at the top addresses) from
operating system usage, and use it to store the pointer array. For
example, using a 16 bit relative pointer (the pointer may only specify
addresses within 2.sup.15 lines of the memory line it is stored in) and
with 32 byte lines (L=5), the pointer array will use 1/16th of the memory.
Besides reducing the effective size of main memory, this technique has the
disadvantage that two areas of memory must be accessed for every memory
request, thereby increasing the latency for a | | |