|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to semiconductor memory devices and
particularly to a construction and an operation method of a semiconductor
memory device containing a cache memory of a simple structure in which a
cache hit rate is improved without increasing the number of pin terminals.
2. Description of the Prior Art
A computer system generally comprises a central processing unit (CPU) for
execution of instructions received and the like, and a main memory for
storage of data, programs and the like necessary for the CPU. It is
desirable to operate the CPU at high speed with no wait from the viewpoint
of improvement of the performance of the system. For this purpose, it is
necessary to reduce time of access to the main memory to a value as short
as possible so that it can correspond to an operation speed of the CPU.
These days, a clock cycle of the CPU tends to be made as fast a cycle as
16 MHz or 20 MHz, which unavoidably requires reduction of the access time
with respect to the main memory. However, this requirement comes to
surpass the limits of the performance of a DRAM (Dynamic Random Access
Memory) used in the main memory. To cope with this, a high-speed memory is
required; however, it is expensive and not desirable from a viewpoint of
cost performance. One of the methods for solving this difficulty is a
cache memory system in which memories are arranged in a hierarchy manner.
In this system, a DRAM (,or DRAMs) which has a large storage capacity with
a low operating speed and therefore is inexpensive is used as a main
memory, and a small-capacity but high-speed buffer memory is provided
between the CPU and the low-speed main memory. Frequently used data in the
main memory is stored in the high-speed buffer memory in accordance with a
request from the CPU. In response to accessing from the CPU, the requested
data is read from and written into the high-speed buffer memory in place
of the main memory. This high-speed buffer memory is called a cache
memory. A state in which data of an address to be accessed by the CPU
exists in the cache memory is called "hit", and in this case, the CPU
accesses the high-speed cache memory. On the other hand, a state in which
data of an address to be accessed by the CPU does not exist is called
miss. In this case, the CPU accesses the low-speed main memory and also
transfers a block to which the requested data belongs from the main memory
to the cache memory. The cache memory stores the transferred block of data
to be ready for the subsequent accessing from the CPU.
As described above, the cache memory does not store data in a fixed manner.
An area of the main memory, which is stored in the cache memory changes
dependent on a request from the CPU. However, the area of the main memory
accessed from the CPU has locality in a data processing. Accordingly, data
fetched from the main memory to be stored in the cache memory is likely to
be accessed for a while thereafter. Consequently, once data of the main
memory is stored in the cache memory, the function of the high-speed
memory is fulfilled most effectively and there is no wait in memory
accessing by the CPU. In other words, processing operation of the CPU is
never delayed due to a memory access time.
Thus, the high-speed cache memory is provide as a buffer between the
low-speed and large-capacity main memory and the high-speed CPU and
accordingly it is made possible to improve the system performance and the
cost performance. However, the above described cache memory system
requires a high-speed memory which is of small capacity but is expensive.
For this reason, the cache memory system cannot be applied to a
small-sized system, which is desired to have a low cost.
Therefore, in a conventional small-sized system, a simplified cache system
is formed by utilizing a page mode and a static column mode which are
high-speed access modes of a general-purpose DRAM.
Referring first to FIG. 1, a construction of a DRAM having a high-speed
access mode will be described. The DRAM comprises a memory cell array 5
where memory cells for storing information are arranged in a matrix of
rows and columns. The rows of the memory cell array 5 are defined by word
lines WL while the columns of the array 5 are defined by bit lines BL.
FIG. 1 typically shows one word line WL, one bit line BL and a memory cell
MC located at an intersection of those lines. A sense amplifier 6 is
provided corresponding to columns of the memory cell array 5, to detect,
amplify and latch a signal potential appearing on a bit line concerned
when a word line is selected.
A row address buffer 1, a row decoder 3 and a word driver 4 are provided to
select memory cells of one row of the memory cell array 5. The row address
buffer 1 accepts an externally applied row address in response to a
control signal RAS and generates an internal row address RA. The row
decoder 3 decodes the internal row address RA from the row address buffer
1 and designates one word line. The word driver 4 responds to a row
address decode signal from the row decoder 3 and activates the word line
designated by the decode signal.
A column address buffer 2, a column decoder 8 and an input/output (I/O)
switch 7 are provided to select memory cells of one column out of the
memory cell array 5. The column address buffer 2 accepts an externally
applied column address in response to a control signal CAS to generate an
internal column address CA. The column decoder 8 decodes the internal
column address CA from the column address buffer 2 and generates a signal
for selecting a column designated by the column address. The I/O switch 7
responds to a column address decode signal from the column decoder 7 to
connect the column (a bit line) designated by the decode signal to an I/O
bus 13.
To input and output data, there are provided an input buffer 14 for
generating internal data upon receipt of external input data D.sub.IN and
supplying the internal data to the I/O bus 13, and an output buffer 15 for
generating external data D.sub.OUT upon receipt of memory cell information
selected by row and column addresses through the I/O bus 13.
In order to control data input/output operation of the memory, there is
provided a read/write (R/W) control 16 for controlling a data input/output
buffer 14 and a data output buffer 15 in response to a write enable signal
WE and the signal CAS.
A row address and a column address as external addresses are multiplexed
and supplied through the same pins to the DRAM. The control signal RAS
provides operation timing to circuits related with the row address. When
the signal RAS is activated, a memory cycle is started. The signal CAS
provides operation timing to circuits related with selection of a column.
This signal also provides timing for writing of data dependent on an
operation mode selected. Referring now to FIGS. 2 to 4 which are waveform
diagrams showing operation of the DRAM, the operation of the DRAM will be
described.
Referring first to FIG. 2, a normal operation cycle of the DRAM will be
described. When the signal RAS falls to "L" (low) level, a memory cycle is
started. At a falling edge of the signal RAS, an externally applied row
address is accepted in the DRAM chip and an internal row address RA is
generated from the row address buffer 1 and is supplied to the row decoder
3. The row address is decoded in the row decoder 3, whereby one word line
is selected to be activated through the word driver 4. As a result,
information stored in one row of memory cells connected to the selected
word line is transmitted onto the respective bit lines (columns). The
information on the respective bit lines is detected, amplified and latched
by the sense amplifiers 6. On the other hand, when the signal CAS falls,
an externally applied column address is accepted by the column address
buffer 2 and an internal column address CA is generated. The column
decoder 8 decodes the internal column address CA and selects a column
designated by the column address. The I/O switch 7 connects the column
(the bit line) selected by the column decode signal from the column
decoder 7 to the I/O bus 13. As a result, information stored in the
selected memory cell and amplified and latched by the sense amplifier 6 is
outputted through the output buffer 15 as an external data D.sub.OUT.
Thus, in the normal operation cycle, a row address is accepted in the chip
at a falling edge of the signal RAS and then a column address is accepted
in the chip at a falling edge of the signal CAS. After that, data of the
memory cell selected by the row address RA and the column address CA is
outputted. Accordingly, RAS access time T.sub.RAC shown in FIG. 2 is
required as the access time (namely, a period from the fall of the signal
RAS to the output of valid data). A cycle time Tc is a sum of a period in
which the DRAM is active (namely, a period of "L" level of the signal RAS)
and a RAS precharge period (namely, a period of "H" (high) level of the
signal RAS, in which the device is in a state). An average value of the
cycle time Tc is about 200 ns in the DRAM with T.sub.RAC =100 ns.
Referring now to FIG. 3, page mode operation will be described. First, a
row address and a column address are provided in the same manner as in the
normal operation cycle, whereby information of a selected memory cell is
read out through the output buffer 15. Then, the signal CAS is raised to
"H" level with the signal RAS being maintained at "L" level. As a result,
circuits related with column selection, such as the column address buffer
2 and the column decoder 8, are reset. On the other hand, the sense
amplifiers 6 is latching information of memory cells of one row selected
by the row address RA because the signal RAS is at "L" level. Then, when a
column address is provided and a signal CAS falls to "L" level, a column
(a bit line) corresponding to the newly supplied column address is
selected and information on the column selected by the column decoder 8
and the I/O switch 7 is read out through the I/O bus 13 and the output
buffer 15. Operation of accepting a new column address for each toggle of
the signal CAS is permitted to be repeated by any number of times within a
period in which the signal RAS is allowed to be maintained at "L" level.
In short, the page mode operation is operation for accessing memory cells
connected in the same row by changing only the column address. Since only
the column address is changed, it is not necessary to accept a row address
for each accessing and thus accessing operation can be performed at a
higher speed than that in the normal operation cycle.
Referring to FIG. 4, a static column mode will be described. In the static
column mode, the first accessing is performed in the same manner as in the
normal operation cycle. Thus, a row address and a column address are
accepted in the chip in response to the signals RAS and CAS, respectively,
and information of a memory cell selected is read out. Then, valid data is
read out and after an elapse of a predetermined period, the column
address is changed with the signals RAS and CAS being maintained at "L"
level. As a result, information of a memory cell corresponding to a new
column address out of the memory cells of the same row is read out.
Although in this operation mode, the signal RAS is maintained at "L" level
and information of the memory cells of one row designated by the initially
supplied row address is latched by the sense amplifiers. Thus, the static
column mode is also a mode in which the memory cells connected in the same
row are accessed by changing only the column address, as in the page mode.
However, in the same manner as in the case of a static RAM, the signal CAS
is maintained at "L" level (corresponding to a signal CS in a static RAM)
and access is made only by changing a column address. Accordingly, it is
not necessary to toggle the signal CAS and thus access can be made
generally at a higher speed than that in the page mode.
An access time T.sub.CAC in the page mode (namely, a period from a fall of
the signal CAS to an output of a valid data) and an access time T.sub.AA
in the static column mode (namely, a period from a change of the column
address to an output of valid data, that is, an address access time), both
are about a half of the RAS access time T.sub.RAC in the normal operation
mode. For example, if T.sub.RAC =100 ns, both T.sub.CAC and T.sub.AA are
about 50 ns. In addition, the cycle time is also shortened and in the case
of the page mode, the cycle time is about 50 ns as in the static column
mode although it depends on a value of the CAS precharge time Tcp.
Now, high-speed accessing operation of the DRAM will be briefly described
with reference to FIG. 1.
As shown in FIG. 1, a multiplexed row and column addresses are supplied to
the row address buffer 1 and the column address buffer 2, respectively.
When the signal RAS falls to "L" level, an internal row address RA is
supplied from the row address buffer 1 to the row decoder 3 in response to
the falling edge thereof, so that the internal address RA is decoded. The
word driver 4 is driven by the decoded row address from the row decoder 3,
thereby activating one word line in the memory cell array 5 selected by
the internal row address RA. As a result, data of the respective memory
cells connected to the selected (activated) word line appear on the
related bit lines to be transmitted to the sense amplifiers 6. The sense
amplifiers 6 detect, amplify and latch the data supplied thereto. Thus, at
this time, data on one row corresponding to the designated row address are
latched by the sense amplifiers 6. Thereafter, if data in a memory cell on
the same row is accessed by the row address, the above described page mode
and static column mode can be utilized.
More specifically, in the page mode, the column address buffer 2 transmits
the column address supplied thereto to the column decoder 8 in response to
a falling edge of the signal CAS. As a result, one of the data latched by
the sense amplifier 6 (in the case of .times. 1-bit structure) is selected
by the decoded address and provided as output data D.sub.OUT through the
output buffer 8.
In the static column mode, a trigger of column (bit line) selection is
given by a change in the multiplexed address MXA, namely, a transition in
the column address supplied to the column address buffer 2. Other
operation is the same as in the page mode.
A description of static column mode operation and a description of a cache
system using DRAMs operable in the static column mode are given in "The
Use of Static Column RAM as a Memory Hierarchy" by J. G. Goodman et al,
IEEE 11th Annual Symposium on Computer Architecture, 1984 pp. 167-174.
Page mode operation and ripple mode/static column mode operation as well as
a cache system using DRAMs operable in those modes are described in an
Application Note on 256K CMOS DRAM of Intel Corp., pp. 1-276 to 1-279.
Referring to FIG. 5, description is now made of construction and operation
of a simple cache memory system using a fast access mode such as the above
described page mode or static column mode.
A main memory system shown in FIG. 5 comprises eight DRAMs 22-1 to 22-8
each capable of performing fast serial access operation. Each of the DRAMs
22-1 to 22-8 has a 1M.times.1b structure. More specifically, each of the
DRAMs 22-1 to 22-8 has a capacity of 1 mega bits (2.sup.20 bits) and data
is inputted and outputted on 1 bit basis. Consequently, the main memory
system has a 1M byte structure. An identical address is multiplexed and
supplied to the respective DRAMs 22-1 to 22-8. Accordingly, an address of
10 bits is supplied to each DRAM.
In order to control access to the main memory, there are provided an
address generator 17, a latch (TAG) 18, a comparator 19, a state machine
20 and an address multiplexer 21.
The address generator 17 generates an address of data required by the CPU,
in response to address information from the CPU (not shown). If the main
memory system is of the 1M byte structure, addresses of 20 bits (namely, a
row address of 10 bits and a column address of 10 bits) are simultaneously
transmitted onto a 20-bit address bus 40.
The latch (TAG) 18 receives the addresses from the generator 17 and stores
the row address selected in the preceding cycle. The row address stored by
the latch (TAG) 18 is not updated at the time of hit in the cache memory
(hereinafter referred to as "cache hit"). It is updated by a row address
newly generated by the address generator 17 at the time of miss in the
cache memory (hereinafter referred to as "cache miss").
The comparator 19 compares the row address from the address generator 17
with the row address stored in the latch (TAG) 18 and generates a signal
CH (cache hit) indicating the result of comparison. The signal CH is
supplied to the latch (TAG) 18. Thus, updating of a content stored in the
latch (TAG) 18 is controlled. The signal CH is also supplied to the state
machine 20.
The state machine 20 generates control signals RAS, CAS and WE in response
to the signal CH and supplies those signals to the respective DRAMs 22-1
to 22-8. The signal WE is a signal for designating input and output of
data to and from the main memory system. Data is read out at "H" level of
the signal WE and data is written at "L" level thereof. The signal WE is
supplied to the data input buffer and the data output buffer of each DRAM.
Data is written in response to the later falling of the signals CAS and
WE. When the signal CH from the comparator 19 indicates a mismatch (a
cache mishit), the state machine 20 temporarily raises the signals RAS and
CAS to "H" level and then lowers those signals sequentially, whereby each
DRAM executes normal operation cycle. At the same time, the state machine
20 supplies a signal WAIT to the CPU to bring the CPU into a wait state.
When the signal CH indicates a match (a cache hit), the state machine 20
maintains the signal RAS at "L" level and toggles the signal CAS, so that
each DRAM performs page mode operation.
The address multiplexer 21 multiplexes the addresses from the address
generator 17 and transmits the multiplexed addresses onto the 10-bit
address bus 41 to supply the same to the respective DRAMs 22-1 to 22-8
under control of the state machine 20. When the signal CH indicates a
mismatch, the address multiplexer 21 multiplexes the address of 20 bits
supplied from the address generator 17 and generates a row address of 10
bits and a column address of 10 bits successively under control of the
state machine 20. When the signal CH indicates a match, only a column
address of 10 bits out of the addresses supplied is generated under
control of the state machine 20.
Referring now to FIG. 6 indicating an operation waveform diagram, operation
of the cache memory system shown in FIG. 5 will be described. The system
clock shown in FIG. 6 is a clock for applying operation timing to the
memory system and the CPU, and one machine cycle is defined by one clock.
According to procedures of a program, the CPU generates address information
of necessary data. The address generator 17 generates, in response
thereto, an address showing a location of storage of the data required by
the CPU, at a rise of the system clock and supplies the address to the
20-bit address bus 40. The comparator 19 compares the 10-bit row address
(RA2) out of the generated addresses with the row address (RA1) stored by
the latch (TAG) 19. When those addresses match (RA1=RA2), which means that
the same row as that related with the memory cells accessed in the
preceding cycle has been accessed, the comparator 19 generates the signal
CH of "H" level, for example, indicating a cache hit. The state machine 20
toggles the signal CAS in response to the signal CH of "H" level from the
comparator 19 with the signal RAS being maintained at "L" level (till
then, the signal RAS is at "L" level and each DRAM is enabled). On the
other hand, the address multiplexer 21 transmits the 10-bit column address
to the 10-bit address bus 41 under control of the state machine 20 when
the signal CH is generated. As a result, the respective DRAMs 22-1 to 22-8
perform page mode operation and provide data at high speed to the CPU in
the access time T.sub.CAC. (Input and output of the data are instructed by
the signal WE, whose instruction is given by the CPU and provided through
the state machine 20.)
On the other hand, the row address (RA1) stored by the latch (TAG) 18 does
not match with the row address (RA2) generated by the address generator
17, the comparator 19 does not generate the signal CH (or keeps the signal
CH at "L" level). In this case, since the memory cells of a row different
from that accessed in the preceding cycle are accessed, it is necessary to
newly supply a row address to the respective DRAMs 22-1 to 22-8. When the
signal CH is not generated, the state machine 20 brings the signals RAS
and CAS temporarily into an inactive state at "H" level, so that the
respective DRAMs 22-1 to 22-8 can execute the normal operation cycle. The
address multiplexer 21 multiplexes the 20-bit address from the address
generator 17 and transmits the row address and the column address
successively by 10 bits to the address bus 41 under control of the state
machine 20. The respective DRAMs 22-1 to 22-8 accept the row address at a
fall of the signal RAS to select one word line and accepts the column
address at a fall of the signal CAS to select one column, whereby
information of the selected memory cell is outputted.
Thus, in the case of cache miss, the normal operation cycle beginning with
RAS precharging is executed. The minimum value of the RAS precharge period
is predetermined and the succeeding operation cycle can not be started
before the elapse of the RAS precharge period. In addition, the access
time until valid data is outputted is T.sub.RAC at low speed. Since this
time T.sub.RAC is longer than one operation cycle time of the CPU, the
state machine 20 supplies a signal WAIT to the CPU to bring it into a wait
state. In the case of a cache miss, the latch (TAG) 18 stores a new row
address on the address bus 40 and holds it. Control as to whether the
stored content in the latch (TAG) 18 is to be changed or not is made by
the signal CH.
In the above described construction, the latch (TAG) 18 stores a row
address, and a match or a mismatch between the stored row address and a
row address to be newly accessed is determined. In other words, in the
conventional simple cache memory system, data for one row of a DRAM (1024
bits in the case of a 1M device) is formed as one block and a cache hit or
a cache miss with respect to this data block is determined.
However, there is not a high probability that all the data of one block
(1024 bits of one row for each DRAM in the above described prior art) are
continuously accessed by the CPU. Therefore, the block size (namely, 1024
bits/DRAM) is unnecessarily large.
In addition, in the construction utilizing the page mode or the static
column mode as in the above described prior art, the latch (TAG) 18 holds
only one block (entry) and the capacity can not be further increased.
Consequently, the cache hit rate can not be sufficiently increased. In
other words, a cache hit occurs only in the case where the same row
address is continuously accessed. Accordingly, if a program routine
related with two consecutive row addresses is repeatedly executed, a cache
miss always occurs and thus the function of the cache memory can not be
satisfactorily performed.
A dynamic semiconductor memory device comprising a serial shift register
having a number of stages equal to the number of columns in the memory
cell array and connected to the columns through transfer gates is
disclosed in U.S. Pat. No. 4,330,852 entitled "Semiconductor Read/Write
Memory Array Having Serial Access", issued to D. J. Redwine et al, filed
Nov. 23, 1973. In this device, data of cells of one row are transmitted in
parallel between the shift register and an addressed row of memory cells.
Data in the shift register are serially shifted from the register to
external for a read operation. The device of the prior art comprises a
data register which is serially accessed, and thus the device can not be
employed as a cache memory which requires random access to the column on
an addressed row.
The same device as discussed above is also described in a publication
entitled "A High Speed Dual Port Memory with Simultaneous Serial and
Random Mode Access for Video Application" by R. Pinkham et al, IEEE
Journal of Solid-State Circuits Vol. A sc-19, No. 6, December 1984, pp.
999-1007.
A memory device with on-chip cache is disclosed by Matick et al. in U.S.
Pat. No. 4,577,273 entitled "Distributed On-Chip Cache", filed Jan. 1,
1984. This prior art on-chip cache comprises a cell array and a
master-slave register. The cell array is accessed through a first port,
while the slave register is accessed through a second port. The
master-slave register is employed as a cache. However, in this prior art,
the master register receives data from the columns connected to an
addressed row of the cell array. Therefore, this prior art also has
disadvantages such as too large a data block size and too small entry
number in the latch (TAG).
SUMMARY OF THE INVENTION
One object of the present invention is to provide a dynamic random access
memory (DRAM) device containing a cache memory with an adequate data block
size.
Another object of the present invention is to provide a DRAM device in
which an increased number of entries can be stored in a tag in a simple
cache system.
A further object of the present invention is to provide a simple cache
system with an adequate data block size and an increased number of entries
to be stored, thereby to improve the hit rate and cost performance of the
system.
A still further object of the present invention is to provide a
semiconductor memory device containing a cache memory with an improved
cache hit rate and an adequate data block size without increasing the
number of external pin terminals.
A still further object of the present invention is to provide an operating
method for any of the above described device.
A DRAM device of the present invention includes a memory cell array divided
into a plurality of cell blocks and a plurality of data storage blocks
provided corresponding to the respective cell blocks. Each data storage
block is operable to receive data on the columns in the corresponding cell
block in response to an inactive row address strobe signal.
Each data storage block is also operable to output data therefrom
corresponding to data on a selected column in response to an active row
address strobe signal.
In the above described structure, the respective data storage blocks can
store data of different rows on a basis of plural data bits, and thus the
number of entries in a simple cache system is increased according to the
number of the data storage blocks. In addition, the data block size of the
simple cache system is reduced to an adequate size depending on the size
of the data storage blocks of the main memories.
Furthermore, the row address strobe signal RAS is used as a signal for
designating an access operation mode of the DRAM depending on a cache hit
or a miss and accordingly operation of the DRAM can be controlled without
any additional external terminals.
These objects and other objects, features, aspects and advantages of the
present invention will become more apparent from the following detailed
description of the present invention when taken in conjunction with the
accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic view showing an overall construction of a
conventional DRAM.
FIGS. 2(a)-2(d) are typical waveform diagrams in a read operation in a
normal mode of the Conventional DRAM.
FIGS. 3(a)-3(d) are typical waveform diagrams in a read operation in a page
mode of the conventional DRAM.
FIGS. 4(a)-4(d) are typical waveform diagrams in a read operation in a
static column mode of the conventional DRAM.
FIG. 5 represents a schematic structural diagram of a conventional simple
cache system using DRAMs operable in a fast access mode.
FIGS. 6(a)-6(g) are operation waveform diagrams for the simple cache system
as shown in FIG. 5.
FIG. 7 represents a schematic structural diagram of a DRAM according to an
embodiment of the present invention.
FIG. 8 represents a structure of the main part of the DRAM as shown in FIG.
7.
FIG. 9 is a schematic view showing a structure of a simple cache memory
system utilizing DRAMs of the present invention.
FIGS. 10(a)-10(f) are waveform diagrams showing operation of the simple
cache memory system according to the above mentioned embodiment.
FIG. 11 is a schematic diagram showing a structure of a simple cache memory
system according to another embodiment of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
An external control signal RAS (row address strobe) of a DRAM applies start
timing for reading and writing of data in the normal operation mode.
However, in the page mode cycle and the static column cycle which are fast
access modes, timing for writing and reading of data is provided by a
signal CAS or a signal WE. Consequently, the signal RAS for providing a
timing of start of a memory cycle and a row address strobe timing does not
play any role in reading and writing of data in a fast access mode. For
this reason, the signal RAS does not need to be maintained in the active
state, i.e., at "L" level in the page mode cycle or the static column
cycle.
Therefore, in the present invention, the | | |