WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Enhanced DRAM with all reads from on-chip cache and all writers to memory array    
United States Patent5699317   
Link to this pagehttp://www.wikipatents.com/5699317.html
Inventor(s)Sartore; Ronald H. (San Diego, CA); Mobley; Kenneth J. (Colorado Springs, CO); Carrigan; Donald G. (Monument, CO); Jones; Oscar Frederick (Colorado Springs, CO)
AbstractAn enhanced dynamic random access memory (DRAM) contains embedded row registers in the form of latches. The row registers are adjacent to the DRAM array, and when the DRAM comprises a group of subarrays, the row registers are located between DRAM subarrays. When used as on-chip cache, these registers hold frequently accessed data. This data corresponds to data stored in the DRAM at a particular address. When an address is supplied to the DRAM, it is compared to the address of the data stored in the cache. If the addresses are the same, then the cache data is read at static random access memory (SRAM) speeds. The DRAM is decoupled from this read. The DRAM also remains idle during this cache read unless the system opts to precharge or refresh the DRAM. Refresh or precharge occur concurrently with the cache read. If the addresses are not the same, then the DRAM is accessed and the embedded register is reloaded with the data at that new DRAM address. Asynchronous operation of the DRAM is achieved by decoupling the row registers from the DRAM array, thus allowing the DRAM cells to be precharged or refreshed during a read of the row register.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5699317
Enhanced DRAM with all reads from on-chip cache and all writers to

     memory array - US Patent 5699317 Drawing
Enhanced DRAM with all reads from on-chip cache and all writers to memory array
Inventor     Sartore; Ronald H. (San Diego, CA); Mobley; Kenneth J. (Colorado Springs, CO); Carrigan; Donald G. (Monument, CO); Jones; Oscar Frederick (Colorado Springs, CO)
Owner/Assignee     Ramtron International Corporation (Colorado Springs, CO)
Patent assignment
All assignments
Publication Date     December 16, 1997
Application Number     08/319,289
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 6, 1994
US Classification     365/230.06 365/49 365/230.05
Int'l Classification     G11C 011/409
Examiner     Lane; Jack A.
Assistant Examiner    
Attorney/Law Firm     Bachand, Esq.; Richard A. Kubida, Esq.; William J. , Meza, Esq.; Peter J. ,
Address
Parent Case     This is a continuation-in-part of application Ser. No. 07/824,211, filed Jan. 22, 1992, now abandoned.
Priority Data    
USPTO Field of Search     365/49 365/189.04 365/189.05 365/203 365/222 365/230.03 365/230.08 365/230.05 365/149 365/189.07 365/230.06 395/432 395/433
Patent Tags     enhanced dram all reads on-chip cache all writers to memory array
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5226009
Arimoto
365/189.04
Jul,1993

[0 after 0 votes]
5226147
Fujishima
711/118
Jul,1993

[0 after 0 votes]
5226139
Fujishima
711/3
Jul,1993

[0 after 0 votes]
5214610
Houston
365/233.5
May,1993

[0 after 0 votes]
5184325
Lipovski
365/189.07
Feb,1993

[0 after 0 votes]
5184320
Dye
365/49
Feb,1993

[0 after 0 votes]
5179687
Hidaka
711/118
Jan,1993

[0 after 0 votes]
5148346
Naab
361/251
Sep,1992

[0 after 0 votes]
5111386
Fujishima
711/118
May,1992

[0 after 0 votes]
5025421
Cho
365/230.05
Jun,1991

[0 after 0 votes]
4943944
Sakui
365/189.05
Jul,1990

[0 after 0 votes]
4926385
Fujishima
365/230.03
May,1990

[0 after 0 votes]
4894770
Ward
711/128
Jan,1990

[0 after 0 votes]
4794559
Greenberger
365/49
Dec,1988

[0 after 0 votes]
4608666
Uchida
365/222
Aug,1986

[0 after 0 votes]
4577293
Matick
365/189.04
Mar,1986

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed as the invention is:

1. An integrated circuit memory device comprising:

an array of DRAM memory cells, the array including a plurality of bit lines;

a row decoder coupled to said array;

a plurality of sense amplifiers coupled to said array;

a set of registers separate from said sense amplifiers usable as cache memory;

a coupling circuit selectively coupling said registers to said array;

a read port coupled to said set of registers and configured so that all stored data to be read from the integrated circuit memory device is read from said cache memory via said read port;

a write port, distinct from said read port, coupled to said array so that all externally-provided data to be stored in said device is routed to said array of DRAM memory cells without passing through said read port;

and a single column decoder coupled to and shared by said read and write ports.

2. The memory device of claim 1 wherein said coupling circuit includes coupling transistors connected between said bit lines and said registers, wherein each of said coupling transistors has a respective control electrode.

3. The memory device of claim 2 wherein each said control electrode is coupled for receiving a corresponding control signal to be commonly applied to at least two of said control electrodes.

4. The memory device of claim 3 wherein said control electrodes of said coupling transistors are coupled for receiving at least two control signals, wherein some of said control electrodes are coupled to receive a first one of said control signals (LOAD 1) and some other ones of said control electrodes are coupled to receive a second one of said control signals (LOAD 2).

5. The memory device of claim 4 wherein:

two of said coupling transistor control electrodes are coupled to receive said first control signal (LOAD 1); and

two other ones of said coupling transistor control electrodes are coupled for receiving said second control signal (LOAD 2).

6. The memory device of claim 2 wherein each said coupling transistor further includes first and second electrodes;

wherein said first electrodes are coupled to said bit lines;

wherein said read port includes read port transistors; and

wherein some of said second electrodes of said coupling transistors are coupled to a corresponding read port transistor.

7. The memory device of claim 6 wherein said coupling transistor second electrodes that are coupled to said read port transistors are coupled to gate electrodes of said read port transistors, thereby providing a high impedance read port.

8. The memory device of claim 6 wherein said some second electrodes of said coupling transistors are also coupled to a first terminal of a corresponding one of said registers.

9. The memory device of claim 8 wherein other ones of said second electrodes of said coupling transistors bit lines and said registers are connected to a second terminal of the corresponding one of said registers.

10. The memory device of claim 6 wherein said second electrodes of said coupling transistors are coupled to first and second terminals of said registers.

11. The memory device of claim 2 wherein said write port comprises a plurality of input transistors, each having first, second, and control electrodes;

wherein each said input transistor first electrode is coupled to a respective bit line.

12. The memory of claim 11 wherein each said input transistor second electrode is coupled to receive a data input bit or a complement thereof; and each said control electrode of said input transistors is coupled to receive a column decode signal.

13. The memory device of claim 1 wherein said coupling circuit, read port, write port, and registers comprise:

a plurality of cross-coupled inverters (142, 144) forming said registers, each said register having a first terminal and a second terminal;

a plurality of coupling transistors (212-218) each having first, second, and control electrodes, said second electrodes being connected to selected ones of said first and second terminals of said registers;

a plurality of read port transistors, each having a respective control electrode;

wherein each said second electrode of the coupling transistors is coupled also to a corresponding one of said read port transistors;

wherein each said coupling transistor control electrode is connected to at least one other coupling transistor control electrode and coupled to receive a corresponding one of a plurality of control signals (LOAD 1, LOAD 2);

a plurality of write port transistors (203-209) connected to respective ones of said bit lines (45) and coupled to receive write data, said write port transistors being located so that said write data can be applied to the bit lines regardless of the state of the coupling transistors.

14. The memory device of claim 13 wherein said coupling transistor second electrodes are coupled to said control electrodes of said read port transistors, thereby providing a high impedance read port.

15. The memory device of claim 1 wherein said read port is unidirectional and said write port is unidirectional.

16. The memory device of claim 1 further comprising a column address decoder coupled to said read ports and said write ports.

17. The memory device of claim 1 wherein said array is arranged as a plurality of DRAM subarrays each having a respective plurality of bit lines;

wherein said set of row registers is arranged as a plurality of sets of row registers corresponding in number to said plurality of DRAM subarrays; and

wherein each said subarray is coupled to store read data in only one respective set of row registers, and each set of row registers is coupled to receive and store read data from only one corresponding DRAM subarray.

18. The memory device of claim 17 wherein each said DRAM subarray is positioned between its respective set of row registers and the sense amplifiers corresponding to said subarray.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The present invention relates to a dynamic random access memory ("DRAM") and more particularly to an Enhanced DRAM (which we call an "EDRAM") with embedded registers to allow fast random access to the DRAM while decoupling the DRAM from data processing operations. The parent application, U.S. Ser. No. 07/824,211 filed Jan. 22, 1992, is incorporated herein by reference.

BACKGROUND OF THE INVENTION

As the computer industry evolves, demands for memory have outpaced the technology of available memory devices. One of these demands is high speed memory compatibility. Thus, in a computer system, such as a personal computer or other computing system, memory subsystems have become an influential component toward the overall performance of the system. Emphasis is now on refining and improving memory devices that provide affordable, zero-wait-state operations.

Generally, volatile memories are either DRAM or static RAM ("SRAM"). Each SRAM cell includes plural transistors. Typically the data stored in a SRAM cell is stored by the state of a flip-flop formed by some of the transistors. As long as power is supplied, the flip-flop keeps its data: it does not need refreshing. In a DRAM cell, on the other hand, there typically is one transistor, and data is stored in the form of charge on a capacitor that the transistor accesses. The capacitor dissipates its charge and needs to be refreshed.

These two types of volatile memories have respective advantages and disadvantages. With respect to memory speed, the SRAM is faster than the DRAM due, partially at least, to the nature of the cells. The disadvantage, however, is that because there are more transistors, the SRAM memory is less dense than a DRAM of the same physical size. For instance, static RAMs traditionally have a maximum of one-fourth the number of cells of a DRAM which uses the same technology.

While the DRAM has the advantage of smaller cells and thus higher cell density (and lower cost per bit), one disadvantage is that the DRAM must refresh its memory cells whereas the SRAM does not. While the DRAM refreshes and precharges, access to the memory cells is prohibited. This creates an increase in access time, which drawback the static RAM does not suffer.

However, the speed and functionality of current DRAMS are often emphasized less than memory size (storage capacity) and cost. This is evidenced by the fact that DRAM storage capacity density has increased at a rate an order of magnitude greater than its speed. While there has been some improvement in access time, systems using DRAMs generally have had to achieve their speed elsewhere.

In order to increase system speed, cache memory techniques have recently been applied to DRAM main memory. These approaches have generally been implemented on a circuit board level. That is, a cache memory is frequently a high-speed buffer interposed on the circuit board between the processor chip and the main memory chip. While some efforts have been made by others to integrate a cache with DRAM, we first address the board level approach.

FIG. 1 indicates a prior art configuration (board-level) wherein a processor chip 10 is configured with a cache controller 12 and a cache memory 14. The main purpose of the cache memory is to maintain frequently accessed data for high speed system access. Cache memory 14 (sometimes called "secondary cache static RAM") is loaded via a multiplexer 16 from DRAMs 20, 22, 24 and 26. Subsequently, data is accessed at high speeds if stored in cache memory 14. If not, DRAMs 20, 22, 24 and/or 26 load the sought data into cache memory 14. As seen in FIG. 1, cache memory 14 may comprise a SRAM, which is generally faster than DRAMS 20-26.

Various approaches have been proposed for cache memory implementation. These approaches include controlling external cache memory by a controller, such as cache memory 14 and cache controller 12 in FIG. 1, or discrete proprietary logic. Notwithstanding its benefits, cache memory techniques complicate another major problem that exists in system design. Memory components and microprocessors are typically manufactured by different companies. This requires the system designer to effectively bridge these elements, using such devices as the cache controller 12 and the multiplexer 16 of FIG. 1. These bridge components are usually produced by other companies. The different pin configurations and timing requirements of these components makes interfacing them with other devices difficult. Adding a cache memory that is manufactured by yet another company creates further design problems, especially since there is no standard for cache implementation.

Exacerbating the system design problems is the disadvantage that the use of external cache memory (such as cache memory 14) compromises the main storage access speed. There are mainly two reasons for this compromise. First, and most significant, the main storage access is withheld until a "cache miss" is realized. The penalty associated with this miss can represent up to two wait states for a 50 MHz system. This is in addition to the time required for a main memory access. Second, the prioritized treatment of physical routing and buffers afforded the external cache is usually at the expense of the main memory data and address access path. As illustrated in FIG. 1, data from DRAMs 20, 22, 24 and 26 can be accessed only through cache memory 14. The actual delay may be small, but adds up quickly.

A third problem associated with separate cache and main memory is that the time for loading the cache memory from the main memory ("cache fill") is dependent on the number of inputs to the cache memory from the main memory. Since the number of inputs to the cache memory from the main memory is usually substantially less than the number of bits that the cache memory contains, the cache fill requires many clock cycles. This compromises the speed of the system.

A memory architecture that has been used or suggested for video RAMs ("VRAMs") is to integrate serial registers with a main memory. VRAMs are specific to video graphics applications. A VRAM may comprise a DRAM with high speed serial registers allowing an additional access port for a line of digital video data. The extra memory used here is known as a SAM (serially addressed memory), which is loaded using transfer cycles. The SAM's data is output by using a serial clock. Hence, access to the registers is serial, not random. Also, there is continuous access to the DRAM so refresh is not an issue as it is in other DRAM applications.

Another implementation that is expected to come to market in 1992 of on-chip cache memory will use a separate cache and cache controller sub-system on the chip. It uses full cache controllers and cache memory implemented in the same way as it would be if external to the chip, i.e. a system approach. This approach is rather complicated and requires a substantial increase in die size. Further, the loading time of the cache memory from the main memory is constrained by the use of input/output cache access ports that are substantially fewer in number than the number of cache memory cells. A cache fill in such a manner takes many clock cycles, whereby system access speed suffers. Such an approach is, in the inventors' views, somewhat cumbersome and less efficient than the present invention.

Still another problem in system design arises when the system has both (a) interleaved memory devices together with (b) external cache memory. Interleaving assigns successive memory locations to physically different memory devices, thereby increasing data access speed. Such interleaving is done for high-speed system access such as burst modes. The added circuitry for cache control and main memory multiplexing usually required by external cache memory creates design problems for effective interleaved memory devices.

Another problem with the prior art arises when memory capacity is to increase. Adding more memory would involve adding more external SRAM cache memory and more cache control logic. For example, doubling the memory size in FIG. 1 requires not only more DRAM devices required, but also another multiplexer and possibly another cache controller. This would obviously add to system power consumption, detract from system reliability, decrease system density, add manufacturing costs and complicate system design.

Another problem concerns the cost of manufacturing a system with an acceptable cache hit probability. When using external cache memory, manufacturers allocate a certain amount of board area for the main memory. A smaller area is allocated for the external cache. Usually, it is difficult to increase the main memory and the external cache memory while maintaining an acceptable cache hit probability. This limitation arises from the dedication of more board area for the main memory than for external cache.

A further problem with system speed is the need for circuitry external to the main memory to write "post" data. Post data refers to data latched in a device until it is needed. This is done because the timing requirement of the component needing the data does not synchronize with the component or system latching the data. This circuitry usually causes timing delays for the component or system latching the data.

As stated supra, access to the DRAM memory cells during a precharge and refresh cycle was prohibited in the prior art. Some prior art approaches have tried to hide the refresh in order to allow access to DRAM data. One DRAM arrangement maintained the data output during a refresh cycle. The drawback of this arrangement was that only the last read data was available during the refresh. No new data read cycle could be executed during the refresh cycle.

A pseudo-static RAM is another arrangement that attempted to hide the refresh cycle. The device was capable of executing internal refresh cycles. However, any attempted data access during the refresh cycle would extend the data access time, in a worst case scenario, by a cycle time (refresh cycle time plus read access time). This arrangement did not allow true simultaneous access and refresh, but used a time division multiplexing scheme to hide the refresh cycle.

Another way to hide the refresh cycle is to interleave the RAM memory on the chip. When a RAM memory block with even addresses is accessed, the odd memory block is refreshed and vice-versa. This type of implementation requires more timing control restraints which translate to a penalty in access time.

Another type of problem arises when considering the type of access modes to the main memory. One type of access is called page mode, in which several column addresses are synchronously applied to an array after a row address has been received by the memory. The output data access time will be measured from the timing clock edge (where the column address is valid) to the appearance of the data at the output.

Another type of access mode is called static column mode wherein the column addresses are input asynchronously. Access can occur in these modes only when RAS is active (low), and a prolonged time may be required in the prior art.

When manufacturing chips that support these access types, only one of these access types can be implemented into the device. Usually, one of the last steps in the making of the memory chip will determine if it will support either type of access. Thus, memory chips made this way do not offer both access modes. This induces an added expense in that the manufacturer must use two different processes to manufacture the two types of chips.

To overcome these problems, small modifications added to a component, such as a DRAM, may yield an increase in system performance and eliminate the need for any bridging components. To successfully integrate the modification with the component, however, its benefit must be relatively great or require a small amount of die space. For example, DRAM yields must be kept above 50% to be considered producible. Yields can be directly correlated to die size. Therefore, any modifications to a DRAM must take into account any die size changes.

In overcoming these problems, new DRAM designs have become significant. The greatest disadvantage to caching within DRAMs has been that DRAMs are too slow. The present invention in one of its aspects seeks to change the architecture of the DRAM to take full advantage of high caching speed that may now be obtainable.

One way to meet this challenge is to integrate the functions of the main storage and cache. Embedding the cache memory within localized groups of DRAM cells would take advantage of the chip's layout. This placement reduces the amount of wire (conductive leads) used in the chip which in turn shortens data access times and reduces die size.

U.S. Pat. No. 5,025,421 to Cho is entitled "Single Port Dual RAM." It discloses a cache with typical DRAM bit lines connected to typical SRAM bit lines through pass gates. Reading and writing the SRAM and DRAM arrays occurs via a single port, which requires that input/output busses communicate with the DRAM bit lines by transmitting data through the SRAM bit lines. Using SRAM bit lines to access the DRAM array precludes any access other than refresh to the DRAM array while the SRAM array is being accessed, and conversely precludes access to the SRAM array while the DRAM array is being accessed, unless the data in the SRAM is the same data as in the currently accessed DRAM row. This is a functional constraint that is disadvantageous.

Moreover, the SRAM cells of Cho FIG. 1 are full SRAM cells, although his FIG. 4 may disclose using only a single latch (FF11) rather than an entire SRAM cell. However, the use of a single port with a simple latch raises a severe problem. Such an architecture lacks the ability to write data into the DRAM without corrupting the data in the SRAM latch. Hence, the FIG. 4 configuration is clearly inferior to Cho's FIG. 1 configuration.

Another effort is revealed by U.S. Pat. No. 4,926,385 to Fujishima, Hidaka, et al., assigned to Mitsubishi, entitled, "Semiconductor Memory Device With Cache Memory Addressable By Block Within Each Column." There are other patents along these lines by Fujishima and/or Hidaka. This one uses a row register like Cho FIG. 4. Two ports are used, but two decoders are called for. While this overcomes several of the problems of Cho, it requires a good deal more space consumed by the second column decoder and a second set of input/output switch circuitry. (Subsequent Fujishima/Hidaka patents have eliminated the second access port and second decoder and have reverted to the Cho FIG. 1 approach, despite its disadvantages.) Nevertheless, in this patent, the "tag" and data coherency control circuitry for the cache is external to the chip and is to be implemented by the customer as part of the system design. The "tag" refers to information about what is in the cache at any given moment. A "hit" or "miss" indication is required to be generated in the system, external to the integrated circuit memory, and supplied to the chip. This leads to a complicated and slower system.

Other Fujishima, Hidaka, et al. U.S. patents include U.S. Pat. Nos. 5,111,386; 5,179,687; and 5,226,139.

Arimoto U.S. Pat. No. 5,226,009 is entitled, "Semiconductor memory device supporting cache and method of driving the same." This detects whether a hit or miss occurs by using a CAM cell array. The basic arrangement is like the approach of Cho FIG. 1 but modified to collect DRAM data from an "interface driver," which is a secondary DRAM sense amplifier, rather than from the primary DRAM sense amplifiers. This architecture still accesses the DRAM bit lines via the SRAM bit lines and is plagued with the single port problem. Circuitry is provided to preserve coherency between the DRAM and the SRAM. A set of tag registers is discussed with respect to a system-level (off-chip) implementation in a prior art drawing. Arimoto implements his on-chip cache tag circuitry using a content addressable memory array. That approach allows N-way mapping, which means that a group of memory devices in the cache can be assigned to any row in any of N subarrays. For example, if an architecture is "4-way associative," this means that there are four SRAM blocks, any of which can be written to by a DRAM. This method results in a large, expensive, and slow implementation of mapping circuitry. Using a CAM array for tag control has an advantage of allowing N-way association. However, the advantage of N-way association seems not to outweigh the disadvantage of the large and slow CAM array to support the N-way SRAM array.

Dye U.S. Pat. No. 5,184,320 is for a "Cached random access memory device and system" and includes on-chip cache control. The details of the actual circuitry are not disclosed, however. This patent also is directed to N-way association and considerable complication is added to support this.

Another piece of background art is Matick et al. U.S. Pat. No. 4,577,293 for a "Distributed on-chip cache." It has 2-way associative cache implemented using a distributed (on-pitch) set of master-slave row register pairs. Full flexibility of access is provided by dual ports that are not only to the array but also to the chip itself. The two ports are totally independent, each having pins for full address input as well as data input/output. The cache control is on-chip.

Thus it should be appreciated that the art has heretofore often directed efforts in achieving N-way association. While this has led to complications, the art has thought that N-way association is the approach to follow.

The present invention, according to one of its aspects, rejects this current thinking and instead provides a streamlined architecture that not only includes on-chip cache control, but also operates so fast that the loss of N-way association is not a concern.

Therefore, it is a general object of this invention to overcome the above-listed problems.

Another object of the present invention is to isolate the cache memory data access operation from undesirable DRAM timing overhead operations, such as refresh and precharge.

A further object of the present invention is to eliminate the need for a external static RAM cache memory in high speed systems.

Still another object of the present invention is to insure cache/main memory data coherency.

Another object of this invention is to insure such data coherency in a fashion which minimizes overhead, so as to reduce any negative impact such circuitry might have on the random data access rate.

SUMMARY OF THE PRESENT INVENTION

The present invention provides a high-speed memory device that is hybrid in its construction and is well-suited for use in high-speed processor-based systems. A preferred embodiment of the present invention embeds a set of tightly coupled row registers, usable for a static RAM function, in a high density DRAM, preferably on the very same chip as the DRAM array (or subarrays). Preferably, the row registers are located within or alongside the DRAM array, and if the DRAM is configured with subarrays, then multiple sets of row registers are provided for the multiple subarrays, preferably one set of row registers for each subarray. Preferably the row registers are oriented parallel to DRAM rows (word lines), orthogonal to DRAM columns (bit lines). The row registers operate at high speed relative to the DRAM. Preferably the number of registers is smaller than the number of bit lines in the corresponding array or subarray. In the preferred embodiment, one row register corresponds to two DRAM bit line pairs, but in other applications, one register could be made to correspond to another number of DRAM bit line pairs. Preferably selection circuitry is included to select which of the several bit line pairs will be coupled (or decoupled) from the corresponding row register.

Preferably the row registers are directly mapped, i.e. a one-way associative approach is preferred. Preferably the configuration permits extremely fast loading of the row registers by connecting DRAM bit lines to the registers via pass gates which selectively couple and decouple bit lines (bit line pairs) to the corresponding row registers. Thus, by selecting which bit line pairs are to be given access to the row registers, the sense amplifiers for example drive the bit lines to the voltages corresponding to the data states stored in a decoded row of DRAM cells and this is loaded quickly into the row registers. Thus, a feature of the present invention is a very quick cache fill.

The fast fill from the DRAM to the row registers provides a very substantial advantage. In the case of a read miss, mentioned below, a parallel load to the row registers is executed. Thereafter, each read from the same row is a read hit, which is executed at SRAM speeds rather than DRAM speeds.

Preferably the row registers are connected to a unidirectional output (read) port, and preferably this is a high impedance arrangement. That is, in the preferred embodiment, the registers are not connected to the source-drain path of the read port transistors, but instead they are connected to gate electrodes thereof. This leads to improvements in size and power.

The DRAM bit lines are preferably connected to a unidirectional input (write) port. In a circuit according to some aspects of the invention, the row registers can be decoupled from the DRAM bit lines and data could still be inputted to the DRAM bit lines via the write port. Moreover, even when the row registers are decoupled from the DRAM bit lines, data can be read from the row registers.

Preferably both the read and write ports operate off one decoder.

The configuration of an integrated circuit memory according to a related aspect of the invention will not require an input/output data buss connected to the sense amplifiers, since each DRAM subarray will be located between its corresponding set of row registers and the DRAM subarray's corresponding set of sense amplifiers, and since the data input and output functions are executed on the row register side.

In addition to including row registers, preferably in a directly mapped configuration, a circuit using the present invention preferably integrates simple, fast control circuitry for the cache (registers). Hence the integrated circuit memory device preferably contains on-chip address compare circuitry, including at least one "last read row" address latch and an address comparator. Where multiple subarrays are used, multiple sets of row registers are used, each having a respective "last read row" and thus a respective "last read row" register. Address and data latches, a refresh counter, and various logic for controlling the integrated circuit memory device also are preferably included on the chip.

Memory reads preferably always occur from the row registers. When an address is received by the memory device, the address comparator determines whether that address corresponds to an address of the row that was last read into the associated row register. When the address comparator detects a match ("hit"), only the row register is accessed, and the data stored there is available from the addressed column at SRAM speeds. Subsequent reads within the row (burst reads, local instructions or data) will continue at that same high speed.

When a read "miss" is detected, the DRAM main memory is addressed and the addressed data is written into the row register. In the event of such a "miss," the first bit of data is available at the output at a slightly slower speed than a hit. Subsequent bits read from the row register will have the same extremely fast access as for a hit.

Since the data corresponding to the received address is read from the row register in both cases, and since according to another aspect of the