In a computer RAM memory system, the memory is subjected to a self test operation during which data is written to and read out from each address location of the memory. The data read out is compared with the written data to detect errors and the number of errors at each bit position is counted. When the number of errors in a bit position-exceeds a selected threshold, the corresponding DRAM is replaced by a spare DRAM. When the self test detects two or more errors in the same double word, the DRAM corresponding to the bit position having the highest error count is replaced with a spare DRAM. The memory is periodically scrubbed and errors detected during the scrubbing operation are counted for each bit position. At the end of the scrubbing of a chip row the DRAMs corresponding to bit positions at which the error counts exceed a selected threshold are replaced with spare DRAMs. When a multiple bit error in a double word is detected during scrubbing, the corresponding double word is tagged.
FIELD OF THE INVENTION
This invention relates to computer RAM memory systems and, more particularly, to improved sparing and scrubbing operations in a computer RAM memory system.
A method for testing the memory in a system with two or more processing units is provided that generally involves the following acts. The memory is divided into two or more sections--one for each of the two or more processing units. Thus, each processing unit has an associated memory section. The memory is then checked with each memory section being checked with its associated processing unit. The act of checking the memory includes causing the address of a first encountered faulty location to be stored and causing a flag to be set in response to encountering a second faulty location. Finally, it is determined whether the flag has been set after the memory is checked. If so, a walk-through routine is then performed.
Memory is scrubbed by an improved non-linear method giving scrubbing preference to the central storage region having the characteristic of a high risk read-only memory such as the CPA region to prevent the accumulation of temporary data errors. The chip row on which the CPA resides is scrubbed after each time the scrubbing of a non-CPA chip row in a PMA completed successfully. The next non-CPA least recently scrubbed chip row would be selected for scrubbing after scrubbing completed on the CPA chip row. This in a first case provides non-linear selection methods of scrubbing central storage of computer systems to more frequently select ("select" herein encompasses the meaning of "favor") scrub regions having the characteristic of a predominately read-only memory making those regions at a higher risk of failure than those regions having lower risk because of frequent write operations. In a second case, scrub regions having the characteristic of a predominately read-only memory are selected by using a second preferred embodiment selection method which uses the detection of faulty data from normal system accesses to central storage to identify other high risk regions and scrub them before other lower risk regions. In addition, the severity of the detected data error can be used to determine the rate at which scrub commands are sent to the selected region: the higher the severity, the higher the scrub rate.
An example memory scrubbing logic is provided. The logic may be operably connectable to a main memory and a processor. The memory access logic may include a memory for mirroring a main memory location and a logic for scrubbing the main memory location.
Disclosed is a semiconductor memory device capable of arbitrarily setting an upper limit of the number of error corrections during a test operation. The semiconductor memory device has a counter, a register, and a comparison circuit. The counter counts the number of error corrections. The register, when an upper limit setting signal is externally inputted to change the upper limit of the number of error corrections, changes the upper limit. The comparison circuit compares the number of error corrections with the changed upper limit.
Systems and methods for improving scrubbing techniques are provided. In one aspect, the error correction code for a memory line is strengthened by reorganizing the memory line into distinct portions and providing an error code set that includes a distinct error code for each portion of the memory line. In another aspect of the invention, the scan rate is effectively increased by moving memory scrubbing functionality into the memory system and distributing it among a number of subcomponents that can operate scrubbing functions in parallel. The effective scan rate increase reduces the probability of failure for any given ECC strength.