WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Data processing system with master and slave devices and asymmetric signal swing bus    
United States Patent6272577   
Link to this pagehttp://www.wikipatents.com/6272577.html
Inventor(s)Leung; Wingyu (Cupertino, CA), Lee; Winston (South San Francisco, CA), Hsu; Fu-Chieh (Saratoga, CA)
AbstractA memory device which utilizes a plurality of memory modules coupled in parallel to a master I/O module through a single directional asymmetrical signal swing (DASS) bus. This structure provides an I/O scheme having symmetrical swing around half the supply voltage, high through-put, high data bandwidth, short access time, low latency and high noise immunity. The device utilizes improved column access circuitry including an improved address sequencing circuit and a data amplifier within each memory module. A resynchronization circuit allows the device to operate either synchronously and asynchronously using the same pins. Each memory module has independent address and command decoders to enable independent operation so that each memory module is activated by commands on the DASS bus only when a memory access operation is performed within the particular memory module. Redundant memory modules are included to replace defective memory modules, and replacement can be carried out through commands on the DASS bus. The memory device can be configured to simultaneously write a single input data stream to multiple memory modules or to perform high-speed interleaved read and write operations. In one embodiment, multiple memory devices are coupled to a common, high-speed I/O bus without requiring large bus drivers and complex bus receivers in the memory modules.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6272577
Data processing system with master and slave devices and asymmetric signal
     swing bus - US Patent 6272577 Drawing
Data processing system with master and slave devices and asymmetric signal swing bus
Inventor     Leung; Wingyu (Cupertino, CA) , Lee; Winston (South San Francisco, CA) , Hsu; Fu-Chieh (Saratoga, CA)
Owner/Assignee     Monolithic System Technology, Inc. (Sunnyvale, CA)
Patent assignment
All assignments
Publication Date     August 7, 2001
Application Number     08/960,951
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 30, 1997
US Classification     710/110 326/121 326/21 326/30 326/80 326/81 326/83 326/86 327/309 710/100 710/107
Int'l Classification    
Examiner     Sheikh; Ayaz
Assistant Examiner     Jean; Frantz B.
Attorney/Law Firm     Klivans; Norman R. Skjerven Morrill MacPherson LLP Holmbeck; Signe M.
Address
Parent Case     This application is a divisional application of U.S. patent application Ser. No. 08/549,610, filed Oct. 27, 1995, now U.S. Pat. No. 5,729,152 issued Mar. 17, 1998, which is a divisional application of U.S. patent application Ser. No. 08/270,856, filed Jul. 5, 1994; now U.S. Pat. No. 5,655,113.
Priority Data    
USPTO Field of Search     710/100 710/107 710/110 710/128 326/80 326/81 326/83 326/86 326/121 326/21 326/30 327/309
Patent Tags     data processing master slave devices asymmetric signal swing bus
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A data processing system comprising:

a first supply voltage;

a second supply voltage;

a bus;

a plurality of slave devices coupled in parallel to said bus, each of said slave devices having a slave bus transceiver for transmitting and receiving signals on said bus; and

a master device coupled in parallel to said bus, said master device having a master bus transceiver for transmitting and receiving signals on said bus, wherein signals transmitted from said slave bus transceiver to said master bus transceiver vary over a first voltage range which is less than the difference between said first supply voltage and said second supply voltage, and signals transmitted from said master bus transceiver to said slave bus transceiver vary over a second voltage range which is approximately equal to the difference between said first supply voltage and said second supply voltage.

2. The data processing system of claim 1, wherein said bus, said slave devices and said master device are all fabricated on one chip.

3. The data processing system of claim 1, wherein said master bus transceiver further comprises:

a clamping circuit coupled to said bus, wherein said clamping circuit limits the signals on said bus within said first voltage range when said clamping circuit is enabled, and wherein the signals on said bus are limited to said second voltage range when said clamping circuit is disabled;

a bus receiver circuit coupled to said bus;

a bus driver circuit coupled to said bus; and

means for enabling said clamping circuit when said bus receiver circuit is receiving signals from said bus and disabling said clamping circuit when said bus driver circuit is transmitting signals to said bus.

4. The data processing system of claim 3, wherein said clamping circuit, when enabled, limits the signals on said bus to a voltage range of approximately one volt about a voltage equal to one half of the first supply voltage.

5. The data processing system of claim 3, wherein said clamping circuit, when enabled, provides voltages at an output of said bus receiver which directly drive CMOS circuitry in said data processing system.

6. The data processing system of claim 3, wherein said bus receiving circuit comprises an inverter having an input coupled to said bus and an output coupled to an I/O node, and wherein said clamping circuit further comprises:

a first transistor of a first conductivity type, said first transistor having a source coupled to the first supply voltage, a drain coupled to said bus and a gate coupled to a first node;

a second transistor of said first conductivity type, said second transistor having a source coupled to the first supply voltage, a drain coupled to said first node and a gate coupled said I/O node;

a third transistor of a second conductivity type opposite said first conductivity type, said third transistor having a source coupled to said first node, a drain coupled to said bus and a gate coupled said I/O node;

a fourth transistor of said second conductivity type, said fourth transistor having a source coupled to said bus, a drain coupled to the second supply voltage and a gate coupled to a second node;

a fifth transistor of said first conductivity type, said fifth transistor having a source coupled to said bus, a drain coupled to said second node and a gate coupled to said I/O node; and

a sixth transistor of said second conductivity type, said sixth transistor having a source coupled to said second node, a drain coupled to said second supply voltage and a gate coupled to said I/O node.

7. The data processing system of claim 6, wherein said bus driver comprises an inverter having an input coupled to said I/O node and an output coupled to said bus.

8. The data processing system of claim 6, wherein said means for enabling and disabling said clamping circuit comprises:

a seventh transistor of said first conductivity type, said seventh transistor having a source coupled to the first supply voltage, a drain coupled to the source of said first transistor and a gate coupled to a control bus;

an inverter having an input and an output, wherein said input is coupled to said control bus;

a eighth transistor of said second conductivity type, said eighth transistor having a source coupled to the drain of said fourth transistor, a drain coupled to the second voltage supply and a gate coupled to the output of said inverter.

9. A data processing system comprising:

a bus, wherein said bus comprises a plurality of bus lines for carrying bi-directional multiplexed address, data and control information;

a plurality of slave devices coupled in parallel to said bus, each of said slave devices having a slave bus transceiver for transmitting and receiving signals on said bus; and

a master device coupled in parallel to said bus, said master device having a master bus transceiver for transmitting and receiving signals on said bus, wherein signals transmitted from said slave bus transceivers to said master bus transceiver vary over a smaller voltage range than signals transmitted from said master bus transceiver to said slave bus transceivers.

10. The data processing system of claim 9, wherein at least one of said bus lines carries a clock signal for synchronization of signal transfer on the bus.

11. The data processing system of claim 10, wherein said bus has at least 16 bus lines for carrying multiplexed address, data and control information.

12. The data processing system of claim 11, wherein said bus also has at least 4 parallel bus lines for carrying control information.

13. The data processing system of claim 10, wherein said address information includes device select information used to select said slave devices, whereby said bus does not require separate device-select lines connected directly to individual slave devices.

14. The data processing system of claim 13, wherein each of said slave devices has at least one modifiable identification register which contains a communication address which identifies each of said slave devices.

15. The data processing system of claim 14, wherein at least one of said slave devices is a memory device having at least one memory array.

16. The data processing system of claim 15, wherein said bus further comprises two or more parallel bus lines for carrying masking information to inhibit writing to certain locations in said memory array during a write operation to said memory device.

17. The data processing system of claim 16, wherein the masking information is transported at both edges of said clock signal.

18. The data processing system of claim 15, wherein the address information comprises a base address of the memory device to be accessed, an array address of a memory array within the memory device to be accessed, and addresses of rows and columns within the memory array to be accessed.

19. The data processing system of claim 18, wherein said identification register of said memory device contains the base address of said memory device, thereby setting the communication address of said memory device equal to the base address of said memory device and allowing said memory device to be accessed using said base address.

20. The data processing system of claim 19, wherein said memory device spans a contiguous memory address space under the base address assigned by its communication address.

21. The data processing system of claim 20, wherein the communication addresses of said memory devices are selected such that said memory devices form a contiguous memory system.

22. The data processing system of claim 10, wherein said plurality of bus lines transport said address, data and control information at both edges of said clock signal.

23. The data processing system of claim 9, wherein one of said bus lines carries a destination clock signal for the synchronization of information transfer from a one of said slave devices to said master device and another of said bus lines carries a source clock signal for the synchronization of information transfer from said master device to said slave devices.

24. The data processing system of claim 23, wherein said destination clock signal is driven by said one of said slave devices and said source clock signal is driven said master device.

25. The data processing system of claim 24, wherein said destination clock signal is driven from the source clock signal through a path substantially matched to a corresponding data signal path in said slave device.

26. The data processing system of claim 9, wherein said master device is an I/O device and said data processing system further comprises an I/O bus connected to said I/O device.

27. The data processing system of claim 26, further comprising a plurality of said data processing systems connected in parallel to said I/O bus.

28. The data processing system of claim 27, wherein said I/O bus comprises a first set of bus lines carrying control information, and a second set of bus lines carrying multiplexed data, address and control information.

29. The data processing system of claim 28, wherein the bus lines carrying multiplexed data, address and control information on said I/O bus correspond with the bus lines carrying multiplexed data, address and control information on said bus.

30. The data processing system of claim 28, wherein said I/O bus further comprises a third set of bus lines carrying a system clock signal and power.

31. The data processing system of claim 30, wherein said second set of bus lines transports information at both edges of said system clock signal.

32. The data processing system of claim 30, wherein said I/O bus and said system clock signal are operated at a reduced CMOS swing voltage.

33. The data processing system of claim 28, wherein said I/O bus further comprises two or more parallel bus lines for carrying masking information to inhibit writing to certain bit locations in said slave devices during a memory write operation.

34. The data processing system of claim 33, wherein said masking information is transported at both edges of said system clock signal.

35. The system of claim 27, further comprising:

a system master device; and

chip select lines connecting said system master device to each of said data processing systems, wherein said chip select lines are used to initialize base addresses of slave devices in said data processing systems.

36. The system of claim 35, wherein said base addresses are selected so that said slave devices form a contiguous memory.

37. The system of claim 35, wherein said base addresses are selected so that said slave devices span at least two non-contiguous areas in an address space.

38. The system of claim 35, wherein said base addresses are modified dynamically during operation of said data processing systems.

39. The system of claim 35, wherein said system master device includes means to modify the base address of at least one of said slave devices in one of said plurality of data processing systems.

40. The system of claim 35, wherein said system master device includes means to modify the control registers of at least one of said slave devices in one of said plurality of data processing systems.

41. The system of claim 35, wherein said system master device includes means to test at least one memory location of one of said slave devices in one of said plurality of data processing systems.

42. The system of claim 35, wherein said system master device includes means to test the memory locations in said slave devices, and to disable at least one of said slave devices which has one or more memory bits that fails the test.

43. The system of claim 42 wherein said system master device further comprises means to set the base addresses of said slave devices which pass the test such that these slave devices form a contiguous memory system.

44. The system of claim 27, wherein said slave devices each comprise a disable register which is modifiable through said I/O bus.

45. The system of claim 26, wherein each of said slave devices has an identification registers which can be programmed through bus commands on said I/O bus.

46. A method of processing data in a system comprising a bus, a plurality of slave devices and a master device, said method comprising the steps of:

transmitting signals from said slave devices to said master device on said bus, wherein the voltage on said bus varies within a first range as said signals are transmitted from said slave devices to said master device;

transmitting signals from said master device to said slave device on said bus, wherein the voltage on said bus varies within a second range as said signals are transmitted from said master device to said slave devices, wherein said second range is larger than said first range; and

providing said data processing system with a first supply voltage and a second supply voltage, wherein said first range is less than the difference between said first and second supply voltages and said second range is approximately equal to the difference between said first and second supply voltages.

47. The method of claim 46, further comprising the step of setting said first range approximately equal to one volt.

48. The method of claim 47, wherein said first range is centered about one half of the first supply voltage.

49. The method of claim 46, further comprising the step of directly controlling CMOS circuitry with said voltage on said bus as said voltage varies within said first range.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a data processing system having a few bus masters and many bus slaves connected in parallel to a common bus. In particular, this invention relates low latency, high bandwidth, low power, high-yield, large capacity memory devices suitable for data processing and video systems. This invention is particularly suitable for systems organized into multiple identical modules in a very-large-scale or wafer-scale integration environment.

2. Description of the Prior Art

When transmitting signals on traditional bus systems, problems typically arise when either of the following conditions exist: (i) the rise or fall time of the transmitted signal is a significant fraction of the bus clock period or (ii) there are reflections on the bus of the signal which interfere with the rising or falling transitions of the signal. The data transfer rate is limited in part by whether signal integrity is compromised as a result of the above conditions. Therefore, to increase data bandwidth, it is desirable to avoid the above-listed conditions.

High frequency data transmission through a bus requires a high rate of electrical charge (Q) transfer on and off the bus to achieve adequate rise and fall times. To avoid condition (i) above, large transistors in the bus drivers are needed to source and sink the large amounts of current required to switch the signal levels. Equation (1) sets forth the relationship between the required current drive capability (I) of the bus drivers, the number of devices (n) attached to the bus, the output capacitance (C) of the bus driver, the signal swing (V) needed to distinguish between logical 1 and 0, and the maximum operating frequency (f) of the bus.

Thus, one way to obtain a higher operating frequency is to increase the drive capability of the bus driver. However, higher drive usually requires a driver with larger size, which in turn translates to increased silicon area, bus capacitance, power consumption and power supply noise. Furthermore, when the output capacitance of the bus driver becomes a substantial part of the bus capacitance, increasing the size of the bus driver does not result in a higher operating frequency.

Another way to increase the operating frequency is to reduce the signal swing on the bus. Signal swing is defined as the difference between the maximum voltage and the minimum voltage of the signals transmitted on the bus. Many traditional bus systems, including the TTL standard, use reduced-swing signal transmission (i.e., signal swing smaller than the supply voltage), to enable high speed operations. A reduced signal swing reduces the required charge transfer, thereby reducing power consumption, noise and required silicon area. Because reduced signal swing substantially reduces the current required from the bus driver, parallel termination of bus lines is facilitated. Parallel termination is an effective way to suppress ringing in the bus. However, the use of small swing signals requires the use of sophisticated amplifiers to receive the signals. As the signal swing decreases, the required gain of the amplifier increases, thereby increasing the required silicon area and operating power. It would therefore be desirable to have a bus system which utilizes small swing signals, but does not require the use of sophisticated amplifiers.

Prior art small swing (less than 1.5 V peak-to-peak) I/O (input/output) schemes generally have a logic threshold voltage different from V.sub.dd /2 (i.e., one-half of the supply voltage), the logic threshold of a conventional CMOS logic circuit. The logic threshold, or trip point, of a bus signal is the voltage level which delineates a logical 1 from a logical 0. An example of such scheme is GTL, where a logic threshold of 0.8 volt is used. (R. Foss et al, IEEE Spectrum October 1992, p.54-57, "Fast interfaces for DRAMs"). Other small swing I/O schemes, such as center-tap terminated (CTT) Interface (JEDEC Standard, JESD8-4, November 1993), have a fixed threshold (e.g., 1.5 volts) which does not track with the supply voltage. To use a bus signal having logic threshold other than the CMOS logic threshold in a CMOS integrated circuit, a translator circuit must be used to translate the I/O logic threshold to the conventional CMOS logic threshold. These translators consume circuit real estate and power, introduce additional circuit delay and increase circuit complexity.

CMOS circuitry uses a logic threshold of V.sub.dd /2 to permit the CMOS circuitry to operate with symmetrical noise margins with respect to the power and ground supply voltages. This logic threshold also results in symmetrical inverter output rise and fall times as the pull-up and pull-down drive capabilities are set to be approximately equal.

Traditional DRAM devices (IC's) are organized into arrays having relatively small capacities. For example, most commercial 1M bit and 4M bit DRAM devices have an array size of 256K bit. This organization is dictated by the bit-line sense voltage and word line (RAS) access time. However, all arrays inside a DRAM device share a common address decoding circuit. The arrays in DRAM devices are not organized as memory modules connected in parallel to a common bus. Furthermore, each memory access requires the activation of a substantial number (e.g., one quarter to one half) of the total number of arrays, even though most of the activated arrays are not accessed. As a result, power is wasted and the soft-error rate due to supply noise is increased.

Prior art DRAM schemes, such as Synchronous DRAM (JEDEC Standard, Configurations For Solid State Memories, No. 21-C, Release 4, November 1993) and Rambus DRAM (See, PCT Patent document PCT/US91/02590) have attempted to organize the memory devices into banks. In the synchronous DRAM scheme, the JEDEC Standard allows only one bit for each bank address, thereby implying that only two banks are allowed per memory device. If traditional DRAM constraints on the design are assumed, the banks are formed by multiple memory arrays. The Rambus DRAM scheme has a two bank organization in which each bank is formed by multiple memory arrays. In both schemes, due to the large size of the banks, bank-level redundancy is not possible. Furthermore, power dissipation in devices built with either scheme is at best equal to traditional DRAM devices. Additionally, because of the previously defined limitations, neither the Synchronous DRAM scheme nor the Rambus DRAM scheme uses a modular bank architecture in which the banks are connected in parallel to a common internal bus.

Many prior art memory systems use circuit-module architecture in which the memory arrays are organized into modules and the modules are connected together with either serial buses or dedicated lines. (See, PCT patent document PCT/GB86/00401, M. Brent, "Control System For Chained Circuit Modules" [serial buses]; and "K. Yamashita, S. Ikehara, M. Nagashima, and T. Tatematsu, "Evaluation of Defect-Tolerance Scheme in a 600M-bit Wafer-Scale Memory", Proceedings on International Conference on Wafer Scale Integration, January 1991, pp. 12-18. [dedicated lines]). In neither case are the circuit modules connected in parallel to a common bus.

Prior art memory devices having a high I/O data bandwidth typically use several memory arrays simultaneously to handle the high bandwidth requirement. This is because the individual memory arrays in these devices have a much lower bandwidth capability than the I/O requirement. Examples of such prior art schemes include those described by K. Dosaka et al, "A 100-MHz 4-Mb Cache DRAM with Fast Copy-Back Scheme", IEEE Journal of Solid-State Circuits, Vol. 27, No. 11, November 1992, pp. 1534-1539; and M. Farmwald et al, PCT Patent document PCT/US91/02590.

Traditional memory devices can operate either synchronously or asynchronously, but not both. Synchronous memories are usually used in systems requiring a high data rate. To meet the high data rate requirement, synchronous memory devices are usually heavily pipelined. (See, e.g., the scheme described in "250 Mbyte/s Synchronous DRAM Using a 3-Stage-Pipelined Architecture", Y. Takai et al, IEEE JSSC, vol. 29, no. 4, April, 1994, pp. 426-431.) The pipelined architecture disclosed in Y. Takai et al, causes the access latency to be fixed at 3 clock cycles at all clock frequencies, thereby making this synchronous memory device unsuitable for systems using lower clock frequencies. For example, when operating at 50 Mhz the device has an access latency of 60 ns (compared to an access latency of 24 ns when operating at 125 Mhz).

Conventional asynchronous memory devices, due to the lack of a pipeline register, maintain a fixed access latency at all operating frequencies. However, the access cycle time can seldom be substantially smaller than the access latency. Consequently, asynchronous devices are unsuitable for high data rate applications.

Thus, it would be desirable to have a memory device which provides a high through-put, low latency, high noise immunity, I/O scheme which has a symmetrical swing around one half of the supply voltage.

It would also be desirable to have a memory device which can be accessed both synchronously and asynchronously using the same set of connection pins.

Moreover, it would be desirable to have a memory device which provides a high data bandwidth and a short access time.

It would also be desirable to have a memory device which is organized into small memory arrays, wherein only one array is activated for each normal memory access, whereby the memory device has low power dissipation.

Additionally, it would be desirable to have a memory device having small functionally independent modules, a defective module can be disabled and another module is used to replace the defective module, resulting in a memory device having a high defect tolerance.

It would also be desirable to have a memory device in which a single input data stream can be simultaneously written to multiple memory arrays and in which data streams from multiple memory arrays can be multiplexed to form a single output data stream.

Furthermore, it would be desirable to have a memory device in which many memory modules are attached to a high-speed common bus without the necessity of large bus drivers and complex bus receivers in the modules.

SUMMARY OF THE INVENTION

The present invention implements a compact, high speed reduced CMOS swing I/O scheme which uses V.sub.dd /2 as the logic threshold. This scheme has the following advantages: (i) The logic threshold tracks with supply voltages, thereby maintaining balance of pull-up and pull-down. (ii) The bus driver and receiver circuits work at a very wide range of supply voltages without sacrificing noise immunity, since the thresholds of the bus driver and receiver circuits track with each other automatically. (iii) The logic threshold is implicit in the logic circuit and does not require an explicit reference generator circuit. (iv) Logic threshold translation is not necessary since the I/O logic threshold is identical to that of the other logic circuitry on-chip.

The present invention groups at least two memory arrays or banks into a memory module and connects all the memory modules in parallel to a common high-speed, directional asymmetrical signal swing (DASS) bus, thereby forming a memory device. The memory modules transmit signals having a reduced swing to a master module coupled to the DASS bus. In one embodiment, this reduced swing is equal to approximately one volt about a center voltage of V.sub.dd /2, where V.sub.dd /2 is the threshold voltage of CMOS circuitry. The signal transmitted from the master device to the memory modules has a full V.sub.dd swing.

The memory modules are equipped with independent address and command decoders so that they function as independent units, each with their own base address. This circuit-module architecture has several advantages: (i) it allows each memory module to be able to replace any other memory module thereby increasing the defect tolerance of the memory device. (ii) It significantly reduces power consumption of the memory device when compared to traditional memory devices because each memory access is handled completely by one memory module only with only one of the arrays activated. (iii) Since each memory module is a complete functional unit, the memory module architectures allows parallel accesses and multiple memory module operations to be performed within different memory modules, thereby increasing the performance of the memory device. (iv) The memory module architecture allows the memory device to handle multiple memory accesses at the same time.

The circuit-module architecture of the present invention further allows easy system expansion by connecting multiple memory devices in parallel through a common I/O bus which is an extension of the on chip bus. In addition, by incorporating redundant memory modules on each memory device and allowing each memory module to have a programmable communication address on the I/O bus system, the resulting memory system has defect tolerance capability which is better than each individual memory device.

In one embodiment of the present invention, the memory arrays include redundant rows and columns. Circuitry is provided within the memory modules to support the testing of these and redundant rows and columns. Circuitry is also provided to replace defective rows and columns with the redundant rows and columns during operation of the memory device.

The memory devices in accordance with the present invention are able to span address spaces which are not contiguous by controlling the communication addresses of the memory modules. Furthermore, the address space spanned by the memory devices can be dynamically modified both in location and size. This is made possible by the incorporation, in each memory module, of a programmable identification (ID) register which contains the base address of the memory module and a mechanism which decommissions the module from acting on certain memory access commands from the bus. The present invention therefore provides for a memory device with dynamically reconfigurable address space. Dynamically reconfigurable address space is especially useful in virtual memory systems in which a very large logical address space is provided to user programs and the logical address occupied by the programs are dynamically mapped to a much smaller physical memory space during program execution.

Each memory array in the present design is equipped with its own row and column address decoders and a special address sequencer which automatically increments address of the column to be accessed. Each memory array has data amplifiers which amplify the signals read from the memory array before the signals are transmitted to the lines of the DASS bus. Both the address sequencer and data amplifiers increase the signal bandwidth of the memory array. Consequently, each memory array is capable of handling the I/O data bandwidth requirement by itself. This capability makes multiple bank operations such as broadcast-write and interleaved-access possible. For example, a memory device in accordance with the present invention is able to handle a broadcast-write bandwidth of over 36 gigabytes per second and 36 memory operations simultaneously.

Memory devices in accordance with the present invention can be accessed both synchronously and asynchronously using the same set of connection pins. This is achieved using the following techniques: (i) using a self-timed control in connection with the previously described circuit-module architecture. (ii) connecting memory modules in parallel to an on-chip bus which uses source synchronous clocking. (iii) using half clock-cycle (single clock-transition) command protocol. (iv) using an on-chip resynchronization technique. This results in memory devices that have short access latency (about 10 ns), and high data bandwidth (1 gigabyte/sec).

Another embodiment of the present invention provides for the termination of bus lines. In one embodiment, a passive clamp for a bus line is created by connecting a first resistor between the bus line and a first supply voltage and connecting a second resistor between the bus line and a second supply voltage. In one embodiment, the first supply voltage is V.sub.dd, the second supply voltage is ground, and the first and second resistor have the same resistance.

In an alternate embodiment, an active clamp for a bus line is created by connecting a p-channel transistor between the bus line and a first supply voltage and connecting an n-channel transistor between the bus line and a second supply voltage. The gates of the p-channel and n-channel transistors are driven in response to the bus line.

The present invention will be more fully understood in view of the following drawings taken together with the detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a memory device with a circuit-module architecture organized around a DASS bus;

FIG. 2a is a waveform diagram illustrating timing waveforms for asynchronous operations;

FIG. 2b is a waveform diagram illustrating timing waveforms for synchronous operations;

FIG. 3a is a schematic diagram of DASS bus transceivers;

FIG. 3b is a schematic diagram illustrating details of one of the bus transceivers shown in FIG. 3a;

FIG. 4 is a block diagram of a memory module in accordance with the present invention;

FIG. 5a is a block diagram of a memory array containing redundant rows and columns;

FIG. 5b is a schematic diagram of a circuit facilitating in-system testing and repair using redundant rows and columns;

FIG. 6 is a block diagram illustrating a data path in a column area of a conventional DRAM device;

FIG. 7 is a block diagram illustrating routing of column address and data lines in a conventional 4 M-bit DRAM device;

FIG. 8 is a block diagram illustrating column circuitry in accordance with one embodiment of the present invention;

FIG. 9 is a schematic diagram of column circuitry in accordance with one embodiment of the present invention;

FIG. 10 is a block diagram of a conventional address sequencing scheme;

FIG. 11a is a block diagram of an address sequencing scheme in accordance with the present invention;

FIG. 11b is a block diagram of one embodiment of the barrel shifter of FIG. 11a;

FIG. 11c is a schematic diagram of one of the flip-flops of the barrel shifter of FIG. 11b;

FIG. 12 is a block diagram of a resynchronization circuit in accordance with the present invention;

FIG. 13 is a schematic diagram of one embodiment of the FIFO of FIG. 12;

FIG. 14a is a schematic diagram of one embodiment of the latency counter of FIG. 12;

FIG. 14b is a schematic diagram of a latch used in the latency counter of FIG. 14a;

FIG. 15 is a waveform diagram illustrating timing waveforms of the resynchronization circuit of FIG. 12 when the device operating synchronously;

FIG. 16 is a waveform diagram illustrating timing waveforms of the resynchronization circuit of FIG. 12 when the device is operating asynchronously;

FIG. 17 is a block diagram of a memory device configured for broadcast-write operation;

FIG. 18 is a waveform diagram illustrating sequencing of an interleaved access operation;

FIG. 19 is a block diagram of a memory system which includes a memory controller and multiple circuit-module memory devices connected in parallel through an I/O bus;

FIG. 20a is a schematic diagram of a reduced CMOS swing bus transceiver with active termination; and

FIG. 20b is a schematic diagram of a reduced CMOS swing bus transceiver with resistive termination.

DETAILED DESCRIPTION OF THE INVENTION

Conventional bus systems make no distinction in signal amplitude (swing) with respect to the direction of signal transfer across the bus. The signal swing transmitted from one end of the bus is identical to that of a signal sent from the other direction. In a bus system where there are substantially more slaves than masters, bus capacitance is dominated by the bus drivers of communicating devices. This is especially true in a semiconductor (integrated circuit) environment where the bus and the communicating devices are on the same chip.

Communication from masters to slaves is predominantly one-to-many (broadcast), and communication from slaves to masters is one-to-one (dedicated). Using a small bus swing when slaves communicate to the masters allows the bus driver of the slave device to be small. Reducing the slave bus driver size effectively reduces the bus capacitance, thereby facilitating low power, high speed operation. The cost of incorporating amplifiers in the bus receivers of the masters is relatively small because the number of masters is small. Using a large signal swing when masters communicate to the slaves avoids the high cost of amplifier circuits in the receivers of the slaves. Since the number of masters is small, using relatively large bus drivers in the masters does not increase the bus capacitance substantially and thus has little effect on the bus operating frequency.

DASS Bus Structure and Protocol

FIG. 1 is a block diagram of a memory device 100 which utilizes a directional asymmetric swing system (DASS) bus 102 to couple master I/O module 104 and slave memory modules 111-128 in parallel. Although the present invention is described in connection with an embodiment having eighteen slave memory modules, it is understood that other numbers of modules can be used. Master I/O module 104 has one side connected to DASS (directional asymmetric swing system) bus 102 and another side connected to I/O bus 106. Slave memory modules 111-128 contain arrays of dynamic random access memory (DRAM).

In one embodiment, DASS bus 102 has 16 bi-directional lines ADQ[15:0] for multiplexed address, data and control information, 4 lines C[3:0] for control information, 2 lines Dm[1:0] for write-mask information, 1 line for source clock (Sck) information and 1 line for destination clock (Dck) information. When referring to memory modules 111-128, the signals on lines C[3:0], Dm[1:0], and Sck are inputs and the signal on line Dck is an output. No explicit memory module select signal is used. Memory module select information is implicit in the memory address used to access memory modules 111-128.

All memory transactions are initiated by either I/O module 104 or by devices connected to I/O bus 106. In the former case, I/O module 104 contains a memory controller. In the later case, I/O module 104 acts as a repeater between I/O bus 106 and DASS bus 102. A memory transaction is initiated with a command. A typical command requires 20 bits of information carried on C[3:0] and ADQ[15:0]. Four bits are used to encode the operation to be performed, and depending on the contents of the four command bits, the remaining sixteen bits can be a combination of the following: base (memory module) address, bank address, row address, column address, command-code extension or control register data. Each command issued is referenced to a particular transition of the clock, in this case, a low-to-high transition. Data is grouped as half-words of 16 bits each. The DASS bus is capable of transferring one half-word at each clock transition (high-to-low or low-to-high), facilitating dual-edge transfer. Essentially, this allows a 32-bit word to be transferred in one clock cycle using a 16-bit data bus.

The command protocol accommodates both synchronous and asynchronous bus operations and minimizes both the transfer overhead and the memory access latency. This is accomplished by sending the full operation code and address in half of a clock cycle (minimum time unit on the bus). This minimizes the overhead of command transfer and allows the access latency to be very close to the inherent latency of the memory. If the command takes multiple half clock-cycles, the overhead also translates into access latency as most of the command information has to be received before one of memory modules 111-128 can start the operation. For asynchronous operations, the clock signal functions as a command and data strobe. FIGS. 2a and 2b illustrate the timing of asynchronous and synchronous read operations, respectively. In either case, the command signal is strobed and evaluated on every rising edge of the clk/strobe signal.

During an asynchronous operation (FIG. 2a), the falling edge of the clk/strobe signal does not occur until the access latency of the memory has expired. When the falling edge of the clk/strobe signal occurs, the first half-word is read. After the latency associated with accessing the second half-word has expired, the clk/strobe signal transitions from low to high, thereby reading the second half-word. The latency for the second half-word is shorter than the latency for the first half-word because the address of the second half-word is generated internal to the chip. In the foregoing manner, the memory device is operated in a dual-edge transfer mode.

During synchronous operation (FIG. 2b), the first half-word signal is read during the second falling edge of the clk/strobe signal after the command signal is detected. The memory device is again operated in a dual-edge transfer mode, with the second half-word output occurring during the subsequent rising edge of the clk/strobe signal. Again, the latency for the second half-word is shorter than the latency for the first half-word. More details on the memory operations are discussed below.

Limiting bus commands to one half clock cycle seems to limit the memory address range to 64K. However, by taking advantage of the inherent characteristics of DRAM access, and separating the access into two micro-operations, the whole address does not need to be presented at the same time. The memory access operation will be discussed in detail in the memory-operation section.

DASS Bus Drivers and receivers

FIG. 3a is a schematic diagram illustrating bus transceiver 302 of slave memory module 111 and bus transceiver 310 of master I/O module 104. FIG. 3b is a schematic diagram of bus transceiver 302 of memory module 111. Bus transceiver 302 includes a bus driver 304 and a bus receiver 306. Bus driver 304 is a conventional CMOS inverter with a PMOS transistor P10 for pull-up and an NMOS transistor N10 for pull-down. Similarly, bus receiver 306 is a conventional CMOS inverter with a PMOS transistor P11 for pull-up and an NMOS transistor N11 for pull-down.

Bus line 308 of DASS bus 102 connects bus transceiver 302 with bus transceiver 310 in I/O module 104. Transceiver 310 includes bus receiver 312, bus driver 314, and clamping circuit 316. Clamping circuit 316 limits the signal swing on bus line 308. Bus receiver 312 includes CMOS inverter 318 and bus driver 314 includes CMOS inverter 314. Clamping circuit 316 includes n-channel field effect transistors N1-N4, p-channel field effect transistors P1-P4 and inverter 321.

Inverter 318 together with clamping circuit 316 form a single stage feedback amplifier which amplifies the signal on bus line 308. The output of inverter 318 has a swing of approximately 0.5 to V.sub.dd -0.5 volt and is used to drive other on-chip CMOS logic.

The operation of DASS bus 102 is dependent upon the bus transceivers 302 and 310. Bus transceivers 302 and 310 dictate operating speed, power dissipation and, to a large extent, the total die area. In accordance with one embodiment of the present invention, I/O module 104 drives DASS bus 102 with a full V.sub.dd (supply voltage) swing. Memory modules 111-128 drive DASS bus 102 with a reduced CMOS swing of approximately 1 Volt centered around V.sub.dd /2.

Bus receiver 312 operates in the following manner. When I/O module 104 is receiving and memory module 111 is driving, a logic low signal is provided to clamp circuit 316 on lead 320. As a result, transistors P4 and N4 are turned on and clamp circuit 316 is enabled. When the Read_data voltage at the input of inverter 304 is at ground, the output of inverter 318 is at a voltage close to ground, transistor P3 is on, transistor N3 is off, transistor P2 is on, transistor N2 is off, transistor N1 is on, and transistor P1 is off. Transistors N1 and N4 provide a conducting path from bus line 308 to ground, thereby preventing the signal on bus line 308 from going to V.sub.dd and clamping the voltage on bus line 308 at a voltage of approximately V.sub.dd /2+0.5 Volt.

When the Read_data voltage at the input of inverter 304 switches from ground to V.sub.dd, transistor P10 (FIG. 3b) turns off and transistor N10 turns on, thereby pulling bus line 308 towards ground. Transistor N1, still being on, accelerates the pull down on bus line 308 until the logic threshold of inverter 318 is reached. At this time, the output of inverter 318 switches to high, turning transistors N2 and N3 on. In turn, transistor N2 turns off transistor N1 and transistor N3 turns on transistor P1. Transistors P1 and P4 provide a conducting path between bus line 308 and V.sub.dd, thereby clamping the signal on bus line 308 at approximately V.sub.dd /2<0.5 volt.

As the voltage on bus line 308 swings from one logic level to another, clamping does not switch direction until the output of amplifier 318 finishes the logic transition. Clamping circuit 316, before it switches, accelerates the switching of inverter 318. The voltage swing on bus line 308 can be adjusted by changing the size of clamping transistors N1, P1, N4 and P4 or the driver transistors N10 and P10.

When I/O module 104 is driving and the memory module 111 is receiving, a logic high signal is applied to lead 320. Consequently, transistors P4 and N4 are opened and clamp circuit 316 is disabled. Transistors P4 and N4 have channel widths (sizes) two times larger than the channel widths of transistors P1 and N1, respectively. When the signal on line 320 is de-asserted, DC current in clamp circuit 316 and inverter 318 is eliminated. As a result, signals transmitted from bus driver 314 to bus receiver 306 on bus line 308 have a full V.sub.dd swing.

Memory Module Organization

The organization of memory module 111 in accordance with one embodiment of the present invention is illustrated in FIG. 4. In this embodiment, memory modules 112-128 are identical to memory module 111. Memory module 111 contains two memory arrays 402a and 402b, each having 256K bits organized as 256 rows and 1024 columns. Memory array 402a includes word line driver and decoder 404a, column decoder 406a, sense amplifier circuitry 408a, and column select and data amplifier circuitry 410a. Similarly, memory array 402b includes word line driver and decoder 404b, column decoder 406b, sense amplifier circuitry 408b, and column select and data amplifier circuitry 410b.

Memory arrays 402a and 402b share a common DASS memory bus interface 412 which connects memory module 111 to DASS bus 102. Bus interface 412 contains command decoding logic, timing control circuitry, address advancing circuitry, and bus drivers and receivers. Bus interface 412 also contains two programmable registers, an identification (ID) register 414 which stores the communication address of memory module 111, and an access-control register 416. ID register 414 includes a module disable bit 420 which can be programmed by a command from DASS bus 102. As described later, module disable bit 420 is dedicated for addressing redundant modules inside the memory device.

Address Mapping

Each memory module 111-128 incorporates a programmable ID register (e.g., ID register 414) which contains the communication address of the respective module. A pre-programmed communication address is assigned to each of memory modules 111-128. The communication address of each memory module 111-128 can be changed during system operation by a command from DASS bus 102. Specifically, an ID write command is transmitted on DASS bus 102 to write the new communication address to the desired ID register.

The complete address to any memory location in any of memory modules 111-128 contains 4 fields. A first field contains a base address which identifies the memory module by communication address. A second field contains an address which identifies the memory array within the memory module. Third and fourth fields contain the addresses which identify the desired row and column, respectively. The outputs of memory modules 111-128 are organized in 32-bit words.

The programmable base address provides memory modules 111-128 with dynamic address mapping capability by allowing the communication addresses of memory modules 111-128 to be modified during operation of the memory device.

In a system that contains 128 modules of 8K words, if the communication addresses of the memory modules are consecutively assigned, a 4M byte contiguous memory is formed in which seven address bits can be used to address the modules. In another application, a digital system may have distinct address spaces for a CPU (central processing unit) and for a display processor. The two processors can reside on the same bus using the same memory subsystem with some of the memory modules mapped to the CPU address space and the others mapped to the display processor address space.

Redundancy

In accordance with one embodiment of the present invention, two levels of redundancy are employed in a memory device using the circuit-module architecture described above. The first level of redundancy is memory module redundancy. Thus, in one embodiment, memory module 111 may be used as a redundant memory module. In other embodiments, an additional memory module, identical to memory modules 111-128, is coupled to DASS bus 102 and used as a redundant memory module. The redundant memory module is included to allow replacement of any defective regular module.

In an embodiment which uses memory module 111 as a redundant module, module disable bit 420 (FIG. 4) of module 111 is pre-programmed such that during normal operation of memory device 100, module 111 is disabled from participating in any memory accesses. However, ID register 414 is still accessible through the bus interface 412. The module disable bits of modules 112-128 are programmed such that these modules are enabled.

If one of the memory modules 112-128 fails during operation of memory device 100, the defective module is decommissioned by programming the disable bit of its ID register. The redundant module 111 is activated by reprogramming module disable bit 420 and writing the communication address of the defective module to ID register 414.

The second level of redundancy is row and column redundancy. Redundant rows and columns are added to each of memory arrays 111-128 for replacement of defective rows and columns in memory arrays 111-128.

FIG. 5a is a block diagram of a memory module 500 having redundant memory sub-arrays 505, 506, 515 and 516. Memory module 500 includes bus interface 520, ID register 521, access control register 503, repair row address registers 550 and 560, repair column address registers 551 and 561, and memory arrays 508 and 518. Memory array 508 includes redundant row sub-array 505, redundant column sub-array 506 and re