|
Claims  |
|
|
What is claimed is:
1. A high speed distributed control memory system, coupled to a host system
processor via a data bus, comprising:
(a) an array of modular/memory elements, connected in cascade with adjacent
memory elements and further connected to said data bus, each being
independently operative to store and process data, communicate with said
host processor over said data bus, and directly communicate with other
memory elements of said array;
(b) position reference means, coupled to the first and last memory elements
of said array, for indicating to said first and last elements their
physical position along said data bus in said array;
(c) host interface means, coupled between said host processor and said
array of elements, to separate memory element to memory element
communications over said data bus, from memory element to host processor
communications over said data bus; and
(d) a system clock, coupled to said host processor and each of said memory
array elements, for synchronizing the operations of said host processor
and said memory array elements.
2. A memory system as set forth in claim 1 wherein said memory element to
memory element communication may be performed independent of the
supervision and control of said host processor.
3. A memory system as set forth in claim 1 further comprising a direct,
data bus independent, communication path between adjacent memory elements
for carrying memory element to memory element communication signals
between said adjacent memory elements.
4. A memory system as set forth in claim 1 wherein said data bus is capable
of being driven by the memory elements of said array at the rate of once
per clock cycle.
5. A memory system as set forth in claim 1 wherein said host interface
means further comprises a transceiver.
6. A memory system as set forth in claim 1 wherein said position indicating
means further comprises a power supply.
7. A memory system as set forth in claim 1 wherein each of said modular
memory elements further comprises:
(a) storage means;
(b) distributed control means, capable of data processing, coupled to said
storage means, for controlling the storing and retrieving of data into and
from said storage means; and
(c) interface means, coupled between said data bus and both said
distributed control means and said storage means, for selectively
channelling command and data signals from said data bus to said
distributed control means and said storage means.
8. A memory system as set forth in claim 7 wherein said interface means is
operative to channel data signals, from said storage means and said
distributed control means, to said data bus.
9. A memory system as set forth in claim 8 wherein each of said memory
elements further includes an internal data bus for carrying said data
signals within each memory element.
10. A memory system as set forth in claim 9 wherein each of said memory
elements further includes an internal command bus for carrying said
command signals.
11. A memory system as set forth in claim 8 which further includes a
communication path, coupled between said host processor and each of said
interface means, for channelling control signals from said host processor
to selected memory elements and for channelling status signals from
selected memory elements to said host processor.
12. A memory system as set forth in claim 11 wherein said communication
path between said host processor and each of said interface means may be
used to input any of a set of preselected host interface signals from said
host processor to the distributed control means of a selected memory
element.
13. A memory system as set forth in claim 12 wherein said communication
path between said host processor and each of said interface means may be
used to output any of a set of preselected host interface signals from a
given memory element to said host processor.
14. A memory system as set forth in claim 12 wherein said set of host
interface input signals includes signals for enabling the reading and
writing of data, from and into respectively, memory elements in said array
selected by said host processor and wherein said selected memory elements
are operative in response to said enabling signals to enable the reading
and writing of data.
15. A memory system as set forth in claim 12 wherein said set of host
interface input signals includes a reset signal which when communicated to
said array causes each memory element to identify itself in terms of its
physical location in the array.
16. A memory system as set forth in claim 13 wherein said communication
path further comprises a set of individual control links, each designated
to carry one of said preselected set of host interface signals.
17. A memory system as set forth in claim 8 further comprising a first and
a second direct link between the interface means of adjacent memory
elements, for carrying direct communication signals between said adjacent
memory elements.
18. A memory system is set forth in claim 17 wherein a first adjacent
memory element generates a transmit signal, which is communicated to a
second adjacent memory element over said direct link, to signal that data
will be transmitted by said first element to said second element, over
said data bus, during an upcoming, single clock interval.
19. A memory system as set forth in claim 18 wherein said first adjacent
memory element is operative to write said data onto said data bus during
said upcoming single clock interval.
20. A memory system as set forth in claim 19 wherein said second adjacent
memory element is operative in response to said transmit signal to read
the data bus during said upcoming single clock interval.
21. A memory system as set forth in claim 20 wherein said second adjacent
memory element is operative during said clock interval to acknowledge a
completed read of said data bus.
22. A memory system as set forth in claim 21 wherein said signal
acknowledging the completion of a read is communicated to said first
adjacent memory element via said second direct link between said adjacent
memory elements.
23. A memory system as set forth in claim 22 further comprising means for
directly broadcasting a control signal generated by any one of said memory
elements to all the other memory elements of said array.
24. A memory system as set forth in claim 23 wherein said distributed
control processor further comprises:
(a) a micro-control processor unit; and
(b) an execution control unit.
25. A memory system as set forth in claim 24 wherein said broadcasted
control signal may be used to conditionally and unconditionally force the
micro-control processor unit of a given memory element to execute a
preselected microcode instruction string.
26. A memory system as set forth in claim 25 wherein each of said memory
elements further comprises means to facilitate interconnecting banks of
cascaded memory elements.
27. A memory system as set forth in claim 26 wherein said interface means
further comprises latch means for buffering commands and data between said
data bus and both the internal data and command buses in each of said
memory elements.
28. A memory system as set forth in 27 wherein each of said modular memory
array elements is an integrated circuit device.
29. A method for distributing the control of a memory system, coupled to a
host system processor via a data bus, and operating said memory system at
high speed, comprising the steps of:
(a) connecting adjacent ones of an array of modular memory elements in
cascade and further connecting each memory element to said data bus, each
being independently operative to store and process data, communicate with
said host processor over said data bus, and directly communicate with
other memory elements of said array;
(b) indicating to the first and last of said memory elements their physical
position along said data bus in said array by utilizing position
referencing means coupled to said first and last elements;
(c) separating memory element to memory element communications over said
data bus, from memory element to host processor communications over said
data bus by coupling host interface means between said host processor and
said array of elements; and
(d) synchronizing the operations of said host processor and said array of
memory elements via use of a system clock.
30. A method as set forth in claim 29 futher comprising the step of
performing memory element to memory element communication independent of
the supervision and control of said host processor.
31. A method as set forth in claim 29 further comprising the step of
directly communicating between adjacent memory elements by transmitting
communication signals over a direct, data bus independent, communication
path between said adjacent memory elements.
32. A method as set forth in claim 29 further comprising the step of
driving said data bus, via said memory elements, at the rate of once per
clock cycle.
33. A method as set forth in claim 29 wherein the step of separating is
implemented by utilizing a transceiver.
34. A method as set forth in claim 29 wherein said step of indicating is
implemented by tying a power supply input to said first and last element
of said array.
35. A method as set forth in claim 29 wherein the operation of each said
modular memory elements further comprises the steps of:
(a) storing and retrieving data from a storage means;
(b) controlling the storing and retrieving of data, into and from said
storage means, via distributed control means capable of data processing;
and
(c) channelling, selectively, command and data signals from said data bus
to said distributed control means and said storage means via interface
means, coupled between said data bus and both said distributed control
means and said storage means.
36. A method as set forth in claim 35 further comprising the step of
channelling data signals, from said storage means and said distributed
control means, to said data bus via said interface means.
37. A method as set forth in claim 36 further comprising the step of
carrying said data signals within each memory element on an internal data
bus.
38. A method as set forth in claim 37 further comprising the step of
carrying said command signals within each memory element on an internal
command bus.
39. A method as set forth in claim 38 which further includes the steps of
channelling control signals from said host processor to selected memory
elements and channelling status signals from selected memory elements to
said host processor, both via a communication path coupled between said
host processor and each of said interface means.
40. A method as set forth in claim 39 further including the step of
inputting any of a set of preselected host interface signals from said
host processor to the distributed control means of a selected memory
element via said communication path between said host processor and each
of said interface means.
41. A method as set forth in claim 40 further including the step of
outputting any of a set of preselected host interface signals from a given
memory element to said host processor via said communication path between
said host processor and each of said interface means.
42. A method as set forth in claim 40 wherein said set of host interface
input signals includes signals for enabling the reading and writing of
data, from and into respectively, memory elements in said array selected
by said host processor.
43. A method as set forth in claim 40 further including the step of
identifying each memory element in terms of its physical location in said
array in response to a host interface input reset signal.
44. A method as set forth in claim 41 further including the step of
designating an individual control link to carry one of said preselected
set of host interface signals when said communication path is comprised of
a set of individual control links.
45. A method as set forth in claim 36 further comprising the step of
carrying direct communication signals between adjacent memory elements
over a first and a second direct link between the interface means of said
adjacent memory elements.
46. A method is set forth in claim 45 further comprising the steps of
generating a transmit signal, via a first adjacent memory element, and
communicating said transmit signal to a second adjacent memory element,
over said direct link, to signal that data will be transmitted by said
first element to said second element, over said data bus, during an
upcoming, single clock interval.
47. A method as set forth in claim 46 further comprising the step of
writing said data from said first adjacent memory element onto said data
bus during said upcoming single clock interval.
48. A method as set forth in claim 47 further comprising the step of
reading data from said data bus during said upcoming single clock interval
in response to said transmit signal.
49. A method as set forth in claim 48 further comprising the step of
acknowledging a completed read of said data bus by said second adjacent
memory element during said clock interval
50. A method as set forth in claim 49 further comprising the step of
communicating said signal acknowledging the completion of a read to said
first adjacent memory element via said second direct link between said
adjacent memory elements.
51. A method as set forth in claim 50 further comprising the step of
broadcasting, directly, a control signal generated by any one of said
memory elements to all the other memory elements of said array.
52. A method as set forth in claim 51 further comprising the step of
performing said distributed control processor function via the combination
of a micro-control processor unit and a cooperating execution control
unit.
53. A method as set forth in claim 52 further comprising the step of
forcing, conditionally and unconditionally, the micro-control processor
unit of a given memory element to execute a preselected micro-code
instruction string in response to said broadcasted control signal.
54. A method as set forth in claim 53 further comprising the step of
interconnecting banks of cascaded memory elements.
55. A method as set forth in claim 54 further comprising the steps of
latching and buffering commands and data, between said data bus and both
the internal data bus and command bus in each of said memory elements, via
said interface means.
56. A method as set forth in claim 55 further comprising the step of
fabricating each of said modular memory array elements in the form of an
integrated circuit device. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates generally to memory systems used in digital computing
systems and more particularly relates to a memory system which is
comprised of a plurality of intelligent memory elements each capable of
rapid, direct communications with one another, without host system
intervention or supervision.
2. Description of the Prior Art
Digital computers are well known which are comprised of a combination of
one or more central processing units (CPU), memory and input/output
devices. The CPU is the "intelligence" of the host computer and typically
uses memory, both internal and external to the host system, as storage
and/or scratch pad space for performing arithmetic and logic operations.
The input/output devices typically provide man/machine interfaces and are
means to communicate with external systems, such as other computers,
external storage devices, etc.
Memory systems are well known which themselves are "intelligent", i.e., can
perform data processing and control functions in parallel with the host
CPU. To accomplish this known systems have one or more control units,
usually in the form of microprocessors, each dedicated to servicing
predetermined portions of memory where the dedicated units act independent
of the host CPU, but subject to its control. Such systems are known as
distributed control memory systems.
Communications between the host system and the processor units in a
distributed control memory system typically involves using an
address/read/write scheme in combination with a communications bus.
Communications are slow since the independent processing resources compete
for time on the bus and all communication, even when effectively between
memory elements only, is, in known systems, routed via the host CPU.
Using the known schemes for communicating with distributed control memory,
certain operations require tying up the host CPU and the system bus for
considerable amounts of time, particularly when a search or sorting
operation is in progress. These types of operations, heretofore software
oriented, require extensive passing of control between the host system and
the distributed control. Even more time is expended resolving the
aforementioned contention problem when the hardware architecture of the
overall system provides for only a single path communication bus between
the host CPU and distributed control memory.
As a result of the aforementioned problems it would appear to be desirable
to minimize or eliminate the time consuming software bottlenecks that work
to slow computer systems by off-loading, from software to hardware,
tedious and frequent tasks, such as those, associated with sorting and
searching. Off-loading these tasks would reduce software errors and speed
up many applications. Specifically, applications like the creation of
constant ordered lists by an operating system would be aided. The creation
of these lists is slow and accrues significant overhead on
operating-system software. Also, improved speed and reliability in
performing the ultra-fast sorting required by specialized applications,
such as graphics and artificial intelligence, would be achieved.
It also appears desirable to permit direct communication between the
control portions of memory elements in a memory element array, without
host system intervention. In addition to facilitating performance of the
aforementioned "software" tasks, the host CPU and memory element
controllers would then truly operate independently and more efficiently.
Particularly in the case where a single shared bus is involved, contention
problems would be held to a minimum by taking advantage of the speed with
which the memory elements could directly get on to and off of the bus,
thereby enabling preselected tasks to be optimally performed by hardware
directly.
SUMMARY OF THE INVENTION
The invention comprises a highspeed, intelligent, distributed control
memory system which, according to the preferred embodiment of the
invention, is further comprised of an array of modular, cascadable,
integred circuit devices, hereinbefore and after referred to as "memory
elements." Each memory element includes storage means, programmable on
board processing means ("distributed control") and means for interfacing
with both the host system and other memory elements in the array utilizing
a single shared bus.
Each memory element of the array is capable of transferring (reading or
writing) data between adjacent memory elements once per clock cycle. In
addition, each memory element is capable of broadcasting data to all
memory elements of the array once per clock cycle. This ability to
transfer data between the memory elements at the clock rate, using the
distributed control, facilitates unburdening host system hardware and
software from tasks more efficiently performed by the distributed control.
As a result, the memory elements themselves can, for example, perform such
tasks as sorting and searching, even across memory element boundries, in a
manner which is conserves, operates faster and is more efficient than
using,host system resources.
The key to achieving these results is a memory architecture that permits
direct communications between distributed control elements without host
system supervision or intervention and which is capable of driving the
host system communication bus at a rate of once per clock cycle.
It is an object of the invention to optimize the performance of computing
systems by off-loading selected tasks from the host system CPU to
distributed control memory.
It is a further object of the invention to permit direct communications
between intelligent memory elements in an array of intelligent memory
elements, without host system intervention or supervision.
It is still a further object of the invention to optimize the use of the
bus structure in a computer system, particularly where the structure is
that of a single bus shared by the host system CPU and the distributed
control in memory elements.
It is yet another object of the invention to provide a memory element array
structure capable of optimally driving a shared bus system, with the
preferred embodiment of the invention being capable of driving a single
shared bus once per clock cycle.
Other objects, features and advantages of the present invention will become
apparent upon consideration of the following detailed description and the
accompanying Drawing, in which like reference designations represent like
features throughout the figures.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 depicts a prior art computer system that includes distributed
control memory.
FIG. 2 depicts an array of memory elements each coupled to a shared data
bus and other memory elements in the array in accordance with the
teachings of the preferred embodiment of the invention.
FIG. 3 depicts, in block diagram form, the details of one of the memory
elements in the array depicted in FIG. 2.
FIG. 4 depicts a pin diagram for a memory chip used in accordance with the
preferred embodiment of the invention.
FIG. 5 depicts the latches and driver circuits used in the preferred
embodiment of the interface circuitry shown in FIG. 3.
FIG. 6 is a timing diagram illustrating the sequence of events in a "pop"
operation performed directly by the distributed control memory.
FIGS. 7a-7d depict how the handshake/acknowledge signalling, implemented in
the preferred embodiment of the invention, is used to accomplish safe,
once per clock cycle, data transfers across memory element boundries.
DETAILED DESCRIPTION
FIG. 1 depicts a prior art computing system 100 which includes CPU 101,
memory 102 and associated input/output device(s) 103. System 100 is also
designated in FIG. 1 as the "host computer." Link 150 is shown as a path
for communication between CPU 101 and the world outside the host computer.
Link 151 is a bus connecting CPU 101 and memory 102.
Also depicted in FIG. 1, as part of memory 102, is microprocessor 199. This
device is shown coupled via bus 151 to CPU 101.
As indicated hereinbefore, distributed control memory systems of the type
depicted in FIG. 1 are known and permit local processing by microprocessor
199 over the memory space. Such processing is typically controlled by CPU
101 via link 151.
FIG. 2 depicts the preferred embodiment of the novel architecture
contemplated by the invention. FIG. 2 shows a distributed control memory,
202, comprised of an array of modular, cascadable, memory elements. These
are shown as elements 202-1 thru 202-n. Each memory element is capable of
communicating with the host CPU via bus 251 (similar to the communications
link 151 shown in FIG. 1) via interface circuitry 252. Circuitry 252 would
typically comprise a transceiver to separate memory element to memory
element communications from memory element to host CPU communications.
Each memory element in FIG. 2 is capable of direct communication with
adjacent memory elements via interconnection link pairs 253-1 thru 253-x,
where x=n-1 and n is the number of memory elements in the array. Each
memory element is also capable of memory element to memory element
communications (adjacent or not), via bus 251 and other paths shown in
FIG. 2, e.g., globel link 274. Both types of communication (adjacent and
nonadjacent) can be performed at high speed, without supervision or
intervention by the host CPU, in the manner to be described in detail
hereinafter.
The purpose and function for each of the interconnections depicted in FIG.
2, along with a description of the application and benefits of the
depicted architecture, will also be set forth in detail hereinafter.
First, however, a detailed description of each of the memory array
components will be set forth, together with a description of inputs and
outputs, (control, data and command I/O) to and from each memory element.
FIG. 3 depicts, in block diagram form, one of the modular elements of the
array of FIG. 2. As shown by the example of FIG. 2, these elements are
cascadable.
Each memory element is comprised of storage means 301, distributed control
(shown further comprised of the combination of micro-control unit 302 and
execution unit 303), and means for interfacing with both the host system
and other memory elements in an array such as the one shown in FIG. 2. The
means for interfacing is depicted in FIG. 3 as unit 304 and interfaces
with bus 251 of FIG. 2 via the link marked "8 bit data bus", 310. The
choice of a bus 8 bits wide is arbitrary and was chosen for the sake of
illustration only.
Reference should now be made to FIG. 4 which depicts how the memory element
depicted in FIG. 3 may be packaged using a standard 28 pin integrated
circuit device. The pin diagram shown in FIG. 4 is intended for use with a
chip that functions in accordance with the teachings of the invention by
incorporating interface circuitry 304, microcontrol unit (MCU) 302,
execution unit 303, storage means 301 (shown as random access memory,
"RAM"), and providing for the input and output of data, control, command
and timing signals shown in FIG. 3. The purpose and function of each pin
depicted in FIG. 4 will be explained with reference again to FIG. 3.
In addition to the functional blocks previously described, FIG. 3 depicts a
set of "host interface" and "chip to chip signals" which are inputs to and
outputs from a given memory element separate and apart from normal data or
command I/O communicated over link 310 and data bus 251. Also shown in
FIG. 3 is clock input 311 for synchronizing the operation of memory
elements and other overall system components.
The set of host interface signals shown includes RST 315; CS 316; RE 317;
WE 318; C/D 319; STAT 320; and DONE 321. DONE 321 is also depicted as a
chip to chip signal. The purpose and function of these host interface
signals in the preferred embodiment of the invention will be described
immediately hereinafter, followed by a detailed description of the chip to
chip communication signals.
Equivalents of the host interface signals are well known in the prior art.
The host/memory interface signals facilitate distributed control
processing by, among other things, putting the memory space that is within
the range of a particular processor in a read or write mode. A particular
chip, or memory space, can be selected by the host system pulling the CS
316 (chip select) line low. The negative logic convention used in
conjunction with the preferred embodiment of the invention was chosen for
the sake of illustration only. One of ordinary skill in the art will
readily appreciate that positive logic would work just as well. When CS
316 is high, all the read/write inputs are ignored.
RE 317 and WE 318, read and write enable respectively, are also active when
low. RE 317 is used to read data from a chip, while WE 318 is used to
write commands or data into a chip.
C/D 319, the command/data input signal, when low allows data to be read or
written (from or to) a chip; when high commands may be written into a
given chip.
RST 315 low signals a chip reset operation. Any command under execution is
terminated. DONE 321 goes high. Upon RST 315 going from low to high, the
chip in the array depicted in FIG. 2, with its RUP line (the function of
which will be described hereinafter) tied to +5V, assumes a chip address
of 0, the next chip in the array assumes an address of 1 and so on until
all devices number themselves. The reset operation, also referred to as
the chip enumeration operation, will be described in detail hereinafter.
It should be noted that the reset operation can be triggered by RST 315
being pulled low, or via a host issued RST (reset) command. The RST
command will be described hereinafter in conjunction with the description
of the command set utilized in the preferred embodiment of the invention.
The wire-or'ed DONE 321 lines shown in FIG. 2 signal completion of the
reset cycle by going low. In general the DONE 321 output (active low)
indicates the termination of an operation. This signal goes high at the
beginning of new commands, data writes, or data reads, and then goes low
when done with the current operation. This host interface signal is, as
indicated before, used in chip to chip communications as well.
As a chip to chip signal DONE 321 is effectively bidirectional, i.e., one
chip can signal the completion of an operation (output) and the wire-or'ed
lines see an input signal. This is one way in which the novel architecture
is used to broadcast control information from chip to chip.
Finally, the STAT 320 output being low, according to the preferred
embodiment of the invention, signals an exception condition following the
execution of an instruction. This output goes high at the beginning of a
new command, or when a write or read is initiated.
The chip to chip communication signals for a given chip are shown as TUP
370, RUP 371, RDWN 372, TDWN 373, GLB 374, DIRG 375, T/R 376 and DIRD 377.
TUP 370 (transmit upward), RUP 371 (receive from the up direction), RDWN
372 (receive from the down direction) and TDWN 373 (transmit downward) are
all active when high in the preferred embodiment of the invention (low
would work just as well). RUP 371 and RDWN 372 are inputs to a given chip,
TUP 370 and TDWN 373 are outputs. There also exists a test mode in the
preferred embodiment where TUP and TDWN are bidirectional.
The purpose and function of these signals in the context of the invention
will be described hereinafter with reference to how a chip array operates
to perform specific operations. These operations were chosen to illustrate
that the novel architecture can be used to realize the previously set
forth objects of the invention. In particular, chip enumeration, passing
tokens from chip to chip, performing hardware binary searchs, pushing and
popping data into and from the memory array, broadcasting data from a
given chip to others in the array, performing jamming operations and
resolving contention problems among chips will all be described
hereinafter in detail.
The explanation of how these operations are performed will not only
illustrate the purpose and function of the chip to chip communication
signals in the context of the novel architecture, but will also
demonstrate how the hardware actually operates and how it is able to
off-load traditionally software oriented tasks, e.g., binary search, etc.
Before detailing these operations and their implementation, the remaining
chip to chip signals need to be briefly characterized.
GLB 374, a bidirectional signal, will be seen as useful in implementing the
broadcast, jam and contention resolving operations. An example of its use
will be set forth hereinafter.
The output signals DIRG (Direction of GLB) 375, T/R (transmit/receiver) 376
and the DIRD (direction of done) 377, are used in the preferred embodiment
of the invention to facilitate interfacing banks of modular chip devices
to one another. With one chip bank, these lines are not used, as shown in
FIG. 3. If the number of devices in a bank (according to the preferred
embodiment) exceeds 16, the designer can either reduce the clock frequency
to the parts, due to increased capacitance, or insert a buffer circuit
between banks of devices to allow operation at optimum speed.
In the first case (reducing clock frequency) DIRG 375, T/R 376 and DIRD 377
are not used. In case two, DIRG 375, T/R 376 and DIRD 377 may be used to
control (enable and tristate) the buffer circuitry which facilitates the
communication with other banks of chips.
The pin diagram of the 28 pin package set forth in FIG. 4 can now be better
appreciated as providing for all the inputs and outputs described above
with reference to FIG. 3. The remaining pins shown in FIG. 4, and not
described above, are the 8 pins (D0-D7) associated with bus link 310 of
FIG. 3 and the 2 V.sub.dd and 2 V.sub.ss pins which are coupled, in the
preferred embodiment, to a +5 volt power supply (V.sub.dd) and ground
(V.sub.ss), respectively. Pins D0-D7 carry data and commands between a
given chip and the system bus.
FIG. 3 goes on to show, in broad block diagram form, the flow of "control"
and "data" between the major circuits on a given chip.
MCU 302 is the distributed "intelligence" (with respect to the host)
embedded in each chip. A micro-sequencer such as the AMD 2911A,
off-the-shelf memory, an EPROM and simple MSI TTL glue logic, combined as
taught by Dietmeyer in "Logic Design of Digital Systems", published by
Allyn and Bacon of Boston, Massachusetts (copyright 1971), chapter 6.2,
would be sufficient to realize a micro-control unit, like MCU 302, which
is capable of executing sequences of hardwired micro-code to perform
preselected operations.
MCU 302, functions in response to a preselected command set specified by
the unit designer. Each command of such a set initiates a predetermined
micro-code sequence, to selectively process and pass data and control
signals between the host system, other memory elements in the array, and
internally among the on-chip circuits and memory.
The preferred embodiment of the invention utilizes a set of 16 commands to
set various pointers, masks, etc. These will be described in detail
hereinafter. For now, however, as an example of one such command, the
preferred embodiment calls for a GSF command (Get Status Full) to cause
MCU 302 to check status flags to see if a given memory array is full. The
particular command set desired is application dependent and well within
the ability of those skilled in the art to design for any given
application.
In the preferred embodiment of the invention MCU 302 outputs a 47 bit
micro-code word. 17 bits of the micro-code word specifies the address of
the next micro-code instruction. The remaining 30 bits of the code word
are actually a set of control signals which are communicated to the other
on broad chip units, i.e., memory 301, execution unit 303 and circuitry
304.
The number of bits chosen for the MCU 302 output micro-word in the
preferred embodiment of the invention is application dependent and not
limiting insofar as describing the invention per se.
Execution unit 303 (part of the distributed control) is comprised of
arithemetic logic units (ALUs), incrementers, comparators, and other
registers, operated under the control of MCU 302. Devices, such as unit
303, are well known for performing address calculations, data manipulation
and conditional branch calculations for a MCU control unit such as MCU
302. Status information is also monitored by unit 303. In the preferred
embodiment of the invention unit 303 is a 10 bit unit comprised of AMD
2900 series bit slice ALUs, standard TTL latches, incrementers and glue
logic, well within the skill of the art to construct, given the
application. Execution unit 303 also generates, under the control of MCU
302, the address pointer into storage device 301.
Storage device 301 is, in the preferred embodiment of the invention, a
novel 1 K-byte RAM. The novel RAM is described in detail in copending U.S.
patent application Ser. No. 838,993 filed Mar. 12, 1986 by the assignee of
this application. Application Ser. No. 838,993 is hereby incorporated by
reference. This particular RAM is organized and operable in such a fashion
as to support implementing sort operations such as a "sort by insertion",
which is one of the operations performed by the hardware described herein
and explained in detail hereinafter. However, the instant invention does
not require, and is not limited to use with, the novel RAM. Any RAM
cooperating with the other on chip units to store and retrieve data would
be sufficient for the purpose of the current invention.
FIG. 3 goes on to depict interface circuitry 304 coupled to "control" and
"data" paths within the chip, and coupled to the depicted pin outs shown
in Both FIG. 3 and 4. The "control" path includes a command bus which is
used to route any one of the predetermined commands being sent by the host
systems over bus 251 and link 310, to MCU 302 for execution. The "data"
path includes a data bus (again, internal to the chip) which carries data
between link 310, storage 301 and selected registers in execution unit
303.
Interface circuitry 304 may, according to the preferred embodiment of the
invention, be viewed as having two separate cooperating parts.
The first part comprises standard, off-the-shelf logic which is used to
pass buffered control and status signals to MCU 302 wherever an input is
presented to circuitry 304, other then from link 310, in FIG. 3. Standard,
off-the-shelf logic, is also used to drive output signals where an output
or bidirectional link is indicated with respect to circuitry 304 in FIG.
3. In both of these cases the buffering is performed under the control of
MCU 302.
The second part of circuitry 304 comprises the logic to field data and
commands taken off the system data bus via link 310. This includes logic
to distinguish commands from data and to route commands and data onto
appropriate internal buses. Commands are routed to MCU 302 on an internal
command bus. Data is routed to execution control unit 303 and memory 301
on an internal data bus.
Commercially available logic for accomplishing these specific functions is
well known to those skilled in the art. For the purposes of the invention,
however, the key concept to be reiterated is that circuitry 304 functions
to route chip to chip signals, host interface signals, and command and
data as well, to and from on chip units as described hereinbefore, and
that implementation of a suitable interface circuit can be achieved by
using standard, off-the-shelf logic and well known combinatorial logic
techniques.
The above referenced second portion of interface circuits 303 is operative,
in the preferred embodiment of the invention, to pass commands and data
between link 310 and the internal buses in a manner depicted in FIG. 5.
FIG. 5 shows the combination of data input latch 501 and data input driver
502 coupled between the external data bus link 310, and the aforementioned
internal data bus 550. FIG. 5 also shows data output latch 503, and data
output driver 504 coupled between internal data bus 550 and link 310, for
outputting data onto the system bus. Finally, FIG. 5 also depicts a
command latch 505, for receiving commands off link 310 and for routing
these commands to the aforementioned internal command bus 555.
Each of the depicted latches is enabled or disabled, and the drivers are
either driving or tristating the respective buses shown, depending on the
presence of absence of the various signals described hereinbefore. For
example, the command latch is enabled when the C/D 319 input is high,
otherwise it is not enabled, etc. Here is where the collection of signals
from the host system, interchip signals and micro control unit generated
signals are all utilized to actually synchronize the placing of data and
the taking of data (and commands) off the system bus.
Prior to proceeding with the description of the actual command set
implemented in the preferred embodiment of the invention, and a
description of the various operations implemented to achieve the objects
of the invention, the various inputs and outputs to and from MCU 302 in
the preferred embodiment of the invention will be recapped and summarized.
First, MCU 302 receives as input:
(a) commands via the internal command bus;
(b) buffered control signals from the host system via circuitry 304; (note,
these same signals, e.g., read enable and write enable, are used in the
portion of circuitry 304 depicted in FIG. 5 to control the loading,
enabling, driving and tristating of the depicted latches and drivers); and
(c) buffered control signals from adjacent chips, e.g., the signals on RUP
371 and RDWN 372.
It should be noted that the preferred embodiment of the invention provides
for a direct input path for interchip signals input on links RUP 371 and
RDWN 372 to the control for the latch and driver circuits shown in FIG. 5.
This design criteria eliminates having to involve MCU 302 in certain
interchip operations, in particular in implementing the "wait-box" scheme
to be described hereinafter in conjunction with certain of the hardware
operations. The highspeed, once per clock cycle operation of the invention
is in part achieved by the direct input of these control signals to where
they are needed, when they are needed, to control the latches and drivers.
The principle output of MCU 302 is, in the preferred embodiment, the 47 bit
micro-code word described hereinbefore.
Finally, with respect to FIG. 3, clock signals from the host system are
shown input to MCU 302 via link 311. In fact, the clock input is shared by
all the array elements with each element taking the clock input and
generating two non-overlapping onboard clock signals. These signals are
used for building synchronization logic and inputs for each of the onboard
units described hereinbefore. According to the preferred embodiment of the
invention the clock signal should be between 1 megahertz and 16 megahertz,
although this is not a factor limiting the concept of the novel
architecture.
For the sake of illustration and completeness, the actual command set
implemented in the preferred embodiment of the invention will now be set
forth in detail.
The preferred command set is divided into 5 groups, (1) the set-up control
group; (2) the status group; (3) the address specification group; (4) the
addressing mode control group, and (5) the command group.
The set-up control group contains the RST (reset) and the KPL (load K, P,
L) instructions. RST is equivalent to a hardware reset and is followed by
the enumeration process which was mentioned hereinbefore and which will be
described in detail hereinafter. The KPL instruction is the means by which
software controls the variable width record organization of the novel RAM
structure described in the copending patent application incorporated
hereinbefore by reference. This instruction defines the number of bytes in
the key (K), the number of bytes in the pointer (P), and the last address
(L) of each chip in the array, which specifies the number of logical
records each memory on a given chip contains. Each record is thus
containing K+P bytes, of which K bytes are the key and P bytes are either
the remaining bytes in the record or a pointer to the physical record in
the main memory.
The status group contains just one instruction, the GSF (get status full).
The chips respond to this instruction via the STAT pin to inform the host
if the array is full.
The address specification group contains six instructions which control the
two basic active pointers, the record pointer, and the byte pointer. At
any point in time these pointers are active only in one of the chips in
the array, the one that was associated with the last record accessed.
After reset, both pointers point to the top of the array. RRB (restore
record boundary) restores the active byte pointer to point to the current
record boundary. NXT (next) sets the byte and record pointers to the next
record boundary. PRE (previous) sets both pointers to the previous record.
DEC (decrement) decrements the byte pointer to set it pointing to the
previous byte. LAL (load address long) loads the byte pointers with an
18-bit value in all the chips of the array to allow for random access
a | | |