|
|
|
| United States Patent | 4903196 |
| Link to this page | http://www.wikipatents.com/4903196.html |
| Inventor(s) | Pomerene; James H. (Chappaqua, NY);
Puzak; Thomas R. (Cary, NC);
Rechtschaffen; Rudolph N. (Scarsdale, NY);
Sparacio; Frank J. (North Bergen, NJ) |
| Abstract | A method and apparatus for controlling access to its general purpose
registers (GPRs) by a high end machine configuration including a plurality
of execution units within a single CPU. The invention allows up to "N"
execution units to be concurrently executing up to "N" instructions using
the GPR sequentially or different GPR's concurrently as either SINK or
SOURCE while at the same time preserving the logical integrity of the data
supplied to the execution units. The use of the invention allows a higher
degree of parallelism in the execution of the instructions than would
otherwise be possible if only sequential operations were performed.
A series of special purpose tags are associated with each GPR and execution
unit. These tags are used together with control circuitry both within the
GPR's, within the individual execution units and within the instruction
decode unit, which permit the multiple use of the registers to be
accomplished while maintaining the requisite logical integrity. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 4903196 |
|
|
Method and apparatus for guaranteeing the logical integrity of data in
the general purpose registers of a complex multi-execution unit
uniprocessor |
|
|
|
|
|
| Publication Date |
February 20, 1990 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 3346851
|      Your vote accepted [0 after 0 votes] | | 4809159 Sowa 712/25 Feb,1989 |      Your vote accepted [0 after 0 votes] | | 4807115 Torng 712/215 Feb,1989 |      Your vote accepted [0 after 0 votes] | | 4807113 Matsumoto 712/210 Feb,1989 |      Your vote accepted [0 after 0 votes] | | 4794518 Mizushima 712/216 Dec,1988 |      Your vote accepted [0 after 0 votes] | | 4604694 Hough 710/200 Aug,1986 |      Your vote accepted [0 after 0 votes] | | 4574350 Starr 710/200 Mar,1986 |      Your vote accepted [0 after 0 votes] | | 4574349 Rechtschaffen 711/154 Mar,1986 |      Your vote accepted [0 after 0 votes] | | 4384324 Kim 712/242 May,1983 |      Your vote accepted [0 after 0 votes] | | 4249241 Aberle 710/200 Feb,1981 |      Your vote accepted [0 after 0 votes] | | 4224664 Trinchieri 714/25 Sep,1980 |      Your vote accepted [0 after 0 votes] | | 3573736 Schlaeppi 385/115 Apr,1971 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
Having thus described our invention, what we claim as new, and desire to
secure by Letters Patent is:
1. In a complex high speed-multi-execution unit uniprocessor system having
N-execution units and a plurality of General Purpose Registers (GPRs) each
of which is accessible to each of the execution units, said system
including an instruction decoder for decoding a sequence of program
instructions to be executed by said system during an instruction decoding
cycle, said N-execution units concurrently executing up to N decoded
instructions during subsequent execution cycles, said execution units
using the same GPR sequentially or different GPR's concurrently, the
improvement comprising;
means for preserving the logical integrity of the data within the
individual GPRs during simultaneous execution of sequential instructions,
including supplementary storage means associated with each GPR for storing
the unique IDs of execution units which are assigned usage of the specific
GPR,
means operative during the decoding cycle for assigning specific GPR
Registers to be used as sources and sinks for each instruction,
means for assigning an execution unit to each instruction by means of an
identifying tag (ID) unique to such execution unit, stored in the storage
means associated with respective GPRs,
means operative during the execution cycle of said system for allowing an
instruction to be processed by an execution unit when it is determined
that the instruction is independent of the execution of one or more
preceding instructions,
said means for determining if an instruction is independent comprising
means for determining if an identified GPR register to be used as a Source
or Sink for a particular instruction being executed has an execution unit
ID value stored in said supplementary storage means as the result of the
execution of a preceeding instruction, that matches an execution unit ID
value stored in the current execution unit and wherein,
said means for decoding operates substantially independently of the
execution of dependent instructions by the execution units.
2. A multi-execution unit uniprocessor system as set forth in claim 1
wherein said control mechanism includes means for determining if a request
for a GPR contained in an instruction being decoded is for immediate use
by the decoder, for a future assignment of use by a particular execution
unit or for immediate use by an execution unit, and wherein said
determinations are made during a decoding cycle of the control mechanism.
3. A multi-execution unit uniprocessor system as set forth in claim 2
wherein said control mechanism's means for loading said special purpose
tag fields in said GPRs includes means responsive to said determining
means for storing the identification tag of an execution unit currently
being assigned a future SINK operation in the particular GPR in that GPR's
SINK FORWARD TAG field.
4. A multi-execution unit uniprocessor system as set forth in claim 3
wherein said control mechanism for loading said special purpose tag fields
in said GPRs includes means for retaining the identification tag of the
execution unit which last requested a SINK operation in a GPR, in that
particular GPR's SINK FORWARD TAG field when a request for a future SOURCE
operation is detected during the decoding cycle.
5. A multi-execution unit uniprocessor system as set forth in claim 4
wherein said control mechanism's means for loading said special purpose
tag fields in said GPRs includes further means operative when a request
for a SOURCE operation is received, for determining if the current
requesting instruction being decoded is requesting the same SINK value
requested by an immediately preceding SOURCE request from another
execution unit and, if so, incrementing a SOURCE CTR associated with a
SOURCE TAG containing that SINK value, by one and, if not, storing the
current value in the SINK FORWARD TAG field of the GPR in that GPR's
SOURCE TAG field and incrementing its associated SOURCE CTR by one.
6. A multi-execution unit uniprocessor system as set forth in claim 5
wherein said control mechanism's means for leading said special purpose
tag fields in said execution units includes means for loading a SINK V
SOURCE TAG field in a particular execution unit with the value in the SINK
FORWARD TAG field of the particular GPR whose use the execution unit is
currently requesting.
7. A multi-execution unit uniprocessor system as set forth in claim 6
wherein there are as many SINK V SOURCE TAG fields in each execution unit
as there are GPRs in the system and said control mechanism includes means
for storing a SINK V SOURCE TAG value in as many SINK V SOURCE TAG fields
of the execution unit as there are GPRs whose future use is being
requested for a particular operation by the execution unit.
8. A multi-execution unit uniprocessor system as set forth in claim 7
wherein there are a plurality of SOURCE TAG/SOURCE CTR pairs in each GPR
and said control mechanism includes means for storing the current SINK
FORWARD TAG in the next available SOURCE TAG field if the SINK FORWARD TAG
doesn't match the last assigned SOURCE TAG field whenever the GPR is
assigned to a future SOURCE operation.
9. A multi-execution unit uniprocessor system as set forth in claim 8
wherein said control mechanism further includes means operative during the
future use assignment operation when a request is for future SOURCE usage
of a particular GPR to determine if there is a SOURCE TAG available in
that GPR's controls set to "home" status and if so, granting the request
and storing the current value of the SINK FWD TAG field in the SOURCE TAG
field and leaving the SINK FORWARD TAG field in the GPR unchanged.
10. A multi-execution unit uniprocessor system as set forth in claim 6
wherein said control mechanism includes means operable, during the
execution phase of operation, if a GPR request is for immediate use by the
decoder, to allow such use only if there is no pending prior SINK or
SOURCE request by an execution unit.
11. A multi-execution unit uniprocessor system as set forth in claim 9
wherein said control mechanism includes means operative, during the
execution phase of operation, if the request is for immediate use by an
execution unit, to compare a requesting execution unit's SINK V SOURCE TAG
with the GPR's SINK EXECUTE TAG and, if the comparison is affirmative, to
allow the request.
12. A multi-execution unit uniprocessor system as set forth in claim 11
wherein said control mechanism further includes means operative during the
execution phase of operation, when an execute unit's request is for a
SOURCE operation in a GPR, to decrement the SOURCE CTR associated with the
GPR's SOURCE TAG, determining if the associated SOURCE CTR is at a
predetermined value and, if so, resetting both the SOURCE TAG field and
its associated SOURCE CTR to "home" status, and leaving the GPR's SINK
EXECUTE TAG unchanged.
13. A multi-execution unit uniprocessor system as set forth in claim 12
wherein said control mechanism for loading said special purpose tag fields
in said GPRs during the execution phase of the system operation further
includes means operative, if the request is for a SINK operation, to
replace the GPR's current SINK EXECUTE TAG with the requesting execution
unit's ID TAG after granting the SINK request.
14. A multi-execution unit uniprocessor system as set forth in claim 12
wherein said control mechanism includes means operative during the
execution phase of operation, when a request for immediate SOURCE usage
for a GPR is received and wherein the GPR has a plurality of SOURCE
TAG/SOURCE CTR pairs, first determining if the SK V SRC TAG from the
execute unit matches the GPR's SINK EXEC TAG and, if so, determining which
SOURCE TAG in the requested GPR's SOURCE TAG fields matches the SINK V
SOURCE TAG field from the execute unit, allowing the SOURCE request and
decrementing the SOURCE CTR associated with the matching SOURCE TAG by
one, and determining if that counter has been decremented to a
predetermined value and, if so, setting the particular SOURCE TAG and
associated SOURCE CTR to "home" (no operation pending) status.
15. A multi-execution unit uniprocessor system as set forth in claim 14
wherein said control mechanism includes means operable during the
execution phase of operation for requesting SINK V SOURCE access to all of
the GPRs referenced in the various SINK V SOURCE TAG field in the execute
unit which are required for a particular operation.
16. In a complex high speed multi-execution unit uniprocessor system having
N-execution units and a plurality of General Purpose Registers (GPRs) each
of which is accessible to each of the execution units, said system
including an instruction decoder for decoding a sequence of program
instructions to be executed by said system during an instruction decoding
cycle, said "N" execution units concurrently executing up to "N" decoded
instructions during subsequent execution cycles, said execution units
using the same GPR sequentially or different GPR's concurrently, the
improvement comprising a means for preserving the logical integrity of the
data within the individual GPRs during simultaneous execution of
sequential instructions, said means comprising:
at least four special tag fields associated with each GPR, a SINK FORWARD
TAG, a SINK EXECUTE TAG, and at least one SOURCE TAG/SOURCE CRT pair, said
tag fields controlling access by an execution unit to a GPR during said
execution cycle,
each execution unit having associated therewith instruction register
storage means having, in addition to fields for storing an instruction to
be executed by the execution unit, at least two special purpose tag fields
for storing an identification (ID) tag unique to that execution unit and
at least one SINK V SOURCE TAG, and
a control mechanism including:
assigning means for assigning instruction to execution units, loading means
for loading said special purpose tag fields in said execution units and
said GPRs during said instruction decoding cycle of the system operation,
said control mechanism thereby preparing the execution units to execute
instructions, said loading means loading a particular GPR's SINK FORWARD
TAG field with the ID TAG of a first execution unit assigned an
instruction which uses that GPR as a SINK and further loading the first
execution unit's SINK V SOURCE TAG field with the ID TAG of a second
execution unit which uses the particular GPR as a SINK immediately prior
to the first execution units use of the GPR, as a source,
means operable, as each of the execution units complete execution of an
assigned instruction, to load a particular GPR's SINK EXECUTE TAG with an
ID TAG of the execution unit which has just finished using it, thereby
allowing the execution unit whose SINK V SOURCE TAG matches the particular
GPR's current SINK EXECUTE TAG to access the particular GPR and wherein,
said means for assigning operating substantially independently of the
execution of dependent instructions by the execution units.
17. A multi-execution unit uniprocessor system as set forth in claim 16
wherein said means for loading said special purpose tag fields in said
GPRs includes:
means for retaining the ID tag of the execution unit which last requested a
SINK operation in the GPR in the particular GPR's SINK FORWARD TAG field
when a subsequent request for a future SOURCE operation is received,
means further operative when a request for a future SOURCE operation is
received for determining if the current requesting execution unit is
requesting the same SINK value requested by an immediately preceding
SOURCE request from another execution unit and, if so, incrementing a
SOURCE CTR associated with a SOURCE TAG containing that SINK value by one
and, if not, storing the current value in the SINK FORWARD TAG field of
the GPR in an unused GPR SOURCE TAG field and incrementing its associated
SOURCE CTR by one.
18. A multi-execution unit uniprocessor system as set forth in claim 16
wherein said control mechanism includes: means operative, if a GPR request
is for immediate use by the decoder, to allow such use only if there is no
pending prior SINK or SOURCE request by an execution unit,
means operative, if the request is for immediate use by an execution unit,
to compare a requesting execution unit's SINK V SOURCE TAG with the GPR's
SINK EXECUTE TAG and, if the comparison is affirmative, to allow the
request, if for SOURCE usage or if for SINK usage and there is no SOURCE
request pending
means operative, if a request is for a SINK operation by an execution unit,
to replace the GPR's current SINK EXECUTE TAG with the requesting
execution unit's ID TAG after granting the SINK request,
means operative when an execute unit's request is for a SOURCE operation in
a GPR, to decrement the SOURCE CTR associated with the GPR's SOURCE TAG,
determining if the associated SOURCE CTR is at a predetermined value and,
if so, resetting both the SOURCE TAG field and its associated SOURCE CTR
to "home" status, and leaving the GPR's SINK EXECUTE TAG unchanged.
19. A multi-execution unit uniprocessor system as set forth in claim 16
wherein there are as many SINK V SOURCE TAG fields in each execution unit
as there are GPRs in the system and said control mechanism includes means
for storing a SINK V SOURCE TAG value in as many SINK V SOURCE TAG fields
of the execution unit as there are GPRs whose future use is being
requested for a particular operation by the execution unit whereby an
instruction may specify the use the use of a plurality of GPRs for an
instruction.
20. A multi-execution unit uniprocessor system as set forth in claim 19
wherein there are a plurality of SOURCE TAG/SOURCE CTR pairs in each GPR
and said control mechanism includes:
means operative when a request is for future SOURCE usage, to first
determine if there is a previous assignment of that SINK FORWARD TAG in a
SOURCE TAG field and, if so, incrementing the associated SOURCE CTR and if
not, to determine if there is a SOURCE TAG available in the GPR controls
and, if so, granting the request and storing the current value of the SINK
FWD TAG field in the SOURCE TAG field and leaving the SINK FORWARD TAG
field in the GPR unchanged, and
means operative when an execution unit is attempting to execute an
instruction, when a request for immediate SOURCE usage for a GPR is
received, for first determining if the SINK V SOURCE TAG of the execute
unit matches the GPR's SINK EXECUTE TAG and, if so, determining which
SOURCE TAG in the requested GPR's SOURCE TAG fields matches the SINK V
SOURCE TAG field from the execution unit, allowing the SOURCE request and
decrementing the SOURCE CTR associated with the matching SOURCE TAG by
one, and determining if that counter has been decremented to a
predetermined value and, if so, setting the particular SOURCE TAG and
associated SOURCE CTR to "home" (no operation pending) status,
said means being further operable for requesting SINK V SOURCE access to
all of the GPRs referenced in the various SINK V SOURCE TAG fields in the
execution unit which are required for a particular operation. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
The present invention has particular utility in the field of complex, high
speed, electronic computer systems having the capability of executing
millions of instructions per second. More particularly, it relates to such
systems having a plurality of difference execution units capable of
working on a single instruction stream simultaneously or in an overlapped
execution configuration. Still, more particularly, it relates to a control
mechanism and architecture for the management of data in the general
purpose registers of such a system which guarantees the logical integrity
of data placed in and extracted form such registers by individual
execution units capable of operating asynchronously.
BACKGROUND OF THE INVENTION
The architectural definition of todays general purpose computers have
implicit in its definitions the sequential nature of the instructions
execution. In those areas where parallel operations are architected--for
example, the operation of the channel in parallel with the CPU's program
execution--special provisions are provided in the architecture to allow
the programmer to synchronize his CPU program with the asynchronous I/O
operation. The programmer correctly assumes that all the sequentially
architected instructions are started and finished sequentially.
The designers of high performance sequential machines recognized many years
ago that parallel execution of instructions would be useful to improve a
machine's performance. Their first approach in this direction was to
provide separate hardware facilities for the different stages in the
execution of an instruction. Thus, if the execution of an instruction took
"N" cycles; and if each cycle of the execution utilized a different
hardware facility, then the design could theoretically be executed "N"
instructions each in a different stage of completion. This concept is
commonly called pipelining.
The designers of pipelined machines would use few or many hardware
facilities to provide different degrees of pipelining, and hence
performance. It should be noted that logical interlocks, (for example, the
N'th instruction's input information would come from the N'th-l
instruction's output), and queuing on hardware facilities, has caused
pipelined machines to perform at far less performance than their
theoretical limit of one instruction per cycle. For a pipelined 370
machine to obtain an execution rate (assuming all data and instructions
are resident in the cache) of one instruction for every three machine
cycles one normal commercial code is considered good. The 3081 and 3090
are examples of IBM system 370 pipelined high performance systems. These
machines always finish instructions in sequential order.
To further increase performance designers can take the approach of
duplicating hardware facilities where queuing becomes a problem--usually
on the execute units, caches, and data busses--, and allowing out of
sequence execution completion. These extensions of pipelining, alleviate
the queuing problem on hardware facilities, and reduce the effects of the
logical interlocks. (It allows, for example, greater than "N" instructions
to be initiated before the first is completed.)
The price that is payed for the increased performance is additional
hardware, and increased control complexity to maintain the logical
sequentiality the architecture demands. The IBM 91/95/195 processors
utilized multiple execution units and out of sequence execution completion
for the floating point operations of systems 360 architecture. The CDC
6600 and 7600 systems also utilized both of the above extensions of
pipelining.
The present invention provides a unique method that allows, through a
tagging and counting scheme, each general purpose register (GPR) of the
system 370 architecture to be assigned a string of operations that must be
executed sequentially upon said GPR. Further, these assignments can be
intermixed with similar strings of operations for other GPR's. This
technique will provide a framework to allow multiple independent streams
of operations to each proceed sequentially, but concurrent to each other.
It provides the proper logical interlocks, both within each stream, and
between concurrent streams, which preserves the logical integrity of the
utilization of the GPR's.
DESCRIPTION OF THE PRIOR ART
U.S. patent application Ser. No. 591,705 filed Mar. 21, 1984, now U.S. Pat.
No. 4,574,349 of R. N. Rechtschaffen describes a method of renaming the 16
GPR's of system 370 into 256 hardware registers, these contain the
contents of memory locations that were addressed on prior load
instructions of the GPR's. If subsequent load instructions go to the same
memory locations as indicated by X2, B2, and D2, this is determined
through a fully associative scan of a pointer stack (Renaming Pointer
Storage). The contents of a part of this stack contains the addresses of a
register in an array of registers (Extension Registers) that contains the
memory data of a previous load. This data is used immediately while the
fetching of the actual data from memory is being accomplished and then
compared to data from the Extension Register. Recovery steps are taken
when a mismatch of data occurs.
As indicated above, both the intent and implementation technique of this
application is quite different than the subject invention.
U.S. Pat. No. 3,346,851 (J. E. Thorton el al) discloses using a tagging
scheme to provide sequencing between an instruction that uses a particular
register as a SINK/SOURCE and a future instruction that uses this same
register as a SOURCE/SINK. Its register management approach and
implementation are completely different from the present invention.
Briefly, the implementation involves two 4 bit fields called "Queue
Designations" that are associated with each execution unit. They are
loaded with the "tag" or identifier of a first execution unit; whenever a
SOURCE operand of a second execution unit will utilize the SINK operand of
the first, and the first is executing an operation that is prior to the
operation in the second.
If, in the above example, the first execution unit was executing an
operation that is subsequent to that in the second execution unit, then
the first unit will be prevented from writing its SINK operand into the
SOURCE operand register of the second execution unit until the second unit
has utilized this SOURCE.
This is accomplished through interrogation of a second set of 3 bit fields
associated with each execution unit that are called F Designators. The F
Designator field holds the basic register name; i.e., 1, 2, 3, 4, etc. .
.; that are to be used for both SOURCE and SINK operands. Thus, if the
interrogation encounters SOURCE F Designators for the SINK operand address
of execute unit one, it will prevent execute unit one from updating of its
SINK operand register.
In addition to being implemented differently than the present invention,
this patent has the following functional differences;
1. An instruction using the same SINK register as a previous, as yet
uncompleted instruction, cannot be issued from the decoder.
2. The execution units are particular as to type--add, multiply, divide,
shift, etc. . .--and if a particular unit is busy when an instruction is
decoded, the instruction cannot be issued from the decoder.
U.S. Pat. No. 4,313,161 discloses a system including a plurality of
processors connected to a shared storage. A flag field is utilized in the
storage address to "lock out" other processors. It is a fairly
conventional configuration.
U.S. Pat. No. 3,718,912 relates to the assignment of register storage space
to successive instructions. Each of the general purpose registers has tags
associated therewith indicating the nature of data stored therein. This
patent does not relate to a multi-execution unit system or to means for
synchronously register usage between same. The particular flags defined
herein and their usage are quite different from any suggestions in this
patent.
Patents numbered 3,778,776; 4,334,269; are both concerned with multiple
virtual memories. Neither of the patents has anything to do with the
logical ordering, for out of sequence execution, of concurrent streams of
operations, upon different general purpose registers, in an architecture
that is basically sequential.
Pat.Nos. 4,334,269 deals with the definition of an efficient architecture
of a pop-up stack CPU that has multiple general purpose registers. It is
stated that with the disclosed architecture a simple compiler that is
generally associated with a pop-up stack CPU will accomplish a data
processing application with as few instruction as a complicated compiler
that is generally associated with a multiple GPR architecture with no
pop-up stack. The patent has nothing to do with the logical ordering, for
out of sequence execution, of concurrent streams of operations, upon
different general purpose registers, in an architecture that is basically
sequential.
A number of additional references were found as a result of a prior art
search which are not overly relevant to the present invention, other than
that they deal with register assignment and management schemes. These
include: U.S. Pat. No. 3,611,306, inventors Riegel et al, "Mechanism to
Control the Sequencing of Partially Ordered Instruction in a Parallel Data
Processing System"; U.S. Pat. No. 4,373,180, inventor, J. P. Linde,
"Microprogrammed Control System Capable of Pipelining Even when Executing
a Conditional Branch Instruction"; U.S. Pat. No. 4,363,096, inventors, J.
A. Comfort, et al, "Arbitration Controller Providing for Access of a
Common Resource by a Duplex Plurality of Central Processing Units"; U.S.
Pat. No. 4,058,711, inventors, Robert M. Ondercin, et al, "Asynchronous
Dual Function Multiprocessor Machine Control"; and an article in Vol. II,
No. 11, Apr. 1969, IBM Technical Disclosure Bulletin by L. W. Smith,
"Instruction Sequencing System", pgs. 1496-1497.
SUMMARY OF THE INVENTION
It is a primary object of the present invention to provide a method and,
apparatus for guaranteeing the logical integrity of data stored in and
accessed from reaching the general purpose register of a multiple
asynchronous execution unit uniprocessor.
It is a further object of the invention to provide such a method and
apparatus which allows the out of sequence use of said registers with
respect to the instruction stream.
It is a further object of the invention to provide such a method and
apparatus which uses special synchronizing tags and counters in
conjunction with each general purpose register and execution unit, in a
unique way, to preserve said logical integrity of the data in said
registers.
It is another object of the invention to provide such a method and
apparatus which utilizes said tags and counters so that a given general
purpose register may be simultaneously used for a SOURCE and a SINK for a
given execution unit as specified by a single instruction.
Other objects features and advantages of the present invention will be
apparent from the subsequent description at the preferred embodiment of
the invention including the drawings and the claims appended hereto.
The objects of the present invention are accomplished in general by an
apparatus for controlling the access to the general purpose registers of a
high speed, complex machine architecture including a plurality of
execution units within a single CPU. By use of the invention, "N"
execution units may be concurrently executing up to "N" instruction using
different general purpose registers as either a SINK or SOURCE while, at
the same time, preserving the logical integrity of the data within the
GPR's. The invention allows a significantly higher degree of parallelism
in the execution of the instructions than would otherwise be possible if
only sequential operations were performed.
A group of at least four special purpose tag fields are associated with the
access control circuitry of each of the general purpose registers.
Additionally, two tag types are utilized with each of the "N" execute
units. Tags associated with each of the general purpose registers and each
execute unit are used together with control circuitry associated with the
instruction unit decoder, the GPR's and the individual execute units which
permit a multiple use of the registers while maintaining the requisite
logical integrity. Briefly, this is accomplished by storing an execute
unit/GPR sequence trail between the individual GPR's and execute units so
that before a given execute unit is permitted to utilize a particular GPR
as a SINK, the immediately preceding SINK operation by a different execute
unit must have been completed. Similarly, it is assured that all SOURCE
accesses to a given GPR by one or more execute units are completed before
a subsequent SINK operation is allowed to occur.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 comprises a functional block diagram of a typical complex high speed
uniprocessor having plural execution units.
FIG. 2 comprises a more detailed functional block diagram of the system of
FIG. 1 illustrating some of the physical details of the general purpose
registers and the multiple execution units and clearly showing data flow
within those portions of the system relevant to the present invention.
FIG. 3 comprises an illustrative graphical example of how the present
system utilizes the various TAG and COUNTER fields defined by the present
invention to set the required values in said fields for a particular
sequence of instructions all of which require the use of a particular GPR.
FIG. 4 comprises a flow diagram for the present system specifying the
operations which must occur in the GPR controls when a request for a SINK
GPR assignment is received by the control mechanism from the instruction
decoder.
FIG. 5 comprises a flow diagram for the present system specifying the
operations which must occur in the GPR when a request for a SOURCE GPR
assignment is received by the control mechanism from the instruction
decoder.
FIG. 6 comprises a flow diagram for the present system specifying the
operations which must occur in the GPR controls when a request for the
immediate use of a GPR as a SINK is received by the control mechanism from
an execute unit.
FIG. 7 comprises a flow diagram for the present system specifying the
operations which must occur in the GPR controls when a request for the
immediate use of a GPR as a SOURCE is received by the control mechanism
from an execution unit.
FIG. 8 comprises a flow diagram for the present system specifying the
operations which must occur in the execute unit controls to request a
particular GPR as either a SINK or a SOURCE.
FIGS. 9A and 9B comprise a flow diagram for the present system specifying
those operations that must occur in the decoder controls to effect the GPR
register assignments to various execution units while maintaining the
logical integrity of the data therein.
DESCRIPTION OF THE PREFERRED EMBODIMENT
The basic concept of the present invention may best be appreciated by
referring to the high-level functional block diagram of FIG. 1 which in
illustrates the architectural environment of the present invention but, in
a larger sense comprises a common architectural environment for any high
speed complex multi-execution unit uniprocessor. The hardware disclosed in
FIG. 1 is thus capable of executing the three essential phases of
operation of such a computing system, namely fetching instructions,
decoding the instructions and finally executing same. Instruction unit 10
accomplishes the function of accessing instructions from memory, including
the various address generation phases involved, and placing them in the
local instruction register. The Decoder and Control Mechanism 12 decodes
the individual instructions, assigns same to the various execution units
and assigns to the execution units the particular general purpose
registers which are to be used for a given instruction as SINK or SOURCE.
More significantly, for the purpose of the present invention, the Control
Mechanism sets up the requisite operations whereby the various
synchronizing tag fields are set during the register assignment phase of
operation. These tag fields are checked during the execution phase of the
operation before access is allowed as will be explained more fully
subsequently. Memory 14 is conventional in nature and is addressed by the
instruction unit to access instructions or by the execution units to
access or store data.
The general purpose registers (GPR's) are conventional in nature but, in so
far as their operation is concerned are unique in the present invention in
that they are provided with a plurality of special synchronizing tag
fields which will be explained more fully with reference to FIG. 2.
Similarly the N-execution units 18 are conventional in nature and could
either be identical units or different units having different functional
capabilities. The capabilities and functions would of course have to be
known by the decoder. The unique feature of the execute units necessitated
by the present invention is the provision of tag fields which are
functionally associated with each of the execute units. These will be
described more fully with respect to FIG. 2.
The essential or underlying feature of the present invention is the
provision and management of the special synchronizing tags in both the
general purpose registers and the execution units which are managed in a
unique fashion. The required logical integrity of the general purpose
registers is maintained, and operations using logically independent
registers can be executed out of their original sequence in the
instruction stream.
These tag fields and their functional location are clearly shown in FIG. 2.
It is noted that the numerals 12, 16, and 18 in FIG. 2 refer to the same
functional blocks as in FIG. 1 namely the decoder 10, distributed controls
12, the general purpose registers 16, and the "N-execute" units 18.
Before proceeding with a specific definition and explanation of each of the
tags, it should be clearly understood that the basic underlying philosophy
of the invention is to keep track of the sequence of the assignments of
the various general purpose registers to execute units. The various tag
fields located in or associated with each of the execute units and general
purpose registers are utilized to perform this function.
More specifically each execute unit is, in effect, told which execute unit
was assigned use of the register for SINK operations immediately preceding
its own request to prevent its using the register prematurely for a SINK
operation. Similarly the execute unit is advised as to which execute unit
must use the register as a SINK immediately before it can use the register
as a SOURCE. This, as will be appreciated, assures that the proper SOURCE
value or operand is given to the requesting execute unit.
Referring to FIG. 2, it will be seen in examining block 16 that a plurality
of tag fields is associated with each GPR. These are defined as a SINK FWD
TAG, a SINK EXEC TAG, SOURCE TAGS (0-N) and SRC CTR's (0-N). Referring to
the execute unit block 18, it will be noted that each execute unit has
associated with it, e.g., in its individual instruction register, and EXEC
unit ID tag, and a plurality of SINK V SRC TAG's (1-M) as well as fields
to hold the operation, and the memory operand where M could be up to 16.
It should be noted in passing that the reason for the plurality of SINK, V
SRC TAG's (1-M) tags is to allow for operations where it might be required
to access more than one GPR for a given operation.
As an aid to understanding the subsequent description of the operation of
the present system the following definition of each of the tags is set
forth to explain precisely what is stored in each field and its general
function in the overall operating scheme.
SINK FORWARD TAG
A Tag (or identifier) Field associated with each GPR that specified what
execution unit was the last to be given authority to use the GPR for a
SINK and has not as yet exercised this authority. If no execution unit has
a request unexercised, then that tag will be on "home" status.
SINK EXECUTE TAG
A TAG field associated with each GPR that specifies what execution unit can
next utilize | | |