|
Description  |
|
|
RELATED PATENT APPLICATION
The patent application of James W. Keeley and Thomas F. Joyce entitled,
"Multiprocessor Shared Pipeline Cache Memory", issued as U.S. Pat. No.
4,695,943 on Sept. 22, 1987, which is assigned to the same assignee as
this patent application.
BACKGROUND OF THE INVENTION
1. Field of Use
This invention relates to address transfer apparatus and more particularly
to methods and apparatus for veryifying that address information is being
transferred without error.
2. Prior Art
In general, many data processing systems do not include apparatus which
check address transfers, particularly when the address being transferred
is used to access a memory device. To ensure that memory addressing
proceeded properly in such instances, one prior art approach was to
combine the parity bits of the address applied to the memory device with
the address of the data and store the resulting information in the
addressed location.
During a subsequent cycle, the stored resulting bit was used to signal the
presence of an error or fault condition associated with the location being
accessed. An example of such an arrangement is described in U.S. Pat. No.
3,789,204 titled, "Self-Checking Digital Storage System", invented by
George J. Barlow.
While the above arrangement was effective in detecting memory faults or
errors, it only detected indirectly errors occurring during the transfer
of the address. The verification of such transfers becomes particularly
important where address being transferred passes through an incrementing
circuit. In this type of arrangement, it becomes difficult to ensure that
the resulting address is valid without adding a substantial amount of
circuit redundancy. That is, a common approach has been to provide two
address incrementing circuits and a comparator. The comparator by
comparing the incremental addresses generated by both incrementing
circuits is able to verify that the incrementing operation took place
without error. Thereafter, new parity can be generated for the verified
incremented address.
In addition to the added duplication, the above approach increases
substantially, the amount of time required for verifying that the address
transfer proceeded without error. In today's high speed data processing
systems, the introduction of this type of address verification can
substantially reduce system performance. This problem is further
compounded where the addresses being transferred have undergone a virtual
to physical address translation operation which involved the generation of
parity bits further delaying the transfer of the address to the memory
device such as a cache memory. In such arrangements, disparities in time
between the availability of the generated parity bits associated with the
physical address and the normal availability of physical address further
adversely affects system performance resulting in more stringent
requirements being placed on the virtual memory management unit which
performs such address translations.
Accordingly, it is a primary object of the present invention to provide an
improved method and apparatus for transferring addresses and their
associated integrity bits through an address path which includes an
incrementing circuit.
It is a further, more specific object of the present invention to provide
an improved method and apparatus which verifies if the transfer of
addresses proceeded without error.
SUMMARY OF THE INVENTION
The above and other objects of the present invention are achieved in a
preferred embodiment. The method and apparatus of the present invention
find particular utilization within a pipeline cache memory system such as
that disclosed in the related copending patent application of James W.
Keeley, et al. In such a system, addresses which have been translated by a
processing unit's virtual memory management unit (VMMU) or received from a
system bus are presented as part of the requests for accessing cache data.
In order to maintain high performance by the cache pipelined stages, the
requests must be received within a certain time interval or valuable cache
cycles will be lost. Another important consideration is that there is a
need to increment the addresses provided to the cache memory system. Since
it is important that such systems have high reliability, integrity or
parity bits are included as parts of such addresses.
The present invention provides a method and apparatus for generating
integrity bits for addresses transferred through an address path which
includes an incrementing circuit and for verifying if the transfer
occurred without error. This is accomplished by separating the integrity
bits from each address and generating a corresponding number of transform
bits which indicate a predetermined characteristic of the predicted change
in state of the number of bits within the address. The transform bits are
then used to transform the original integrity bits into integrity bits for
the incremented address.
By separately transforming the integrity bits of the address into integrity
bits for the incremented address, both operations can be accomplished
within a minimum of time. Moreover, the present invention allows for
disparities in time of arrival between the address and its integrity bits.
This reduces the time constraints which are imposed on the address sources
such as a VMMU. Also, it maintains a high performance level within the
address receiving unit such as the cache memory of the preferred
embodiment.
The present invention facilitates reliability by providing a method and
apparatus for verifying that address incrementing and/or transfer was
performed without errors. This is done by logically combining the
incremented address, the transform bits and integrity bits of the
unincremented address. When an error is indicated, the logical result is
then used to override the cache directory cycle and force a cache miss
condition. To allow for further disparities in the arrival times of the
addresses and their integrity bits, the integrity bits of the
unincremented address which are last to arrive are combined with the
result of combining the incremented address and transform bits.
In the preferred embodiment, the operation of generating the transform bits
is carried out by a programmable logic device (PLD). According to the
invention, the PLD generates the transform bit by determining from the
received address, the predetermined characteristic which, in the preferred
embodiment, is whether the number of bits predicted to change by
incrementing the address is an odd number.
In those systems, in which the address and its integrity bits are known to
arrive at the same time, the PLD can be used in the same manner to
transform the integrity bits of the received address into the integrity
bits of an incremented address. In this case, the PLD carries out the
operations of generating the transform bits and complementing the
integrity bits of the received address according to the states of the
transform bits.
The novel features which are believed to be characteristic of the invention
both as to its organization and method of operation, together with further
objects and advantages will be better understood from the following
description when considered in connection with the accompanying drawings.
It is to be expressly understood, however, that each of the drawings is
given for the purpose of illustration and description only and is not
intended as a definition of the limits of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a cache subsystem which incorporates the
method and apparatus of the present invention.
FIGS. 2a and 2b shown in greater detail portions of the cache subsystem of
FIG. 1.
FIGS. 3 and 4 are flow and timing diagrams respectively used to explain the
operation of the method and apparatus of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 shows in block diagram form, the organization of a cache subsystem
14-6 which incorporates the method and apparatus of the present invention.
As shown, the cache subsystem 14-6 receives memory requests from a
plurality of sources 14-1 through 14-5. These sources include a pair of
central processing unit (CPU) subsystems 14-2 and 14-4, a system bus
source 14-1 and a replacement address register (RAR) source 14-5.
Each of the CPU subsystems 14-2 and 14-4 include a virtual memory
management unit (VMMU) for translating CPU virtual addresses into physical
addresses for presentation to cache subsystem 14-6 as part of the memory
requests. The system bus source 14-1 includes a FIFO subsystem whcih
couples to a system bus and to the replacement address register (RAR)
source 14-5. The FIFO subsystem receives all of the information
transferred between any units connected to the bus in addition to any new
data resulting from a memory request being forwarded to the system bus by
cache subsystem 14-6.
Cache subsystem 14-6 is organized into a source address generation section
and two separate pipeline stages, each with its own decode and control
circuits. The source address generation section includes blocks 14-62
through 14-65 which perform the functions of source address selection and
generation. The first pipeline stage is an address stage and includes the
directory and associated memory circuits of blocks 14-66 through 14-76,
arranged as shown. This stage performs the functions of latching the
generated source address, directory searching and hit comparing. The first
pipeline stage provides as an output information in the form of a level
number and a column address. The operations of the first pipeline stage
are clocked by timing signals generated by the timing and control circuits
within the subsystem 14-6.
The information from the first stage is immediately passed onto the second
pipeline stage leaving the first stage available for the next source
request. The second pipeline stage is a data stage and includes the data
buffer and associated memory circuits of blocks 14-80 through 14-87,
arranged as shown. This stage performs the functions of accessing the
requested data from the buffer memories 14-88 and 14-90, or
replacing/storing data with data received from the system bus 14-1. The
second pipeline stage provides a 36-bit data word for transfer to one of
the CPU subsystems 14-2 and 14-4. Again, the operations of the second
pipeline stage are clocked by timing signals generated by cache subsystem
timing and control circuits.
The basic timing for each of the subsystem sources of FIG. 1 is established
by the cache subsystem timing and control circuits. Such control permits
the conflict-free sharing of cache subsystem 14-6 by CPU subsystems 14-2
and 14-4 and bus 14-1 including RAR source 14-5. These circuits are
described in greater detail in the related patent application. Briefly,
these circuits include address select logic circuits which generate
control signals for conditioning address selector 14-62 to select one of
the subsystems 14-2, 14-4 or 14-1/14-5 as a request address source.
Also, the timing circuits include pipeline clock circuits which define the
different types of cache memory cycles which can initiate the start of the
pipeline. This results in the generation of a predetermined sequence of
signals in response to each request which includes signals WRTPLS,
PIPE0A+0A and PIPE0B+0A. That is, first and second signals, respectively,
indicate a cache request for service by CPU0 subsystem 14-2 and CPU1
subsystem 14-4 while other signals indicate cache requests for service by
system bus 14-1.
The different blocks of the first and second pipeline stages are
constructed from standard integrated circuits, such as those described in
the "The TTL Data Book, Volume 3", Copyrighted 1984, by Texas Instruments
Inc. and in the "Advanced Micro Devices Programmable Array Logic
Handbook", Copyright 1983, by Advanced Micro Devices, Inc. For example,
the address selector circuit of block 14-62 is constructed from two sets
of six 74AS857 multiplexer chips cascaded to select one of four addresses.
The latches of blocks 14-68 and 14-72 are constructed from 74AS843 latch
chips.
The directory memories 14-74 and 14-76 are constructed from 8-bit slice
cache address comparator circuits having part number TMS2150JL,
manufactured by Texas Instruments Incorporated. The address registers
14-80 and 14-84 are constructed from 9-bit interface flip-flops having
part number SN74AS823, manufactured by Texas Instruments, Inc. The address
increment circuits of block 14-64 are constructed from standard ALU chips
designated by part number 74AS181A.
As seen from FIG. 1, cache subsystem 14-6 is organized into even and odd
sections which permit two data words to be accessed simultaneously in
response to either an odd or even memory address. The arrangement of the
present invention enables the transfer of parity bits included within the
even and odd memory addresses presented by the address sources through the
cache pipeline stages. That is, in parallel with the required incrementing
being performed by incrementing circuit 14-64, apparatus in the form of
parity transform circuit 14-65 generates a plurality of transform bits
(FLPA08, FLPA16) which are stored in even address latches 14-72 in
response to a load address signal ADLOAD in place of the parity bits of
the address bits received from selector circuit 14-62 which are required
to be incremented. An AND gate 14-63 generates the signal ADLOAD by
combining signals WRTPLS and PIPE0A+0A.
The increment circuit 14-64 includes a lookahead circuit shown in greater
detail in FIG. 2a which generates as an output, an increment carry signal
INCRY0. This signal is applied as an input to transform circuit 14-65,
also shown in greater detail in FIG. 2a. The transform circuit 14-65 is
constructed from a programmable array logic (PAL) element having part
number AmPAL16L8B, manufactured by Advanced Micro Devices, Inc. As
explained herein in greater detail, the PAL circuit 14-65 is specially
programmed or burned according to the present invention to generate the
required transform bits.
In the preferred embodiment, only a portion (i.e., 10 bits) of the entire
physical address is incremented while the remaining address bits are
transferred through the cache subsystem pipeline stages unchanged. Thus,
the ten address bits (CMAD 13-22) which corresponds to the low order byte
and a portion of the next low order byte of the address received from
selector circuit 14-62 are applied as inputs to transform circuit 14-65.
Also, the parity bits (CMAPEX, CMAP00, CMAP08 and CMAP16) for the selected
address are separated from the source address and loaded into the parity
address latches 14-66, in response to timing signal PIPE0A+0A.
Additional, cache subsystem 14-6 also includes odd and even parity check
circuits 14-69 and 14-70, a pair of OR gages 14-71 and 14-73 and a pair of
pipeline storage flip-flops 14-86 and 14-87 arranged as shown. According
to the present invention, these circuits verify that the address transfer
or address incrementing operation performed by circuit 14-64 proceeded
without error. The check circuits 14-69 and 14-70 generate the required
error signals for all four address bytes which are grouped within OR gate
circuits 14-71 and 14-73. The odd and even parity error signals ODAPER and
EVAPER from OR gates 14-71 and 14-73 are stored in error flip-flops 14-86
and 14-87 in response to timing signal PIPE0B+0A.
The parity check circuits 14-69 and 14-70 are constructed from standard
parity generator circuits designated by part number 74AS280 while error
flip-flops are constructed from standard clocked flip-flops designated by
part number 74AS1823. For ease of explanation, the gates 14-71 and 14-73
are shown as single OR gates which may be constructed using standard NAND
gates designated by part number 74S20 which operates as negative input OR
gates.
FIG. 2b illustrates in greater detail, a portion of even parity check
circuits 14-70. This portion, corresponding to parity generator circuit
14-700, generates a parity error signal EVAPE1A for the next low order
byte address bits EVAD08-15 stored in even latches 14-72 by combining
these signals with the corresponding transform bit FLAP08 to produce
output signal EVAPE1. Signal EVAPE1 is then combined with address parity
bit signal CMAP08 within an exclusive OR circuit 14-702 to produce output
error signal EVAP1A. This signal is applied to OR circuit 14-73 along with
the three other signals generated by the remaining circuits of parity
check circuit 14-70.
DESCRIPTION OF OPERATION
With reference to FIGS. 1 through 2b and the timing and flow diagrams of
FIGS. 3 and 4, the operation of the cache subsystem 14-6 incorporating the
method and apparatus of the present invention will now be described. As
previously mentioned, the present invention enables cache subsystem 14-6
to maintain complete integrity within its address paths which include
incrementing circuits. The cache subsystem 14-6 receives from address
selector circuit 14-62 addresses from any one of the sources 14-1 through
14-5 which contain parity check bits or integrity bits. In order to
minimize the time constraints imposed upon the sources, in particular, the
CPU VMMU's, the arrangement of the present invention permits the arrival
times of the address and integrity bits to be skewed as indicated in FIG.
4. That is, the integrity bits generated by the VMMU are permitted to be
delayed up to half way through the directory cycle. At that time, they are
latched on the negative going or trailing edge of timing signal PIPE0A+0A.
The address bits are latched earlier in time, such as one quarter the way
through the directory cycle, in response to load signal ADLOAD. Along with
the address bits, the two transform bits are also latched. Subsequently,
the incremented address bits, the odd address bits and error signal if
detected are latched in response to timing signal PIPE0B+0A.
The transform bits are generated in parallel during the incrementing
operation by PAL circuit 14-65. As seen from FIG. 2a, circuit receives as
inputs, cache memory address signals CMAD13-CMAD22 which correspond to
address bits 13-22, in addition to increment carry signal INCRY0 which is
low or a binary ZERO if address bits CMAD17-22 are high or binary ONES.
The PAL circuit 14-65 generates as outputs, signals FLPA08 and FLPA16
which correspond to flip address parity bit 08 and 16, respectively.
The states of signals FLPAL08 and FLPAL16 are generated according to the
following tables:
______________________________________
FLPAP08 FLPAP16
______________________________________
C C C C I F C C C C C C
C F
M M M M N L M M M M M M M L
A A A A C P A A A A A A A P
D D D D R A D D D D D D D A
1 1 1 1 Y P 1 1 1 1 2 2 2 P
3 4 5 6 O 0 6 7 8 9 0 1 2 1
8 6
(0)X
X X X H L (0)X X X X X X L L
(0)X
X X L L L (2)X X X X X L H L
(1)X
X L H L H (3)X X X X L H H H
(2)X
L H H L L (4)X X X L H H H L
(3)L
H H H L H (5)X X L H H H H H
(3)H
H H H L H (6)X L H H H H H L
(7)L H H H H H H H
(7)H H H H H H H H
______________________________________
As indicated, when the carry-in signal INCRY0 is high, this indicates that
no incrementing is to take place. Conversely, when signal INCRY0 is low,
incrementing will take place. The states of signals INCRY0 and CMAD16
define whether there was a carry from the low order byte of the address.
When an odd number of the address bits CMAD13-15 are predicted to change
state as a result of the carry, the transform bit signal FLPAP08 is set to
a ONE. When address bit 22 (CMAD22) is low, this indicates that no
incrementing is to take place. Conversely, when address bit 22 is high,
incrementing will take place. The numbers to the left side of each table
in parentheses indicate the number of bits predicted to change state.
From the above tables, the Boolean or logical equations for signals FLPA08
and FLPA16 are as follows:
##EQU1##
In the instance where there is only a small differential between the
arrival times of the address and integrity bits, PAL circuit 14-65 can
also be used directly to transform the parity or integrity bits of the
address. In this case, integrity or parity bit signals CMAP08 and CMAP16
are also applied as inputs to PAL circuit 14-65. The states of the
transformed integrity bits CMAP08E and CMAP16E are generated according to
the following tables:
__________________________________________________________________________
CMAP08E CMAP16E
__________________________________________________________________________
C C C C I C C C C C C C C C C C
M M M M N M M M M M M M M M M M
A A A A C A A A A A A A A A A A
D D D D R P P D D D D D D D P P
1 1 1 1 Y 0 0 1 1 1 1 2 2 2 1 1
3 4 5 6 0 8 8 6 7 8 9 0 1 2 6 6
E E
(0)
X X X X H L L (0)
X X X X X X L L L
(0)
X X X X H H H (0)
X X X X X X L H H
(0)
X X X L L L L (2)
X X X X X L H L L
(0)
X X X L L H H (2)
X X X X X L H H H
(1)
X X L H L L H (3)
X X X X L H H L H
(1)
X X L H L H L (3)
X X X X L H H H L
(2)
X L H H L L L (4)
X X X L H H H L L
(2)
X L H H L H H (4)
X X X L H H H H H
(3)
L H H H L L H (5)
X X L H H H H L H
(3)
L H H H L H L (5)
X X L H H H H H L
(3)
H H H H L L H (6)
X L H H H H H L L
(3)
H H H H L H L (6)
X L H H H H H H H
(7)
L H H H H H H H L
(7)
H H H H H H H L H
__________________________________________________________________________
It will be noted that signals CMAP08 and CMAP08E are both a function of
address bits 8-15 while signals CMAP16 and CMAP16E are both a function of
address bits 16-22.
From the above, the Boolean or logical equations for signals CMAP08E and
CMAP16E are as follows:
##EQU2##
Now referring to FIG. 3, it will be assumed that address selector circuit
14-62 has selected CPU 0 VMMU 14-2 as the address source. At the beginning
of a cache cycle as established by the cache timing circuits, portions of
the selected 36-bit address is presented as inputs to the odd address
latches 14-68, the even address latches 14-72, increment circuit 14-64 and
parity transform circuit 14-65. In the preferred embodiment, the arrival
of the 4 integrity bits CMAPEX, CMAP00 through CMAP16 can be delayed.
Therefore, the 32 source address bits are latched into odd address latches
14-68. That is, address bit 22 (CMAD22) is the odd/even starting address
bit. If it is a binary ZERO, this specifies that the selected source
address is already even so that no incrementing need take place. If
address bit 22 is a binary ONE, it specifies that incrementing takes place
and that the selected source address is odd.
From the above, as seen from FIG. 3, the selected source address bits
(CMAD16-21) of the low order byte which is incremented as a function of
the state of address bit 22, is transferred to the even address latches
14-72 without change when bit 22 is a binary ZERO. When bit 22 is a binary
ONE, the low order byte address bits CMAD16-21 are incremented by one by
circuit 14-64.
While incrementing is taking place, PAL transform circuit 14-65 from the
states of the low order byte address bits CMAD 16-21 operates to set the
low order transform bit FPLA16 to a state which indicates whether the
number of low order byte address bits predicted to change state because of
incrementing is odd. If the number is odd, bit FLPA16 is set to a binary
ONE and conversely is set to a binary ZERO when the number of bits
predicted to change is even.
As seen from FIG. 3, PAL transform circuit 14-65 sets the next low order
transform bit FLPA08 to a state which indicates whether the number of
address bits (CMAD13-15) of a portion of the next low order address byte
predicted to change state because of incrementing is odd. Incrementing is
established by the state of the increment carry signal INCRY0 from the
NAND gate 14-640 of FIG. 2a. When signal INCRY0 is a binary ONE, this
indicates that no incrementing is to take place. Conversely, when signal
INCRY0 is a binary ZERO indicating that the address signals CMAD17-22 are
all ONES, this indicates that incrementing is to take place.
As seen from FIG. 3, if the number of next low order byte address bits
CMAD13-15 predicted to change state is odd, then transform bit FLAP08 is
set to a binary ONE. Conversely, if the number predicted to change state
is even, then transform bit FLAP08 is set to a binary ZERO.
The address bits including 10 incremented address bits and 2 transform bits
are latched into even address latches 14-72 in response to address load
signal ADLOAD. At the same time, the unincremented 32 address bits are
latched into odd address latches 14-68. As seen from FIG. 3, the latched
transform bits are used to complement or invert the states of the later
arriving byte integrity bits CMAP08 and CMAP16 which are latched into
parity address latches 14-66 in response to timing signal PIPE0A+0A.
Thereafter, the parity check circuits 14-69 and 14-70 are used to verify
that the source address was transferred and/or incremented without error.
The arrangement of the present invention maximizes the delay in time for
arrival or, stated differently, provides as much time as possible for the
late arriving integrity bits by first combining the incremented address
bits with the transform bits as illustrated by FIG. 2b. The intermediate
result is then combined with the late arriving integriry bit such as by
the exclusive OR circuit 14-702 which performs the required complementing
or inverting of the integrity bit as a function of the state of the
corresponding transform bit. Since both complementing and verifying are
exclusive OR operations, they can be performed in any sequence with the
same results.
As seen from FIG. 3, the results of the verification or checking operation
are stored in pipeline flip-flops 14-86 and 14-87. That is, if any one of
the 4 address bytes of the stored in the odd and even address latches
14-68 and 14-72 produce an error signal, this causes the corresponding one
of the OR gates 14-71 and 14-73 to force its output to a binary ONE. This,
in turn, forces one of the error flip-flops 14-86 and 14-87 to switch to a
binary ONE state in response to timing signal PIPE0B+0A. The error signals
generated by OR gates 14-71 and 14-73 are used to force the cache hit
circuits to signal a miss condition preventing the cache subsystem 14-6
from reading out the incorrect data from its buffer memories 14-88 and
14-90. Thus, the error detected during the performance of an integrity
cycle, overrides the directory cycle creating the cache miss condition.
From the above, it is seen how the method and apparatus of the present
invention provides very efficient high speed generation of integrity bits
for an address which is required to be transferred through an increment
path. This generation can tolerate differences in arrival times between
the address and its integrity bits. Additionally, for purposes of
reliability, the present invention permits verification of the address
transfer and/or incrementing operations.
It will be obvious to those skilled in the art that many changes may be
made to the preferred embodiment of the present invention. For example,
the invention may be used to generate integrity bits for any number of
bytes for other types of sources for use by various types of devices.
Also, other types of programmable logic devices may be employed by the
present invention.
While the characteristic predicted to change state was in terms of the
number of bits being odd, the characteristic could be modified. Also,
while the increment operation involved adding of a constant equal to one,
other types of increment operations can also be performed by the invention
in a similar fashion.
While in accordance with the provisions and statutes there has been
illustrated and described the best form of the invention, certain changes
may be made without departing from the spirit of the invention as set
forth in the appended claims and that in some cases, certain features of
the invention may be used to advantage without a corresponding use of
other features.
* * * * *
|
|
|
|
|
Description  |
|