|
Claims  |
|
|
I claim:
1. In a fault-tolerant multiprocessor arrangement comprising at least a
first and a second processor, a first data memory for storing memory
words, each of which being stored under an address different from the
address of another of said memory words, a second data memory for storing
a pair of memory words under each of said addresses, and a control logic
with at least a check bit memory having for each pair of memory words
stored under the same address in said second data memory at least two
check bits for completing the address of each pair of memory words and a
one-bit storage cell containing a control bit, an operating method
comprising the steps of:
said first processing during execution of a program containing recovery
points writes a first of said memory words under a first of said addresses
into said first data memory, transfers said memory word including said
first address from said first to said second data memory, and writes said
transferred first memory word and its respective address into said second
data memory under control of said control logic,
said first processor during execution of said program further transfers a
recovery point signal indicating in each case the arrival of said program
at a recovery point, thereby causing the control logic to invert said
control bit of said one-bit storage cell at each recovery point in
response to said transferred recovery signal,
said control logic during the process of writing of said first memory word
into said second data memory reads out of said check bit memory a first
and a second check bit discriminated by said first address, forms a first
pointer bit from said first, said second and said control bit in
accordance with the rule:
OLD(1):=BIT1(1).A+BIT2(1).A
and a second pointer bit having a binary state which is the inverse of
said first pointer bit, feeds said second pointer bit to an address input
of said second data memory to complete the pair address for said first
address of said first memory word of said second data memory, first
newly determines the binary states of said and second control bits in
accordance with the rules:
BIT(1):=NEW(1).A+OLD(1).A, and DIT2(1):=NEW(1).A+OLD(1).A,
writes said newly determined first and second bits into said check bit
memory,
said control logic in said interval between said two current recovery
points further reads out of said check bit memory a third and a fourth
check bit discriminated by a second address of a pair of secondary memory
words into which not data have yet been entered within said interval,
forms a third and a fourth pointer bit from said third and said fourth
check and said control bit in accordance with the rules:
OLD(2):=BIT(2).A+BIT2(2).A, and NEW(2):=BIT1(2).A+BIT2(2).A
newly determines the binary states of said third and said fourth check bit
from said third and said fourth pointer and said control bit in accordance
with the rules:
BIT1(2):=NEW(2).A+OLD(2).A, and BIT2(2):=NEW(2).A+OLD(2).A,
and writes said newly determined third and fourth check bit into said
check bit memory,
said second processor in response to a fault signal of said first processor
takes over and continues the execution of said program at a recovery point
last reached by said first processor during execution of said program by
use of the data states at this recovery point transferred into said second
data memory,
said control logic during the process of each reading of said second
processor from said second data memory forms in the case of said first or
said second address said first or said third pointer bit, and
feeds said formed first or third pointer bit to said address input of said
second data memory for completing the pair addresses for said first or
second address of said first or second memory word of said second data
memory, wherein
A means said control bit,
BIT(1), BIT2(1), BIT1(2) and BIT2(2) mean said first, second, third and
fourth check bit,
OLD(1), NEW(1), OLD(2) and NEW(2) mean said first, second, third and fourth
pointer bit,
the dot means a logical AND operation,
the plus sign means a logical OR operation, and
the bar above the symbols means negation.
2. A method according to claim 1 wherein said control logic contains an
address generator, said method further comprising the steps of:
said control logic executes forming of pair addresses in a loop through all
addresses, and
said address generator successively generates all addresses.
3. A method according to claim 1 wherein said control logic contains an
address generator, said method further comprising the steps of:
said control logic executes forming of pair addresses in a loop through
groups of addresses having in each case a common group address, and
said address generator successively generates said group addresses.
4. A method according to claim 2 or 3 wherein said check bit memory has for
each first resp. second address in addition to said first and second resp.
third and fourth check bit a fifth resp. a sixth check bit, said method
further comprising the steps of:
said control logic reads said fifth resp. sixth check bit in each case
together with said first and second resp. third or fourth check bit from
said check bit memory, after processing and if necessary changing, writes
back said fifth resp. sixth check bit together with said first and second
rsp. third and fourth check bit into said check bit memory,
forms a first resp. a second initialization bit from said fifth resp. said
sixth check and said control bit in accordance with the rules:
INI(1):BIT3(1).A+BIT3(1).A, and INI(2):=BIT3(2).A+BIT3(2).A,
sets said first initialization bit with each execution of writing into
said second data memory to a first binary state, for example to "1",
newly determines the binary state of said fifth check bit in accordance
with the rule
BIT3(1):=INI(1).A+INI(1).A+INI(1).A,
writes said newly determined fifth check bit into said check bit memory,
uses said first initialization bit with each execution of reading from said
second data memory in a loop for recognizing addresses of pairs of memory
values into which data have already been written in the respective current
interval, sets said second initialization bit with each forming of address
pairs in a loop to a first binary state,
newly determines the binary state of said sixth check bit as described, and
writes the newly determined sixth check bit into the check bit memory,
wherein
BIT3(1) resp. BIT3(2) means said fifth resp. sixth check bit, and
INI(1) and INI(2) means said first and said second initialization bit.
5. A method claim according to claim 1, in which:
the control logic executes the forming of pair addresses in parallel for
all addresses immediately after arrival at each recovery point, and
only afterwords writes into said second data memory.
6. A method as claimed in claim 1, wherein said second data memory has two
memory areas having in each case pairs of memory words, in which one of
the memory areas in used as a save memory area for saving the data state
of said first data memory at the recovery points and the other memory area
is used as main memory area by said second processor, said method further
comprising the steps of:
said control unit in each case supplies a point bit, which does not change
in time, to the said address input of said second data memory for
completing the pair of address for the memory word address during all
memory accesses to the main memory areas. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to a fault-tolerant multiprocessor
arrangement and more particularly to a fault-tolerant multiprocessor with
a memory for storing the previous data state.
2. Discussion of Background
Transient or permanent hardware and software faults in a fault-tolerant
multiprocessor arrangement must be detected and eliminated as quickly as
possible. For the purpose of fault detection, two processors forming one
data processor are frequently operated in parallel with the same program
and their results are monitored for correspondence. As soon as a fault
occurs, program execution is interrupted (fail-stop function). To ensure
that the program is executed even after a fault, particularly in the case
of process computer applications, a back-up task is activated on a
hot-standby processor which is provided with the same I/O channels.
Since it is a very complex task to guarantee the integrity of each
individual (atomistic) operation during the activation of the standby
processor and since more time is frequently required for fault detection
than for the execution of an individual operation, execution of the
program is usually resumed by the standby processor at one of several
earlier points specially provided for in the program (recovery point).
This point has been reached without faults by the processor originally
executing the program (rollback technique). By resuming at this point,
some of the operations will be repeated.
Execution of the program is preferably resumed at the recovery point last
reached without faults. However, this is only possible if the time of
occurrence of the fault and the time of its detection lie within the same
interval between the same recovery points, for example recovery points
RP.sub.i and RP.sub.i+1. Otherwise, execution of the program must be
resumed at the recovery point RP.sub.i-1 or an even earlier recovery point
(multiple-step rollback technique).
To resume program execution at a recovery point, the standby processor
needs, in addition to the program, a copy of the data state which existed
at this recovery point in the main memory of the processor originally
executing the program. The program, which does not change with time can be
made available to the standby processor at program start up. In contrast,
copies must be provided of the continuously changing data state at each
recovery must be stored in a manner accessible to the standby processor at
program start up the respectively next recovery point is reached
The speed with which program execution is resumed by the standby processor
after an error and with which the state of program execution already
achieved when the error occurred is achieved again obviously depends on
the mechanism of creation of these copies, the mechanism by which these
copies are accessed by the standby processor and the amount of code
between the individual recovery points in the program.
The data state copies are created in a so-called state save unit (SSU).
This unit is arranged to be electrically isolated and spatially separated
from the processor executing the program so that it will not also be
affected by an error in this processor. As has already been mentioned, the
standby processor should be able to rapidly access the data state save
copies which is why the state save unit is usually arranged close to the
standby processor.
The simplest way of creating the data state copies at the recovery points
consists of the processor executing the program transferring the total
contents of the memory allocated to it into the state save unit after each
recovery point has been reached. However, program execution must be
interrupted during the time required for this operation.
To save time, the amount of data to be transferred into the state save unit
can be reduced by transferring only the changes D.sub.i, i+1, produced by
the write accesses in the interval which has just elapsed, for example
between recovery points RP.sub.i and RP.sub.i+1, to the state save unit.
This is possible because the data state S.sub.i+1 at the recovery point
RP.sub.i+1 differs from the preceding data state S.sub.i at recovery point
RP.sub.i only by the changes D.sub.i, i+1 :
S.sub.i+1 =S.sub.i +D.sub.i , i+1.
If an error occurs during the process of transferring, the data or changes
D.sub.i, i+1 into the state save unit using this procedure, the problem is
created that the no longer current "old" data state S.sub.i is already
partially overwritten by the data or changes of the more current "new"
data state S.sub.i+1 but the "new" data state Si+1 has not yet been
completely recorded. There is then no valid data state available in the
state save unit.
One possibility of avoiding this problem consists in duplicating the state
save unit. Such a duplication is already known from Ferridun, A. M.;
Shien, K. G., A fault tolerant multiprocessor with rollback recovery
capability, Proc. 2nd Intern. Conf. on Distr. Comp. Systems, pages
283-289, 4/81.
While one half of the duplicated state save unit is in each case available
for receiving the new data state, for example S.sub.i+1, the "old" data
state Si is stored in the other half. The function of the two halves of
the duplicated state save unit (updating, storing) alternates at each
recovery point.
Since the saved data state, for example S.sub.i, in one half of the
duplicated state save unit is not influenced by filing the new data state
S.sub.i+1 in its other half, filing of the new data state S.sub.i+1 can
take place during program execution in the interval between recovery
points RP.sub.i and RP.sub.i+1 so that, as a rule, no further interruption
of program execution is required.
To file the data state S.sub.i+1, it is possible to copy the data state
S.sub.i in the :state save unit into its respective other half in order to
be updated and at the same time to record the current changes D.sub.i, i+1
in this half. On the other hand, it is also possible to transfer, instead
of the entire data state S.sub.i, only the changes D.sub.i-1, i carried
out in the preceding interval between recovery points RP.sub.i-1 and
RP.sub.i in the half containing data state S.sub.i in the current interval
into the half to be updated in the state save unit because it holds true
that:
S.sub.i+1 =S.sub.i-1 +D.sub.i-1, i +D.sub.i, i+1.
For this purpose, however, the state save unit memory words or addresses
which have been modified in the preceding interval must be flagged. This
can be done by allocating a separate bit to each memory word of the state
save unit. Another bit can be used in a known manner for identifying the
memory words of the state save unit half to be updated in which current
changes D.sub.i, i+1 have already been made in the interval current in
each case. This allows the transfer of changes D.sub.i-1, i in the state
save unit and the recording of current changes D.sub.i, i+1 to be nested
together since overwriting of the current changes D.sub.i, i+1 with
changes D.sub.i, i-1 from the preceding interval can be avoided by
checking the other bit. The two bits mentioned change meaning at every
recovery point.
The method described can be used for creating the save copies in the state
save unit without any effect on the running of the program if the transfer
of all changes D.sub.i-1, i from the preceding interval in the state save
unit can be concluded before the next recovery point in each case is
reached in program execution. However, this means that the minimum
distance between two recovery points is determined by the transfer time.
However, the recovery points cannot easily be provided at arbitrary points
in the program and at arbitrary distances from one another. Problems which
would arise, for example, during resumption of program execution by the
standby processor due to the repetition of output operations to the
peripherals or the repetition of inter-process communication can be
avoided only by providing recovery points in each case immediately
following such operations.
However, this requirement establishes an upper limit for the mutual
distance between recovery points. The mutual distance between recovery
points is therefore primarily determined by the intensity of the I/O
operations required. In particular applications, it an be shorter than the
time require for data transfer in the state save unit.
SUMMARY OF THE INVENTION
The primary object of this invention is therefore to reduce the time
requirements within the state save unit.
According to this invention, data do not need to be transferred within the
state save unit which is essentially formed by the second data memory and
its associated control logic. Instead, pointer bits are used which specify
where the data belonging to the "old" and "new " state are stored in the
state save unit. Modification of the pointer bits, which is all that is
required, is mush more efficient than the known data transfer within the
state save unit and, in addition, is independent of the memory word width.
The pointer bits are formed from check bits which can be stored in a
separate high speed memory. Only the simplest logic operations are
required for forming the pointer bits and for modifying them. The bit
operations can be carried out completely successively, partially parallel
or completely in parallel. Partial or complete parallel execution of the
bit operations advantageously entails further time saving. The high speed
memory preferably used for the check bits and the logic required for the
bit operations can be integrated in a single VLSI chip. This additionally
increases the processing speed of the bits. In addition, this solution
leads to a simple and elegant circuit configuration. The data link is
preferably an optical data channel. This optimally results in electric
isolation between the state save unit and the first processor executing
the program and the first data memory allocated to it as main memory. As a
result, an error in the first processor or in the first data memory cannot
influence the data saved in the state save unit. By using a buffer memory
on the transmit side of the data link, the data (and addresses) to be
transferred via the data link can be transferred with relatively uniform
distribution in time. As a result, the band width required for the data
link can be advantageously reduced.
So that the saved data state of the second processor used as standby
processor is available as directly as possible after fault of the first
processor, the control logic can be designed in such a manner that it
allows the second processor to have direct access to the second data
memory. In this case, the second data memory is used directly as main
memory by the second processor.
The overall multiprocessor arrangement can be of symmetrical configuration
so that the processors save one another's data. Three or even more
processors can also be connected together to a network saving one
another's data.
BRIEF DESCRIPTION OF THE DRAWINGS
Further developments and advantages of the invention can be seen in the
subsequent explanation of illustrative embodiments, reference being made
to the accompanying drawings in which:
FIG. 1 is a block diagram showing the fault-tolerant multiprocessor
arrangement according to the invention, comprising a first embodiment of
the control logic in which the bit operations are executed successively,
FIG. 2 is a diagrammatic representation for explaining the bit operations,
of FIG. 1,
FIG. 3 is a diagrammatic representation, corresponding to FIG. 2, for
explaining further bit operations,
FIG. 4 is a block diagram of a second embodiment of the control logic in
which the bit operations are executed partially in parallel,
FIG. 5 is a block diagram of a third embodiment of the control logic in
which the bit operations are executed completely in parallel and
FIG. 6 is a block diagram of a fault-tolerant multiprocessor arrangement
according to the invention, comprising two systems of processor, data
memory and control logic which save one another's data.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
A more complete appreciation of the invention and many of the attendant
advantages thereof will be readily obtained as the same becomes better
understood by reference to the following detailed description when
considered in connection with the accompanying drawings, wherein
P.sub.1 designates a first processor in FIG. 1. This processor is connected
via a data bus DB.sub.1 and an address bus AB.sub.1 to a first data memory
M.sub.1. A control line RW.sub.1 is used for controlling the read/write
memory operations. The data bus DB.sub.1 and the address bus AB.sub.1 are
connected to a transmit unit S.sub.1. During all write operations of the
processor P.sub.1 to the first data memory M.sub.1, this unit transmits
the addresses adr and data appearing on the multi-core buses DB.sub.1 and
AB.sub.1 over a data link DL.sub.1, if necessary after temporary storage
in a buffer memory B.sub.1. The write operations are recognized by the
transmit unit S.sub.1 by the logic state on control line RW.sub.1 which is
also supplied to it. In addition to processor P.sub.1 another processor
P.sub.1, can be provided which, together with the processor P.sub.1, forms
a data processor which is designated as first processor in the following.
The two processors P.sub.1, P.sub.1 of this data processor are capable of
checking each other and of generating a fault message and stopping
operation when a fault occurs (fail-stop function).
The data and addresses adr transferred by the data link DL.sub.1 are
received in a receive unit R.sub.2 and, if necessary, are output again to
a data bus DB.sub.2.1 and an address bus AB.sub.2.1 after temporary
storage. The buses DB.sub.2.1 and AB.sub.2.1 are connected to a second
data memory M.sub.2. Among other purposes, this memory is used as a save
memory for the data written by the first processor into the first data
memory M.sub.1 and transferred to it via the data link DL.sub.1. Other
tasks of the second data memory M.sub.2 are explained below. To exercise
its save function, the second data memory M.sub.2 has a pair of memory
words at the corresponding address adr in each case at least for each
memory word of the first data memory M.sub.1 receiving write inputs from
the first processor during program execution. One of the two memory words
of each pair is in each case used for storing the contents of the memory
word, corresponding to the respective pair with respect to address, in the
first data memory M.sub.1 at the last recovery point. The other memory
word of each pair is used for storing the current contents, possibly
changed after the last recovery point, of this memory word.
The addresses adr on address bus AB.sub.2.1 only address pairs of memory
words in the second data memory M.sub.2 and not individual memory words.
Addresses adr are therefore incomplete for the second data memory M.sub.2.
To complete them and to form the memory word address of the second data
memory M.sub.2, an additional address bit Z.sub.2 is needed which is
supplied by a control logic CL.sub.2.
The control logic CL.sub.2 contains a control bit memory M.sub.C the
address inputs of which are connected to the address bus AB.sub.2.1. The
word width of the check bit memory M.sub.C is three bits. The
corresponding three data outputs of the check bit memory M.sub.C are
connected to a check bit bus CB.sub.1 which leads to first logic L.sub.1.
The input of the first logic L.sub.1 is also connected to two control
lines A and B whose meaning will be explained below. At the output, the
first logic L.sub.1 is connected to another check bus CB.sub.2 which leads
back to the check bit memory M.sub.C, to two signal lines OLD and NEW. The
latter lead to a multiplexor M.sub.X which, depending on the logic state
of a control line C, switches one of them through to its output. This
output supplies the previously mentioned additional address bit Z.sub.2
for the second data memory M.sub.2.
The control logic CL.sub.2 also contains second logic L.sub.2. This is
connected to two signal lines SREQ.sub.2 and RPCLK.sub.2 which are
connected to outputs of the receive unit R.sub.2. At the output, the
second logic L.sub.2 is connected to the previously mentioned control
lines A, B and C and to further control lines D, E.sub.2, F.sub.2 and
RW.sub.2.1.
The signal on control line D is applied to a tristate driver T.sub.2.3 by
way of which an address generator G is connected to the address bus
AB.sub.2.1. The control lines E.sub.2, F.sub.2 and RW.sub.2.1 emerge from
the control logic CL.sub.2. Control line RW.sub.2.1 is used for
controlling the read/write memory operations in the second data memory
M.sub.2 and is therefore connected to this memory. The signal on control
line E.sub.2 is applied to tristate drivers T.sub.2.1 and T.sub.2.2 in
buses AB.sub.2.1 and DB.sub.2.1.
The fault-tolerant multiprocessor arrangement of FIG. 1 is completed by a
second processor which, like the first processor, is provided with two
processors P.sub.2 and P.sub.2, which check each other, and by address and
data buses AB.sub.2 and DB.sub.2 associated with these processors. The
latter buses are connected via tristate drivers T.sub.2.4 and T.sub.2.5 to
buses AB.sub.2.1 and DB.sub.2.1. The tristate driver T.sub.2.5 is designed
to be bidirectional. The signal on control line F.sub.2 is applied to
tristate drivers T.sub.2.4 and T.sub.2.5. The address bus AB.sub.2 is
connected to a decoder D.sub.2 which generates from the addresses adr
appearing on this bus a memory request signal MR.sub.2 which is supplied
to an input of the second logic L.sub.2 in the control logic CL.sub.2.
Finally, the signal of another control line RW.sub.2 which is directly
connected to the second processor consisting of processor P.sub.2 and
P.sub.2 and which is a read/write control line is also applied to the
second logic L.sub.2.
The fault-tolerant multiprocessor arrangement, the configuration of which
is described above, operates in the following manner. The first processor
processes a program, which program is provided with recovery points. At
each of these recovery points, the data state of the first data memory
M.sub.1 should be stored as save copy in the second data memory M.sub.2 so
that, in a case of a failure of the first processor, the second processor
can take over and continue execution of the program at the last recovery
point reached without faults by the first processor. In addition to the
data state, the second processor naturally also needs the program itself
for this purpose. However, this should be available to it right from the
start.
The text which follows first explains how the save copies are created in
the second data memory M.sub.2. Following this, the manner in which the
second processor accesses the save copies in the event of a failure of the
first processor is explained.
To create the save copies, all data written by the first processor into the
first data memory M.sub.1 and the means by which its data state is changed
are acquired by the transmit unit S.sub.1 and are transmitted, together
with their addresses adr, via the data link DL.sub.1 to the receive unit
R.sub.2. From this unit, the second data memory M.sub.2 receives the data
via the data bus DB.sub.2.1 and the addresses adr via the address bus
AB.sub.2.1. As already stated, the transferred addresses adr are
incomplete for the second data memory M.sub.2 since they are in each case
two memory words in the second data memory M.sub.2 which have the address
adr. To complete the addresses adr, the control logic CL.sub.2 in each
case generates the additional bit Z.sub.2. During generation of this bit,
the control logic must ensure that no data belonging to the "old" state,
that is to say to the state at the recovery point last reached without
faults by the first processor, are overwritten in the second data memory
M.sub.2. For this purpose, the control logic CL.sub.2 must know in which
of the two memory words of the pairs having the common address adr the
data item belonging to the "old" state is stored. In addition, the control
logic must remember in each case in which of the memory words under its
control a data item belonging to the "new" state, that is to say a data
item written into the first data memory by the first processor after the
recovery point last reached without faults, has been stored.
Two check bits BIT1(adr) and BIT2(adr) are provided for each address adr in
the check bit memory M.sub.C for storing this information. In addition to
the check bits BIT1(adr) and BIT2(adr), the check bit memory M.sub.C
contains for each address adr another check bit BIT3(adr) the meaning of
which will still be explained.
At the same time that the memory word pairs are addressed in the second
main memory M.sub.2, the respective associated check bits, forming a
triplet, in the check bit memory M.sub.C are addressed since this is also
connected to the address bus AB.sub.2.1. The check bits BIT1(adr),
BIT2(adr) and BIT3(adr) addressed in each case appear at the three data
outputs of the check bit memory M.sub.C and reach the first logic L.sub.1
via the check bit bus CB.sub.1. In this logic, a pointer bit OLD(adr) is
formed from the check bits BIT1(adr) and BIT2(adr) and from a check bit A
by means of simple logic gates in accordance with the rule
OLD(adr):=A+BIT1(adr).A+BIT2(adr) . A
In addition, a pointer bit NEW (adr) is formed in accordance with the rule
NEW (adr):=OLD (adr)
that is to say be inverting the pointer bit OLD(adr). Above, as in the text
which follows, the dot means a logical AND operation, the plug sign means
a logical OR operation and the bar above the symbols means negation.
As can be easily reconstructed with the aid of the rule for forming the
OLD(adr) pointer bit and by means of the first truth table below, the
OLD(adr) pointer bit alternately corresponds to the BIT1(adr) check bit or
to the BIT2(adr) check bit, depending on the binary state of control bit
A.
______________________________________
First truth table
A BIT1(adr) BIT2(adr) OLD(adr)
______________________________________
0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 0
1 1 1 1
______________________________________
Control bit A is generated by the second logic L.sub.2 and passes to the
first logic L.sub.1 via control line A of the same name. It changes its
binary state at each recovery point. It is preferably derived from the
initial state of a flip flop FF contained in the second logic L.sub.2.
That a recovery point has been reached can be indicated, for example, by
means of a message transferred via data bus DB.sub.1 to transmit unit
S.sub.1 by the first processor Following this message, transmit unit
S.sub.1 in each case transmits a corresponding message to receive unit
R.sub.2. When it receives this message, this unit generates a recovery
point clock signal which passes via signal line RPCLK.sub.2 to the second
logic L.sub.2 which subsequently inverts the binary state of control bit A
and of the flip flop FF.
The OLD(adr) and NEW(adr) pointer bits are output by the first logic
L.sub.1 on control lines OLD and NEW and pass via these lines to
multiplexor M.sub.X. They are used alternately, depending on the logic
state of control line C at multiplexor M.sub.X, as additional address bit
Z.sub.2 for the second data memory M.sub.2. The OLD(adr) pointer bit
addresses in each case the memory word of the pairs in the second data
memory M.sub.2 which contains the data item belonging to the "old" state
while the NEW(adr) pointer bit in each case addresses the memory word into
which the new data item (present on data bus DB.sub.2.1) is to be written
(or has been written). In the process of writing the saving data which
were transferred via data link DL.sub.1 into the second data memory
M.sub.2 the NEW(adr) pointer bit must naturally be selected by the
multiplexor M.sub.X. The OLD(adr) pointer bit is in each case used for the
read access, to be discussed below, to the "old" data state by the second
processor. The fact that the NEW(adr) pointer bit is in each case formed
with the inverse binary state from the OLD(adr) pointer bit in the first
logic L.sub.1 ensures in a simple manner that data belonging to the "old"
data state are not overwritten in the second data memory M.sub.2.
The binary state of the control line C determining the selection of the two
pointer bits OLD(adr) and NEW(adr) is checked by the second logic L.sub.2.
Every time data and addresses are ready in the receive unit R.sub.2, this
logic receives a save request signal on signal line SREQ.sub.2 from the
latter. Subsequent to this signal, the binary state of control line C is
set by the second logic L.sub.2 in such a manner that the NEW(adr) pointer
bit is switched through at the multiplexor M.sub.X. The second logic
L.sub.2 informs the second data memory M.sub.2 via the read/write control
line RW.sub.2.1 that the data present on data bus DB.sub.2.1 are to be
stored.
As has already been mentioned, the control logic CL.sub.2 must remember the
write destination for the new data. For this purpose, check bits BIT1(adr)
and BIT2(adr) are newly determined from the pointer bit OLD(adr) and
NEW(adr) and the control bit A in the logic L.sub.1 in accordance with the
rules
BIT1(adr):=NEW(adr) . A+OLD(adr) . A and
BIT2(adr):=NEW(adr) . A+OLD(adr) . A
and written via check bit bus CB.sub.2 into check bit memory M.sub.C,
overwriting the original values of these check bits. As can again be
reconstructed by means of the specified rules of formation for the
BIT1(adr) and BIT2(adr) check bits and by means of the second truth table
below, however, only the check bit which does not happen to correspond to
the pointer bit OLD(adr), in dependence on the logic state of check bit A,
is effectively newly determined. The check bit corresponding to pointer
bit OLD(adr) remains unchanged in order to retain the information of which
of the pairs of memory words of the second data memory M.sub.2 the data
item belonging to the "old" state is stored.
______________________________________
Second truth table
BIT1 BIT2
BIT1(adr) BIT2(adr) OLD NEW (adr) (adr)
A (old) (old) (adr) (adr) (new) (new)
______________________________________
0 0 0 0 1 0 1
0 0 1 0 1 0 1
0 1 0 1 0 1 0
0 1 1 1 0 1 0
1 0 0 0 1 1 0
1 0 1 1 0 0 1
1 1 0 0 1 1 0
1 1 1 1 0 0 1
______________________________________
The recovery points, the arrival the control logic CL.sub.2 and the second
logic L.sub.2 contained in it by the recovery point clock signal on signal
line RPCLK.sub.2 represent points in time which are of importance to the
storage process described. At these points, all data previously belonging
to the "new" state become data belonging to the "old" state in the second
data memory M.sub.2 since the old state refers in each case to the
recovery point last reached. By inverting control bit A at each recovery
point, however, this is taken into account in a simple manner. This is
because, after inversion of the control bit A, the OLD(adr) pointer bit is
derived from the other check bit BIT1(adr) or BIT2(adr) as before in each
case by the first logic L.sub.1. If, for example, the OLD(adr) pointer bit
was derived from the BIT1(adr) check bit in the interval elapsed between
the previously reached recovery point RP.sub.i-1 and the current recovery
point RP.sub.i, it is now, in the current interval, derived from the
BIT2(adr) check bit until the next recovery point RP.sub.i+1 is reached.
But this check bit contains just the information into which of the memory
words of the memory word pairs of the second data memory M.sub.2 the data
item was written as belonging to the "new" but now to the "old" state in
the interval just elapsed. The data belonging to the "old" data state are
thus directly available after the inversion of the control bit A which can
be executed very rapidly by simply switching over the flip flop FF in the
logic L.sub.2.
However, the above discussion left out of consideration the fact that, as a
rule, the write accesses take place only to some of the addresses adr in
the intervals between the recovery points. For the addresses adr into
which nothing was written in the interval elapsed, the pointer bit
OLD(adr) correctly addresses the data item belonging in each case to the
"old" state in the current interval after inversion of control bit A if
the two check bits BIT1(adr) and BIT2(adr) correspond at least at the end
of the interval elapsed.
This problem is illustrated in FIG. 2. At (a) to (e) in FIG. 2 a pair of
memory words of the second data memory M.sub.2 is diagrammatically shown
which is in each case intended to be the same pair of memory words. On
both sides of the pair of memory words, the logic states of the check bits
BIT1(adr) and BIT2(adr) belonging to this pair are shown in each case as
arrows pointing to one or the other memory word. FIG. 2(a) shows, for
example, the data state of the memory words at a recovery point
RP.sub.i-2. The upper one of the two memory words contains data item I
while the lower memory word contains data item II. Both check bits
BIT1(adr) and BIT2(adr) "point" to the upper memory word having data item
I which is intended to correspond to the data item contained in the
corresponding memory word of the first data memory M.sub.1 at recovery
point RP.sub.i-2. Data item II is intended to be an older data item which
is no longer current. Up to recovery point RP.sub.i-1, the pointer bit
OLD(adr) is to be derived from check bit BIT1(adr). Before this recovery
point RP.sub.i-1 is reached, a new data item III is then to be stored in
the pair of memory words shown. To retain the more current "old" data item
I, it must be written into the lower memory word. As has already been
explained, this is automatically achieved due to the fact that the pointer
bit NEW(adr), which has a logic state which is the inverse of the pointer
bit OLD(adr) and thus, in this case, of check bit BIT1(adr), is selected
as the additional address bit Z.sub.2 for the second data memory M.sub.2
at the multiplexor M.sub.X. When the BIT1(adr) and BIT2(adr) check bits
are newly determined, from the pointer bits OLD(adr) and NEW(adr) and from
the control bit A after the new data item III has been written into this
lower memory word of the pair of memory words shown, the BIT2(adr) check
bit is inverted. It now "points" to the lower memory word containing the
new data item III after having been written back into the check bit memory
M.sub.C. Check bit BIT1(adr) remains unchanged. This is shown in FIG.
2(b). At recovery point RP.sub.i-1, check bit BIT2(adr) is given the
meaning of the OLD(adr) pointer bit as shown in FIG. 2(i c). The new "old"
data item is now data item III in the lower memory word. In the interval
between recovery points RP.sub.i-1 and RP.sub.i, no further new data item
is to be written into the pair of memory words shown. However, after check
bit BIT1(adr) has retrieved its meaning as OLD(adr) pointer bit, this bit
would now point to the wrong "old" data item I in the upper memory word
at recovery point RP.sub.i. This is because the more current, new "old"
data item III is now contained in the lower memory word. This is shown in
FIG. 2(e). However, this problem can be solved in a simple manner by
setting the check bit which happens not to correspond to the OLD(adr)
pointer bit, in this case the BIT1(adr) check bit, to be equal to the
check bit corresponding to the OLD(adr) pointer bit, in this case to the
BIT2(adr) check bit at some time in the interval between recovery points
RP.sub.i-1 and RP.sub.i as shown in dashes in FIG. 2(d). Naturally, this
setting to equality must take place for all addresses adr which were not
the subject of a write access in the preceding interval. In addition, the
setting to equality must be terminated for all addresses before the next
recovery point is reached.
This equating operation is carried out in the control logic CL.sub.2 by all
addresses adr which are received by the check memory M.sub.C via the
address bus AB.sub.2.1 having been successively generated by the address
generator G. As in the case of the addressing by addresses adr which are
output to the address bus AB.sub.2.1 by the receive unit R.sub.2, the
triplet of check bits addressed in each case is read out of the check bit
memory M.sub.C and supplied to the first logic L.sub.1 via the check bit
bus CB.sub.1. This logic forms the pointer bit OLD(adr) from the check
bits BIT1(adr) and BIT2(adr) in the manner already described. However, the
NEW(adr) pointer bit is now formed in accordance with the rule
NEW(adr):=OLD(adr),
that is to say with the same logic state as the OLD(adr) pointer bit. The
BIT1(adr) and BIT2(adr) check bits (or one of the two) are again newly
determined in the manner already specified and written back into the check
bit memory M.sub.C. There is no operation of writing into the second data
memory M.sub.2 during this process. This is therefore an operation which
runs purely within the control logic CL.sub.2. According to the above
discussion, the first logic L.sub.1 must be capable of executing two
different operations with respect to forming the pointer bit NEW(adr).
Control line B is used for informing the first logic L.sub.1 which of the
two operations it is to execute. Whenever the receive unit R.sub.2
contains no data for saving in the second data memory M.sub.2, which is
recognized by the second logic L.sub.2, for example, by the signal level
on signal line SREQ.sub.2, this logic adjusts the logic state of control
line B in such a manner that the equating operation last described is
executed by the first logic L.sub.1. At the same time, the second logic
L.sub.2 enables, via control lines D, the address generation at address
generator G and the connection of the tristate driver T.sub.2.3 to the
address bus A.sub.2.1. Simultaneously, it decouples the receive unit
R.sub.2 by means of the two tristate drivers T.sub.2.1 and T.sub.2.2 from
address bus AB.sub.2.1 and from data bus DB.sub.2.1 via control lines
E.sub.2. As soon as new data are available again in the receive unit
R.sub.2 for saving in the second data memory M.sub.2, execution of the
equating operations is interrupted by the second logic L.sub.2 and the new
data are written into the second data memory M.sub.2.
The two different operations can therefore alternately occur within the
intervals between the recovery points. However, this results in a data
consistency problem which will be explained with the aid of FIG. 3. The
representation in FIG. 3 corresponds to FIG. 2. In particular, FIG. 3 (a)
corresponds to FIG. 2(b). In FIG. 3(c) a write process is additionally
assumed between recovery points RP.sub.i-1 and RP.sub.i, the data item IV
being written as "new" data item into the upper memory word. During this
process, as in FIG. 2 (b), the check bit which does not happen to have the
meaning of the OLD(adr) pointer bit, that is to say now the check bit
BIT1(adr) is inverted so that it "points" to the memory word with the new
data item IV. If then the equating operation, shown in dashes in FIG. 2
(e) were to be subsequently carried out for the pair of memory words
shown, the check bit BIT1(adr), which again takes over the meaning of the
OLD(adr) pointer bit, would "point" to the memory word with the old "old"
data item IV and not to the memory word with the now more current new
"old" data item IV at the subsequent recovery point RP.sub.i. It is
important therefore to prevent the equating operations from being carried
out for addresses of pairs of memory words into which a new data item has
already been written in the current interval.
This can be achieved in a simple manner by causing the control logic
| | |