|
Claims  |
|
|
What is claimed is:
1. A multiprocessor system that executes programs written in a
predetermined programming language expressed by an inference tree
constituted by nodes connected from top to bottom by branches comprising
OR processes, said multiprocessor system comprising:
a plurality of processors numbered serially with identifier numbers; and
a network system for linking said processors, each of said processors
comprising
(a) first means for storing initial system information representing
characteristics of said multiprocessor system, which information includes
(1) first information representative of a value of the identifier number of
said processor and
(2) second information representative of the number of said processors
linked by said network system;
(b) second means including
(1) searching means for searching branches at a node of the program,
(2) storing means for storing third information representative of the
number of said branches at each node, which number is obtained by
searching the branches at each node, and
(3) selecting means for automatically and dynamically selecting a specified
branch at each node of said OR process by executing a predetermined
calculation by use of said initial system information and the third
information without transferring information between said processors,
whereby an initial load balancing is obtained in said multiprocessor
system.
2. A logic programming language multiprocessor system according to claim 1
wherein each of said processor further comprises:
(a) fourth means for storing said first information in a first register and
also storing said second information in a second register as an initial
condition;
(b) fifth means for carrying out a calculation involving dividing the
values of said first and second registers by said third information;
(c) sixth means for updating the values of said first and second registers
with the quotient and remainder of the calculated result produced by said
fifth means; and
(d) seventh means for selecting a specified branch by use of the updated
values of said first and second registers updated by said sixth means
until the updated value of said second register is decreased to only one,
whereby said specified branch is selected in each processor.
3. A method of load balancing in a multiprocessor system that executes
programs written in a predetermined programming language expressed by an
inference tree constituted by nodes connected from top to bottom by
branches comprising OR processes, said multiprocessor system having a
plurality of processors with separate memories and being numbered serially
with identifier numbers, each of said processors including searching means
for searching branches at the node, and selecting means for selecting a
desireable branch along a given OR process, said multiprocessor system
further having a network system linking said processors
said method of load balancing in each of said processors comprising the
steps of:
(a) storing initial system information representative of the
characteristics of said multiprocessor system, which information includes
(1) first information representative of the value of the identifier number
of said processor and
(2) second information representative of the number of said processors
linked by said network system;
(b) searching branches at each node of the program;
(c) storing third information representative of the number of branches at
each node which is obtained by the operation of searching said branches at
each node; and
(d) automatically selecting a specified branch at each node of said OR
process by executing a predetermined calculation by use of said initial
system information and said third information without transferring
information between said processor; and
(e) processing said branch in each processor independently, whereby an
initial load balancing is obtained in said multiprocessor system.
4. A method of load balancing according to claim 3 further comprising steps
of:
(a) storing said first information in a first register and also storing
said second information in a second register as an initial condition;
(b) carrying out a calculation involving dividing the values of said first
and second registers by said third information;
(c) updating the values of said first and second registers by the quotient
and remainder of the calculation result achieved by said carrying step;
and
(d) selecting a specified branch by use of updated values of said first and
second registers updated in said updating step until the updated value of
said second register is decreased to only one,
whereby said specified branch is selected in each processor.
5. A method of load balancing in a multiprocessor system that executes
programs written in a predetermined programming language expressed by an
inference tree constituted by nodes connected from top to bottom by
branches comprising OR processes, said multiprocessor system having a
plurality of processors with separate memories and working environments
and being numbered serially with identifier numbers, each of said
processors including searching means for searching branches at the node
and selecting means for selecting a desirable branch along a given OR
process, and a network system linking said processors, said method of load
balancing comprising the steps of:
storing initial system information in the memory representative of
characteristics of said multiprocessor system,
automatically and dynamically selecting a specific branch for each
processor by use of said initial system information without transferring
information between said processors,
processing said specific branch in each processor independently, whereby an
initial load balancing is obtained in said multiprocessor system,
generating history information in a first processor, the amount of which is
smaller than that of information representative of the whole working
environment of said first processor while said first processor processes a
branch given thereto,
transferring said history information from said first processor to the
memory of a second processor while said first processor is operating on
its branch,
dynamically reproducing the working environment of said first processor in
said second processor by use of the history information transferred from
said first processor, and
processing a specific branch in said second processor by use of the
reproduced working environment of said first processor, whereby a load
balancing is obtained between said first and second processors.
6. A method according to claim 5, wherein said system information includes
(a) first information representative of an identifier of each processor and
(b) second information representative of a number of assignable processors
which are subjected to said initial load balancing;
and said history information includes
(a) said first information representative of said identifier of each
processor,
(b) third information representative of an inference depth of said first
processor, and
(c) fourth information representative of an identifier of a branch from
said first processor which is dynamically assigned to said second
processor, a pair of said third information and said fourth information
being generated every time said branch of said first processor is shared
with said second processor.
7. A method according to claim 6, wherein said each processor comprises
(a) a first register for storing said first information,
(b) a second register for storing said second information,
(c) a third register for storing said third information,
(d) a fourth register for storing said fourth information,
(e) memory means including the memory of the processor for storing said
third and fourth information, and
(f) operation means for performing a predetermined operation by use of said
first information and said fourth information. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention generally relates to a multiprocessor system and a
method of load balancing thereof, and more particularly to a
multiprocessor system which has a plurality of processors and a network
system and a method of load balancing processing in the multiprocessor
system in which a given load is divided into a plurality of load segments
and each of the load segments are dynamically assigned to a predetermined
processor while operating the multiprocessor system.
2. Prior Art
The conventional multiprocessor system has a plurality of processors and a
network system. In the case where a given computational task or load
written in a logic programming Language (e.g., Prolog) is executed in
parallel in the conventional multiprocessor system, the given load (or an
initial goal) is divided into plural initial load segments which are
assigned to all of the processors at an initial load balancing stage. More
specifically, a first initial load segment is given to a first processor
wherein data representative of the processing result of the first initial
load segment is obtained, and such data must be transferred to a second
processor which starts to process a second initial load segment thereof by
use of such data. Thus, data representative of the processing result in
the presently operating first processor must be transferred to the next
processor which is idle during the operation of the first processor, but
will start to process its initial load segment by use of the data from the
first processor. As described heretofore, the initial load segments are
sequentially assigned to the processors in turn. Hence, the conventional
multiprocessor system requires a long processing time before the given
load is executed in parallel.
At a time when the initial goal is given to one processor, all of the other
idle processors within the multiprocessor system do not operate. Hence,
one processor must divide the given initial goal into plural initial load
segments which must be assigned to the other processors. In addition, the
conventional multiprocessor system must provide the network system for
transferring an information concerning the given initial goal which must
be divided. For this reason, the conventional multiprocessor system cannot
perform an initial load balancing on the initial goal with a high speed.
Originally, it is possible to obtain a performance improvement due to a
parallel effect for shortening processing times (hereinafter, simply
referred to as the parallel effect) when the given load is executed in
parallel in the conventional multiprocessor system. However, the
conventional multiprocessor system suffers from a problem in that it is
not actually possible to obtain such parallel effect because of the reason
described above.
Further, the above mentioned one processor supplied with the initial goal
must transfer a certain part of the information thereof to all of other
processors so that the amount of information to be transferred is
increased. Hence, the conventional multiprocessor system suffers another
problem in that it must have the ability to transfer data at high speed
and to transfer a large quantity of data for the network system.
Next, a description will be given with respect to the above-mentioned
problems in detail by considering that the logic programming language
(i.e., Prolog) is executed in the conventional multiprocessor system.
In a process for sequentially executing Prolog (shown in FIG. 2), a
predetermined priority (i.e., a depth-first-search) is given such that
branches are searched from an upper side to a lower side and from a left
side to a right side within an inference tree (or a proof tree) of Prolog.
When the system fails to find the correct branch (or the desirable branch)
while searching, the system backtracks to the preceding node and search
all branches connected thereto so as to find the correct branch.
On the contrary, in another process for executing Prolog in parallel,
plural processors simultaneously search a certain section or all sections
of the inference tree so as to find the correct branches in accordance
with a predetermined breadth-first-search. Such process is called an OR
parallel execution in which all branches within the inference tree are
divided into plural sections (hereinafter, referred to as OR processes)
each having a certain number of the branches and all of the OR processes
are respectively assigned to the idle processors when the initial goal is
given to the system. In this case, information required to execute each OR
process must be transferred to the corresponding idle processor.
As described before, one processor supplied with the initial goal must
divide the given initial goal into plural initial load segments which must
be assigned to the other idle processors at the initial load balancing
stage. Hence, the conventional system can not perform the initial load
balancing with high speed.
Meanwhile, after the load balancing is performed between the first and
second processors within the multiprocessor system, it is desirable that
the first and second processors be able to independently proceed with
their respective processes without transferring data representative of the
working environment of the first processor from that processor to the
second processor.
In order to realize the above-mentioned load balancing within the
conventional multiprocessor system, a predetermined working environment
required for the second processor must be extracted (or selected) from the
working environments which are obtained by performing predetermined
processes within the first processor, before performing the load balancing
in the first processor, and such predetermined working environment must be
transferred to the second processor.
In other words, the above predetermined working environment is identical to
the information which is obtained by performing predetermined processes
other than the load balancing process within the first processor. Such
predetermined working environment is necessary for the second processor in
the case where a certain part of the load to be executed in the first
processor is shared with and executed by the second processor. In
addition, the amount of information representative of the working
environments increases as the system proceeds to balance the load.
Therefore, quite a large amount of information must be transferred to the
other processors when the load balancing is performed after a long process
is performed in each processor.
As described heretofore, the first processor must stop performing its
original process and extract the predetermined working environment
required for the load balancing from its working environments (at a load
generation stage), and then such predetermined working environment, which
has a large amount of information, must be transferred to the second
processor. Thereafter, the second processor must store the transferred
information (at a load storing stage) so it can proceed with its original
process. In detail, a data conversion is required in order to transfer
such information by use of the network system. In the present
specification, the meaning of the data conversion will be considered to be
included in the meanings of the above load generation and load storing.
As shown in FIG. 1, overhead time must inevitably be provided for with the
above-mentioned load balancing in the conventional multiprocessor system.
In FIG. 1, the first processor cannot prevent a first overhead time from
occurring, and the second processor also cannot prevent a second overhead
time from occurring.
Due to the overhead time which accompanies the load balancing (or due to
the stopping of the process in the first processor in particular), each
processor can not demonstrate its processing ability every time unit. In
addition, the load balancing is required to be performed between
processors at an arbitrary and asynchronous time. Hence, the conventional
multiprocessor system suffers the problem in that it is not possible to
demonstrate the parallel effect as described before. This parallel effect
can be evaluated by the total ability which can be obtained from the
following formula: (Total Ability)=(Processing Ability of each
processor).times.(Number of processors which are operable in parallel in
order to process the given load). Hence, the conventional system needs a
network system having a high price to transfer the large amount of
information with arbitrary and asynchronous timing. In order to transfer
the large amount of information, the network system must be occupied for a
long time, hence, it becomes impossible to perform the load balancing
between the processors properly. Therefore, the conventional system
suffers a problem in that a load un-balancing must occur.
Compared to an improvement in the processing speed of the processor, an
improvement in the transfer speed of the network system within the
multiprocessor system has relatively little effect. This causes a tendency
to increase the communication time of the network system more than that of
the processors. In this case, the above-mentioned problem becomes serious.
As the number of processors within the multiprocessor system increases,
which tendency is remarkable.
Next, a description will be given with respect to the above-mentioned
problem in a concrete manner in conjunction with FIG. 2 when Prolog is
executed in parallel in the multiprocessor system.
In the case where the first processor performs the load balancing on the
second processor in the OR parallel execution described before, the first
processor divides an OR process from all branches of the inference tree,
and the divided OR process is assigned to the second processor.
In this case, transfer data (to be transferred from the first processor to
the second processor) can be classified as first and second transfer data.
The first transfer data represent the information of the divided OR
process. The second transfer data represent the information of the divided
OR process and other information which is required to execute the divided
OR process.
The first processor must transfer the above second transfer data to the
second processor while the first and second processors independently
proceed with their respective processes after the load balancing is
performed. This happens because, the second processor must refer to the
working environment of the first processor when the first processor
transfers the first transfer data to the second processor, instead of the
second transfer data.
However, the second transfer data must include data representative of the
large amount of information of the working environment of the first
processor which is necessary for executing the divided OR process. This
working environment in the Prolog execution includes "bind information"
representative of a connection relation between variables and values and
"control information" for controlling backtracking of the Prolog, for
example.
The above-mentioned working environment is produced by the first processor
before performing the load balancing. The second processor requires such
working environment to execute the divided OR process after the load
balancing is performed. Because, when the second processor independently
obtains a solution (or a processing result) of the initial goal by
performing the divided OR process, the second processor may need all of
the bind information which is produced by the first processor between a
time when the initial goal is given and a later time when the first
processor starts to perform the load balancing. In addition, the amount of
such bind information must be increased nearly in proportion to the
processing time. Therefore, the first processor must transfer quite a
large amount of information representative of its working environment to
the second processor when the first processor performs the load balancing
on the second processor after a long processing time has been passed.
Since the first processor must divide the OR process and transfer the large
amount of information representative of its working environment every time
the first processor performs the load balancing on the second processor,
the original process of the first processor must be stopped so it performs
intermittently. On the other hand, since the second processor receives the
working environment of the first processor every time the load balancing
is performed, the original process of the second processor must be stopped
in order to receive the large amount of information representative of the
working environment of the first processor and to store such transferred
information.
Therefore, each processor can not demonstrate its full processing ability.
In addition, the load balancing is required between the processors at the
arbitrary and asynchronous time. Hence, the multiprocessor system suffers
from the problem in that it is impossible to obtain the parallel effect as
described before.
Further, the conventional system requires an expensive network system to
transfer large amounts of information at arbitrary and asynchronous times.
Since the network system in this case is occupied for a long time in order
to transfer the large amount of information, it becomes almost impossible
to perform the load balancing between the processors. Therefore, the
conventional multiprocessor system suffers from the above described
problem in that the load becomes unbalanced.
The above-mentioned problem becomes serious in a recently developed
sequential inference machine (or a Prolog machine) which machine can
sequentially perform the inference by itself with high speed. Because,
when the multiprocessor system controls one thousand or more of such
machines (i.e., the processors) in parallel, the conventional system has
the tendency to cause the improvement of the data transfer speed of the
network system to become smaller than that of the processing speed of each
machine, as described before. As the number of the processors within the
multiprocessor system increases, the above-mentioned tendency becomes even
greater.
A sequential inference machine of 1 MLIPS (i.e., one Mega Logical Inference
Per Second) produces a working environment having about 5 MW (i.e., five
Mega Word) (in case of 40 Bit/W) of information. For example, a serial
link of 10 MBPS (i.e., ten Mega Bit Per Second) is actually required
between two mutually adjacent processors as the network system which
connects all one thousand of the sequential inference machines provided
within the multiprocessor system. In this case, it is possible to transfer
data of 0.25 MW per second (which is obtained by dividing 10 MBPS by 40
Bit/W) representative of the working environment between two mutually
adjacent processors.
In this case, the processing time for performing the inference divided by
the communication time of the network system becomes equal to 1/20. The
value 20 which appears in the denominator is obtained by dividing 5 MW by
0.25 MW. Due to the load balancing (or due to the transfer of the large
amount of information in particular), the sequential inference machine
(i.e., the processor) must stop performing the original inference process
for a long time. Hence, the apparent processing ability of the sequential
inference machine must be lowered.
Since the operating processors and the network system are occupied in order
to transfer the information representative of the working environments for
a long time, it becomes impossible to perform the required load balancing
so that the availability of the processor must be lowered. Thus, the
parallel effect applied to the multiprocessor system must be lowered as
described before.
SUMMARY OF THE INVENTION
It is therefore a primary object of the present invention to provide a
method of load balancing processing in a multiprocessor system which can
obtain a high parallel effect by performing the initial load balancing of
the initial goal with a high speed when the logic programming language is
executed in parallel in the multiprocessor system.
It is another object of the present invention to provide a method of load
balancing processing in a multiprocessor system which remarkably reduces
the amount of the information transferred between the processors so as to
perform the load balancing at high speed and without intermittently
stopping the execution of the original process of each processor so that a
high parallel effect can be obtained even when a network system having a
reasonable price is used in the multiprocessor system.
In a first aspect of the invention, there is provided a multiprocessor
system for processing a given load written by a predetermined programming
language comprising:
a plurality of processors and a network system linking the processors,
each of the processors comprising
(a) first means for storing system information representing characteristics
of the multiprocessor system, and
(b) second means for automatically and dynamically selecting a specific
initial load segment from the given load by use of the system information
without transferring information between the processors, whereby initial
load balancing is obtained in the multiprocessor system.
In a second aspect of the invention, there is provided a multiprocessor
system for processing a given load written by a predetermined programming
language comprising:
a plurality of processors and a network system linking the processors,
each of the processors comprising
(a) first means for generating history information, the amount of which is
smaller than that of information representative of the whole working
environment of a source processor, while processing a specific load
segment given to each processor,
(b) second means for transferring the history information to a destination
processor while operating the multiprocessor system,
(c) third means for reproducing the working environment of the source
processor by use of the history information transferred from a source
processor, and
(d) fourth means for processing the specific load segment by use of the
reproduced working environment of the source processor, whereby load
balancing is obtained in the multiprocessor system.
In a third aspect of the invention, there is provided a method of load
balancing processing in a multiprocessor system having a plurality of
processors and a network system linking the processors, comprising the
steps of:
providing a computational tack or load written in a predetermined
programming language,
automatically and dynamically selecting a specific initial load segment for
each processor from the given load by the use of the system information
without transferring information between the processors, and
performing the specific initial load segment processing in each processor
independently, whereby initial load balancing is obtained in the
multiprocessor system.
In a fourth aspect of the invention, there is provided a method of load
balancing processing in a multiprocessor system having a plurality of
processors and a network system linking the processors, comprising the
steps of:
providing a load written in a predetermined programming language,
generating history information in a first processor, the amount of which is
smaller than that of information representative of the whole working
environment of the first processor while the first processor, processes a
specific load segment given thereto,
transferring the history information from the first processor to a second
processor while operating the multiprocessor system,
dynamically reproducing the working environment of the first processor in
the second processor by use of the history information transferred from
the first processor, and
processing a specific load segment in the second processor by use of the
reproduced working environment of the first processor, whereby load
balancing is obtained between the first and second processors and a
partial load segment of the first processor is shared with the second
processor.
In a fifth aspect of the invention, there is provided a method of load
balancing processing in a multiprocessor system having a plurality of
processors and a network system linking the processors, comprising the
steps of:
providing a load written in a predetermined programming language,
storing system information representative of characteristics of the
multiprocessor system,
automatically and dynamically selecting a specific initial load segment for
each processor from the given load by use of the system information,
without transferring information between the processors,
processing the specific initial load segment in each processor
independently, whereby an initial load balancing is obtained in the
multiprocessor system,
generating history information in a first processor, the amount of which is
smaller than that of information representative of the whole working
environment of the first processor, while the first processor processes a
specific load segment given thereto,
transferring the history information from the first processor to a second
processor while operating the multiprocessor system,
dynamically reproducing the working environment of the first processor in
the second processor by use of the history information transferred from
the first processor, and
processing a specific load segment in the second processor by the use of
the reproduced working environment of the first processor, whereby load
balancing is obtained between the first and second processors and a
partial load segment of the first processor is shared with the second
processor.
BRIEF DESCRIPTION OF THE DRAWINGS
Further objects and advantages of the present invention will be apparent
from the following description, reference being had to the accompanying
drawings wherein preferred embodiments of the present invention are
clearly shown.
In the drawings:
FIG. 1 shows time charts for explaining the overhead that inevitably
accompanies load balancing in conventional multiprocessor systems;
FIG. 2 shows an example of an inference tree representing a solution of the
Prolog program;
FIG. 3 is a block diagram showing an embodiment of a multiprocessor system
according to the present invention;
FIG. 4 is a block diagram showing a first embodiment of the processor which
constitutes the multiprocessor system according to the present invention;
FIGS. 5 and 6 show inference trees for explaining the initial load
balancing of the initial goal performed in the multiprocessor system
according to the present invention;
FIGS. 7 and 8 are block diagrams both showing a second embodiment of the
processor within the multiprocessor system according to the present
invention;
FIG. 9 shows a partial inference tree of the logic programming language for
explaining a process for producing in the first processor history
information (represented by data of three words or five words) to be
transferred;
FIG. 10 shows a partial inference tree for explaining a process for
reproducing a required working environment by the use of the transferred
history information (represented by data of three words) in the second
processor;
FIG. 11 shows a partial inference tree for explaining a process for
reproducing a required working environment by use in the second processor
of the transferred history information (represented by data of five
words); and
FIG. 12 shows a partial inference tree for explaining a process of the
second processor for reproducing a required working environment by use of
the transferred history information represented by data of (1+2n) words at
time periods "n" for performing the load balancing in the first processor.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Referring now to the drawings, wherein like reference characters designate
like or corresponding parts throughout the several views, FIG. 3 is a
block diagram showing all the elements of a multiprocessor system 1
according to the present invention. This multiprocessor system 1
(surrounded by a dashed line) provides a network system (surrounded by a
dotted line) and a plurality of processors PR0 to PRn (where n denotes an
integral number). These processors PR0 to PRn are connected to each other
by the network system. The present invention is directed to the processors
rather than the network system; hence, a description of the network system
will be omitted from this specification.
[A] FIRST EMBODIMENT
Next, a description will be given with respect to the structure of the
first embodiment of a processor PRi (where i denotes an integral number
lying between 1 to n) in conjunction with FIG. 4. In FIG. 4, the processor
PRi comprises a register Rpi for storing data representative of a
processor identifier pi, a register Rpn for storing data representative of
a number pn of processors (hereinafter, referred to as assignable
processors) which are subjected to the initial load balancing of the
initial goal, a counter Cpc for counting a number pc of the branches at
each node (i.e., at each process for searching the desirable branch in the
initial load balancing), a flag section F for storing a flag representing
whether the initial load balancing has been performed or not, an operation
section 11 for performing operations and big-or-small judgment which will
be described later, a program memory 12 for storing programs written in
the logic programming language to be executed, and a working memory 13 for
storing data representative of the working environments which are used for
executing the programs.
Next, a description will be given with respect to a "strategic procedure"
for performing the OR process in each processor PRi in conjunction with
FIG. 5. FIG. 5 shows a strategic procedure for performing the initial load
balancing in the case where the number of initially assignable processors
is set to twelve. In FIG. 5, each of characters p0 to p11 designates each
of the identifiers pi of the twelve processors.
In the following description, a number id representing an inference depth
of the logic programming language varies "0", "1", "2", . . . as the
inference depth becomes deeper. In addition, the number id equals "0" at
the initial goal stage. Further, a number m (where m denotes an integral
number) of branches bm connected to a common node varies "0", "1", . . .
from the left most branch in turn.
(I) First, the multiprocessor system initializes the registers Rpi and Rpn,
the counter Cpc and the flag section F.
(II) Secondly, the assignable processors are equally assigned to the
branches of the inference tree which branches from an initial goal point.
For example, five processors having the identifiers p0, p1, p2, p3 and p4
are respectively and dynamically assigned to five branches b0, b1, b2, b3
and b4 connected to the common node (i.e., the initial goal point) from
the left most branch. Similarly, seven other processors having the
identifiers p5 to p11 are assigned to the five branches b0 to b4. Thus,
three processors having the identifiers p0, p5 and p10 are assigned to the
branch b0. In addition, the three processors having the identifiers p1, p6
and p11 are assigned to the branch b1. Similarly, the two processors
having the identifiers p4 and p9 are assigned to the branch b4.
As described above, plural processors are assigned to each of the branches
b0 to b4. The processors assigned to one branch designate the assignable
processors in a next stage.
(III) Next, the present system performs the initial load balancing in the
direction of the inference depth. Similar to the above-mentioned procedure
(II), the processors assigned to one branch are assigned to next branches
connected to that one branch. When one processor is assigned to each of
the next branches, the initial load balancing is completed. In this case,
when the number pn of the assignable processors is smaller than the number
of assignable branches, a branch next to the branch assigned by the last
processor is saved for the last processor (i.e., the right branch is saved
for the last processor), but branches next to the branches each assigned
by the other processors are not saved (i.e., the right branches are left
for the other processors).
For example, the processors having the identifiers p0, p5 and p10 are
respectively assigned to the branches b0, b1 and b2 in inference depth
id=1, and the branch b3 is saved for the processor having the identifier
p10. Similarly, other branches are assigned by the corresponding
processors as shown in FIG. 5.
The above is a diagrammatical explanation of the initial load balancing of
the initial computational task or goal. Next, a generalized description
will be given with respect to the initial load balancing.
(1) First, the present system initializes the registers Rpi and Rpn, the
counter Cpc and the flag section F as described below.
(a) The identifier pi of each processor is stored in the register Rpi.
(b) The number pn of the assignable processors is stored in the register
Rpn.
(c) The flag "on" indicating that the processor is initialized is set in
the flag section F.
(2) The inference depth id equals "k", the value of which is sequentially
varied as 0, 1, . . .
(a) In the case where the number pn of the assignable processors is larger
than one, the init | | |