WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Multiprocessor system and a method of load balancing thereof    
United States Patent5053950   
Link to this pagehttp://www.wikipatents.com/5053950.html
Inventor(s)Naganuma; Jiro (Zama, JP); Ogura; Takeshi (Chigasaki, JP)
AbstractA multiprocessor system has a plurality of processors and a network system linking said processors to thereby process a given computational load written in a logic programming language. According to an initial load balancing algorithm, each processor independently and dynamically selects an initial load segment thereof from the given load by use of system information representative of characteristics of the multiprocessor system without transferring information between the processors, whereby an initial load balancing is obtained in the multiprocessor system. According to a load balancing algorithm for reproducing working environments, which algorithm is performed after performing the initial load balancing algorithm, a partial load segment of a first processor is shared with a second processor. In this case, the first processor generates history information representative of the working environment thereof, but the amount of the history information is smaller than that of the whole working environment of the first processor. This history information is supplied to the second processor wherein the working environment of the first processor is reproduced by use of the history information. Thereafter, the second processor processes the partial load segment of the first processor by use of the reproduced working environment of the first processor, while the first processor processes the load segment thereof. Thus, the present multiprocessor system can greatly reduce the amount of the information over that of a conventional multiprocessor system, whereby the load balancing can be performed at high speed.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5053950
Multiprocessor system and a method of load balancing thereof - US Patent 5053950 Drawing
Multiprocessor system and a method of load balancing thereof
Inventor     Naganuma; Jiro (Zama, JP); Ogura; Takeshi (Chigasaki, JP)
Owner/Assignee     Nippon Telegraph and Telephone Corporation (Tokyo, JP)
Patent assignment
All assignments
Publication Date     October 1, 1991
Application Number     07/522,504
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 11, 1990
US Classification     718/105 718/106
Int'l Classification     G06F 015/16 G06F 015/00
Examiner     Chan; Eddie P.
Assistant Examiner    
Attorney/Law Firm     Darby & Darby
Address
Parent Case     This is a continuation of application Ser. No. 132,788, filed Dec. 14, 1987, now abandoned.
Priority Data     Dec 19, 1986[JP]61-303412 May 25, 1987[JP]62-127338
USPTO Field of Search     364/200 MS File 364/900 MS File
Patent Tags     multiprocessor load balancing
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
3496551



[0 after 0 votes]
3348210



[0 after 0 votes]
4748558
Hirosawa
718/105
May,1988

[0 after 0 votes]
4633387
Hartung
718/105
Dec,1986

[0 after 0 votes]
4403286
Fry
718/105
Sep,1983

[0 after 0 votes]
4318173
Freedman
718/103
Mar,1982

[0 after 0 votes]
4130865
Heart
709/201
Dec,1978

[0 after 0 votes]
4099235
Hoschler
718/105
Jul,1978

[0 after 0 votes]
4073005
Parkin
718/104
Feb,1978

[0 after 0 votes]
3593300
Driscoll, Jr.
540/67
Jul,1971

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A multiprocessor system that executes programs written in a predetermined programming language expressed by an inference tree constituted by nodes connected from top to bottom by branches comprising OR processes, said multiprocessor system comprising:

a plurality of processors numbered serially with identifier numbers; and

a network system for linking said processors, each of said processors comprising

(a) first means for storing initial system information representing characteristics of said multiprocessor system, which information includes

(1) first information representative of a value of the identifier number of said processor and

(2) second information representative of the number of said processors linked by said network system;

(b) second means including

(1) searching means for searching branches at a node of the program,

(2) storing means for storing third information representative of the number of said branches at each node, which number is obtained by searching the branches at each node, and

(3) selecting means for automatically and dynamically selecting a specified branch at each node of said OR process by executing a predetermined calculation by use of said initial system information and the third information without transferring information between said processors, whereby an initial load balancing is obtained in said multiprocessor system.

2. A logic programming language multiprocessor system according to claim 1 wherein each of said processor further comprises:

(a) fourth means for storing said first information in a first register and also storing said second information in a second register as an initial condition;

(b) fifth means for carrying out a calculation involving dividing the values of said first and second registers by said third information;

(c) sixth means for updating the values of said first and second registers with the quotient and remainder of the calculated result produced by said fifth means; and

(d) seventh means for selecting a specified branch by use of the updated values of said first and second registers updated by said sixth means until the updated value of said second register is decreased to only one,

whereby said specified branch is selected in each processor.

3. A method of load balancing in a multiprocessor system that executes programs written in a predetermined programming language expressed by an inference tree constituted by nodes connected from top to bottom by branches comprising OR processes, said multiprocessor system having a plurality of processors with separate memories and being numbered serially with identifier numbers, each of said processors including searching means for searching branches at the node, and selecting means for selecting a desireable branch along a given OR process, said multiprocessor system further having a network system linking said processors

said method of load balancing in each of said processors comprising the steps of:

(a) storing initial system information representative of the characteristics of said multiprocessor system, which information includes

(1) first information representative of the value of the identifier number of said processor and

(2) second information representative of the number of said processors linked by said network system;

(b) searching branches at each node of the program;

(c) storing third information representative of the number of branches at each node which is obtained by the operation of searching said branches at each node; and

(d) automatically selecting a specified branch at each node of said OR process by executing a predetermined calculation by use of said initial system information and said third information without transferring information between said processor; and

(e) processing said branch in each processor independently, whereby an initial load balancing is obtained in said multiprocessor system.

4. A method of load balancing according to claim 3 further comprising steps of:

(a) storing said first information in a first register and also storing said second information in a second register as an initial condition;

(b) carrying out a calculation involving dividing the values of said first and second registers by said third information;

(c) updating the values of said first and second registers by the quotient and remainder of the calculation result achieved by said carrying step; and

(d) selecting a specified branch by use of updated values of said first and second registers updated in said updating step until the updated value of said second register is decreased to only one,

whereby said specified branch is selected in each processor.

5. A method of load balancing in a multiprocessor system that executes programs written in a predetermined programming language expressed by an inference tree constituted by nodes connected from top to bottom by branches comprising OR processes, said multiprocessor system having a plurality of processors with separate memories and working environments and being numbered serially with identifier numbers, each of said processors including searching means for searching branches at the node and selecting means for selecting a desirable branch along a given OR process, and a network system linking said processors, said method of load balancing comprising the steps of:

storing initial system information in the memory representative of characteristics of said multiprocessor system,

automatically and dynamically selecting a specific branch for each processor by use of said initial system information without transferring information between said processors,

processing said specific branch in each processor independently, whereby an initial load balancing is obtained in said multiprocessor system,

generating history information in a first processor, the amount of which is smaller than that of information representative of the whole working environment of said first processor while said first processor processes a branch given thereto,

transferring said history information from said first processor to the memory of a second processor while said first processor is operating on its branch,

dynamically reproducing the working environment of said first processor in said second processor by use of the history information transferred from said first processor, and

processing a specific branch in said second processor by use of the reproduced working environment of said first processor, whereby a load balancing is obtained between said first and second processors.

6. A method according to claim 5, wherein said system information includes

(a) first information representative of an identifier of each processor and

(b) second information representative of a number of assignable processors which are subjected to said initial load balancing;

and said history information includes

(a) said first information representative of said identifier of each processor,

(b) third information representative of an inference depth of said first processor, and

(c) fourth information representative of an identifier of a branch from said first processor which is dynamically assigned to said second processor, a pair of said third information and said fourth information being generated every time said branch of said first processor is shared with said second processor.

7. A method according to claim 6, wherein said each processor comprises

(a) a first register for storing said first information,

(b) a second register for storing said second information,

(c) a third register for storing said third information,

(d) a fourth register for storing said fourth information,

(e) memory means including the memory of the processor for storing said third and fourth information, and

(f) operation means for performing a predetermined operation by use of said first information and said fourth information.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to a multiprocessor system and a method of load balancing thereof, and more particularly to a multiprocessor system which has a plurality of processors and a network system and a method of load balancing processing in the multiprocessor system in which a given load is divided into a plurality of load segments and each of the load segments are dynamically assigned to a predetermined processor while operating the multiprocessor system.

2. Prior Art

The conventional multiprocessor system has a plurality of processors and a network system. In the case where a given computational task or load written in a logic programming Language (e.g., Prolog) is executed in parallel in the conventional multiprocessor system, the given load (or an initial goal) is divided into plural initial load segments which are assigned to all of the processors at an initial load balancing stage. More specifically, a first initial load segment is given to a first processor wherein data representative of the processing result of the first initial load segment is obtained, and such data must be transferred to a second processor which starts to process a second initial load segment thereof by use of such data. Thus, data representative of the processing result in the presently operating first processor must be transferred to the next processor which is idle during the operation of the first processor, but will start to process its initial load segment by use of the data from the first processor. As described heretofore, the initial load segments are sequentially assigned to the processors in turn. Hence, the conventional multiprocessor system requires a long processing time before the given load is executed in parallel.

At a time when the initial goal is given to one processor, all of the other idle processors within the multiprocessor system do not operate. Hence, one processor must divide the given initial goal into plural initial load segments which must be assigned to the other processors. In addition, the conventional multiprocessor system must provide the network system for transferring an information concerning the given initial goal which must be divided. For this reason, the conventional multiprocessor system cannot perform an initial load balancing on the initial goal with a high speed. Originally, it is possible to obtain a performance improvement due to a parallel effect for shortening processing times (hereinafter, simply referred to as the parallel effect) when the given load is executed in parallel in the conventional multiprocessor system. However, the conventional multiprocessor system suffers from a problem in that it is not actually possible to obtain such parallel effect because of the reason described above.

Further, the above mentioned one processor supplied with the initial goal must transfer a certain part of the information thereof to all of other processors so that the amount of information to be transferred is increased. Hence, the conventional multiprocessor system suffers another problem in that it must have the ability to transfer data at high speed and to transfer a large quantity of data for the network system.

Next, a description will be given with respect to the above-mentioned problems in detail by considering that the logic programming language (i.e., Prolog) is executed in the conventional multiprocessor system.

In a process for sequentially executing Prolog (shown in FIG. 2), a predetermined priority (i.e., a depth-first-search) is given such that branches are searched from an upper side to a lower side and from a left side to a right side within an inference tree (or a proof tree) of Prolog. When the system fails to find the correct branch (or the desirable branch) while searching, the system backtracks to the preceding node and search all branches connected thereto so as to find the correct branch.

On the contrary, in another process for executing Prolog in parallel, plural processors simultaneously search a certain section or all sections of the inference tree so as to find the correct branches in accordance with a predetermined breadth-first-search. Such process is called an OR parallel execution in which all branches within the inference tree are divided into plural sections (hereinafter, referred to as OR processes) each having a certain number of the branches and all of the OR processes are respectively assigned to the idle processors when the initial goal is given to the system. In this case, information required to execute each OR process must be transferred to the corresponding idle processor.

As described before, one processor supplied with the initial goal must divide the given initial goal into plural initial load segments which must be assigned to the other idle processors at the initial load balancing stage. Hence, the conventional system can not perform the initial load balancing with high speed.

Meanwhile, after the load balancing is performed between the first and second processors within the multiprocessor system, it is desirable that the first and second processors be able to independently proceed with their respective processes without transferring data representative of the working environment of the first processor from that processor to the second processor.

In order to realize the above-mentioned load balancing within the conventional multiprocessor system, a predetermined working environment required for the second processor must be extracted (or selected) from the working environments which are obtained by performing predetermined processes within the first processor, before performing the load balancing in the first processor, and such predetermined working environment must be transferred to the second processor.

In other words, the above predetermined working environment is identical to the information which is obtained by performing predetermined processes other than the load balancing process within the first processor. Such predetermined working environment is necessary for the second processor in the case where a certain part of the load to be executed in the first processor is shared with and executed by the second processor. In addition, the amount of information representative of the working environments increases as the system proceeds to balance the load. Therefore, quite a large amount of information must be transferred to the other processors when the load balancing is performed after a long process is performed in each processor.

As described heretofore, the first processor must stop performing its original process and extract the predetermined working environment required for the load balancing from its working environments (at a load generation stage), and then such predetermined working environment, which has a large amount of information, must be transferred to the second processor. Thereafter, the second processor must store the transferred information (at a load storing stage) so it can proceed with its original process. In detail, a data conversion is required in order to transfer such information by use of the network system. In the present specification, the meaning of the data conversion will be considered to be included in the meanings of the above load generation and load storing.

As shown in FIG. 1, overhead time must inevitably be provided for with the above-mentioned load balancing in the conventional multiprocessor system. In FIG. 1, the first processor cannot prevent a first overhead time from occurring, and the second processor also cannot prevent a second overhead time from occurring.

Due to the overhead time which accompanies the load balancing (or due to the stopping of the process in the first processor in particular), each processor can not demonstrate its processing ability every time unit. In addition, the load balancing is required to be performed between processors at an arbitrary and asynchronous time. Hence, the conventional multiprocessor system suffers the problem in that it is not possible to demonstrate the parallel effect as described before. This parallel effect can be evaluated by the total ability which can be obtained from the following formula: (Total Ability)=(Processing Ability of each processor).times.(Number of processors which are operable in parallel in order to process the given load). Hence, the conventional system needs a network system having a high price to transfer the large amount of information with arbitrary and asynchronous timing. In order to transfer the large amount of information, the network system must be occupied for a long time, hence, it becomes impossible to perform the load balancing between the processors properly. Therefore, the conventional system suffers a problem in that a load un-balancing must occur.

Compared to an improvement in the processing speed of the processor, an improvement in the transfer speed of the network system within the multiprocessor system has relatively little effect. This causes a tendency to increase the communication time of the network system more than that of the processors. In this case, the above-mentioned problem becomes serious. As the number of processors within the multiprocessor system increases, which tendency is remarkable.

Next, a description will be given with respect to the above-mentioned problem in a concrete manner in conjunction with FIG. 2 when Prolog is executed in parallel in the multiprocessor system.

In the case where the first processor performs the load balancing on the second processor in the OR parallel execution described before, the first processor divides an OR process from all branches of the inference tree, and the divided OR process is assigned to the second processor.

In this case, transfer data (to be transferred from the first processor to the second processor) can be classified as first and second transfer data. The first transfer data represent the information of the divided OR process. The second transfer data represent the information of the divided OR process and other information which is required to execute the divided OR process.

The first processor must transfer the above second transfer data to the second processor while the first and second processors independently proceed with their respective processes after the load balancing is performed. This happens because, the second processor must refer to the working environment of the first processor when the first processor transfers the first transfer data to the second processor, instead of the second transfer data.

However, the second transfer data must include data representative of the large amount of information of the working environment of the first processor which is necessary for executing the divided OR process. This working environment in the Prolog execution includes "bind information" representative of a connection relation between variables and values and "control information" for controlling backtracking of the Prolog, for example.

The above-mentioned working environment is produced by the first processor before performing the load balancing. The second processor requires such working environment to execute the divided OR process after the load balancing is performed. Because, when the second processor independently obtains a solution (or a processing result) of the initial goal by performing the divided OR process, the second processor may need all of the bind information which is produced by the first processor between a time when the initial goal is given and a later time when the first processor starts to perform the load balancing. In addition, the amount of such bind information must be increased nearly in proportion to the processing time. Therefore, the first processor must transfer quite a large amount of information representative of its working environment to the second processor when the first processor performs the load balancing on the second processor after a long processing time has been passed.

Since the first processor must divide the OR process and transfer the large amount of information representative of its working environment every time the first processor performs the load balancing on the second processor, the original process of the first processor must be stopped so it performs intermittently. On the other hand, since the second processor receives the working environment of the first processor every time the load balancing is performed, the original process of the second processor must be stopped in order to receive the large amount of information representative of the working environment of the first processor and to store such transferred information.

Therefore, each processor can not demonstrate its full processing ability. In addition, the load balancing is required between the processors at the arbitrary and asynchronous time. Hence, the multiprocessor system suffers from the problem in that it is impossible to obtain the parallel effect as described before.

Further, the conventional system requires an expensive network system to transfer large amounts of information at arbitrary and asynchronous times. Since the network system in this case is occupied for a long time in order to transfer the large amount of information, it becomes almost impossible to perform the load balancing between the processors. Therefore, the conventional multiprocessor system suffers from the above described problem in that the load becomes unbalanced.

The above-mentioned problem becomes serious in a recently developed sequential inference machine (or a Prolog machine) which machine can sequentially perform the inference by itself with high speed. Because, when the multiprocessor system controls one thousand or more of such machines (i.e., the processors) in parallel, the conventional system has the tendency to cause the improvement of the data transfer speed of the network system to become smaller than that of the processing speed of each machine, as described before. As the number of the processors within the multiprocessor system increases, the above-mentioned tendency becomes even greater.

A sequential inference machine of 1 MLIPS (i.e., one Mega Logical Inference Per Second) produces a working environment having about 5 MW (i.e., five Mega Word) (in case of 40 Bit/W) of information. For example, a serial link of 10 MBPS (i.e., ten Mega Bit Per Second) is actually required between two mutually adjacent processors as the network system which connects all one thousand of the sequential inference machines provided within the multiprocessor system. In this case, it is possible to transfer data of 0.25 MW per second (which is obtained by dividing 10 MBPS by 40 Bit/W) representative of the working environment between two mutually adjacent processors.

In this case, the processing time for performing the inference divided by the communication time of the network system becomes equal to 1/20. The value 20 which appears in the denominator is obtained by dividing 5 MW by 0.25 MW. Due to the load balancing (or due to the transfer of the large amount of information in particular), the sequential inference machine (i.e., the processor) must stop performing the original inference process for a long time. Hence, the apparent processing ability of the sequential inference machine must be lowered.

Since the operating processors and the network system are occupied in order to transfer the information representative of the working environments for a long time, it becomes impossible to perform the required load balancing so that the availability of the processor must be lowered. Thus, the parallel effect applied to the multiprocessor system must be lowered as described before.

SUMMARY OF THE INVENTION

It is therefore a primary object of the present invention to provide a method of load balancing processing in a multiprocessor system which can obtain a high parallel effect by performing the initial load balancing of the initial goal with a high speed when the logic programming language is executed in parallel in the multiprocessor system.

It is another object of the present invention to provide a method of load balancing processing in a multiprocessor system which remarkably reduces the amount of the information transferred between the processors so as to perform the load balancing at high speed and without intermittently stopping the execution of the original process of each processor so that a high parallel effect can be obtained even when a network system having a reasonable price is used in the multiprocessor system.

In a first aspect of the invention, there is provided a multiprocessor system for processing a given load written by a predetermined programming language comprising:

a plurality of processors and a network system linking the processors,

each of the processors comprising

(a) first means for storing system information representing characteristics of the multiprocessor system, and

(b) second means for automatically and dynamically selecting a specific initial load segment from the given load by use of the system information without transferring information between the processors, whereby initial load balancing is obtained in the multiprocessor system.

In a second aspect of the invention, there is provided a multiprocessor system for processing a given load written by a predetermined programming language comprising:

a plurality of processors and a network system linking the processors,

each of the processors comprising

(a) first means for generating history information, the amount of which is smaller than that of information representative of the whole working environment of a source processor, while processing a specific load segment given to each processor,

(b) second means for transferring the history information to a destination processor while operating the multiprocessor system,

(c) third means for reproducing the working environment of the source processor by use of the history information transferred from a source processor, and

(d) fourth means for processing the specific load segment by use of the reproduced working environment of the source processor, whereby load balancing is obtained in the multiprocessor system.

In a third aspect of the invention, there is provided a method of load balancing processing in a multiprocessor system having a plurality of processors and a network system linking the processors, comprising the steps of:

providing a computational tack or load written in a predetermined programming language,

automatically and dynamically selecting a specific initial load segment for each processor from the given load by the use of the system information without transferring information between the processors, and

performing the specific initial load segment processing in each processor independently, whereby initial load balancing is obtained in the multiprocessor system.

In a fourth aspect of the invention, there is provided a method of load balancing processing in a multiprocessor system having a plurality of processors and a network system linking the processors, comprising the steps of:

providing a load written in a predetermined programming language,

generating history information in a first processor, the amount of which is smaller than that of information representative of the whole working environment of the first processor while the first processor, processes a specific load segment given thereto,

transferring the history information from the first processor to a second processor while operating the multiprocessor system,

dynamically reproducing the working environment of the first processor in the second processor by use of the history information transferred from the first processor, and

processing a specific load segment in the second processor by use of the reproduced working environment of the first processor, whereby load balancing is obtained between the first and second processors and a partial load segment of the first processor is shared with the second processor.

In a fifth aspect of the invention, there is provided a method of load balancing processing in a multiprocessor system having a plurality of processors and a network system linking the processors, comprising the steps of:

providing a load written in a predetermined programming language,

storing system information representative of characteristics of the multiprocessor system,

automatically and dynamically selecting a specific initial load segment for each processor from the given load by use of the system information, without transferring information between the processors,

processing the specific initial load segment in each processor independently, whereby an initial load balancing is obtained in the multiprocessor system,

generating history information in a first processor, the amount of which is smaller than that of information representative of the whole working environment of the first processor, while the first processor processes a specific load segment given thereto,

transferring the history information from the first processor to a second processor while operating the multiprocessor system,

dynamically reproducing the working environment of the first processor in the second processor by use of the history information transferred from the first processor, and

processing a specific load segment in the second processor by the use of the reproduced working environment of the first processor, whereby load balancing is obtained between the first and second processors and a partial load segment of the first processor is shared with the second processor.

BRIEF DESCRIPTION OF THE DRAWINGS

Further objects and advantages of the present invention will be apparent from the following description, reference being had to the accompanying drawings wherein preferred embodiments of the present invention are clearly shown.

In the drawings:

FIG. 1 shows time charts for explaining the overhead that inevitably accompanies load balancing in conventional multiprocessor systems;

FIG. 2 shows an example of an inference tree representing a solution of the Prolog program;

FIG. 3 is a block diagram showing an embodiment of a multiprocessor system according to the present invention;

FIG. 4 is a block diagram showing a first embodiment of the processor which constitutes the multiprocessor system according to the present invention;

FIGS. 5 and 6 show inference trees for explaining the initial load balancing of the initial goal performed in the multiprocessor system according to the present invention;

FIGS. 7 and 8 are block diagrams both showing a second embodiment of the processor within the multiprocessor system according to the present invention;

FIG. 9 shows a partial inference tree of the logic programming language for explaining a process for producing in the first processor history information (represented by data of three words or five words) to be transferred;

FIG. 10 shows a partial inference tree for explaining a process for reproducing a required working environment by the use of the transferred history information (represented by data of three words) in the second processor;

FIG. 11 shows a partial inference tree for explaining a process for reproducing a required working environment by use in the second processor of the transferred history information (represented by data of five words); and

FIG. 12 shows a partial inference tree for explaining a process of the second processor for reproducing a required working environment by use of the transferred history information represented by data of (1+2n) words at time periods "n" for performing the load balancing in the first processor.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring now to the drawings, wherein like reference characters designate like or corresponding parts throughout the several views, FIG. 3 is a block diagram showing all the elements of a multiprocessor system 1 according to the present invention. This multiprocessor system 1 (surrounded by a dashed line) provides a network system (surrounded by a dotted line) and a plurality of processors PR0 to PRn (where n denotes an integral number). These processors PR0 to PRn are connected to each other by the network system. The present invention is directed to the processors rather than the network system; hence, a description of the network system will be omitted from this specification.

[A] FIRST EMBODIMENT

Next, a description will be given with respect to the structure of the first embodiment of a processor PRi (where i denotes an integral number lying between 1 to n) in conjunction with FIG. 4. In FIG. 4, the processor PRi comprises a register Rpi for storing data representative of a processor identifier pi, a register Rpn for storing data representative of a number pn of processors (hereinafter, referred to as assignable processors) which are subjected to the initial load balancing of the initial goal, a counter Cpc for counting a number pc of the branches at each node (i.e., at each process for searching the desirable branch in the initial load balancing), a flag section F for storing a flag representing whether the initial load balancing has been performed or not, an operation section 11 for performing operations and big-or-small judgment which will be described later, a program memory 12 for storing programs written in the logic programming language to be executed, and a working memory 13 for storing data representative of the working environments which are used for executing the programs.

Next, a description will be given with respect to a "strategic procedure" for performing the OR process in each processor PRi in conjunction with FIG. 5. FIG. 5 shows a strategic procedure for performing the initial load balancing in the case where the number of initially assignable processors is set to twelve. In FIG. 5, each of characters p0 to p11 designates each of the identifiers pi of the twelve processors.

In the following description, a number id representing an inference depth of the logic programming language varies "0", "1", "2", . . . as the inference depth becomes deeper. In addition, the number id equals "0" at the initial goal stage. Further, a number m (where m denotes an integral number) of branches bm connected to a common node varies "0", "1", . . . from the left most branch in turn.

(I) First, the multiprocessor system initializes the registers Rpi and Rpn, the counter Cpc and the flag section F.

(II) Secondly, the assignable processors are equally assigned to the branches of the inference tree which branches from an initial goal point. For example, five processors having the identifiers p0, p1, p2, p3 and p4 are respectively and dynamically assigned to five branches b0, b1, b2, b3 and b4 connected to the common node (i.e., the initial goal point) from the left most branch. Similarly, seven other processors having the identifiers p5 to p11 are assigned to the five branches b0 to b4. Thus, three processors having the identifiers p0, p5 and p10 are assigned to the branch b0. In addition, the three processors having the identifiers p1, p6 and p11 are assigned to the branch b1. Similarly, the two processors having the identifiers p4 and p9 are assigned to the branch b4.

As described above, plural processors are assigned to each of the branches b0 to b4. The processors assigned to one branch designate the assignable processors in a next stage.

(III) Next, the present system performs the initial load balancing in the direction of the inference depth. Similar to the above-mentioned procedure (II), the processors assigned to one branch are assigned to next branches connected to that one branch. When one processor is assigned to each of the next branches, the initial load balancing is completed. In this case, when the number pn of the assignable processors is smaller than the number of assignable branches, a branch next to the branch assigned by the last processor is saved for the last processor (i.e., the right branch is saved for the last processor), but branches next to the branches each assigned by the other processors are not saved (i.e., the right branches are left for the other processors).

For example, the processors having the identifiers p0, p5 and p10 are respectively assigned to the branches b0, b1 and b2 in inference depth id=1, and the branch b3 is saved for the processor having the identifier p10. Similarly, other branches are assigned by the corresponding processors as shown in FIG. 5.

The above is a diagrammatical explanation of the initial load balancing of the initial computational task or goal. Next, a generalized description will be given with respect to the initial load balancing.

(1) First, the present system initializes the registers Rpi and Rpn, the counter Cpc and the flag section F as described below.

(a) The identifier pi of each processor is stored in the register Rpi.

(b) The number pn of the assignable processors is stored in the register Rpn.

(c) The flag "on" indicating that the processor is initialized is set in the flag section F.

(2) The inference depth id equals "k", the value of which is sequentially varied as 0, 1, . . .

(a) In the case where the number pn of the assignable processors is larger than one, the init