|
Claims  |
|
|
What is claimed is:
1. The method of processing message packets at different processors in a
plurality of processors interconnected by a network including the steps
of:
coupling message packets from the processors concurrently into the network;
determining within the network at least one of the coupled message packets
which has a priority dependent on data content of the coupled message
packets which priority is not exceeded by any other message packet; and
presenting a highest priority message packet to all processors
simultaneously for processing thereat the highest priority message packet
containing the data of the at least one message packet that is determined
to have priority.
2. The method set forth in claim 1 above, including in addition the step of
determining at the processors from the received highest priority message
packet which one or more processors is to act upon the packet.
3. The method set forth in claim 1 above, wherein the step of determining
priority comprises the steps of concurrently transferring contending
message packets into the network and merging the message packets in
accordance with priority until said highest priority packet has been
selected.
4. The method set forth in claim 3 above, wherein the steps of transferring
and merging comprise the successive steps of performing packet pair
comparisons at each of a plurality of levels with a higher priority packet
from each compared pair of packets being passed to a next level until only
said highest priority packet survives.
5. The method set forth in claim 4 above, including the step of passing the
packets in one direction through the network as the packet pair
comparisons are made and further comprising the step of transferring the
highest priority packet back through the network in a direction opposite
to the one direction.
6. The method set forth in claim 5 above, wherein the steps of transferring
and merging through the network include the step of passing the packets
level-by-level in successive synchronous steps.
7. The method set forth in claim 6 above, wherein each message packet
comprises a sequence of data bytes and the step of performing packet pair
comparisons comprises comparing in sequence corresponding bytes of each
contending message packet in each packet pair.
8. The method set forth in claim 7 above, wherein the step of performing
packet pair comparisons further includes the step of passing message
packet bytes to a next level in the network and wherein each said step is
executed with less than a predetermined maximum time delay.
9. The method set forth in claim 8 above, wherein the message packets are
of variable length and further including the steps of sequentially
presenting to all processors highest priority bytes of coupled message
packets without requiring a prior determination of said highest priority
packet within the network, such that a priority determination may be made
at some time after a beginning byte of a message packet begins to be
received at the processors.
10. The method set forth in claim 1 above, including the step of indicating
loss of contention to all processors originating packets that did not gain
priority.
11. The method set forth in claim 10 above, further including the step of
terminating transmission during the same contention interval from
processors whose packets did not gain priority.
12. The method set forth in claim 11 above, further including the step of
coupling another set of message packets concurrently into the network from
the plurality of processors after the completed transmission of the last
highest priority previous single or common priority packet.
13. The method set forth in claim 1 above, wherein the processors provide
messages, comprising at least primary data and response messages, having
data contents varying in accordance with a coherent priority scheme.
14. The method set forth in claim 13 above, wherein the messages further
comprise status and control messages whose data contents also vary in
accordance with the coherent priority scheme.
15. The method set forth in claim 14 above, including the steps of coupling
responses from all active processors to the network upon the completion of
a primary data message transmission and merging the responses in the
network.
16. The method set forth in claim 15 above, wherein the priority scheme
grants priority to messages having the lowest data content.
17. The method set forth in claim 1 above, including the step of
transferring packets from each processor, while determining priority, with
a like predetermined delay for each packet regardless of the decision as
to priority.
18. The method set forth in claim 1 above, including the steps of providing
reference data in each of the message packets and determining at each
processors from the reference data whether the highest priority packet is
to be accepted by the processor.
19. The method set forth in claim 18 above, including the steps of storing
reference data at the processors, and comparing the reference data in the
at least one packet to that stored at the processor to determine if the at
least one packet is to be accepted.
20. The method set forth in claim 19 above, wherein the stored reference
data are hash values.
21. The method set forth in claim 1 above, wherein the processors are
arranged in a data base system and including the steps of distributing the
data base in disjoint subsets throughout the processors and determining at
each processor whether the data the highest priority message packet is
related to that in the disjoint subset.
22. The method set forth in claim 21 above, wherein the disjoint subsets
include primary and redundant subsets at each processor, the redundant
subsets being backup for primary subsets from a number of other
processors.
23. The method set forth in claim 1 above, further including a host system
providing a system task and the step of communicating message sockets
representing subtasks and processed subtasks between the processors and
the host system via the network.
24. The method set forth in claim 23 above, further including the step of
coordinating said subtasks in the system using processors on the network.
25. The method set forth in claim 23 above, further including the step of
collecting processed subtasks at a processor via the network.
26. The method set forth in claim 1 above, including the steps of
concurrently coupling a set of competing message packets from the
processors to the network, transferring the highest priority message
packet to all processors with a predetermined fixed delay, and again
concurrently coupling another set of competing message packets to the
network, including previously losing message packets, for another
determination of priority.
27. The method set forth in claim 26 above, including the steps of
assembling sorted message packet lists at each processor, and coupling the
priority message packets from each processor to the network in the
sequence of highest order priority first.
28. The method set forth in claim 1 above, further including the steps of
redundantly coupling packets, determining highest priority and
simultaneously presenting the highest priority packet to all processors. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
Since the advent of the electronic computer in reliable form, workers in
the art have given much consideration to systems employing a number of
computers functioning together in interrelated fashion to accomplish a
given overall task. In some of these multiprocessor systems a large
computer utilizes its superior speed and capacity to perform the complex
portions of a program, but assigns smaller and slower satellite processors
the less complicated and less urgent tasks in order to reduce the load and
demands upon the large computer. The large computer is required to
undertake the responsibilities of assigning subtasks, making sure that the
smaller processors are kept occupied, ascertaining the availability and
performance of the smaller processors, and providing a unified result.
Other multiprocessor systems utilize a different approach, employing
multiple processors and a common bus system, with the processors having
essential equality of function. In this type of system, separate control
computers or control systems are often used to monitor the availability
and capability of an individual processor for a given subtask, and to
control the routing of tasks and information between processors. The
processors may be arranged and operated so that they themselves monitor
the status and availability of the other processors and determine the
routing of messages and programs. The common and substantial drawback of
these systems is that the software and operating time required for
overhead and maintenance functions interfere with the performance of the
principal objectives. Problems of routing and monitoring may increase
quadratically in relation to the number of processors involved, so that
ultimately a disproportionate amount of effort is spent in overhead
functions.
The following are some patents that are illustrative of the state of the
art:
U.S. Pat. Nos. 3,962,685 to Belle Isle; 3,962,706 to Dennis et al;
4,096,566 to Borie et al; 4,096,567 to Millard et al; 4,130,865 to Heart
et al; 4,136,386 to Annunziata et al; 4,145,739 to Dunning et al;
4,151,592 to Suzuki et al.
Since the days of the early "Binac" (two parallel processors) and
comparable systems it has been recognized that a multiprocessor provides a
redundant capability that can substantially improve the overall
reliability of an operating system. Actual installations of multiprocessor
systems have until recently been quite limited, largely due to the
extensive software problems involved. Nonetheless, the advantages of
multiprocessor operation for real time applications and other situations
in which system down time cannot be tolerated have led to the development
of systems which are successful in operation but which nevertheless
involve significant commitments to overhead software and operating time.
Illustrative of these are U.S. Pat. Nos. 3,445,822, 3,566,363 and
3,593,300, all relating to a system in which multiple computers access a
single shared main memory, and in which capabilities and requirements are
compared in order to assign tasks optimally to individual processors.
Another example of the prior art is U.S. Pat. No. 4,099,233, in which a
number of processors share a single bus and a control unit incorporating a
buffer register is used in the transfer of data blocks between a
transmitting miniprocessor and a receiving miniprocessor. This concept has
been employed in a distributed mail sorting system in Europe.
U.S. Pat. No. 4,228,496 pertains to a commercially successful
multiprocessor system in which buses between processors are coupled to bus
controllers which monitor transmissions and determine the priority of data
transfers between processors, each of which can be coupled in to control a
certain part of a number of peripheral devices.
The "Ethernet" system (U.S. Pat. Nos. 4,063,220 and 4,099,024) being
jointly promoted by Xerox, Hewlett-Packard and Intel evidences another
approach to the problem of intercommunicating between different processors
and peripherals. All units are coupled to a common multiple access network
and compete for priority. Collision detection is based upon time priority,
which in turn means that global capabilities cannot readily be controlled,
coordinated or given specificity.
Details of these complex systems can only be fully appreciated by close
analysis of the patents and any related publications. However, review will
show in each instance that the prioritizing of data transfer and the
selection of processors requires extensive intercommunication and
supervisory control if tasks are to be shared. Expansion of the systems to
include additional processors does not present identical problems with
these different systems, but in each instance substantially complicates
system software, applications programming, hardware, or all three.
Analysis will show that inherent constraints on multiprocessor system size
and capability are imposed by the usage of one or two logically passive
ohmic busses. While different techniques can be employed to facilitate
intercommunication, such as the grouping of subsystems into global
resources evidenced in recent U.S. Pat. No. 4,240,143, the amount of
useful traffic must reach a limit and variable delays impose insuperable
problems when large numbers of processors are used. Situations can arise
in which one or more processors become locked out or deadlocked, and these
circumstances in turn require added circuitry and software to resolve the
problems. The impracticality of substantially extending the number of
processors, say to 1024, thus becomes evident.
It is desirable for many applications to depart from the constraints of
these existing approaches and to utilize modern technology to best
advantage. The lowest cost technology available today is based upon mass
produced microprocessors, and high capacity rotating disk memories, such
as Winchester technology devices using small head to disk spacings in a
sealed enviroment. It is desirable to be able to expand a multiprocessor
system without disproportionate or even concomitant software complexity.
It is desirable further to be able to handle computer problems that may be
characterized as having a distributed structure, in which an overall
function can be dynamically subdivided into limited or iterative
processing tasks. Virtually all data base machines fall into this
category, which also includes such other typical examples as sorting,
pattern recognition and correlation, digital filtering, large matrix
computations, simulation of physical systems and the like. In all of these
situations there is a requirement for widely dispersed, relatively
straight-forward individual processing tasks with a high instantaneous
task load. This situation unduly burdens prior art multiprocessor systems
because it tends to increase the time and software involved in overhead,
and because practical difficulties arise in implementation of the systems.
Using a shared passive bus, for example, propagation rates and data
transfer times introduce an absolute barrier as to the rate at which
transactions can be processed.
Data base machines thus provide a good example of the need for improved
multiprocessor systems. Three basic approaches, namely the hierarchical,
network, and relational, have been proposed for the implementation of
large scale data base machines. The relational data base machine, which
permits easier user access to given data in a complex system by using
tables of relationships, has been recognized as having powerful potential.
Typical publications, such as an article entitled "Relational Data Base
Machines", published by D. C. P. Smith and J. M. Smith, in the March 1979
issue of IEEE Computer magazine, p. 28, U.S. Pat. No. 4,221,003 and
articles cited therein illustrate the state of the art.
Sorting machines also provide an example of the need for improved computing
architecture. A review of sorting machine theory can be found in Searching
and Sorting by D. E. Knuth, pp. 220-246, published (1973) by
Addison-Wesley Publishing Co., Reading, Mass. A number of networks and
algorithms are disclosed that must be studied in detail to appreciate
their limitations, but it is generally true that they are typically
complex schemes having only specific sorting purposes. Another example is
provided by L. A. Mollaar in an article entitled "A Design for a List
Merging Network", in the IEEE Transactions on Computers, Vol. C-28 No. 6,
June 1979 at pp. 406-413. The network proposed utilizes external control
of network merge elements and requires programming to perform specific
functions.
Various workers in the art have considered and are considering specialized
memory and system approaches that are intended to improve access to and
maintenance of information in a relational data base. These approaches
evidence the general recognition of the desirability of the relational
data base machine. In their present forms, however, they violate the
principle of utilizing the most advantageous cost per bit technology that
is presently available, because they inherently require development of
futuristic systems of ultimately unknown performance and economic
viability. Furthermore, these proposals are so preliminary in nature that
they cannot for some time confront the practical difficulties involved
with a working data base machine, in which data must not only be accessed,
but must further be updated, corrected as necessary, sorted, merged,
rolled back, recovered, and otherwise manipulated to meet the user's
requirements. The incorporation of other features, such as a capability
for expansion of the system, would tend to further delay practical usage
of such system.
Significant recent work on relational data base machines has been concerned
with responding interactively to ever more complex queries. However, the
ability to answer high level and sophisticated queries and the resultant
ease of use and user productivity should not impose penalties on the user
in terms of throughput and response time. It is also evident that, where a
large data base has been accumulated in an organization, the needs of
different activities seeking information from the data base can vary
widely, and thus to meet all the needs satisfactorily requires extensive
knowledge of the system. Although some systems have been devised that
perform all of the needed functions, they do so only for small data bases
and at great expense.
It is highly desirable for many organizations to be able to utilize a given
large main frame system, while obtaining the further cost and reliability
advantages of a multiprocessor. If this can be done, all of the
organization's existing software and hardware can continue to be used and
the effort required to convert to a relational data base system will be
minimized and continuity of day-to-day operations will be assured.
SUMMARY OF THE INVENTION
Systems and methods in accordance with the invention utilize a novel
architecture and organization in which multiple processors are
intercoupled by an active bidirectional network. The bidirectional network
is arranged in a hierarchy of precedence determining nodes, each of which
can concurrently resolve contentions for priority between competing pairs
of messages. It also broadcasts to all processors, from an apex node at
the highest tier in the hierarchy, that message packet having priority.
Tasks to be performed by the individual processors are accepted and
responsive message packets are returned, again via the bidirectional
network.
The network serves in one direction as a high speed decision making tree
whose active circuit nodes function in the time and space domains to make
a prioritized sort. Priority between contending message packets is
determined in accordance with predetermined rules and based upon the data
content in the message packets themselves. Messages of lower priority that
lose in contention within the network are again retried when the prior
transmission is completed.
The priority scheme pertains as well to acknowledgment messages, status and
control messages and special communications. Employing coherent priority
relationships, and timing the application of messages to the network so
that they are entered concurrently, the system eliminates the need for
extensive prefatory and confirmatory exchanges. A message gaining priority
on the network is delivered concurrently to all processors, and the
messages that lose in contention may substantially immediately vie again
for transmission.
The delay introduced by the network is balanced, in the sense that it is
the same for all processors, and is dependent only on the number of node
levels in the hierarchical network. The delay therefore increases only by
one increment for each doubling of the number of processors. In
consequence of such factors, the minimization of support functions, and
the fact that prioritizing is done without interruption of message flow,
transfers on the network contain a very high proportion of data messages.
Systems and methods in accordance with the invention can be advantageously
configured to stand alone or to interface to the I/O subsystems of
existing computers, such as a large or small applications processing
machine (referred to herein as the "host" or "main frame" computer). They
also permit existing operating systems software and applications software
on the "host" which do not use the invention to be used without
modification.
The multiprocessor system utilizes highly cost effective microprocessors.
For a data base system, some microprocessors may be characterized as
interface processors and others of which may be characterized as access
module processors. Both processor types are coupled to the base tier of
the bidirectional network. The access module processors individually
control different secondary storages, such as large capacity disk
memories, each of which contains a portion of the relational data base
arranged in scatter storage fashion. Each secondary storage has both
primary and backup storage portions that are unique and nonredundant
portions of the data base.
When a host computer generates a request, it communicates it via its I/O
channel to an interface processor. The interface processor may determine
that information stored by the access module processors must be retrieved
or otherwise manipulated to satisfy the request.
In all applications of the invention, requests for processing are
communicated by a processor to other processors via packets on the active
logic network. The network delivers such requests on a prioritized basis
and is capable of directing the request to either the specific
processor(s) or to the class of all processors which have the information
or capabilities needed to process the packet. Those processor(s) then
perform the ind | | |