|
Claims  |
|
|
What is claimed is:
1. A mesh connected local area network for interconnecting a multiplicity
of hosts, said network comprising:
a multiplicity of switch means for simultaneously routing a multiplicity of
data packets between hosts in the network; said hosts and switch means
comprising network members;
a multiplicity of point to point link means for interconnecting said switch
means and the hosts in said network, each point to point link means
providing a communication channel between two of said network members;
said multiplicity of point to point link means including spanning tree
links and a multiplicity of additional links;
said multiplicity of switch means and said spanning tree links jointly
forming a spanning tree in which one of said switch means is designated
the root of said spanning tree;
each switch means including reconfiguration means for determining the
position of said switch means in said spanning tree, said reconfiguration
means including:
position denoting means for denoting a tree position within said spanning
tree;
stability denoting means for denoting a stability value which indicates
whether said tree position is stable;
message sending means coupled to said position denoting means for sending a
configuration message to each switch means neighboring said switch means;
said configuration message including said tree position and said stability
value;
message receiving means for receiving configuration messages from
neighboring switch means, for generating a derived tree position which is
a function of the tree position in each received configuration message,
and for replacing the tree position denoted by said position denoting
means with said derived tree position when said derived tree position is
better than the tree position denoted by said position denoting means;
said message sending means including means for sending a configuration
message to said neighboring switch means when said tree position denoted
by said position denoting means is replaced by said derived tree position;
stability evaluating means, coupled to said message receiving means and
said stability denoting means, for setting said stability value to denote
that said tree position is stable when said switch means has sent at least
one configuration message to and has received at least one configuration
from each neighboring switch means, and (1) the last configuration message
received from each of said neighboring switch means includes a better tree
position than said tree position denoted by said position denoting means,
or (2) the last configuration message received from at least one of said
neighboring switch means includes a worse tree position than said tree
position denoted by said position denoting means and the switch's
stability value in the last configuration message received from each such
neighboring switch means with a worse tree position denotes a stable tree
position for said neighboring switch means;
said reconfiguration means including completion detecting means for
determining when said tree position denotes that said switch means is the
root of said spanning tree and said stability denoting means denotes that
said tree position is stable, and for sending a reconfiguration completion
message to said neighboring switch means indicating that all said switch
means are stable;
whereby the switch means in said network determine their relative tree
positions in said spanning tree and when the process of determining those
tree positions has been completed.
2. The mesh connected local area network of claim 1,
each switch means including a multiplicity of port means for coupling the
switch means to other switch means and hosts;
said position denoting means including means for designating as an uplink
port one port means of said switch means which couples said switch means
to another switch means that is closer to said root of said spanning tree,
in accordance with predefined criteria, and for denoting as downlink ports
each port means which is coupled to the uplink port of other ones of said
switch means;
said switch means which is closer to said root of said spanning tree
comprising the parent of said switch means; other switch means coupled to
said port means by said downlink ports comprising the children of said
switch means;
said reconfiguration means including
netlist means for storing a netlist which represents a least a portion of
said network members and the interconnections therebetween;
netlist transmission means for transmitting said netlist to said parent of
said switch means when said stability denoting means denotes that said
tree position is stable; and
netlist receiving means for receiving a netlist transmitted by a
neighboring switch means and for merging said received netlist with said
netlist stored in said netlist means;
said reconfiguration completion message sent by said completion detecting
means including said netlist stored in said netlist means;
said netlist receiving means further including means for replacing said
netlist with said received netlist when a reconfiguration completion
message is received;
said reconfiguration means including means for retransmitting said received
reconfiguration completion message to said children of said switch means;
whereby a complete netlist is stored in each switch means.
3. The mesh connected local area network of claim 2,
said reconfiguration means including
configuring means for denoting as an uplink port each port means of said
switch means which couples said switch means to another switch means that
is closer to said root of said spanning tree, in accordance with
predefined criteria, and for denoting as downlink ports all the other port
means of said switch means; and
route generating means, coupled to said netlist means, for generating a
routing table with an entry for each possible combination of a host
denoted in said netlist and a port means from which a data packet can be
received; each entry in said routing table defining a subset of said port
means of said switch means through which a received data packet can be
retransmitted, said subset of port means being a function of the port
means and the host corresponding to said entry; wherein said subset of
port means includes only selected ones of said port means denoted by said
configuring means as downlink ports when said corresponding port means is
denoted as a downlink port;
whereby said routing means provides deadlock free routing of data packets
through said mesh connected local area network.
4. A mesh connected local area network for interconnecting a multiplicity
of hosts, said network comprising:
a multiplicity of switch means for simultaneously routing a multiplicity of
data packets between hosts in the network; said hosts and switch means
comprising network members;
a multiplicity of point to point link means for interconnecting said switch
means and the hosts in said network, each point to point link means
providing a communication channel between two of said network members;
said multiplicity of point to point link means including spanning tree
links and a multiplicity of additional links;
said multiplicity of switch means and said spanning tree links jointly
forming a spanning tree in which one of said switch means is designated
the root of said spanning tree;
each switch means including reconfiguration means for determining the
position of said switch means in said spanning tree, said reconfiguration
means including:
position denoting means for denoting a tree position within said spanning
tree;
stability denoting means for denoting a stability value which indicates
whether said tree position is stable;
configuration change detection means for detecting a change in the
configuration of said network, including means for detecting the existence
of a new connection between said switch means and another one of said
network members and for detecting the breaking of a connection between
said switch means and another one of said network members;
reconfiguration initiation means, coupled to said configuration change
detection means, for sending a reconfiguration initiation message to
switch means neighboring said switch means when a change in the
configuration of said network is detected by said configuration change
detection means;
reconfiguration computation means, coupled to said position denoting means,
for redetermining the tree position of said switch means in said spanning
tree, said reconfiguration computation means including means for (A)
sending and receiving configuration messages to and from switch means
neighboring said switch means and (B) storing in said position denoting
means a derived tree position derived from said received configuration
messages; each said configuration message including said tree position and
said stability value;
said reconfiguration computation means including stability evaluating means
for setting said stability value to denote that said tree position is
stable when said switch means has sent at least one configuration message
to and has received at least one configuration from each neighboring
switch means, and (1) the last configuration message received from each of
said neighboring switch means includes a better tree position than said
tree position denoted by said position denoting means, or (2) the last
configuration message received from at least one of said neighboring
switch means includes a worse tree position than said tree position
denoted by said position denoting means and the switch's stability value
in the last configuration message received from each such neighboring
switch means with a worse tree position denotes a stable tree position for
said neighboring switch means;
said reconfiguration means including completion detecting means for
determining when said tree position denotes that said switch means is the
root of said spanning tree and said stability denoting means denotes that
said tree position is stable, and for sending a reconfiguration completion
message to said neighboring switch means indicating that all said switch
means are stable;
said multiplicity of switch means including means for transmitting to all
of said switch means network configuration data indicating the tree
positions of all said switch means in said spanning tree; and
each switch means including route selection means for routing received data
packets to specified network members in accordance with said network
configuration data and predefined route selection criteria;
whereby changes in the configuration of the network automatically cause the
switch means in said network to redetermine their tree positions in said
spanning tree. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
This patent application is related to patent application Ser. No.
07/370,248, filed simultaneously herewith, entitled ROUTING APPARATUS AND
METHOD FOR HIGH-SPEED MESH CONNECTED LOCAL AREA NETWORK, which is hereby
incorporated by reference.
The present invention relates generally to computer communications networks
for interconnecting computers and particularly to a mesh connected local
area network for routing information packets between computers.
BACKGROUND OF THE INVENTION
Local area networks (LANs) are commonly used to transmit messages between
relatively closely located computers. Referring to FIGS. 1A, 1B and 2,
there are at least three basic types of organizational architectures for
LANs: linear (shown in FIG. 1A), ring (shown in FIG. 1B), and mesh (shown
in FIG. 2). Ethernet, for example, is a widely used linear LAN for
interconnecting computer workstations, mainframes, and minicomputers.
For the purposes of this discussion linear LANs are defined to be single
channel LANs in which message packets are broadcast so as be heard by all
hosts (H) on the network, although usually only the host that is addressed
by a packet will choose to listen to it.
The present invention solves the primary problems which have heretofore
prevented mesh connected LANs from providing reliable high speed
communications among a large number of interconnected host computers. For
the purposes of this discussion, "a mesh connected network" means a
network of switches connected in an arbitrary topology.
Before explaining the significance of the problems solved by the present
invention, we will briefly consider the differences between mesh connected
local area networks and linear and ring networks, and the motivations for
building mesh connected networks even though such networks are generally
more expensive and complicated than linear and ring LANs.
Linear and ring LANs have the advantage of architectural simplicity and
well known solutions to most of the problems required for successful
commercial application--and have well established records of reliability.
However, linear and ring LANs have at least two major technological
limitations--both the number of hosts (i.e., workstations and other
computers) and the quantity of data that can be transmitted through such
LANs are limited by the availability of only a single data transmission
path. As more and more hosts are added to a linear or ring LAN, the amount
of traffic on the single data path will increase and the average amount of
time that each host must wait to send a message will also increase.
Eventually, if enough hosts share a single LAN the delays will become
unacceptable.
It can be shown that simply increasing the rate of data transmission on
linear and ring LANs does not completely solve the problem of network
congestion because some of the delays in such networks are related to the
length of time that it takes for a message to traverse the length of the
network--i.e., some delays are proportional to the physical length of the
network, regardless of the rate of data transmission.
For instance, it has been shown that the maximum usable data transmission
rate in linear LANs is inversely proportional to the physical length of
the network's channel. As a result, it would appear that useful linear
LANs cannot use data transmission rates much higher the 10 Megabaud rate
currently used by Ethernet--because the use of substantially higher data
rates will restrict the length of the network. In addition, linear LANs
have the problem that, since only one data packet can be sent at a time,
there must be a mechanism for deciding who (i.e., which host on the LAN)
will have control of the LAN at any one time. A simple consideration of
signal speed limitations imposed by the speed of light indicates that the
length of linear LANs must be fairly limited (e.g., to several
kilometers), and that network performance will degrade as more hosts are
added to the LAN because of contention for control of the LAN.
While ring LANs can run at arbitrarily high data rates, rings LANs suffer
from high latency--the delay between transmission and receipt of a
message, which is proportional to the length of the network and the number
of nodes which must be traversed. Ring LANs are also not very fault
tolerant, and are very limited in terms of their configuration.
While the above noted problems with linear and ring LANs have not overly
hampered their usefulness so far, the growing need for LANs with hundreds
of hosts and for data transmission rates in the range of 100 Megabits per
second exceeds the capability of the presently existing linear and ring
LANs.
The primary advantage of using a mesh connected LAN is the availability of
many parallel communications paths. This allows the simultaneous
transmission of messages between different pairs of network hosts. Thus a
mesh connected network can achieve much higher bandwidth than a comparable
linear or ring network--because the throughput of the network is not
limited by the throughput limitations of the network's links.
Another advantage of mesh connected networks over ring LANs is that mesh
networks can have relatively low latency. Latency is generally
proportional to the number of nodes that must receive and retransmit a
message packet. A well designed mesh LAN can have a relatively small
number of nodes between any selected pair of hosts in comparison to a ring
LAN with a similar number of hosts.
Another advantage of mesh connected networks is that a well designed mesh
connected network will provide several potential communication paths
between any selected pair of hosts, thereby reducing the amount of time
that hosts must wait, on average, before transmitting a message. In other
words, contention for use of the network can be greatly reduced because
many hosts can use the network simultaneously.
Traditionally, while mesh networks have been discussed in computer science
literature and a few patents, mesh networks have never achieved commercial
success due to several well known and relatively intractable problems. In
particular, the most difficult problems have been (1) deadlock, (2)
handling broadcast messages, (3) how to reconfigure the network when a
network component fails, and (4) how to organize the routing of messages
through the network so that the network throughput exceeds the throughput
of a single link. These problems, and their solutions by the present
invention ar described below.
SUMMARY OF THE INVENTION
In summary, the present invention is a high-speed mesh connected network
with high host-to-host bandwidth, low host-to-host latency, and high
aggregate bandwidth. The mesh connected network consists of a number of
interconnected switches which are coupled, in turn, to the hosts that are
members of the local network. The switches are cut-through, nonblocking
switches that are coupled to each other and to the hosts by a multiplicity
of point to point links.
The switches are organized as a spanning tree with one switch being denoted
the root node of the tree. Using a node ranking rule which will be
described below, every switch is ranked in terms of how "close" it is to
the root node.
Every link in the network is denoted as an "up" link in one direction and
as a "down" link in the other direction. The up direction is the one for
which the switch at one end of the link is closer to the root than the
switch at the other end of the link.
In addition, each switch has a routing mechanism for automatically routing
a received message packet toward its target host. In particular, the
routing mechanism of the present invention allows numerous packets to be
routed simultaneously through the network, and prevents deadlock by
ensuring that all message packets follow a sequence of one or more up
links, followed by one or more down links. No up links are traversed after
the message packet has been routed down even a single down link.
High aggregate bandwidth is achieved by simultaneously routing many data
packets through the network. Low latency is achieved, in part, by
providing switches which start retransmitting (i.e., forwarding) packets
well before receiving the ends of those packets. This is known as
cut-through switching.
A packet buffering scheme prevents node starvation and enables the routing
of broadcast messages. In addition, the flow control and data buffering of
the present invention compensates for any mismatches between the clock
rates of neighboring switches.
The present invention includes a number of self-management features that
overcome problems which have previously prevented commercial application
of mesh connected networks. The switches in the network automatically
detect any changes in the configuration of the network, such as the
addition of switches and links as well as the removal or failure of
network components. Upon detecting a change in the network configuration,
all of the switches participate in a distributed reconfiguration process
which automatically and quickly reconfigures the network by recomputing
all the legal paths for routing message packets through the network. The
reconfiguration process is sufficiently fast that it has minimal impact on
the performance and operation of the network.
Important aspects of the reconfiguration process include automatic
identification of the root of the spanning tree and automatic detection of
the completion of the distributed reconfiguration process.
BRIEF DESCRIPTION OF THE DRAWINGS
Additional objects and features of the invention will be more readily
apparent from the following detailed description and appended claims when
taken in conjunction with the drawings, in which:
FIG. 1A is a block diagram of a linear local area network, and FIG. 1B is a
block diagram of a ring local area network.
FIG. 2 is a block diagram of a small mesh connected local area network in
accordance with the present invention.
FIG. 3 is a more detailed diagram of a section of a local area network in
accordance with the present invention.
FIG. 4 depicts an example of deadlock in a mesh connected LAN.
FIG. 5 is a conceptual diagram of the concept of up and down links in a
mesh connected LAN.
FIG. 6 is a timing diagram depicting the transmission of a data packet and
the corresponding flow control signals.
FIG. 7 is a block diagram of a network controller for one host computer.
FIG. 8 is a block diagram of the switch used in the preferred embodiment.
FIG. 9 is a block diagram of the crossbar switch used in the preferred
embodiment.
FIG. 10 is a block diagram of the data flow control circuitry for a chain
of connected network members.
FIG. 11 is a block diagram of two connected link units in a switch.
FIG. 12 is a detailed block diagram of a link unit.
FIG. 13 is a block diagram of the router used in the switch of FIG. 8.
FIG. 14 schematically depicts the process of selecting a link vector from a
routing table using the network address as part of the lookup address.
FIG. 15 is a block diagram of the route selection mechanism of the router
in FIG. 13.
FIG. 16 is a timing diagram for the router of FIG. 13.
FIG. 17 depicts a mesh network as a spanning tree.
FIG. 18 is a flow chart of the first phase of the network reconfiguration
process.
FIG. 19 depicts the primary data structures used during the second and
third phases of the network reconfiguration process.
FIG. 20 is a flow chart of the second and third phases of the network
reconfiguration process.
DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 2 shows a conceptual representation of a mesh connected local area
network 100 in accordance with the present invention, although many
important features of the present invention are not shown in this Figure.
Unlike the prior art mesh networks, there is no particular hierarchy of
nodes and no requirements as to how the nodes of the network are
interconnected. The nodes of the network could be randomly interconnected
and the network would still function properly, although a well thought out
set of interconnections will provide somewhat better performance.
In FIG. 2 the host computers which use the network are labelled H, and the
nodes which comprise the local area network (LAN) are called switches and
are labelled S. In this conceptual diagram sixteen switches are used to
interconnect about eighty hosts. It should be noted that the switches S
are multiported, cut-through nonblocking switches which can simultaneously
couple a multiplicity of incoming links to various selected outgoing
links. These switches enable numerous data packets to be simultaneously
routed through the network.
GLOSSARY
To clarify the following discussion, the following definitions are
provided.
"Channel" is the term used to refer to one half of a link, as defined
below. In general, each channel is a single direction communication
channel for transmitting data packets between two members of a network. In
some contexts a channel is called an "up link" or "down link" to identify
the direction of data flow in the channel.
A "host" is any computer or workstation that is connected to the network
and which can be used to send and receive messages. Each letter "H" in
FIG. 2 represents one host.
A "member of the network" or "network member" is either a host or a switch.
A "mesh connected network" is a network of switches connected in an
arbitrary topology.
A "message" is any set of information or data which is transmitted from one
member of the network to another. As will be explained in detail below,
most messages are sent from one host to another, but occasionally, network
control messages are sent from one switch to another.
"Packet", "data packet" and "message packet" all mean the basic unit of
information which is transmitted through the network. Basically, any set
of information that is sent through the network is first packaged into one
or more packets. Each packet includes a header that specifies the
destination of the packet, and a tail which declares the end of the
packet. Thus a short message (e.g., less than 10,000 bytes) will be
typically transmitted as a single packet, whereas a long message (e.g., a
long document or data file) will be broken into a stream of consecutively
transmitted packets.
"Retransmitting" a packet means forwarding a packet that has been received
or partially received by a switch.
A "port" is the circuit in a switch (or host) which couples the switch (or
host) to a link.
A "switch" is a physical device that is used to receive and route packets
through the network. In the preferred embodiment switches can be connected
to at least a dozen hosts and/or other switches. Each circle in FIG. 2
labelled "S" represents one switch.
A "link" is the apparatus which physically connects any two members of the
network. In FIG. 2 each straight line between a host H and a switch S, or
between two switches, represents a link. In the context of the present
invention, each link between two network members is a full duplex, two way
channel, which allows simultaneous communications in both directions. Both
ends of each link are terminated by a "link circuit" which is also called
a "port".
A "network address" is a value, assigned to each network member, used to
index into a "routing table". The entry in the routing table specified by
the network address provides information corresponding to legal routes
through the network to the network member.
"Reconfiguration" is the process of determining all the legal data
transmission paths for data packets being transmitted by the network.
Every time that a new switch of link is added to the network, and every
time that a switch or link is removed from the network or fails to work
properly, a network reconfiguration takes place. An important feature of
the present invention is that not all of the physical multi-link paths
between two hosts are legal transmission paths.
"Spanning tree," as used herein, means a representation of the
interconnections between the switches in a mesh connected network.
Technically, a spanning tree is a non-cyclic connected subgraph which
represents a portion of the network, excluding the host computers and
certain links between switches. The excluded links make the network an
acyclic graph rather than a tree because the nodes of the spanning tree
can have interconnections within each level of the graph.
A "Netlist" is a representation of the switches and links between switches
in a network.
The "root", "root node" or "root switch" of a network is a switch S which
is designated as the root of the spanning tree representation of the
network. The root node has several special responsibilities during
reconfiguration of the network, and also for retransmitting broadcast
messages, i.e., messages that are sent to all of the hosts in the network.
NETWORK CONNECTIONS AND ROUTING
Referring to FIG. 3, there is shown one section of a mesh connected network
in accordance with the present invention. In the preferred embodiment,
each host 120 in the network has a network controller 122 which couples
the host 120 to two distinct switches (e.g., switches 124 and 126 in the
case of host 120). The two links 128 and 130 which couple the host 120 to
switches 124 and 126 are identical, except that only one of the two links
is active at any one time. For this reason link 130 is shown as a dashed
line to indicate that it is inactive.
Whenever the active link between a host computer and a switch fails, the
host's network controller 122 automatically activates the other link
130--thereby reconnecting the host to the network. In addition, it is
strongly preferred that the two links 128 and 130 for each host be coupled
to two different switches so that if an entire switch fails all the hosts
coupled to that switch will have alternate paths to the network.
Generally, the provision of two alternate paths or channels from each host
to the network provides sufficient redundancy that no single hardware
failure can isolate a host from the network.
It is noted here that each "link" between network members is actually two
communications channels which simultaneously carry data in opposite
directions. In the preferred embodiment, each link 128 in the network can
be up to 100 meters in length when coaxial cable is used, and up to 2
kilometers miles in length when fiber optic cabling is used.
When using coaxial cable, the amount of wiring needed by the network can be
reduced by using a single line of cable to simultaneously transmit signals
in both directions over the link. At each end of the cable there is a
transmitter and a detector. The detector regenerates the signals sent by
the transmitter at the other end of the cable by subtracting the output of
the transmitter at the same end of the cable from the signal received by
the detector at its end of the cable. Such full duplex, single wire
communication channels are well known, and are not essential to
implementing the present invention.
Numerous data packets can be simultaneously transmitted through the
network. For example consider the example of a first packet being sent
from host 132 to host 134 while a second packet is sent from host 136 to
host 138. FIG. 3 shows a route P1, comprising three links coupled by two
switches which can be used for sending the first packet from host 132 to
host 134. Route P2 shown in FIG. 3 can simultaneously be used to send the
second packet from host 136 to host 138. In this example, both data
packets are simultaneously routed through switch 140. This is possible
because the switches used in the present invention are multiported
nonblocking switches. Each switch contains a crossbar circuit which can
simultaneously couple a multiplicity of incoming links to distinct
outgoing links.
While packets are generally sent from one host H in the network to another
host H, it is noted that during reconfiguration of the network data
packets are sent to computers in the switches themselves. This aspect of
data packet routing will be discussed below in the sections entitled
Routing Tables and Reconfiguration Process.
Deadlock
One of the features of the present invention is that it provides a mesh
connected network that cannot suffer from "deadlock". Deadlock in a
network can be thought of as the electronic analog of gridlock at a
traffic intersection. FIG. 4 shows four host computers A, B, C and D and
four switches S. Each host is trying to send a data packet 148 to another
host that is separated from the transmitting host by two switches. The
destination of each packet is denoted by the label in the box 148 which
symbolizes the packet. For instance, the packet 148 being sent by host A
has a destination of host C.
For the purposes of this example it is assumed that the data packets being
sent are larger than the data buffers in the switches, and that therefore
the data packets will occupy a chain of two or more links during the
transmission of the packet. As shown in FIG. 4, the progress of each data
packet is blocked because the link needed for the next step of the
transmission is blocked by another one of the packets.
As will be appreciated by those skilled in the art, deadlock can also occur
with small data packets. In particular, the data buffer in a switch can
become filled with two or more data packets, thereby preventing any more
data from being sent through the link that is connected to the filled data
buffer. Thus in FIG. 4 each blocked packet 148 can be replaced by a
sequence of two or more packets, the first of which is being blocked
because the link needed for the next step in its route is blocked by
another one of the packets.
Clearly this deadlocked condition will not happen very often because it
requires four hosts to initiate the sending of new data packets virtually
simultaneously. However, it is unacceptable for deadlock to ever occur
because it will cause the network to "crash" and messages to be lost.
Up/Down Routing
The present invention completely prevents deadlock by using a new type of
routing procedure which automatically routes messages so that they will
not deadlock one another. Referring again to FIG. 4, the implication of
the data paths shown is that one could, at least theoretically, have a
"cycle" in which a data packet is sent into an endless loop through the
four switches. While cycles are not, by themselves, usually a problem, the
availability of data paths which form a cycle is a symptom of mesh
networks which can suffer deadlock.
Referring to FIG. 5, there is shown a somewhat complicated example of a ten
node network of switches S1 to S10. All lines between the switches
represent bidirectional links.
For reasons which will soon be explained, every link between the switches
has been assigned a direction, as indicated by the arrows on the links.
The arrows on the links are said to point "up" toward the root node of the
network. More specifically, when a data packet is transmitted through a
link in the same direction as the arrow on that link, the data packet is
said to be going on an "up link" or, more correctly, an "up channel". When
a data packet is transmitted through a link in the opposite direction as
the arrow on that link the data packet is said to being going on a "down
link" or "down channel".
The basic routing rule used in the present invention is that all legal
routes for a data packet comprise zero or more up channels, followed by
zero or down channels. Once a data packet has been transmitted through a
down channel it cannot be transmitted through an up channel.
The basic routing rule as just stated defines the legal routes for a packet
from a "global perspective"--that is from the viewpoint of someone looking
at the network as a whole. From the perspective a single switch, when a
packet travels on an "up link" to the switch, that packet is received on a
down link. Thus, from the "local switch perspective", packets received on
"down links" can be forwarded on either an up or down link; packets
received on "up links" can be forwarded only on down links.
In addition, it should be understood that for all links between hosts and
switches, the up direction is toward the switch. The channel from a host
computer to a switch is always an up link or channel, and the channel from
a switch to a host is always a down link or channel. Thus when a host
computer transmits a data packet, the first channel that the data packet
goes over is always an up channel. Similarly, the last channel that a data
packet goes over as it is received by a host computer is always a down
channel.
The lower left portion of FIG. 5 comprising switches S1, S3, S5 and S10
will now be used to show why deadlock is impossible using the up/down
routing mechanism. If one tries to impose the data paths from FIG. 4 onto
these switches in FIG. 5, one will see that all of the data paths in FIG.
4 are legal except one: the data path from host B to host D through
switches S3, S5 and then S10 is not legal. This is because the path from
S3 to S5 is a down channel while the path from S5 to S10 is an up channel.
This contradicts the rule that up channels cannot be used after down
channels. The solution is that message from host B to host D must first go
from S3 to S1 (which is an up channel) and then from S1 to S10 (which is a
down channel).
The directionality of each link between switches in the network is
determined as follows. Every switch (and host computer) is permanently
assigned a unique 48-bit identifier, called the UID. Such UIDs are used in
Ethernet networks to uniquely identify every member of the network. As
will be discussed later, every switch in the network is assigned a 7-bit
SHORT ID, and every host computer is assigned an 11-bit network address.
The first rule is that the switch with the lowest UID in the entire network
is called the root node and is assigned a network level of zero. A
corollary of the first rule is that all links to the root node are
directed upwards toward the root.
In FIG. 5 it is assumed that each switch is assigned a UID equal to its
reference numeral: S1 is assigned a UID of 1, then S2 is assigned a UID of
2, and so on. Thus switch S1 is the root and the links to S1 from switches
S2, S3, S9 and S10 are directed upwards toward switch S1.
The second rule is that switches are assigned network levels based on the
minimum number of links between the switch and the root, and that links
between switches at different network levels are directed upward toward
the lower network level. For instance, switch S3 is at network level 1 and
switch S8 is at network level 2, and thus the link from S8 to S3 is upward
toward S3.
The third and final rule for assigning directionality to links is that
links between switches at the same network level are upward toward the
switch with the lower UID. Thus, since switches S2 and S3 are both at
network level 1, the link between them is upward toward S2.
Another example of a legal route through the network is as follows: to send
a packet from host C to host E, the packet could go via path P3 or path
P4. Path P3, which goes through switches S5, S3, S8 and S9, is legal
because it follows up channels and then down channels. Path P4, which goes
through switches S5, S7 and then S8, is legal because it follows two down
channels.
While path P4 is shorter than path P3, path P3 might be preferred if the
| | |