|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to network systems having redundant routers
for receiving packets from a host on a LAN. More specifically, the
invention provides a "standby group" of routers including an "active"
router which handles packets from the host and a "standby" router which
backs up the active router should it fail.
Local area networks (LANs) are commonly connected with one another through
one or more routers so that a host (a PC or other arbitrary LAN entity) on
one LAN can communicate with other hosts on different LANs. Typically, the
host recognizes only those addresses for the entities on its LAN. When it
receives a request to send a data packet to an address that it does not
recognize, it communicates through a router which determines how to direct
the packet between the host and the address. Unfortunately, a router may,
for a variety of reasons, become inoperative (e.g., a power failure,
rebooting, scheduled maintenance, etc.). When this happens, the host
communicating through the inoperative router may still remain connected to
other LANs if it can send packets to another router connected to its LAN.
Various protocols have been devised to allow a host to choose among routers
in a network. Two of these, Routing Information Protocol (or RIP) and ICMP
Router Discovery Protocol are examples of protocols that involve dynamic
participation by the host. The host in a RIP system receives the periodic
routing protocol packets broadcast by the various routers on the system
and thereby keeps track of available routers. If a router stops sending
protocol packets, the host assumes that the router is no longer operative
and stops sending data through that router. Unfortunately, routing
protocol packets contain relatively large amounts of data including all
the specific routes known by the routers. Because the host periodically
receives these rather large packets, the system bandwidth is reduced.
In ICMP Router Discovery, the host keeps track of operative routers by
listening for router reachability messages. These messages contain a list
of IP addresses of usable routers together with preference values for
those routers. Because these messages are relatively small (in comparison
to routing protocol packets received by the host in RIP) and are not
coupled time-wise with any routing protocol, the bandwidth utilization is
improved in comparison with RIP. Nevertheless, both RIP and Router
Discovery require that the host be dynamically involved in the router
selection, thus reducing performance and requiring special host
modifications and management.
In a widely used and somewhat simpler approach, the host recognizes only a
single "default" router. In this approach, the host is configured to send
data packets to the default router when it needs to send packets to
addresses outside its own LAN. It does not keep track of available routers
or make decisions to switch to different routers. This requires very
little effort on the host's part, but has a serious danger. If the default
router fails, the host can not send packets outside of its LAN. This will
be true even though there may be a redundant router able to take over
because the host does not know about the backup. Unfortunately, such
systems are now widely used in mission critical applications such as stock
trading.
Other systems in which the host becomes overly dependent upon a single
router have similar problems. For example, in a "proxy ARP" protocol, a
router may give a host its address in response to the host's request for
an address outside of its local LAN. Thereafter, the host directs its
traffic through that router. If the host does not often update its ARP
table entry (which lists physical addresses of available routers), it may
continue to assume that it should send all data packets through the same
router, even after that router fails. Unfortunately when this happens, the
host can no longer communicate outside its own LAN.
In view of the above, it would be desirable to have a network system in
which the hosts are not dynamically involved in router selection, and yet
are able to handle failures by an assigned router.
SUMMARY OF THE INVENTION
The present invention provides a system and protocol for routing data
packets from a host on a LAN through a virtual router. The host is
configured so that the packets it sends to destinations outside of its LAN
are always addressed to the virtual router. The virtual router may be any
physical router elected from among a "standby group" of routers connected
to the LAN. The router from the standby group that is currently emulating
the virtual router is referred to as the "active" router. Thus, packets
addressed to the virtual router are handled by the active router. A
"standby" router, also from the group of routers, backs up the active
router so that if the active router becomes inoperative, the standby
router automatically begins emulating the virtual router. This allows the
host to always direct data packets to an operational router without
monitoring the routers of the network.
In one aspect, the present invention provides a router for use in the
described standby group. Such a router includes (1) a primary router
address; (2) a group virtual address which is adopted by the router when
it becomes the active router of the network segment; (3) means for
assuming the group's virtual address; (4) means for issuing a coup message
to notify a current active router that the router will attempt to replace
the active router; and (5) means for disabling the means for issuing a
coup message. In preferred embodiments, each router of this invention has
the capability of adopting both the standby and active statuses depending
upon the current circumstances in the network.
The coup message provides a router with the ability to take over the role
of active router should it determine that it has a priority higher than
that of the active router. Each router is configured with a priority.
Generally, the router with the highest priority is the active router.
However, if a new router (i.e., a router that has neither the active nor
standby status) having the highest priority enters the network group, it
becomes active router only through a defined protocol. This involves
sending a coup message containing the new router's priority. When the
active router determines that the router sending the coup message has a
higher priority, the active router sends a resign message and removes the
group's virtual address. When this occurs, the active router ceases to
emulate the virtual router--a role now taken by the new router. In some
cases, it will be desirable for a router to be configured so that it will
not send coup messages in this situation. Thus, the routers of this
invention preferably include means for disabling the means for sending
coup messages in a manner such that a new higher priority router entering
the group does not automatically preempt an active router.
A router can determine when the active or standby router is no longer
operating by listening for "hello" messages from these routers. Thus, the
routers of this invention preferably include means for sending and
receiving hello messages. The hello message preferably includes a router
priority, a router status (e.g., active, standby, or new), and the group
virtual address. Thus, the listening routers can determine a speaking
router's status and priority. If a new router determines that the priority
of the active router is lower than its own, it may send a coup message. If
a new router no longer hears hello messages issuing from the active or
standby router, it can assume that router is no longer operational.
Thereafter, the new router together with other new routers can elect and
install a replacement standby and/or replacement active router. This
election preferably is performed automatically, without requiring that a
user intervene to specify the replacement router.
In another aspect, the present invention provides a method for backing up
an active router. The method includes the following steps: (1) specifying
an active router for routing data packets from the host; (2) specifying a
standby router which backs up the active router; (3) causing the active
router to emulate a virtual router; (4) causing the host to address data
packets to the virtual router; and (5) automatically selecting a new
active router based upon a comparison of the priorities of the multiple
routers in the network. In the case of a coup, the step of selecting a new
active router includes the following steps: (a) detecting a coup message
from a new router indicating that it wishes to take over as the active
router; and (b) selecting the new router as the active router if its
priority is higher than that of the active router. In the case where the
active router simply leaves the system (due to a bad connection for
example), the step of selecting a new active router includes the following
steps: (a) determining when an active router has left the network (by no
longer hearing hello messages from the active router, for example); and
(b) if the active router has in fact left the network, selecting the
standby router as the active router. Note that in this case, the standby
router should automatically take over for the active router, and the other
routers in the system must then decide among themselves which one will
become the new standby router.
These and other features and advantages of the present invention will be
presented in more detail in the following specification of the invention
and the figures.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a router that may be used in this invention;
FIG. 2a is a block diagram of a network segment in accordance with this
invention having a standby group of routers and a virtual router for the
standby group;
FIG. 2b is a block diagram of a network segment having two standby groups
of routers, each having a router which emulates a group virtual router;
FIG. 3 is a process flow diagram showing generally the steps involved in
replacing a departing active router (which emulates the group virtual
router) with a standby router;
FIG. 4 is a process flow diagram showing the steps involved in replacing a
departing standby router with a new router from a group of routers;
FIG. 5 is a process flow diagram presenting the processes by which a new
router entering a network segment can become an active router in
accordance with this invention;
FIG. 6 is a state diagram of a router in a preferred embodiment of this
invention; and
FIG. 7 is a chart showing the events which cause a router of FIG. 6 to
change states.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
1. Definitions
The following terms are used in the instant specification. Their
definitions are provided to assist in understanding the preferred
embodiments described herein.
A "host" is a PC or other arbitrary network entity residing on a LAN and
communicating with network entities outside of its own LAN through a
router or bridge.
A "router" is a piece of hardware which operates at the network layer to
direct packets between various LANs of the network. The network layer
generally allows pairs of entities in a network to communicate with each
other by finding a path through a series of connected nodes.
An "IP (internet protocol) address" is a network layer address for a device
operating in the IP suite of protocols. The IP address is typically a 32
bit field, at least a portion of which contains information corresponding
to its particular network segment. Thus, the IP address of a router may
change depending upon its location in a network.
A "MAC address" is an address of a device at the sublayer of the data link
layer, defined by the IEEE 802 committee that deals with issues specific
to a particular type of LAN. The types of LAN for which MAC addresses are
available include token ring, FDDI, and ethernet. A MAC address is
generally intended to apply to a specific physical device no matter where
it is plugged into the network. Thus, a MAC address is generally hardcoded
into the device--on a router's ROM, for example. This should be
distinguished from the case of a network layer address, described above,
which changes depending upon where it is plugged into the network. MAC is
an acronym for Media Access Control.
A "virtual address" is an address shared by a group of real network
entities and corresponding to a virtual entity. In the context of this
invention, one router frown among a group of routers emulates a virtual
router by adopting one or more virtual addresses, and another entity
(usually a host) is configured to send data packets to such virtual
address(es), regardless of which router is currently emulating the virtual
router. In preferred embodiments, the virtual addresses encompasses both
MAC layer and network layer addresses. Usually various members of the
group each have the capability of adopting the virtual address (although
not at the same time) to emulate a virtual entity.
A "packet" is a collection of data and control information including source
and destination node addresses, formatted for transmission from one node
to another. In the context of this invention, it is important to note that
hosts on one LAN send packets to hosts on another LAN through a router or
bridge connecting the LANs.
2. Overview
The invention employs various process steps involving data manipulation.
These steps require physical manipulation of physical quantities. Usually,
though not necessarily, these quantities take the form of electrical or
magnetic signals capable of being stored, transferred, combined, compared,
and otherwise manipulated. It is sometimes convenient, principally for
reasons of common usage, to refer to these signals as bits, values,
variables, characters, data packets, or the like. It should be remembered,
however, that all of these and similar terms are to be associated with the
appropriate physical quantities and are merely convenient labels applied
to these quantities.
Further, the manipulations performed are often referred to in terms, such
as estimating, running, selecting, specifying, determining, or comparing.
In any of the operations described herein that form part of the present
invention, these operations are machine operations. Useful machines for
performing the operations of the present invention include general purpose
and specially designed routers or other similar devices. In all cases,
there should be borne in mind the distinction between the method of
operations in operating a router or computer and the method of computation
itself. The present invention relates to method steps for operating a
router in processing electrical or other physical signals to generate
other desired physical signals.
The present invention also relates to an apparatus for performing these
operations. This apparatus may be specially constructed for the required
purposes, or it may be a general purpose programmable machine selectively
activated or reconfigured by a computer program stored in memory. The
processes presented herein are not inherently related to any particular
router or other apparatus. In particular, various general purpose machines
may be used with programs written in accordance with the teachings herein,
or it may be more convenient to construct a more specialized apparatus to
perform the required method steps. For example, the routers of this
invention are preferably specially configured models AGS, MGS, CGS, TGS,
IGS, 2000, 3000, 4000, and 7000 available from Cisco Systems, Inc. of
Menlo Park, Calif. The general structure for a variety of these machines
will appear from the description given below.
Referring now to FIG. 1, a router 10 of the present invention includes a
master central processing unit (CPU) 62, low and medium speed interfaces
68, and high speed interfaces 12. In preferred embodiments, the CPU 62 is
responsible for such router tasks as routing table computations and
network management. It may include one or more microprocessor chips 63
selected from complex instruction set computer (CISC) chips (such as the
Motorola 68040 microprocessor), reduced instruction set computer (RISC)
chips, or other available chips. In a preferred embodiment, a memory 61
(such as non-volatile RAM and/or ROM) also forms part of CPU 62. However,
there are many different ways in which memory could be coupled to the
system.
The interfaces 12 and 68 are typically provided as interface cards.
Generally, they control the sending and receipt of data packets over the
network and sometimes support other peripherals used with the router 10.
The low and medium speed interfaces 68 include a multiport communications
interface 52, a serial communications interfaces 54, and a token ring
interface 56. The high speed interfaces 12 include an FDDI interface 24
and a multiport ethernet interface 26. Preferably, each of these
interfaces (low/medium and high speed) includes (1) a plurality of ports
appropriate for communication with the appropriate media, and (2) an
independent processor such as the 2901 bit slice processor (available from
Advanced Micro Devices corporation of Santa Clara Calif.), and in some
instances (3) volatile RAM. The independent processors control such
communications intensive tasks as packet switching and filtering, and
media control and management. By providing separate processors for the
communications intensive tasks, this architecture permits the master
microprocessor 62 to efficiently perform routing computations, network
diagnostics, security functions, etc.
The low and medium speed interfaces are coupled to the master CPU 62
through a data, control, and address bus 65. High speed interfaces 12 are
connected to the bus 65 through a fast data, control, and address bus 15
which is in turn connected to a bus controller 22. The bus controller
functions are provided by a processor such as a 2901 bit slice processor.
Although the system shown in FIG. 1 is a preferred router of the present
invention, it is by no means the only router architecture on which the
present invention can be implemented. For example, an architecture having
a single processor that handles communications as well as routing
computations, etc. would also be acceptable. Further, other types of
interfaces and media could also be used with the router.
FIGS. 2a and 2b show network segments including routers R interconnecting a
host H on a LAN L with one or more other LANs in a network N. For the
purposes of this invention, any LAN supporting broadcast and link layer
addressing independent of exact physical location is acceptable. It is to
be understood that the LAN L includes other network entities in addition
to the host H, but in the interest of simplifying the figures, these
entities are not shown. Further, it should be understood that the routers
in FIGS. 2a and 2b are connected to at least one other LAN or WAN in
addition to LAN L shown in the figures. Still further, for this invention,
any data processing device in a LAN may be considered a host. For example,
the host H may be a terminal, personal computer, workstation,
minicomputer, mainframe, etc. It should be understood that the hosts may
be manufactured by different vendors and may also use different operating
systems such as MS-DOS, UNIX, OS/2, MAC OS and others.
Referring to FIG. 2a, a network segment 118 includes host H on LAN L, a
group of routers (including routers R1, R2, and R3) on cable 120, and
virtual router R4. Host H is connected to routers R1, R2, and R3 via cable
120 and bi-directional line 74. Bidirectional line 74 and cable 120 may be
any suitable media such as coaxial cable, shielded and unshielded twisted
pair wiring, fiber optic line, radio channels, and the like. The LAN in
which the host resides may assume a variety of topologies, including ring,
bus, star, etc. Further, these LANs may have different physical
configurations such as token ring (IEEE 802.5), ethernet (IEEE 802.3), and
fiber distributed data interface or "FDDI" (ANSI X3T9.5).
At any one time, one of the routers R1, R2, or R3 assumes the state of
active router, a condition requiring that it emulate the virtual router
R4. The host H is configured to point to virtual router R4, regardless of
which real router (R1, R2, or R3) is currently emulating it. Thus, when
the host H needs to send data packets outside of LAN L, it directs them to
virtual router R4. A virtual router in this invention is defined by
virtual MAC layer and network layer (e.g., IP) addresses which are shared
by a group of routers running the protocol of this invention. The router
selected by the protocol to be the active router (R1, R2, or R3 in this
case) adopts these virtual MAC and network layer addresses--possibly in
addition to its own addresses--and thus receives and routes packets
destined for the group's virtual router. In the router group shown of FIG.
2a there will be 4 pairs of addresses (each pair includes a MAC and a
network layer address): one for router R1, one for router R2, one for
router R3, and one for the group or virtual router R4.
One of the routers in the group (R1, R2, or R3) assumes the state of
standby (or backup) router. When the standby router detects that the
active router has failed, it takes over as the active router by adopting
the group's MAC and IP addresses. A new standby router is automatically
selected from among the other routers in the group--assuming there are
more than two routers in the group. In the simple example provided in FIG.
2a, if the router R1 is initially the active router, the host will send
packets through R1 because R1 has adopted the MAC and network layer
addresses of R4. Further, if router R2 is the standby router, a failure by
R1 will cause R2 to become the active router. After such failure, the host
will continue sending data to the MAC and IP addresses of R4 even though
that data is now transferred though a different router. It is important to
recognize that any router in a standby group can assume the roles of
standby or active router.
Further, a new router within the group may attempt a coup of the active
router if it believes that it meets the conditions necessary to perform as
active router. In this case, the new router (e.g., R3) first determines
whether it has "priority" over the current active router (explained
below). If so, it issues a coup message and the current active router
resigns, whereupon the new router takes over the status of active router.
The procedures of selecting active routers based upon priority has some
elements in common with the procedures employed in the routing protocols
OSPF and IS-IS. However, unlike these conventional routing protocols, the
goal of the present invention is to provide an active router which
emulates a virtual router for a host's benefit. Further, the present
invention provides a mechanism by which the preempt capability (ability to
coup) can be switched off so that the new router does not automatically
take over as active router when it enters the network group. This new
feature is desirable because network operation may be delayed for a short
period while the coup takes place. Thus, the ability to switch off the
preempt capability may prevent unnecessary system delays.
Referring now to FIG. 2b, a network segment having two groups of routers is
shown. Such network segments are appropriate when different hosts on a LAN
have their own standby groups of routers. In other words, each host has
its own active router, standby router, and new routers (if there are more
than two routers in the standby group). Each such standby group connected
to the LAN follows the protocol for emulating a virtual router and
selecting active and standby routers as described above. Employing
multiple standby groups might be beneficial in a situation where different
groups of users within an organization (a marketing group and a finance
group, for example) share the same LAN. The marketing group may use one
standby group of routers to send packets outside of the LAN while the
finance group uses another standby group to send its data packets outside
of the LAN. In FIG. 2b, host H1 has a standby group of routers 124. This
group includes real routers R1 and R2 as well as virtual routers R3.
Similarly, host H2 in LAN L sends its data packets through standby group
126 on network N. Standby group 126 includes real routers R5, R6, and R7
together with virtual router R4. All routers connected with LAN L are
connected through cable 130. Each of standby groups 124 and 126 has its
own network layer and MAC addresses which are adopted by the active router
emulating the virtual router. In some situations, a given router may exist
in two different groups. For example, in FIG. 2b, router R5 might exist in
group 124 as well as group 126. To do so, it would have to be configured
to adopt the MAC addresses for virtual routers R3 and R4 as well as it own
physical MAC address. In theory, such a router could be a member of as
many groups as the number of additional MAC addresses it could adopt.
In each of the above examples, the standby group included at least two
routers. In preferred embodiments, standby groups include at least three
real routers. However, some redundancy can also be obtained with a single
router and two interfaces using "dial backup." In this embodiment, one
interface is designated a primary interface and the other a backup. When
the primary interface fails, the backup interface begins to be used.
The standby protocol of this invention can be run on any of a number of
transport protocols including TCP ("Transmission Control Protocol"), UDP
("User Datagram Protocol"), CLNP, and XNS ("Xerox Network System").
Preferably, UDP is used as the transport protocol of this invention.
As noted above, the routers of this invention preferably run on an IP
network layer. However, their application is not limited to any specific
network layer protocol. For example, the standby protocol of this
invention could also run on IPX which is a network layer protocol used
underneath "Netware".TM. available from Novell, Inc. of Provo, Utah. When
the standby protocol of this invention is implemented for IPX, an active
router must emulate a virtual IPX router rather than a virtual IP router.
Such an active router will be the only router in the group to respond to
GNS ("Get Nearest Server") requests issued by hosts.
It should also be recognized that the protocol of this invention can in
some circumstances be used to emulate virtual bridges (as opposed to
virtual routers). For example, SRB ("source routing bridging") is a
protocol allowing for multiple bridges operating in parallel. In
implementing this invention in SRB, one bridge from a group would have to
emulate a virtual bridge. For example, a virtual bridge number could be
employed in much the same manner as the virtual IP addresses used for
router standby groups.
3. A Router Enters or Leaves the Network Group
In a preferred embodiment, routers enter and leave the network according to
a procedure which efficiently determines whether an active router must be
replaced, and if so, determines how that router is to be replaced. A
router may leave a network segment in one of two ways: (1) it can simply
go down without first notifying the other routers, or (2) it can
officially resign by broadcasting its departure. Examples of the first
case include a router abruptly losing power, crashing, system reloading,
etc. Examples of the second case include scheduled maintenance, etc.
Generally, the broadcast resignation is preferable because it allows other
routers in the network to take immediate steps and thereby smooth the
transition. A router which leaves the group can subsequently reenter, but
can not immediately assume the role of active or standby router. The
reentering router will have to await appropriate circumstances before
assuming such a role.
To negotiate with one another for the statuses of active and standby
routers, the routers of the this invention can send three types of
relevant messages: hello messages, coup messages, and resign messages.
Hello messages notifies other routers in the network that a particular
router is operational in the system. The format of such hello message is
generally similar to that of the hello messages used in protocols such as
OSPF. Coup messages from local routers tell active routers that a local
router wishes to take over as the active router. Resign messages tell the
other routers that an active router wishes to leave its post.
Depending upon the current router's state and the information contained in
each of these messages, a given router may or may not change its state.
Most generally, the routers of this invention can assume one of three
states: new, standby, and active. As will be explained below, a new router
actually resides in one of four substates. Active routers have adopted
their group's virtual IP and MAC addresses and therefore handle packets
from the group's host that are directed outside of its LAN. The standby
router is available to immediately take over as active router if the
current active router should fail or resign. Both active and standby
routers issue periodic hello messages to let the other routers on the
network know their statuses. New routers may listen for these hello
messages and may under some circumstances issue their own hello messages
or attempt a coup of the active router.
If an active or standby router fails or otherwise leaves a standby group,
it will simply stop sending hello messages. At the end of a defined length
of time during which no hello messages are received from the active
router, the standby router will take over. The new routers in the segment
will then conduct an election to install a new standby router in place of
the one that took over as active router. If neither the active router nor
the standby router is functioning, the new routers will conduct an
election to fill both the active and standby slots. In this case, the new
router with the highest priority assumes the role of active router and the
new router with the second highest priority assumes the role of standby
router.
When a standby router receives an active router's resign message (when, for
example, it is being taken down for scheduled maintenance), the standby
router automatically assumes the role of active router. At the same time,
the new routers (having also received the resign message) anticipate that
there will not be a standby router and conduct their own election. As a
result of the election, a new standby router is installed from among the
group of new routers.
As suggested, each router has a specified priority which is used in
elections and coups of the active router. A priority is configured for
each router by a user of the network. The priority of each router is
preferably an integer between 0 and 255 (i.e., an 8 bit word.) with 100
being the default. Generally, the router having the highest priority
should be the active router and the router having the second highest
priority should be the standby router. When routers enter or leave the
network group, the priority-based elections and coups of this invention
smooth the transition so that the group routers can quickly and with
minimal disruption assume their correct status in the system. In the event
that two routers having the same priority are seeking the same status, the
primary IP addresses of these routers are compared and the router having
the higher IP address is given priority.
Two important events in this invention are detailed in the flow charts of
FIGS. 3 and 4. The first of these involves a standby router taking over
for an active router which has left its standby group for some reason. The
second of these involves a new router taking over for a standby router
which has assumed the role of designed router. It should be understood
that these flow diagrams as well as the others presented herein are
provided as convenient representations to aid in understanding the state
transitions of router used in this invention. Some of the flow diagrams
are organized in a manner that could imply that the system checks for
certain actions by event loops or polling. No such limitation is intended.
Thus, the process flow charts presented herein should not be read to imply
that the system necessarily checks for events in the order listed.
FIG. 3 presents a process flow diagram showing the conditions under which a
standby router takes over when an active router leaves its standby group.
It should be understood that a standby router can become active under
other circumstances (i.e., receipt of a lower priority hello from the
current active router when the standby router is configured to preempt).
For purposes of FIG. 3, however, it is assumed that the active router has
left without provocation from another router. The other cases will be
addressed in the discussion of FIG. 5 and elsewhere. The process of FIG. 3
begins at 134 and in a step 138, the router under consideration enters the
standby state. Next, the standby router determines whether the current
active router has issued a resign message in a decision step 140. If not,
the standby router determines whether the active router has stopped
sending hello messages in a step 144. As long as decision steps 140 and
144 are answered in the negative, the standby router continues to await an
event in which one of these decisions can be answered in the affirmative.
When that happens, the standby router assumes the role of active router in
a step 146. Thereafter, the process is concluded at 148.
FIG. 4 shows how a router in the new state can take over for a standby
which has left its post in the standby group. The standby router could be
asked to relinquish its post | | |