|
Description  |
|
|
FIELD OF THE INVENTION
This invention relates to data processing and communications systems in
general and specifically to network control stations and systems in which
problem condition alert signals and messages are defined and sent from
operating entities in the network to the network system operator console
at the network management control program host.
Prior Art
Alerts in communication and data processing network systems which
communicate using alerts to a central operator's console at a controlling
CPU station are known. Currently, each alerting product must create and
arrange for the storage of product unique screens, identifying an alert
condition to an operator, at the problem management console control point.
These screens are then invoked when a given alert is received to inform
the operator as to what problem or condition is being reported.
Substantial effort is involved in developing the product unique screens
and in implementing them in a coordinated fashion so that alert screens
for each new product attached to a network are available at the control
point for display. Furthermore, the amount of storage required to maintain
a record of the screens at the control points and the amount of
synchronization imposed on the shipment of products by the manufacturers
in the creation and distribution of the product unique alert screens for
the host system consoles have made this approach highly unacceptable.
In the past, using the so-called stored screen alerts discussed briefly
above, an identifying index is specified for each unique alert. Sets of
previously agreed-upon display screens were encoded and stored at the
operator control console and a unique alert identification was sent with
each alert to the operator's console. This enabled the processor at the
operator console to identify which screen was being asked for by the alert
sender. An alert from an IBM 3274 would, for example, carry a number such
as X'08' (hexadecimal). It would also carry an indication that the alert
is from a 3274 controller. Based upon this information, the processor at
the control console would retrieve and display a set of information
display screens for a 3274 and would select from those screens screen
number 8 for immediate display. The IBM System/38 implemented such an
alert structure as described in IBM Technical Disclosure Bulletin Vol. 26,
No. 12 "IBM SYSTEM/38 ALERTS" May 1984.
OBJECTS OF THE INVENTION
In light of the foregoing known problems and difficulties with the prior
art, it is an object of this invention to provide an improved generic
alert code which reduces the alert screen storage and distribution
requirements at the network controlling CPU.
A further object of the invention is to minimize the need for changes to
support addition of new products to the network. Yet a further object of
the invention is to provide data to be displayed in the language used at
the receiving system regardless of the language used at the sending
product.
SUMMARY
Code points, which are strings of bits, are generated in response to an
event in a device attached to a network. The code points are used to index
predefined tables that contain relatively short units of text messages to
be used in building an operator's information display. A product attached
to a network, an alert sender, generates a series of code points
representative of desired display messages for an operator. The messages
are independent of the specific alert sending device insofar as an alert
receiver is concerned.
The alert receiver, is selected to handle alerts for a number of components
on the network. The alert receiver has access to a storage area where
display text corresponding to individual code points is stored. When the
alert receiver receives an alert, the code points are used as an index by
the alert receiver to retrieve messages and build a screen display of data
for the operator to review and take appropriate actions.
Code points are assigned in a hierarchical manner so that additional code
points can be defined and sent by an alert sending product without the
need for changing the code points supported by an alert receiver at the
same time. A general or generic error, such as "output device error" is
first defined and identified by a code point. More specific errors are
then defined, such as "printer error" or "printer cassette error". These
detailed errors are given code points which have hierarchically selected
different bits. When an alert receiver does not yet have a copy in
associated storage of the specific error, the message for the general
error is displayed. In the prior art, unless the specific screen was
stored at the alert receiver, no such messages were displayed.
One advantage of using code points to build displays of data is that the
display can be tailored specifically to describe the event giving rise to
the alert. This can be done without storing a large number of displays at
the alert receiver. Great flexibility is provided in that the alert sender
chooses which short messages to send. The code points are short, so that
they do not significantly interfere with transmission of other data on the
network. Thus, greater flexibility and granularity is provided without the
expense of added storage requirements or transmission bandwidth.
The use of code points also aids in the provision of information to an
operator in the language of the operator. The messages may be stored by a
receiver in any language desired, as the index will simply retrieve the
message at the address indicated by a table of code points and addresses.
Thus, alerts for new products do not involve the translation of multiple
screen of alerts into many languages. The alerts for new products instead
may involve only the translation of a few, if any, unique short messages
corresponding to new code points.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and still other unenumerated objects of the invention are met
in a preferred embodiment thereof as depicted in the drawings in which:
FIG. 1A illustrates schematically an architectural arrangement of the
communication and data processing system in an IBM SNA architecturally
defined environment.
FIG. 1B schematically illustrates a preferred embodiment of the invention
environment for an IBM System 370 host operating as the network management
control point for communication to an SNA-based communication network.
FIG. 2 illustrates the format for the architecturally defined Network
Management Vector Transport request unit employed in the preferred
embodiment for communication of the alert messages.
FIG. 3A illustrates, in order, the selection of data elements from an alert
message to be inputted into a buffer prior to entry of the buffer contents
into the IEEE 802 standard CRC algorithm calculation device.
FIG. 3B illustrates schematically a program for generating a unique alert
identification number.
FIG. 3C illustrates schematically the basic process flow for generating a
unique alert identification number.
FIG. 4 illustrates schematically the buffer content for a specific example
of an alert message.
FIG. 5 illustrates schematically a portion of a typical communication/data
processing network configuration in which a communication controller
attached to a token ring network operates as the alert sender.
FIGS. 6A,6B,6C,6D illustrate in complete detail a specific example of a
total generic alert message sent to report a wire fault in the system
depicted in FIG. 5.
FIGS. 7,7A,7B,7C,7D illustrate the major vector format to be employed in
the standard NMVT messages.
FIGS. 8,8A,8B,8C,8D illustrate one of the subvector formats to be employed
in the standard NMVT messages.
FIG. 9 illustrates a flow diagram for the display of generic alert
messages.
FIG. 10 illustrates a generic alert list display.
FIG. 11 illustrates a generic alert recommended actions for selected alert
display.
FIG. 12 illustrates a generic alert detail display.
DESCRIPTION OF THE PREFERRED EMBODIMENT(S)
As alluded to briefly above, the invention finds its application in the
present-day complex communication and data processing networks in which a
variety of devices or products suffering from a similar variety of
inherent possible problems must be managed from central control points by
system control operators. In a typical IBM SNA architected system, the
network control functions are provided by a variety of management tools
and processes. Among these offered in an SNA system are automatic
detection, isolation and notification to system operators of existing
resource problems. For an overview of such systems, reference may be had
to a paper entitled "Problem Detection, Isolation and Notification in
Systems Network Architecture" appearing in the Conference Proceedings,
IEEE INFOCOM 86, Apr. 19, 1986.
As discussed at greater length in the referenced paper, the strategic
vehicle for accomplishing the automatic detection, isolation and
notification to the system operator in an SNA network is the Network
Management Vector Transport alert. This alert is an architecturally
defined and published data communication format with specifically defined
contents. Each individual product throughout an SNA network is responsible
for detecting its own problem, performing analysis for isolating the
problem and for reporting the results of the analysis in alert messages
sent to the system control operator. In some cases, a problem may be
isolated to a single failing component in the network and the failing
component will be identified in the alert message. If the failure can be
further isolated, for example, to a specific element within a failing
component, then the element may also be identified in the alert message.
In other cases where it is not possible for the detecting product to
isolate the failure to a network component, the problem detecting product
will send information that will assist the network operator at the system
control console, or alert receiver, to complete isolation of the failure
to a single component. Examples of problems that can be detected are
components in an SNA network are given in the aforementioned paper. The
data that flows in the alert messages reporting the problems is also
specifically described. The IBM program product, Network Problem
Determination Application (NPDA) which is an IBM program product that
presents alert data to a network operator, is also discussed in brief.
As briefly alluded to, in an SNA network the alert message is the vehicle
for notifying the network operator that a problem exists within the
network. Products throughout the SNA network are responsible for detecting
problems and reporting them via alert messages so that operators at the
central control terminal, usually located at the host system site, can be
aware of problems in all parts of the network. However, the alert message
typically performs more functions than the simple enunciation of the
existence of problems. It also transports data that assists the network
operators in isolating and in ultimately resolving the identified
problems. The alerting task is applicable to all of the resources in the
network. Thus, it makes it possible for an operator at the central control
facility to manage not only the communications resources of the network
such as the controllers, communication links, modems and the like, but
also to manage such system resources as tape drive units and Direct Access
Stored Data units (DASD) and printers, for example. Typically, such system
resource hardware components do not send their own alert messages since
they are not provided with the sophisticated problem detection and
isolation mechanisms together with processing capability to construct and
send the alert messages. Such system resources usually have alerts sent on
their behalf by the network component to which they are attached, for
example, to an attached controller for a printer, DASD unit, or the like.
As discussed in the aforementioned paper, the alert message is encoded and
formatted in an architecturally defined and published manner and is known
as the Network Management Vector Transport (NMVT) message when it flows
through such a network. As such, the alert message consists of a Major
Vector (MV) with an identification that identifies the message as an alert
and a number of included Subvectors (SV) that transport the various types
of alert data to the control point. The major vector/subvector encoding
scheme has several advantages. First, since the format for the message
length is variable rather than fixed, an alert with less data than another
need not carry 0's or padding characters in unused data fields. If the
data to be transported by a given subvector is not present in an alert
from a given product, that subvector is simply omitted altogether.
Secondly, since products that receive alerts, such as IBM's NPDA product
mentioned above, may parse or analyze a major vector and its subvectors,
migration to newer versions of the management program products is
simplified whenever additional data is added to the alert messages. The
new data is simply encoded in a new subvector and the only change
necessary to the management program is the addition of recognition support
for the new subvector.
In the context of such alert message management systems, an important
feature alluded to previously is the filtering of alerts. Filtering is
defined as a procedure in which certain message units or specific alerts
are selected for exclusion or for different treatment at the alert
receiving station, i.e., at the network control console operator's
display. Differences in treatment for specific alert messages may be as
follows:
The specific alert message may be excluded from an alert log and/or from
the alert display at the operator's station. Ordinarily, each alert is
logged and presented to the operator as it arrives. Filters may be set,
however, to specify that a particular alert should be logged only for
later retrieval but not displayed for the operator immediately or perhaps
not even logged. The filtering operation for particular alerts allows
enablement or inhibition of the functions of logging an individual alert,
displaying the alert to a specified operator, forwarding the alert to
another control point for handling, or of the use of the alert as a
trigger mechanism for the displaying of special display screens in place
of those normally used at the control console station. Alert messages that
a given user deems useless for a particular network can be discarded
altogether while others can be routed first to the appropriate node or
station within the network and then to the appropriate operator at that
node for handling.
For certain network configurations or user installations, a particular
alert message may never be useful. In such cases, a filter can be
permanently set at the alert receiver console to discard without logging
or displaying them any instances when that alert message is received.
Additionally, there may be certain exceptional circumstances, typically
such as scheduled maintenance intervals, in which the alert that is
generated is ordinarily useful and meaningful but is temporarily of no
value. In this case, the filter may be temporarily set to discard any
instances of the alert that are received during the maintenance period.
The filtering capability is especially important because, for certain
types of maintenance procedures, numerous instances of the same alert can
be generated in a very short period of time.
As alluded to above, the current implementation of alert messages is based
upon product unique screens which are stored at the control point
operator's station which is typically connected to a host or in a network
control console processor. However, considerable effort is involved in
developing the unique screens and in synchronizing their usage with the
implementation of given products in a network composed of numerous
products from numerous suppliers. Generic alerts, using code points to
index short units of text, provide a more flexible approach to the
transport and display of information in message alerts to the control
point or system control operator's station. In generic alerts, the data
can be transported in a coded form within an alert message and the network
control point product, such as IBM's NPDA can use the coded data in at
least two ways. First, the coded data is used as an index to predefine
tables containing short units of text to be used in building the display
for the operator. Secondly, the textual data to be displayed can be
defined by the alert data itself. In each case, however, the data
displayed is wholly independent of the product associated with the cause
of the specific alert insofar as the processing of the received message is
concerned. The indexing of text strings by the specifically defined and
encoded code points contained within the string and the displaying of
textual data messages sent in such an alert are done in exactly the same
manner regardless of which product caused the sending of the alert.
As stated earlier, generic alerts in the present invention are encoded in
the architecturally defined and published major vector/subvector/subfield
format. This format is schematically illustrated in FIG. 2 and is defined
in the IBM publication GA27-3136, first published in 1977. The latest
versions of this publication which is available in the patent application
file contain completely detailed lists of currently defined code points
for each specific type of error for each specific type of product in a
communication and data processing network. Such detailed lists are not
required for an understanding of the operation and best mode of the
invention. Instead, the location of the code points and a few examples are
provided so that so that one skilled in the art after reading this
description would be able to practice it without undue experimentation.
The use of the architecturally defined format, unlike fixed format
schemes, makes possible the inclusion in a particular alert message of
only those elements that are necessary. Subvectors and subfields of data
that are not required are simply not included. The encoding scheme as
published and defined is currently in use for most SNA management services
records in the IBM systems.
FIG. 1A illustrates a typical architectural environment for an SNA data and
communication network. Typically, the operator's display console indicated
as box 1 in FIG. 1A is connected to a host CPU 2 which operates a control
point management service program illustrated as CPMS 3 which communicates
with session control program 4 internally in the host CPU 2. The session
control program 4 operates using the Network Management Vector Transport
response unit format over the communications link 5 to establish the
SSCP-PU (System Services Control Point-Physical Unit) SNA session. The
physical unit (PU) may typically be a terminal controller or a terminal
itself if the terminal is provided with sufficient processing capacity.
The terminal controller or terminal will contain the SNA session control
program portion 4 necessary to establish the partner SNA half session as
illustrated in FIG. 1A. The terminal controller or terminal itself 6, as
shown in FIG. 1A, will also contain a processor (not shown) operating a
management services program for the physical unit itself. This is
illustrated as the physical unit management services program block 7 which
communicates with local management services program 8 to manage a given
terminal or controller. For the architected system of FIG. 1A, the typical
physical example is given by FIG. 1B. The operator's console 1, which may
be a typical 3270 display station and keyboard, is connected to a
System/370 host CPU 2 containing the appropriate control point management
services program 3 in the form of IBM's network management control program
offering NPDA or other similar versions of network management control
programs. The SNA session control is managed by a virtual
telecommunications access method such as IBM's VTAM program also operating
within the System/370 host. The communications link 5 links the host to a
plurality of elements in the communication network. Only one element, a
typical IBM 3174 terminal controller is illustrated as the physical unit 6
which contains the necessary programming to support the SNA session,
(illustrated as the half session control program portion 4 in FIG. 1A),
the physical unit management services program 7 and the local management
services program 8 for operating the attached terminals 9 and for
reporting problem alert conditions relative either to the terminal
controller 6 or to the terminals 9.
The communications link 5 typically links the controller 6 to the host 2
and, of course, numerous such controllers and terminals may exist within a
typical complex network.
An architecturally defined and published format for the communication is
the Network Management Vector Transport (NMVT) request unit format shown
in some detail in FIG. 2. This format is used for the communications of
alert messages.
Briefly, the NMVT request unit format comprises a header portion of
information 10 followed by the management services major vector portion
11. The total NMVT request unit may contain up to 511 bytes of information
and so has a highly variable length and data content. As schematically
shown, the NMVT header 10 contains a plurality of subfields of information
with bytes 0 through 2 comprising a portion identified as the NS header.
Bytes 3 and 4 comprise a field of information that has been retired from
use identified as field 16. Field 17 comprising bytes 5 is reserved or
retired and field 18 is a procedure related identifier. Bytes 7 and 8
represent data fields 19 and 20 with field 19 being for indicator flags'
sequence field, and SNA address list indicators as shown in the drawing.
Field 20 is a reserved field.
The management services major vector portion 11, may be further broken down
into fields 12 through 14 as schematically depicted in FIG. 2. A length
indicator comprising bytes 9 and 10 contains a pointer pointing to the end
point of field 14. A key indicator comprising bytes 11 and 12 specifies
the particular type of major vector as will be further described. The
management services subvector field 14 may contain a plurality of bytes of
data specifically selected to represent the problem conditions to be
reported. The specific selection is in accordance the defined
specification previously noted in the IBM SNA reference manual.
The management services subvector field 14 may be further broken down into
specific subvectors, each of which may be identified by fields 21 and 22
as having a specific length and a specific type with the data field 23
containing specific subfields of data. The data subfield 23 may be further
broken down into subfields within the data each having a length field 24,
an identification key field 25 and subsequent data fields 26.
As may be readily appreciated, a high degree of flexibility of encoding
data points to construct an alert message is made possible in this system.
However, it will be noted that the alert messages constructed in this
format contain no unique fixed length identifier to describe to the
receiving management for operator console which specific alert has been
encoded.
The specific solution to this problem, is more fully described in U.S. Pat.
No. 4,823,345 for Method and Apparatus for Communication Network Alert
Record Identification, filed on the same date herewith, and having
attorney docket number RA987-001. The above solution is depicted
schematically in FIG. 3C as a two-stage process for generating a unique
alert identification number. As depicted in FIG. 3C from a generated alert
message record, certain fields of data are extracted as an input to the
CRC algorithm. The alert record 28 is inputted to the extraction means 29
which is a selector routine that selects from the NMVT formatted message
certain prescribed bytes from identified subvectors as will be described
in greater detail later. This creates input to the CRC algorithm for
calculation in box 30. The IEEE 802 standard CRC algorithm is well known
but is set out later herein for convenience. The result of calculating
this algorithm utilizing the data input from box 29 is a 32-bit number to
which is appended in box 31 a unique product identification code which
results in an output of an alert message identifier.
FIG. 3C shows the format of an outputted alert identifier unique to a
specific product and alert message.
FIG. 3A describes in tabular form the necessary fields to be extracted from
the NMVT formatted message. The elements to be extracted constitute those
fields representing the alert type 301 from the hex 92 subvector in the
NMVT, the alert description code 302 from the hex 92 subvector and all
probable cause code points 303 in their order of appearance from the hex
93 subvector. This is to be followed in order by a delimiter 304 as
specified in FIG. 3A, all the user cause code points 305 in their order of
appearance from the hex 94 subvector (this subvector is optional and may
be omitted), a further delimiter 306 as shown in FIG. 3A and any install
cause code points 307 in their order of appearance, if any, from the hex
95 subvector. This is also followed by a further delimiter 308 as shown in
FIG. 3A and finally, by all the failure cause codes points 309 as defined
in order, if any, from the hex 96 subvector. This subvector is also
optional as is the hex 95 subvector as noted in FIG. 3A. All of these code
points for subvectors 92 through 96 are completely architected and
described in the aforementioned IBM SNA reference manual.
The procedure as depicted schematically in the flow chart in FIG. 3B
operates as follows:
First, the elements of the alert record to be used in filtering are
extracted from the subvectors at 301 and 311 and at 312 and 313 as
specified in FIG. 3A and placed into a variable length buffer in the
specified order depicted in FIG. 3A. Delimiters at 314 are inserted to
distinguish successive groups of elements from each other (the delimiters
as shown in FIG. 3A). This process is done for major vectors at 301
through 314 and for subvectors at 315 through 322. Note that for
subvectors a variable "SV KEY" is used for the scan, and incremented at
322. The result of this step is a mapping of alert elements into the
buffer entries (such as in FIG. 4) in such a way that two independent
alerts from different sources will constitute an identical buffer entry
if, and only if, they should be treated as indistinguishable for filtering
purposes. Next, turning to FIG. 3C, the buffer entry is run as a data
input into a specified IEEE 802 standard CRC algorithm calculation device.
The device may be either a commercially available CRC algorithm integrated
circuit chip which calculates the result or it may be an appropriately
programmed data processor. The output which results from the CRC algorithm
calculation is a 32-bit binary number that is associated with the buffer
entry. This number is inserted in the alert itself, so that it will be
available to the alert receiver.
There are actually two different methods by which the first two steps
indicated above can be implemented. An alert sending product may actually
implement the CRC algorithm in its own processor or in its own code and
generate the alert identification number for each alert on-line in real
time as it is prepared for transmission. Alternatively, the alert sending
product may be pre-coded with predefined alert ID numbers with the code
points having been run through the algorithm generation process once in
the course of product development. The resulting ID numbers can then be
stored in the table within the product so that only a table look-up is
necessary at the time it is necessary for sending a specific alert.
When it receives an alert, an alert receiver extracts two pieces of
information from it: the identifier indicating the identity of the network
product which sent the alert, and the 32-bit number resulting from step 2
above. The identifier, identifying the sending product, appears in the
architecturally defined portion reserved for this purpose. These two are
concatenated together to form the unique alert identifier depicted in FIG.
3C. The purpose of this step in the process is to reduce the probability
of duplication of the unique identifiers from the mapping that is done in
step 2. Since the buffer entries for alerts are always at least 5 bytes in
length and typically may range from 15 to 25 bytes and perhaps may be as
large as 80 bytes or more, the mapping of the entries into a 32-bit number
is obviously not a perfect one-for-one mapping. By concatenation of the
resulting 32-bit number with the identity of the sending product, the
probability of duplication is enormously reduced since the set of all
alerts flowing in a given network which may easily run into thousands of
alerts will be partitioned in the sets associated with alert sending
products in the network which typically are many fewer and may range
between 10 and a few hundred. Therefore, the likelihood of duplication of
the same alert message occurring from the same type of product at the same
time for application to the network is very small.
The buffer entries are always ordered in accordance with the hex subvectors
92 through 96 keys as depicted in FIG. 3A in accordance with this
invention. The specific example for a specific type of product under
specific assumed conditions is depicted in FIG. 4 where the buffer entries
are shown in the order of their presentation at 401, 402, 403, 404, 405,
406, 407, 408 and 409. As the example indicates, the code entries that are
placed in the buffer comprise only a small portion of the complete alert
record given in FIG. 6 for the sample assumed condition. Only the code
points that are characteristic of a particular alert condition have been
selected in accordance with FIG. 3A. Other elements of the alert record,
such as the time stamp, the sender's serial number, the SNA name or
address, etc., that may differ for the same alert condition in the network
are not included in the alert ID number calculation process.
Turning to FIG. | | |