WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
System for uninterruptively displaying only relevant and non-redundant alert message of the highest severity for specific condition associated with group of computers being managed    
United States Patent5619656   
Link to this pagehttp://www.wikipatents.com/5619656.html
Inventor(s)Graf; Lars O. (Rensselaer, NY)
AbstractThe system and method of this invention automatically manages a group of computers by automatically gathering data, storing the data, analyzing the stored data to identify specified conditions, and initiating automated actions to respond to the detected conditions. The invention, hereafter "SYSTEMWatch AI-L", comprises a SYSTEMWatch AI-L client which turns a computer into a managed computer, a SYSTEMWatch AI-L console, which turns a computer into a monitoring computer, a SYSTEMWatch AI-L send facility, which allows a system administrator to send commands to various SYSTEMWatch AI-L clients through the SYSTEMWatch AI-L console, and a SYSTEMWatch AI-L report facility which allows a system administrator to query information collected and processed by the SYSTEMWatch AI-L clients and SYSTEMWatch AI-L consoles.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5619656
System for uninterruptively displaying only relevant and non-redundant

     alert message of the highest severity for specific condition associated

     with group of computers being managed - US Patent 5619656 Drawing
System for uninterruptively displaying only relevant and non-redundant alert message of the highest severity for specific condition associated with group of computers being managed
Inventor     Graf; Lars O. (Rensselaer, NY)
Owner/Assignee     OPENService, Inc. (Albany, NY)
Patent assignment
All assignments
Publication Date     April 8, 1997
Application Number     08/238,476
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 5, 1994
US Classification     709/224 700/12 700/32 709/207 710/19 714/48 714/57
Int'l Classification     G06F 011/00 G06F 011/30 G06F 013/10
Examiner     Lee; Thomas C.
Assistant Examiner     Kim; Ki S.
Attorney/Law Firm    
Address
Parent Case    
Priority Data    
USPTO Field of Search     395/650 395/575 395/700 395/200.11 395/185.01 395/185.1 395/835 395/839 364/141 364/144 364/146 364/152 364/264.5 364/284.3 364/280 364/286.3 364/974.5
Patent Tags     uninterruptively displaying only relevant non-redundant alert message highest severity specific condition associated group computers being managed
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5528759
Moore
709/224
Jun,1996

[0 after 0 votes]
5432932
Chen

Jul,1995

[0 after 0 votes]
5418966
Madduri
710/200
May,1995

[0 after 0 votes]
5230048
Moy
707/1
Jul,1993

[0 after 0 votes]
5226150
Callander
714/57
Jul,1993

[0 after 0 votes]
5163151
Bronikowski
714/57
Nov,1992

[0 after 0 votes]
5155842
Rubin
714/22
Oct,1992

[0 after 0 votes]
5146587
Francisco
714/57
Sep,1992

[0 after 0 votes]
5111384
Aslanian
714/26
May,1992

[0 after 0 votes]
5109486
Seymour
709/224
Apr,1992

[0 after 0 votes]
5047977
Hill
714/57
Sep,1991

[0 after 0 votes]
4888690
Huber
707/4
Dec,1989

[0 after 0 votes]
4866712
Chao
714/704
Sep,1989

[0 after 0 votes]
4815030
Cross
707/10
Mar,1989

[0 after 0 votes]
4589068
Heinen, Jr.
717/127
May,1986

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. A method of automatically managing a group of at least one managed computer comprising the steps of:

gathering data;

analyzing the data to identify a specific computer condition;

constructing an alert message identifying said specific computer condition;

performing a set of validation tests on said alert message, said set of validating tests comprising;

querying for a duplicate alert message existing in a database in which previously posted alert messages are stored;

querying said database for an existing alert message associated a computer condition related to, and having a higher severity than, said specific computer condition prompting said alert message;

querying said database for an existing alert message associated with said specific computer condition which is being ignored; and

querying said database for a previously cleared alert message associated with said specific computer condition within a predetermined time period;

rejecting said alert message if an existing alert was found during any one test from said set of validation tests; and

displaying said alert message, when said alert message was not rejected during said validation tests, at the managed computer without inhibiting the managed computer from continuing its application processes;

whereby a user is only presented with relevant alerts and no network traffic is used to retransmit irrelevant or redundant alerts.

2. A method according to claim 1, further comprising a step of clearing other existing alert messages related to said alert message upon creation of said alert message whereby less severe alerts can be superseded by more severe alerts.

3. A method according to claim 1, wherein said identified computer condition is one from a set of conditions consisting of;

state of file system disk space;

state of file system inode usage;

state of process CPU usage;

state of process memory usage;

state of swap space usage;

state of daemon processes;

state of operating system;

state of hardware;

state of application;

state of networks;

state of other peripherals connected to said group of at least one managed computer.

4. A method according to claim 1, further comprising a step of initiating a predetermined action at said managed computer against said alert message, whereby once initiated, the action retrieves all argument necessary directly from said alert message and does not require the user to specify any additional arguments.

5. A method according to claim 1, further comprising a step of executing an action upon creation of said alert message whereby the problem can be fixed without human intervention.

6. A method according to claim 1, further comprising a step of clearing said alert message whereby less relevant alerts do not clutter up the alerts display screen.

7. A method according to claim 6, wherein said step of clearing said alert message is preformed automatically after a predetermined period of time whereby the need to manually clear alerts alleviated.

8. A method according to claim 1, further comprising a step of assigning said alert message with a priority indicating the severity of said computer condition whereby it is made easy to classify problems by severity.

9. A method according to claim 8, further comprising a step of changing said priority over time whereby an alert can be escalated in priority over time.

10. A method according to claim 9, further comprising a step of executing an action upon changing of said priority whereby different actions can be executed depending on escalation status of said alert message.

11. A method according to claim 1, further comprising a step of associating an owner with said alert message whereby it is possible to select only those alerts that require resolution by a specific user for viewing by the user.

12. A system of automatically managing a group of at least one managed computer comprising:

means for gathering data;

means for analyzing the data to identify a specific computer condition;

means for constructing an alert message identifying said specific computer condition;

means for performing a set of validation tests on said alert message, said set of validating tests comprising;

querying for a duplicate alert message existing in a database in which previously posted alert messages are stored;

querying said database for an existing alert message associated with a computer condition related to, and having a higher severity than, said specific computer condition prompting said alert message;

querying said database for an existing alert message associated with said specific computer condition which is being ignored; and

querying said database for a previously cleared alert message associated with said specific computer condition within a predetermined time period;

means for rejecting said alert message if an existing alert was found during any one of said validation tests; and

means for displaying said alert message, when said alert message was not rejected during said validation tests, at the managed computer without inhibiting the managed computer from continuing its application processes;

whereby a user is only presented with relevant alerts and no network traffic is used to retransmit irrelevant or redundant alerts.

13. A system according to claim 12, further comprising a step of clearing other existing alert messages related to said alert message upon creation of said alert message whereby less severe alerts can be superseded by more severe alerts.

14. A system according to claim 12, wherein said identified computer condition is one from a set of conditions consisting of;

state of file system disk space;

state of file system inode usage;

state of process CPU usage;

state of process memory usage;

state of swap space usage;

state of daemon processes;

state of operating system;

state of hardware;

state of application;

state of networks;

state of other peripherals connected to said group of at least one managed computer.

15. A system according to claim 12, further comprising a step of initiating a predetermined action at said managed computer against said alert message, whereby once initiated, the action retrieves all argument necessary directly from said alert message and does not require the user to specify any additional arguments.

16. A system according to claim 12, wherein the creation of said alert message causes an action to be executed whereby the problem can be fixed without human intervention.

17. A system according to claim 12, wherein said alert message is cleared whereby less relevant alerts do not clutter up the alerts display screen.

18. A system according to claim 17, wherein said alert message is cleared automatically after a predetermined period of time whereby the need to manually clear alerts alleviated.

19. A system according to claim 12, wherein said alert message has a priority indicating the severity of said computer condition whereby it is made easy to classify problems by severity.

20. A system according to claim 19, wherein said priority can change over time whereby an alert can be escalated in priority over time.

21. A system according to claim 20, wherein a change of said priority causes an action to be executed whereby different actions can be executed depending on escalation status of said alert message.

22. A system according to claim 12, wherein said alert message can be associated with an owner whereby it is possible to select only those alerts that require resolution by a specific user for viewing by the user.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

This invention relates in general to system administration and in particular to automated management of a group of computers and its associated hardware and software.

BACKGROUND ART

The following documents are hereby incorporated by reference in its entirety:

1. Object Oriented Programming, Coad P., and Nicola J., YourDon Press Computing Series, 1993., ISBN 0-13-032616-X.

2. The C Programming Language, Kernighan B., and Ritchie D., 1st Edition, Prentice-Hall Inc., ISBN 0-13-110163-3

3. The Unix Programming Environment, Kernighan and Pike, Prentice-Hall Inc., ISBN 013-937699-2

4. Unix Network Programming, Stevens, Prentice Hall Software Series, 1990, ISBN 0-13-949876-1.

5. Internetworking with TCP/IP, Volume I, Principles, Protocols, and Architecture, 2d Ed, Prentice Hall, 1991, ISBN 0-13-468505-9

6. Solaris 1.1, SMCC VersionA, AnswerBook for SunOS 4.1.3 and Open Windows Version 3, Sun Microsystems Computer Corporation, Part Number 704-3183-10, Revision A.

7. Artificial Intelligence, Rich E., McGraw-Hill, 1983, ISBN 0-07-052261-8.

8. Artificial Intelligence, Winston P., 2d Edition, 1984, ISBN 0-201-08259-4.

9. Documentation for the SunOS 4.1.3 operating system from Sun Microsystems, Inc.

10. SunOS 4.1.3 manual pages ("man pages") from Sun Microsystems, Inc.

As used within this document and its accompaning drawings and figures, the following terms are to be construed in this manner:

1. "CPU" shall refer to the central processing unit of a computer if that computer has a single processing unit. If the computer has multiple processors, the term CPU shall refer to all the processing units of such a system.

2. "Managing a computer" shall refer to the steps necessary to manage a computer, for example, gathering and storing information, analyzing information to detect conditions, and acting upon detected conditions.

The problem of system administration for a computer with a complex operating system such as the UNIX operating system is a complex one. For example, in the UNIX workstation market, it is common for an organization to hire one system administrator for every 20-50 workstations installed, with each such administrator costing a company (including salary and overhead) between $60,000 and $100,000. Indeed, some corporations have discovered that despite freezing or cutting back hardware and software purchases, the rising cost of retaining system administrators has nevertheless continued to escalate the cost of maintaining an Information Services organization at a substantial rate.

In a typical system administration environment, the work cycle consists of the following. A problem occurs on the computer which prevents the end user from carrying out some task. The end user detects that problem some time after it has occurred, and calls the complaint desk. The complaint desk dispatches a system administrator to diagnose and remedy the problem. This has three important consequences: First, problems are detected after they have blocked a user's work. This can be of substantial impact in organizations which use their computers to run their businesses. Second, problems which do not necessarily block a user's work, but which may nonetheless have important consequences, are difficult to detect. For example, one vendor supplies an electronic mail package which is dependent upon a functional mail daemon process. This mail daemon process has a tendency to die on an irregular, but frequent basis. In such situations, the end user typically does not realize that he is not capable of receiving electronic mail until after they've missed a meeting scheduled by electronic mail. Third, because problems are not detected until after they block a user's work, a problem which at an earlier state might have been easier to fix cannot be fixed until it has escalated into something more serious, and more difficult to correct.

Currently, system administrators manage a group of computers by performing most actions manually. Typically, the system administrator periodically issues a variety of commands to gather information regarding the state of the various computers in the group. Based upon the information gathered, and based upon a variety of non-computer information, the system administrator detects problems and formulates action plans to deal with the detected problems.

Automation of a system administration's task is difficult for several reasons:

1. Data regarding the state of the computer is difficult to obtain. Typically, the system administrator must issue a variety of commands and consider several pieces of information from each command in order to diagnose a problem. If the system administrator is responsible for several machines, these commands must be repeated on each machine.

2. When the system administrator detects a problem, the appropriate action plan may vary depending on a variety of external factors. For example, suppose a particular computer becomes slow and unresponsive when the system load on that computer crosses a certain threshold. If this problem occurs during normal business hours under ordinary circumstances, it will probably be a problem which must be resolved in a timely manner. On the other hand, suppose this problem occurs in the middle of the night. While this situation might still be a problem, the resolution need not be as timely since the organization's work will not be impacted, unless the problem still exists by the start of the business day. Now suppose the accounting department, at the end of each month, runs a processor intensive task to do the end-of-month accounting, which normally forces the load average above that threshold. If the system load crosses that same average during the time when the accounting department runs their end of month program, that's not a problem. In order to build a tool to handle situations like these using current tools would require writing a large series of inter-related complex boolean expressions. Unfortunately, writing and testing such a series of complex boolean expressions are difficult.

3. Current system administration tools view the universe of computer problems as a static universe. Computer problems, however, evolve over time as hardware and software are added, removed, and replaced in a computer.

4. Furthermore, an automated tool should also flexibly alter its behavior based on the nature of the commands a system administrator issues to it in guiding it in to resolve problems. Thus, if the system administrator routinely ignores a particular problem, the automated tool should warn the system administrator less frequently if the routinely ignored problem reoccurs.

What is needed is a tool which will automatically gather the necessary computer information to manage a group of computers, detect problems based upon the gathered information, inform the system administrator of detected problems, and automatically perform corrective actions to resolve detected problems.

SUMMARY OF THE INVENTION

The shortcomings of the prior art are overcome and additional advantages are provided in accordance with the principles of the present invention through the provision of SYSTEMWatch AI-L, which automatically manages at least one computer by automatically gathering computer information, storing the gathered information, analyzing the stored information to identify specific computer conditions, and performing automatic actions based on the identified computer conditions.

BRIEF DESCRIPTION OF DRAWINGS

The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention will be apparent from the following detailed description taken in conjunction with the accompaning drawings in which:

FIG. 1 illustrates an embodiment of the present invention which comprises two groups of computers, a group of managed computers and a group of monitoring computers.

FIG. 2 illustrates one example of the structure of a managed computer, comprising a processing unit, memory, disk, network interface, peripherals, and a SYSTEMWatch AI-L client;

FIG. 3 illustrates one embodiment of the structure of a monitoring & command computer, comprising a processing unit, disk, network interface, peripherals, and a SYSTEMWatch AI-L console;

FIG. 4 illustrates one embodiment of the structure of a computer which is both a managed computer and a monitoring computer, comprising a processing unit, disk, network interface, peripherals, a SYSTEMWatch AI-L console, and a SYSTEMWatch AI-L client;

FIG. 5 illustrates one embodiment of the SYSTEMWatch AI-L client and the SYSTEMWatch AI-L console, comprising of a core layer plus an application layer.

FIG. 6 illustrates one embodiment of the logical structure of the core layer in accordance with the principles of the present invention;

FIG. 7 illustrates one example of an embodiment of data within the database of the core layer accordance with the principles of the present invention;

FIGS. 8a-8b illustrates one embodiment of the operation of the expert system found in the core layer of SYSTEMWatch AI-L;

FIG. 9 illustrates one embodiment of the SYSTEMWatch AI-L client's "client loop";

FIG. 10 illustrates one embodiment of the SYSTEMWatch AI-L console's "console loop";

FIG. 11 illustrates one embodiment of the SYSTEMWatch AI-L request facility; and

FIG. 12 illustrates one embodiment of the SYSTEMWatch AI-L report facility.

DESCRIPTION OF THE PREFERRED EMBODIMENT

One preferred embodiment of the technique of the present invention of managing a group of computers is targeted at groups of workstations running the UNIX operating system. Alternative embodiments of the present invention can consist of groups of computers running other operating systems, such as, Microsoft's Windows NT and IBM's OS/2. As viewed in FIG. 1, the invention comprises, for instance, 2 groups of computers:

a. A group of managed computers, 1, which includes computers, 2-5, comprising, for example, (see FIG. 2) a CPU, 9, memory, 10, disks, 14, communications interface, 16, other peripherals, 15, and a SYSTEMWatch AI-L client, 13. The size of the managed group of computers can range from 1 to several thousand. Data which is gathered from a managed computer is stored on the managed computer. From time to time, a managed computer may send data to a monitoring computer (see below).

b. A group of monitoring computers, 6, which includes computers comprising, for example, (see FIG. 3) a CPU, 17, memory,18, disks, 22, communications interface, 24, other peripherals, 23, and a SYSTEMWatch AI-L console, 21. The size of the monitoring group of computers can range from 0 to several hundred. Although data gathered from a managed computer is stored on the managed computer, from time to time a managed computer may send data to a monitoring computer. A monitoring computer can also explicitly request data from a managed computer. Data which is received by the monitoring computer from a managed computer is stored on the monitoring computer. Furthermore, since a monitoring computer can receive data from several managed computers, a monitoring computer may perform post-processing on data received from several managed computer, and/or perform additional data gathering itself, in which case that data is stored on the monitoring computer.

In another embodiment the two groups of computers may be the same group (all managed computers are also monitoring computers), two distinct groups (no managed computers are monitoring computers), or overlap (some managed computers are monitoring computers). The computers which form the groups of computers may be heterogeneous or homogeneous. The only requirement is that each managed computer have the capability to communicate with at least one monitoring computer. One preferred embodiment of this invention is to have all the computers on a computer network, but any other means of communication, e.g., over a modem using a telecommunications network, is adequate. The differentiation between managed and monitoring computers are the SYSTEMWatch AI-L client and the SYSTEMWatch AI-L console, which are described below:

a. As show in FIG. 2, a computer is a managed computer if the computer is running the SYSTEMWatch AI-L client, which provides a means for the computer to automatically detect and respond to problems. Additionally, the SYSTEMWatch AI-L client also accepts and responds to commands issued by a SYSTEMWatch AI-L console described below.

b. As shown in FIG. 3, a computer is a monitoring computer if the computer is running the SYSTEMWatch AI-L console, which provides a means for the computer to receive and display notifications of detected problems, and to display the corrective actions taken. Additionally, the SYSTEMWatch AI-L console is also able to issue commands to any group of managed computers.

c. As shown in FIG. 4, a computer is both a managed computer and a monitoring computer if it contains both SYSTEMWatch AI-L client, 13, and SYSTEMWatch AI-L console, 21.

An Overview of the SYSTEM Watch AI-L Client

The task of the SYSTEMWatch AI-L client is to manage a computer and to provide notification of management actions to the SYSTEMWatch AI-L console. Before explaining how the SYSTEMWatch AI-L client operates, however, it is necessary to consider how the SYSTEMWatch AI-L client is organized. As previously mentioned, the SYSTEMWatch AI-L client is bifurcated into a core layer, 33, which provides the SYSTEMWatch AI-L client with the underlying mechanism for detecting and responding to problems, and an application layer, 34, which configures the SYSTEMWatch AI-L client to operate in a useful manner. The SYSTEMWatch AI-L client was designed this way because the nature of a particular computer's problem is not static. For example, problems may evolve as changes are made to the hardware and software of the computer, and if the computer is a multi-user computer, as users are added and removed from the system. As computer problems change, only the SYSTEMWatch AI-L client's application layer need be modified. As shown in FIG. 6, the core layer is composed of four elements: a database, 41, an expert system, 40, a language interpreter, 39, and a communications mechanism, 42. One example of a preferred embodiment of the application layer, 34, is a series of programs written in a language which can be interpreted by the language interpreter of the core layer.

Care Layer Description--Database

The first element of the core layer is SYSTEMWatch AI-L database, 41. The database is used for storing gathered data, intermediate results, and other information. Refering to FIG. 7, in the context of the database, SYSTEMWatch AI-L uses two concepts: ENTITYs, 43, 53, and PROPERTYs, 44, 47, 49, 54, 56. These two features are now described in greater detail:

1. PROPERTY

Conceptually, PROPERTYs are similar to field descriptions. In one embodiment, a PROPERTY has the following features:

TABLE 1 __________________________________________________________________________ FEATURE DESCRIPTION __________________________________________________________________________ NAME A property must have a name. TYPE A property must have a type, which corresponds to the type of the data to be stored in the field. FORMAT A property may optionally have a string which describes how the data in the field should be formatted. The format string is similar to the C language's printf( )'s formatting control. HEADER A property may optionally contain a string which will be displayed as the column header when a report featuring records containing the property is displayed. DISPLAYUNIT A string used by the reporting facility which is appended to the data in the field during a report. Thus, if the PROPERTY is a description of memory utilization in kilobytes, an appropriate DISPLAYUNIT might be "kb" DISPLAYTYPE Some display formats are commonly used through SYSTEMWatch AI-L. DISPLAYTYPES are keywords which corresponds to a particular FORMAT. Examples of DISPLAYTYPEs include STRING20, for a string limited to 20 characters in width, DATESMALL, for displaying date in mm/dd format, PERCENT, for automatically display numbers between 0.0 and 1.0 as percentages (e.g.: 0.52 is displayed as 52%) SHORTDESC A PROPERTY may optionally contain an abbreviated description of the PROPERTY. LONGDESC A PROPERTY may optionally contain a long description of the PROPERTY. __________________________________________________________________________

2. ENTITY

Conceptually, ENTITYs are similar to database tables. In SYSTEMWatch AI-L, ENTITYs are used to group related PROPERTYs.

FIG. 7 illustrates the concept that each piece of data in the database is associated with a given PROPERTY and a given ENTITY. In this document, it will be necessary to refer to certain combinations of ENTITYs and PROPERTYs. The construction <entity name>.sub.-- <property name> (e.g.: IGNORE.sub.-- IGNORETIME) will refer to a database entry with an entity equal to <entity name> and a property equal to <property name>.

In addition to ENTITYs and PROPERTYs, the database, 41, in SYSTEMWatch AI-L also has these additional features:

1. Host Information

Each piece of data in database, 41, automatically has host information associated with it. Thus, as data is stored in the database, the database automatically associates the host from which the data originated from. This is because in SYSTEMWatch AI-L, data is "owned" by the host from where the data originated. Other hosts may request a copy of the data since SYSTEMWatch AI-L has communications capabilities. Some data may be stored in a central location (e.g.: a SYSTEMWatch AI-L console) if it is relevant to multiple computers. Because each piece of data has host information associated with it, a SYSTEMWatch AI-L console can conslidate data from multiple hosts.

2. Time Information

Each piece of data in database, 41, has a time field associated with it. The time field by default has the last time the data was updated, but SYSTEMWatch AI-L provides a mechanism of changing the time field so its possible to store some other time in the field.

3. Name

Each piece of data in database, 41, has a key field which is called the name field. A name field must be unique for a given ENTITY, PROPERTY, and host (the name of a computer). Thus, within an ENTITY and PROPERTY used for tracking computer processes, the name field might be the process id since process ids are unique on each computer, so by specifying the ENTITY name, PROPERTY name, and host name, the name field forms a unique key to locate the data.

4. Value

Of course, a database stores data. In SYSTEMWatch AI-L, the term value refers to the data stored in the database.

In one example, database, 41, is currently implemented as a relational database: One table is used for describing ENTITYs. This table is used to associate ENTITYs with PROPERTYs. Another table is used for describing PROPERTYs. Finally, another table holds the information, which can be located by providing an ENTITY name, PROPERTY name, and the name field of the data. This table also contains the associated host and time information.

In another embodiment, database, 41, can also be implemented with a database which is object oriented, i.e, a database which supports the ability to inherit data and methods from super and sub classes.

Additional requirements of database, 41, used in the core is that the database must support certain query operations and certain set operations. Specifically, the query operations supported by the database include:

1. regular expression matching in queries.

2. creation time or update time query, i.e., searching for a data item based upon the time the data was stored in the database or based on the time the data was last updated in the database.

3. host of origin in queries, i.e., searching for a data item based on the host which created the data.

4. time comparison query, i.e., searching for data based upon a time comparison. Note: SYSTEMWatch AI-L stores its time in a manner similar to the UNIX operating system. That is to say, all time is converted to seconds elapsed since the beginning of UNIX time. The advantages of using this method is that time comparisons are easily made, and a time plus an interval can be added to obtain a future time.

The set operations which database, 41, supports include:

1. set intersections (ANDs)--given 2 or more sets of data, return the elements present in all sets.

2. set union (ORs)--given 2 or more sets of data, return the elements in all sets.

3. set exclusion (NOTs)--given a first set and a second set, return elements in the first set which are not elements of the second set.

Care Layer Description--The Expert System

The second element of the core layer is an expert system, 40, which is used for problem detection and action initiation. The expert system, 40, is a forward chaining rule based expert system using a rule specificity algorithm. When SYSTEMWatch AI-L client, 13, is started, the expert system contains no rules. Rules are declared and incorporated into the core layer. Rules support both the IF-THEN rules as well as IF-THEN-ELSE rules. The rules used in SYSTEMWatch AI-L permit assignments and function calls within the condition of the rule. Additionally, SYSTEMWatch AI-L expert system, 40, also has the following features:

a. Rules can declare variables. All variables declared within a rule are static variables.

b. Rules can have an initialization section. The initialization section contains actions which must be performed only once, and before the rule is ever tested. It can, for example, contain a state declaration and an interval declaration (states and intervals are described below). It may contain variable declarations for variables used by the rules, and it may contain code to do a variety of actions.

c. Rules can have, for instance, an INTERVAL and a LASTCHECK time. In accordance with the principles of the present invention, in order for a rule to be eligible for testing by the expert system, at the time of testing the clock time must be equal to or greater than the LASTCHECK time plus the INTERVAL time. The LASTCHECK time for each rule is set to the clock time whenever a rule is actually tested. This way, the INTERVAL specifies the minimum amount of time which must elapse since the last time a rule was checked before the rule becomes eligible for