WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method and apparatus for target application program supervision    

Get related patents on CD
United States Patent6457142   
Link to this pagehttp://www.wikipatents.com/6457142.html
Inventor(s)Klemm; Reinhard P. (North Plainfield, NJ); Singh; Navjot (Morristown, NJ)
AbstractA fault monitoring, performance monitoring and fault tolerance apparatus and method for target target application programs is realized in an application supervisor by employing a supervisor agent, modified application programming interfaces (APIs), a generic application wrapper and a shell script that operate interactively to detect and automatically resolve reliability and performance problems occurring in executing the target application program. This is realized, in accordance with the invention, without the need to access, modify or have knowledge of the source code of the target application program to be supervised. In a specific embodiment of the invention, Java.TM. programming language target application programs are supervised. This is realized by employing the supervisor agent that attaches to a Java virtual machine through two virtual machine native interfaces. One interface is the Java Virtual Machine Profiler Interface (JVMPI) and the other is the Java Native Interface (JNI).
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Inventor     Klemm; Reinhard P. (North Plainfield, NJ); Singh; Navjot (Morristown, NJ)
Owner/Assignee     Lucent Technologies Inc. (Murray Hill, NJ)
Patent assignment
All assignments
Company News
Publication Date     September 24, 2002
Application Number     09/430,161
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 29, 1999
US Classification     714/38 714/47
Int'l Classification     H02H 003/05
Examiner     Beausoleil; Robert
Assistant Examiner     Duncan; Marc
Attorney/Law Firm     Stafford; Thomas
Address
Parent Case    
Priority Data    
USPTO Field of Search     714/38 714/26 714/27 714/39 714/47 702/188
Patent Tags     target application program supervision
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
6026236
Fortin

Feb,2000

[0 after 0 votes]
5528753
Fortin
714/35
Jun,1996

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. Apparatus including an application supervisor for supervising a target application program comprising:

a supervisor agent;

a modified application programming interface;

a generic application wrapper; and

a shell script,

wherein said supervisor agent, said modified application programming interface, said generic application wrapper and said shell script operate interactively to detect and automatically resolve reliability and/or performance problems occurring in executing said target application program, whereby this is realized without having a need to either access, modify or have knowledge of the source code of the target application program.

2. The invention as defined in claim 1 further including a virtual machine for executing at least said target application program and at least two virtual machine native interfaces to access said virtual machine for providing notification to said application supervisor of prescribed events occurring during execution of said target application program, and wherein said target application program is accessed through said modified application programming interface.

3. The invention as defined in claim 2 wherein said target application program is in Java programming language, said virtual machine is a Java virtual machine and said native interfaces include a Java virtual machine profiler interface (JVMPI) and a Java native interface (JNI).

4. The invention as defined in claim 3 wherein said supervisor agent in conjunction with said JVMPI, JNI and modified application programming interface detects and/or resolves prescribed events in an executing target application program.

5. The invention as defined in claim 4 wherein said detected prescribe events include at least one or more events from a list including, target application program does not terminate within expected time, thread does not terminate within expected time, garbage collector runs too often, garbage collector runs for too long, number of threads exceeds threshold, Java virtual machine shutdown, number of threads is illegal, e.g., spawning the same thread more than once can indicate a programming bug, hung target application program, i.e., target application program does not terminate within expected time, hung thread, i.e., thread does not terminate within expected time, thread terminates, but is not supposed to terminate, i.e., this is a thread that runs forever, target application program exits with System.exit, thread terminates due to an uncaught exception raised either by the virtual machine or explicitly by the target application program, exception thrown, and caught by the target application program.

6. The invention as defined in claim 4 wherein said detected prescribed events are resolved in at least one or more ways from a list including ignore event, notify said manager, suspend additionally spawned threads if the detected problem is a dangerously high number of threads in the target application program, restart target application program, quit target application program.

7. The invention as defined in claim 4 wherein said shell script at least starts execution of said Java virtual machine, said application supervisor and said generic application wrapper.

8. The invention as defined in claim 7 wherein said shell script further controls shutting down the application supervisor, and supplies appropriate parameters to the application supervisor and said Java virtual machine.

9. The invention as defined in claim 8 wherein said generic application wrapper controls starting up and shutting down said the target application program.

10. The invention as defined in claim 9 further including a configuration file and wherein said generic application wrapper processes said configuration file.

11. The invention as defined in claim 10 wherein said application supervisor is configured by parameterizing a set of policy templates in said configuration file.

12. The invention as defined in claim 11 wherein said application wrapper passes each policy template from said configuration file through calls to native methods to said supervisor agent.

13. The invention as defined in claim 11 wherein each of said policy templates is associated with a specific aspect of reliability or performance of said target application program.

14. The invention as defined in claim 7 further including a prescribed manager and a prescribed transport mechanism for connecting said supervisor agent to said prescribed manager.

15. The invention as defined in claim 14 wherein said prescribed manager visually displays events and/or actions of said executing target application program being supervised.

16. The invention as defined in claim 15 wherein said prescribed manager requests status information on said executing target application program and initiating actions in said application supervisor in response to detected events.

17. The invention as defined in claim 14 further including an application supervisor configuration including thread policy specifications, system policy specifications and manager specifications.

18. The invention as defined in claim 7 further including a configuration manager for creating a default configuration file from prescribed class files comprising the target application program.

19. Apparatus including an application supervisor for supervising a target application program comprising:

supervisor agent means for detecting and/or responding to prescribed events occurring during execution of said target application program;

modified application programming interface means for accessing said target application program;

generic application wrapper means for starting up said target application program;

and

shell script means for enabling said target application program to execute,

wherein said supervisor agent means, said modified application programming interface means, said generic application wrapper means and said shell script means operate interactively to detect and automatically resolve reliability and/or performance problems occurring in executing said target application program, whereby this is realized without having a need to either access, modify or have knowledge of the source code of the target application program.

20. The invention as defined in claim 19 further including a virtual machine means for executing at least said target application program and at least two virtual machine native interface means for accessing said virtual machine means for providing notification to said application supervisor of prescribed events occurring during execution of said target application program.

21. The invention as defined in claim 20 wherein said target application program is in Java programming language, said virtual machine means is a Java virtual machine and said at least two native interface means includes a Java virtual machine profiler interface (JVMPI) and a Java native interface (JNI).

22. The invention as defined in claim 21 wherein said supervisor agent means in conjunction with said JVMPI and JNI detects and/or resolves prescribed events in an executing target application program.

23. The invention as defined in claim 22 wherein said shell script means includes means for at least starting execution of said Java virtual machine, said application supervisor means and said generic application wrapper means.

24. The invention as defined in claim 23 wherein said shell script means further means for controlling shutting down said application supervisor, and for supplying appropriate parameters to the application supervisor and said Java virtual machine.

25. The invention as defined in claim 24 wherein said generic application wrapper means includes means for controlling starting up and shutting down said the target application program.

26. The invention as defined in claim 25 further including configuration file means for storing prescribed parameters and wherein said generic application wrapper means includes mean for processing said configuration file.

27. The invention as defined in claim 26 wherein said application supervisor is configured by parameterizing a set of policy templates in said configuration file.

28. The invention as defined in claim 27 wherein said application wrapper means further includes means for passing each of said stored policy templates from said configuration file means through calls to native methods to said supervisor agent.

29. The invention as defined in claim 28 wherein each of said policy templates is associated with a specific aspect of reliability or performance of said target application program.

30. The invention as defined in claim 23 further including prescribed manager means for displaying at least status indications of said executing target application program and prescribed transport mechanism means for connecting said supervisor agent to said prescribed manager means.

31. The invention as defined in claim 30 wherein said prescribed manager means includes means for visually displaying events and/or actions of said executing target application program being supervised.

32. The invention as defined in claim 31 wherein said prescribed manager means includes means for requesting status information on said executing target application program.

33. The invention as defined in claim 30 further including an application supervisor means configuration including thread policy specifications, system policy specifications and manager specifications.

34. The invention as defined in claim 23 further including configuration manager means for creating a default configuration file from prescribed class files comprising the target application program.

35. A method for employing an application supervisor for supervising a target application program comprising the steps of:

detecting of and/or responding to, through a supervisor agent, prescribed events occurring during execution of said target application program;

accessing through a modified application programming interface said target application program;

starting up said target application program through a generic application wrapper; and

enabling said target application program to execute through a shell script,

wherein said supervisor agent, said modified application programming interface, said generic application wrapper and said shell script operate interactively to detect and automatically resolve reliability and/or performance problems occurring in executing said target application program, whereby this is realized without having a need to either access, modify or have knowledge of the source code of the target application program.

36. The method as defined in claim 35 further including a step executing at least said target application program through a virtual machine and accessing said virtual machine through at least two virtual machine native interfaces for providing notification to said application supervisor of prescribed events occurring during execution of said target application program.

37. The method as defined in claim 36 wherein said target application program is in Java programming language, said virtual machine is a Java virtual machine and said at least two native interfaces include a Java virtual machine profiler interface (JVMPI) and a Java native interface (JNI).

38. The method as defined in claim 37 wherein said supervisor agent in conjunction with said JVMPI and JNI detects and/or resolves prescribed events in an executing target application program.

39. The invention as defined in claim 38 wherein said detected prescribe events include at least one or more events from a list including, target application program does not terminate within expected time, thread does not terminate within expected time, garbage collector runs too often, garbage collector runs for too long, number of threads exceeds threshold, Java virtual machine shutdown, number of threads is illegal, e.g., spawning the same thread more than once can indicate a programming bug, hung target application program, i.e., target application program does not terminate within expected time, hung thread, i.e., thread does not terminate within expected time, thread terminates, but is not supposed to terminate, i.e., this is a thread that runs forever, target application program exits with System.exit, thread terminates due to an uncaught exception raised either by the virtual machine or explicitly by the target application program, exception thrown, and caught by the target application program.

40. The invention as defined in claim 38 wherein said detected prescribed events are resolved in at least one or more ways from a list including ignore event, notify said manager, suspend additionally spawned threads if the detected problem is a dangerously high number of threads in the target application program, restart target application program, quit target application program.

41. The method as defined in claim 38 further including a step of said shell script causing at least starting execution of said Java virtual machine, said application supervisor and said generic application wrapper.

42. The method as defined in claim 41 further including a step of said shell script controlling starting up and shutting down the application supervisor and the target application program, and supplying appropriate parameters to the application supervisor and said Java virtual machine.

43. The method as defined in claim 41 further including a step of storing prescribed parameters in a configuration file and wherein said generic application wrapper processes said configuration file.

44. The method as defined in claim 43 further including a step of configuring said application supervisor by parameterizing a set of policy templates stored in said configuration file.

45. The method as defined in claim 44 further including a step of passing each of said stored policy templates under control of said generic application wrapper from said configuration file through calls to native methods to said supervisor agent.

46. The method as defined in claim 45 wherein each of said policy templates is associated with a specific aspect of reliability or performance of said target application program.

47. The method as defined in claim 41 further including steps of a prescribed manager displaying at least status indications of said executing target application program and a prescribed transport mechanism connecting said supervisor agent to said prescribed manager means.

48. The method as defined in claim 47 further including a step of said prescribed manager visually displaying events and/or actions of said executing target application program being supervised.

49. The method as defined in claim 48 further including a step of said prescribed manager requesting status information on said executing target application program.

50. The method as defined in claim 41 further including an application supervisor configuration including thread policy specifications, system policy specifications and manager specifications.

51. The invention as defined in claim 41 further including a step of creating a default configuration file from prescribed class files comprising the target application program.
 Description Submit all comments and votes
 


TECHNICAL FIELD

This invention relates to program reliability, performance monitoring and problem resolution and, more particularly, to target application program supervision.

BACKGROUND OF THE INVENTION

A number of prior software application supervision, i.e., program monitoring, apparatus and techniques are known in the art. However, these prior apparatus and techniques where limited to detecting and recovering from so-called process hangs and crashes. There also are prior known arrangements that support implementation of internal and external application program supervisors. Prior systems that support implementation of an internal application supervisor require that the target application program be modified, either by modifying the source code or by modifying the executable code. Similarly, prior systems that support implementation of an external application supervisor require extensive modifications to and recompilation of the source code. This is not only time consuming and difficult, but often impossible to implement because the source code is typically not available to a customer.

SUMMARY OF THE INVENTION

Problems and limitations of prior known fault monitoring, performance monitoring and fault tolerance apparatus and method for target application programs are overcome in an application supervisor by employing a supervisor agent, modified application programming interfaces (APIs), a generic application wrapper and a shell script that operate interactively to detect and automatically resolve reliability and performance problems occurring in executing the target application program. This is realized, in accordance with the invention, without the need to access, modify or have knowledge of the source code of the target application program to be supervised.

In a specific embodiment of the invention, Java.TM. programming language applications are supervised. This is realized by employing the supervisor agent that attaches to a Java virtual machine through two virtual machine native interfaces. One interface is the Java Virtual Machine Profiler Interface (JVMPI) and the other is the Java Native Interface (JNI). In conjunction with the JVMPI and JNI, the supervisor agent can detect and respond, i.e., resolve prescribed events in an executing target application program. Other events that cannot be monitored through the JVMPI and JNI are propagated to the supervisor agent through a set of modified Java API classes. That is, the target application program is accessed through the modified API classes. In this manner, JAS is able to monitor events during the execution of the target application program. To start up the supervised target application program and process a configuration file, the application supervisor of this invention employs a generic application wrapper and a shell script.

It is important that JAS be easily configured, and still be flexible and powerful in order to supervise specific target application programs. This is realized by configuring JAS by parameterizing a set of policy templates in a configuration file. Each policy template is associated with a specific aspect of reliability or performance of either the entire target application program or a subclass of its threads or objects. Once a policy template has been parameterized, it becomes a policy that specifies what behavior the application supervisor should treat as an anomaly and how it should react to it once it is detected.

In another embodiment of the invention, events and actions of the executing target application program can be visualized in a remote manager that is connected to the supervisor agent via a customized protocol that uses a TCP/IP (transmission control protocol/internet protocol) transport mechanism. The remote manager may also request status information on the supervised target application program, as well as, initiate target application program supervision actions on its own.

A technical advantage of the invention is that it can detect and resolve an extensive range of reliability, as well as, performance problems including a complete target program process crash.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 shows, in simplified block diagram form, details of a personal computer or work station on which the invention may be practiced;

FIG. 2 shows, in simplified form, details of the Java application supervisor including an embodiment of the invention;

FIG. 3 pictorially depicts a display illustrating the default remote manager including an event log;

FIG. 4 is a flowchart illustrating the steps in an overall JAS execution process;

FIG. 5 is a flowchart illustrating the steps performed in the thread start process;

FIG. 6 is a flow chart illustrating the steps in the process for normal thread termination;

FIG. 7 is a flow chart illustrating steps in the process for abnormal thread termination;

FIG. 8 is a flow chart illustrating the steps in a periodic thread check process; and

FIG. 9 is a flow chart illustrating the steps in an example flow for a performance problem process.

DETAILED DESCRIPTION

The Java.TM. programming language is increasingly being used in the implementation of programs that used to be the domain of more traditional programming languages such as C and C++. Many of these programs, in particular servers, have stringent availability, reliability, and performance requirements. The Java Application Supervisor (JAS) is a generic software system that attaches to any given Java virtual machine and supervises the execution of the running program, i.e., JAS automatically detects and resolves many reliability and performance problems according to the user's specifications. JAS reduces and in many cases eliminates the need for the program specific, time intensive and effort intensive implementation of mechanisms that monitor and handle reliability and performance problems. JAS does not require source code modifications or recompilation of the supervised Java program. Thus, JAS can be used to enhance the availability of Java server programs and decrease the risk of performance degradation with very little effort on the developer's part. JAS is lightweight in the sense that it imposes very little execution time and memory overheads on the supervised program.

Introduction

Quite often and for a variety of reasons, performance and reliability problems materialize only after a program has been deployed to the user. Thus, a program with stringent performance and reliability requirements ideally comes with built in mechanisms that detect performance and reliability problems at runtime ("on line"). The mechanisms may alert a programmed or human supervisor, or they may automatically attempt to resolve the problems. Such mechanisms can also be used to record reliability and performance trouble spots in the program that can then serve during a software maintenance phase as the basis for program improvements. Let us call a collection of such mechanisms an application supervisor. Let us also call the process of detecting and resolving performance and reliability problems during the execution of a target application program or supervised target application program supervision. If an application supervisor is an integral part of a target application program, we call it an internal application supervisor. If the supervisor is somehow attached to the supervised program in such a way that the latter can execute without the supervisor, i.e., if the supervisor is not an integral part of the supervised program, we call it an external application supervisor.

Target application program supervision is intended to detect performance and reliability problems that were overlooked or not anticipated during software testing and maintenance phases and that show up during the operational phase of the program. Usually, target application program supervision has to be lightweight in the sense that the execution time and memory overheads imposed on the target application program are small, i.e., the overhead of target application program supervision should not become an additional performance problem.

An application supervisor does not contribute to the functional purpose of the program. Therefore, many software projects avoid the time, expense, and required expertise associated with building an internal or external supervisor for the target application program in question. For this reason, we designed and built JAS (Java Application Supervisor). It is an external application supervisor that is reusable across all Java target application programs. JAS can be parameterized by the user for the specified target application and does not require changes to the target source code or bytecode or recompilation of the target, i.e., JAS is nonintrusive. While JAS cannot completely eliminate the need for target specific target application program supervision, it can very often reduce the time intensive and effort intensive implementation of mechanisms that monitor and handle reliability and performance problems. Since JAS is not part of a target application program, changes to the target application program supervision are easily made by reconfiguring JAS, and modifying the target application program does not necessitate any code changes in the application supervisor. Moreover, JAS can very quickly detect problems and, if necessary, restart the target application program. If the target application program is a server, this amounts to a brief interruption of the services rendered after which the detected problem usually disappears for a certain time. Hence, JAS can significantly improve the availability and long-term performance of the target application program.

It is important that configuring JAS is simple, yet flexible and powerful, or else JAS might not be more convenient to use than it would be to program a customized application supervisor for a specific target. JAS is configured by parameterizing a set of policy templates in a configuration file. Each policy template is associated with a specific aspect of reliability or performance of the entire target application program or a subclass of its threads or objects. Once a policy template has been parameterized, it becomes a policy that specifies what behavior JAS should treat as an anomaly and how to react to it once it has been detected. To understand the following examples, notice that a remote manager can be attached to JAS that can visualize some aspects of the state of the supervised target application program and the detected problems. An example of a reliability-related policy is:

Notify a remote manager 205 (FIG. 2) whenever a thread of class ClientRequestHandler dies abnormally. Restart target application program if more than 5 threads of class ClientRequestHandler die abnormally within 90000 ms.

The corresponding JAS policy would be

ClientRequestHandler abnormalThreadDeath 5 90000 notify restart

Abnormal thread death is defined as a thread stop due to an uncaught exception such as a NullPointerException. An example of a performance related policy is:

If more than 250 threads of class PrefetchURL are executing concurrently suspend all newly spawned threads. Notify remote manager if this happens more than 10 times within 3600000 ms.

The corresponding JAS policy would be

PrefetchURL threadlimit 250 10 3600000 suspend notify

JAS can be of service even if the user does not have knowledge of the internal structure of the target application program. For example, the user can almost always assume that an uncaught exception in a thread, leading to the immediate death of the thread, constitutes a software failure and warrants some action such as restarting the target application program. Similarly, if the user knows the maximum execution time of the target application program and the target exceeds this time, it is safe to assume a software failure that should result in an action that JAS can take. However, the more the user is familiar with the internals of the target application program the more the user can tailor the JAS configuration and, thus, the more precise the target application program supervision can be.

JAS Features

JAS is a so called lightweight external supervisor for Java target application programs. Its problem detection and resolution capabilities cover performance and reliability aspects of the target application program and are completely transparent to the target application program. The current JAS implementation can supervise any Java application program that fulfills all of the following conditions:

Since JAS adds a thread to the supervised target application program, the target application program may not make its functionality dependent on the total number of threads in its address space.

The target application program does not change its functionality depending on the size or number of Java API (application programming interface) class files.

It makes no assumptions about the order in which threads are scheduled or about absolute times for the execution of code, (well engineered Java programs should not do this anyway).

It does not change a set of Java API classes. Few Java programs change API classes.

For JAS to detect a Java exception, the target application program has to either throw the exception explicitly via a throw exception statement or it has to catch the exception and invoke a process on the exception object such as to Strin ( ).

The current implementation of JAS can detect at least the following performance related events, among others, in the target application program:

target application program does not terminate within expected time;

thread does not terminate within expected time;

garbage collector runs too often;

garbage collector runs for too long;

number of threads exceeds threshold.

At least the following reliability related events, among others, can be detected by the current JAS implementation:

virtual machine shutdown;

number of threads is illegal, e.g., spawning the same thread more than once can indicate a programming bug;

hung target application program, i.e., target application program does not terminate within expected time;

hung thread, i.e., thread does not terminate within expected time;

thread terminates, but is not supposed to terminate, i.e., this is a thread that runs forever;

target application program exits with System.exit;

thread terminates due to an uncaught exception raised either by the virtual machine or explicitly by the target application program;

exception thrown, and caught by the target application program.

The Java equivalent of a C/C++ application crash is usually a thread or target application program termination due to an uncaught exception. Some Java programs also catch a variety of exceptions but do not deal with them other than printing or logging the exception, thus leaving the target application program in an illegal state. JAS can detect exceptions whether they are handled by the target application program or not. In the former case, JAS allows supplementing the exception handler in the target application program with additional functionality such as notifying a remote manager and logging of the exception by the manager. In particular, JAS can detect the following exceptions indicating a fatal situation encountered by the virtual machine due to an internal error or due to resource limitations:

OutOfMemoryError;

StackOverflowError;

InternalError;

VirtualMachineError;

UnknownError.

Exceptions can also result from various bugs in the target application program and many bugs will result in Java exceptions. Examples of such exceptions are:

NullPointerException;

ArithmeticException;

IllegalArgumentException;

NumberFormatException;

ArrayIndexOutOfBoundsException;

SecurityException.

Another class of reliability problems that JAS can detect is the erroneous input of classes to the Java virtual machine resulting in an exception. The following are examples of such exceptions:

ClassFornatError;

LinkageError;

NoSuchMethodError.

Currently, JAS can respond to detected problems in at least the following ways, among others:

ignore event;

notify remote manager;

suspend additionally spawned threads if the detected problem is a dangerously high number of threads in the target application program;

restart target application program;

quit target application program.

In addition to more fined-grained problem detection, JAS can also apply more fine-grained problem solution strategies, for example, as follows:

make a complex decision about whether to take any action and what action to take based on the exact nature of the problem and based on an optional user-supplied policy;

execute actions in addition to program-specified exception handlers when exceptions are thrown;

suspend additional thread creation if the number of threads has reached a user-specified threshold;

reset variable values if a problem has been detected and variable value changes might lead to a partial or complete solution of the detected problem.

The remote manager 205 (FIG. 2) notification can be combined with every other action. For each event, JAS can trigger one of two different actions depending on whether the event has happened less or more often than a certain number of times during a user specified time window. For example, if there has been a problem with the time consumed by the garbage collector at most 5 times during the last 10 minutes, the manager could be notified. If this happened more than 5 times during the last 10 minutes, the target application program could be restarted. JAS will also periodically notify the manager of the absence of any problems and will convey some statistical information such as the current number of running threads, the current memory consumption, etc.

No source code or bytecode changes or recompilation of the target application program is necessary. Depending on the knowledge about target application program internals, ranging from no knowledge to complete knowledge, a JAS user can tailor the JAS configuration to varying degrees. A tool that is part of the JAS distribution generates a default configuration for the target application program that can be modified by the JAS user. A JAS configuration consists mainly of a sequence of policies that specify which actions to take upon which events. To keep JAS and JAS configurations simple and to reduce the execution time overhead that JAS imposes on the target application program, there is only a fixed set of events and actions that the user can choose from when specifying policies. Policies are static, i.e., cannot be changed at run time, and policies cannot be based on other policies. A more detailed description of policies in a JAS configuration is described below.

JAS communicates with a remote manager 205 via a customized UTF-8-based protocol on top of TCP/IP 206 in order to visualize events and actions and to receive instructions for actions that JAS ought to carry out. JAS and the remote manager 205 will attempt to reestablish the communication link between JAS and the remote manager 205 if it happens to get interrupted due to a failure of the communication subsystem, the remote manager, or the target application program. The standard JAS distribution contains a default graphical remote manager 205 that visualizes events and actions in JAS and logs every event, see for example FIG. 3 that pictorially depicts a display illustrating the default remote manager 205. The event log allows a user to pinpoint the nature of the detected problem and the time in milliseconds and location in the application of the problem occurrence. An excerpt from an event log is presented below.

Indeed, if more flexibility in specifying policies is needed than JAS configurations allow, a user will be able to program a customized manager that receives event notifications from JAS and instructs JAS to respond to events with actions. Relating events to actions can thus be done with an arbitrary level of complexity and is not subject to most of the JAS restrictions on policies.

Configuring JAS For a Target Application Program

Before using JAS for supervising a given target application program, a JAS user has to generate a JAS configuration for the target application program. A JAS configuration is an ASCII file containing a sequence of policy specifications and other information for JAS. A policy describes what action(s) to take if a specified event occurs. The user may generate a default configuration by applying a tool (configuration manager) contained in the JAS distribution to the set (or any subset) of class files comprising the target application program. To get the maximum benefit from JAS, the user should modify the default configuration to reflect the specifics of the target application program. Changing the default configuration is a very simple process as shown below.

A JAS configuration consists of three parts:

1. thread policy specifications;

2. system policy specifications;

3. manager specifications.

A sample configuration for JAS is shown below. It is for the target application program WebCompanion, a prefetching and caching Web proxy. This target application program also generated the event log shown below.

An example excerpt from a JAS event log showing the times when events occurred, the events, and actions taken is as follows:

0 supervisorRunning WebCompanion

411 applicationRunning notify

511 GcmaxTimeExceeded notify

1072 objectAllocatedjava.lang.ArrayIndexOutOfBoundsException notify

1633 naturalThreadDeath main WebCompanion notify

18757 GCmaxTimeExceeded notify

20610 GCmaxTimeExceeded notify

23174 GCmaxTimeExceeded notify

35231 GCmaxTimeExceeded restart

0 supervisorRunning WebCompanion

331 applicationRunning

882 objectAllocated java.lang.ArraylndexOutOfBoundsException notify

1152 naturalThreadDeath main WebCompanion notify

39307 abnormalThreadDeath Thread-4 FetchThread

java.lang.NullPointerException

at FatchThread.accessURL(Compiled Code)

at FetchThread.run(Compiled Code)

notify

49391 abnormalThreadDeath Thread-5 FetchThread

java.lang.NullPointerException

at HTMLdocs.loadNewDocument(Compiled Code)

at FetchThread.complete(FetchThread java:204)

at FetchThread.accessURL(Compiled Code)

at FetchThread.run(Compiled Code)

notify

102378 abnoralThreadDeath Thread-6 FetchThread

java.lang.NullPointerException

at FetchThread.accessURL(Compiled Code)

at FetchThread.run(Compiled Code)

restart

Comments can be freely interspersed in the configuration file, and shown below, is the format of the configuration file as comments.

Thread Policy Specifications

Every thread in Java is generated from an object that is of class java.lang.Thread or a subclass thereof. In other words, every thread in Java can be naturally associated with a class that defines the behavior of the thread. For each such class, the JAS user may but does not have to add a set of policies to the JAS configuration. These policies determine what performance and reliability related events originating at a thread of the specified class JAS ought to consider a problem and how to respond to them. Currently, JAS allows the specification of at least five policies for each thread class. In the example shown below, there are policies for thread classes FetchThread, Prefetchthread, WebCompanion, and Watchdog. The five policies for each thread class concern the following events:

1. abnormal thread termination (caused by an uncaught exception);

2. normal thread termination (run method comes to a natural end);

3. expected completion time for thread has been exceeded (thread hung);

4. soft limit for number of threads has been exceeded (see below);

5. hard limit for number of threads has been exceeded (see below).

Each policy describes what action to trigger when JAS detects the specified event. There are two types of actions. The first type gets triggered as long as the total number of events of this type does not exceed a certain maximum within a specified time window (called probation). The second type of action gets triggered if the specified maximum number of events during the probation has been exceeded. The policy abnormalThreadDeath 2 300000 notify restart, for example, means "if more than 2 threads of the given class terminate abnormally within 300000 milliseconds, restart the entire target application program; every abnormal thread termination before that will result in the notification of the manager". The policy naturalThreadDeath 0 INFINITE quit quit means "if a thread of the given class terminates normally from the virtual machine point of view, quit the target application program" implying that either this thread is supposed to run forever but a bug might lead to thread termination, or the death of this thread also means the end of the target application program execution. Reaching a soft or hard limit on the number of threads can imply resource or performance penalties that the user would like to avoid. It can also mean that there is a bug in the program that causes more than an allowed number of threads to be spawned. In the former case, the specified action could be, for example, suspend which means that each thread that exceeds the given threshold is suspended until the total number of active threads has fallen below the threshold. In the latter case, the specified action could be, for example, to quit the target application program.

An example of a JAS configuration for WebCompanion is as follows:

FetchThread

// abnormalThreadDeath <maximum>_<probation><beforeaction><afteraction>

abnormalThreadDeath 2 300000 notify restart

// naturalThreadDeath <maximum><probation><beforeaction><afteraction>

naturalThreadDeath 0 INFINITE restart restart

// expectedCompletionTime <time><maximum><probation><beforeaction>

<afteraction>expectedCompletionTime INFINITE INFINITE INFINITE none none

// softlimit <number><maximum><probation><beforeaction>< afteraction>

softlimit INFINITE INFINITE INFINITE none none

// hardlimit <number><maximum><probation><beforeaction>< afteraction>

hardlimit INFINITE INFINITE INFINITE none none

PrefetchThread

// abnormalThreadDeath <maximum>_<probation><beforeaction><afteraction>

abnormalThreadDeath 3 INFINITE notify restart

// naturalThreadDeath <maximum><probation><beforeaction><afteraction>

naturalThreadDeath 0 INFINITE restart restart

// expectedCompletionTime <time><maximum><probation><beforeaction>

<afteraction>expectedCompletionTime INFINITE INFINITE INFINITE none none

// softlimit <number><maximum><probation><beforeaction>< afteraction>

softlimit INFINITE INFINITE INFINITE none none

// ardlimit <number><maximum><probation><beforeaction>< afteraction>

hardlimit INFINITE INFINITE INFINITE none none

WebCompanion

// abnormalThreadDeath <maximum>_<probation><beforeaction><afteraction>

abnormalThreadDeath 0 INFINITE notify restart

// naturalThreadDeath <maximum><probation><beforeaction><afteraction>

naturalThreadDeath 1 INFINITE none none

// expectedCompletionTime <time><maximum><probation><beforeaction>

<afteraction>expectedCompletionTime 30000 1 restart restart

// softlimit <number><maximum><probation><beforeaction>< afteraction>

softlimit 11 INFINITE none quit

// hardlimit <number><maximum><probation><beforeaction>< afteraction>

hardlimit 21 INFINITE quit quit

Watchdog

// abnormalThreadDeath <maximum><probation><be