|
|
|
| United States Patent | 5974462 |
| Link to this page | http://www.wikipatents.com/5974462.html |
| Inventor(s) | Aman; Jeffrey D. (Poughkeepsie, NY), Arwe; John E. (Poughkeepsie, NY), Booz; David A. (Kingston, NY), Bostjancic; David V. (Poughkeepsie, NY), Dritschler; Gregory M. (Poughkeepsie, NY), Eilert; Catherine K. (Wappingers Falls, NY), Yocom; Peter B. (Wappingers Falls, NY) |
| Abstract | A method and apparatus for controlling the number of servers in an
information handling system in which incoming work requests belonging to a
first service class are placed in a queue for processing by one or more
servers. The system also has units of work assigned to a second service
class that acts as a donor of system resources. In accordance with the
invention, a performance measure is defined for the first service class as
well as for the second service class. Before adding servers to the first
service class, there is determined not only the positive effect on the
performance measure for the first service class, but also the negative
effect on the performance measure for the second service class. Servers
are added to the first service class only if the positive effect on the
performance measure for the first service class outweighs the negative
effect on the performance measure for the second service class. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5974462 |
|
|
Method and apparatus for controlling the number of servers in a
client/server system |
|
| Inventor |
Aman; Jeffrey D. (Poughkeepsie, NY) , Arwe; John E. (Poughkeepsie, NY) , Booz; David A. (Kingston, NY) , Bostjancic; David V. (Poughkeepsie, NY) , Dritschler; Gregory M. (Poughkeepsie, NY) , Eilert; Catherine K. (Wappingers Falls, NY) , Yocom; Peter B. (Wappingers Falls, NY) |
|
|
|
| Publication Date |
October 26, 1999 |
|
|
|
|
|
| Filing Date |
March 28, 1997 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
CROSS-REFERENCE TO RELATED APPLICATION(S)
This application is related to the following commonly owned, concurrently
filed application(s), incorporated herein by reference:
D. F. Ault et al., "Method and Apparatus for Transferring File Descriptors
in a Multiprocess, Multithreaded Client/Server System", Ser. No.
08/825,302.
D. F. Ault et al., "Method and Apparatus for Controlling the Assignment of
Units of Work to a Workload Enclave in a Client/Server System", Ser. No.
08/825,304. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5675739 Eilert 709/226 Oct,1997 |      Your vote accepted [0 after 0 votes] | | 5655120 Witte 718/105 Aug,1997 |      Your vote accepted [0 after 0 votes] | | 5603029 Aman 718/105 Feb,1997 |      Your vote accepted [0 after 0 votes] | | 5539883 Allon
Jul,1996 |      Your vote accepted [0 after 0 votes] | | 5537542 Eilert 709/201 Jul,1996 |      Your vote accepted [0 after 0 votes] | | 5504894 Ferguson 707/2 Apr,1996 |      Your vote accepted [0 after 0 votes] | | 5473773 Aman 718/104 Dec,1995 |      Your vote accepted [0 after 0 votes] | | 5459864 Brent 718/105 Oct,1995 |      Your vote accepted [0 after 0 votes] | | 5437032 Wolf 718/103 Jul,1995 |      Your vote accepted [0 after 0 votes] | | 5283897 Georgiadis 718/105 Feb,1994 |      Your vote accepted [0 after 0 votes] | | 5276897 Stalmarck 718/105 Jan,1994 |      Your vote accepted [0 after 0 votes] | | 5249290 Heizer 718/105 Sep,1993 |      Your vote accepted [0 after 0 votes] | | 5212793 Donica 718/105 May,1993 |      Your vote accepted [0 after 0 votes] | | 5155858 DeBruler 718/105 Oct,1992 |      Your vote accepted [0 after 0 votes] | | 5031089 Liu 709/226 Jul,1991 |      Your vote accepted [0 after 0 votes] | | 3702006 Josiah B. Page (Salt Point, NY) 718/105 Oct,1972 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
| Add a new Other reference: |
| Post related web sites and other references in this section |
| | Reference | Relevancy | Comments | MVS Planning: Workload Management, IBM Publication GC28-1761-00, 1996.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | MVS Programming: Workload Management Services, IBM Publication GC28-1773-00, 1996.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | "Optimal Control Of A Removable . . . With Finite Capacity", by, Wang et al., Microelectron. Reliab. (UK) vol. 35, No. 7, Jul. 1995, P1023-30.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | "Providing Distributed Computing Environment Servers On Client Demand", IBM TDB, vol. 38, No. 3, Mar. 1995, P231-233.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | "Queue-Dependent Servers", by V.P. Singh, IBM TR 221301, Jun. 30, 1971.
. Nov,2006 |      Your vote accepted [0 after 0 votes] | | "Queue Dependent Servers Queueing System", by Garg et al., Microelectron. Reliab. (UK) vol. 33, No. 15, Dec. 1993, P2289-95.. Nov,2006 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a method and apparatus for controlling the number of servers in an information handling system in which incoming work requests belonging to a first service class are placed in a queue for processing by one or more
servers.
2. Description of the Related Art
Systems in which incoming work requests are placed in a queue for assignment to an available server are well known in the art. Since the frequency at which the incoming requests arrive may not be readily controlled, the principal means of
controlling system performance (measured by queue delay or the like) in such a queued system is to control the number of servers. Thus, it is known in the art to start an additional server when the length of the queue being served reaches a certain high
threshold or to stop a server when the length of the queue being served reaches a certain low threshold. While such an expedient may achieve its design objectives, it is unsatisfactory in a system in which other units of work besides the queued work
requests are contending for system resources. Thus, even though providing an additional server for a queue may enhance the performance of the work requests in that queue, providing such a server may so degrade the performance of other units of work
being handled by the system that the performance of the system as a whole deteriorates.
Current operating system software is not able to take over the responsibility for managing the number of servers according to the end-user oriented goals specified for the work requests and considering other work with independent goals running in
the same computer system.
SUMMARY OF THE INVENTION
The present invention relates to a method and apparatus for controlling the number of servers in an information handling system in which incoming work requests belonging to a first service class are placed in a queue for processing by one or more
servers. The system also has units of work assigned to a second service class that acts as a donor of system resources. In accordance with the invention, a performance measure is defined for the first service class as well as for the second service
class. Before adding servers to the first service class, there is determined not only the positive effect on the performance measure for the first service class, but also the negative effect on the performance measure for the second service class.
Servers are added to the first service class only if the positive effect on the performance measure for the first service class outweighs the negative effect on the performance measure for the second service class.
The present invention allows system management of the number of servers for each of a plurality of user performance goal classes based on the performance goals of each goal class. Tradeoffs are made that consider the impact of addition or
removal of servers on competing goal classes.
BRIEF DESCRIPTION OF THE DRAWINGS
The detailed description explains the preferred embodiments of the present invention, together with advantages and features, by way of example with reference to the following drawings.
FIG. 1 is a system structure diagram showing particularly a computer system having a controlling operating system and system resource manager component adapted as described for the present invention.
FIG. 1A shows the flow of a client work request from the network to a server address space managed by the workload manager of the present invention.
FIG. 2 illustrates the state data used to select resource bottlenecks.
FIG. 3 is a flowchart showing logic flow for the find-bottleneck function.
FIG. 4 is a flowchart of the steps to assess improving performance by increasing the number of servers.
FIG. 5 is a sample graph of server ready user average.
FIG. 6 is a sample graph of queue delay.
DETAILED DESCRIPTION OF THE INVENTION
As a preliminary to discussing a system incorporating the present invention, some prefatory remarks about the concept of workload management (upon which the present invention builds) are in order.
Workload management is a concept whereby units of work (processes, threads, etc.) that are managed by an operating system are organized into classes (referred to as service classes or goal classes) that are provided system resources in accordance
with how well they are meeting predefined goals. Resources are reassigned from a donor class to a receiver class if the improvement in performance of the receiver class resulting from such reassignment exceeds the degradation in performance of the donor
class, i.e., there is a net positive effect in performance as determined by predefined performance criteria. Workload management of this type differs from the run-of-the-mill resource management performed by most operating systems in that the assignment
of resources is determined not only by its effect on the work units to which the resources are reassigned, but also by its effect on the work units from which they are taken.
Workload managers of this general type are disclosed in the following commonly owned patents, pending patent applications and non-patent publications, incorporated herein by reference:
U.S. Pat. No. 5,504,894 to D. F. Ferguson et al., entitled "Workload Manager for Achieving Transaction Class Response Time Goals in a Multiprocessing System";
U.S. Pat. No. 5,473,773 to J. D. Aman et al., entitled "Apparatus and Method for Managing a Data Processing System Workload According to Two or More Distinct Processing Goals";
U.S. Pat. No. 5,537,542 to C. K. Eilert et al., entitled "Apparatus and Method for Managing a Server Workload According to Client Performance Goals in a Client/Server Data Processing System";
U.S. Pat. No. 5,603,029, to J. D. Aman et al., entitled "System of Assigning Work Requests Based on Classifying into an Eligible Class Where the Criteria Is Goal Oriented and Capacity Information is Available";
U.S. application Ser. No. 08/383,168, filed Feb. 3, 1995, of C. K. Eilert et al., U.S. Pat. No. 5,675,739 entitled "Apparatus and Method for Managing a Distributed Data Processing System Workload According to a Plurality of Distinct
Processing Goal Types";
U.S. application Ser. No. 08/383,042, filed Feb. 3, 1995, of C. K. Eilert et al., now abandoned in favor of Ser. No. 08/848,763, filed May 1, 1997,entitled "Multi-System Resource Capping";
U.S. application Ser. No. 08/488,374, filed Jun. 7, 1995, of J. D. Aman et al., entitled "Apparatus and Accompanying Method for Assigning Session Requests in a Multi-Server Sysplex Environment";
MVS Planning: Workload Management, IBM publication GC28-1761-00, 1996;
MVS Programming: Workload Management Services, IBM publication GC28-1773-00, 1996.
Of the patents and applications, U.S. Pat. Nos. 5,504,894 and 5,473,773 disclose basic workload management systems; U.S. Pat. No. 5,537,542 discloses a particular application of the workload management system of U.S. Pat. No. 5,473,773 to
client/server systems; applications 08/383,168 and 08/383,042 disclose particular applications of the workload management system of U.S. Pat. No. 5,473,773 to multiple interconnected systems; U.S. Pat. No. 5,603,029 relates to the assignment of work
requests in a multi-system complex ("sysplex"); and application Ser. No. 08/488,374 relates to the assignment of session requests in such a complex. The two non-patent publications describe an implementation of workload management in the IBM.RTM.
OS/390.TM. (formerly MVS.RTM.) operating system.
FIG. 1 illustrates the environment and the key features of the present invention for an exemplary embodiment. The environment of this invention is that of a queue of work requests and a pool of servers which service the work requests. This
invention allows management of the number of servers based on the performance goal classes of the queued work and the performance goal classes of competing work in the computer system. Those skilled in the art will recognize that any number of such
queues and groups of servers within the computer system may be used without departing from the spirit or scope of this invention. The computer system 100 is executing a workload and is controlled by its own copy of an operating system 101 such as the
IBM OS/390 operating system. The operating system 101 executes the steps described in this specification.
Except for the enhancements relating to the present invention, system 100 is the one disclosed in copending application Ser. No. 08/383,168. Although not shown in FIG. 1, system 100 may be one of a plurality of interconnected systems that are
similarly managed and make up a sysplex. As taught in copending application Ser. No. 08/383,168, the performance of various service classes into which units of work may be classified may be tracked not only for a particular system, but for the sysplex
as a whole. To this end, and as will be apparent from the description below, means are provided for communicating performance results between system 100 and other systems in the sysplex. However, since this sysplex-wide mode of operation is not an
essential part of the present invention, it is discussed only in passing in this specification. In general, the reader may refer to the copending application Ser. No. 08/383,168 for this and other details of operation of the system 100 not directly
related to the present invention.
Dispatcher 102 is a component of the operating system 101 that selects the unit of work to be executed next by the computer. The units of work 150 are the application programs that do the useful work that is the purpose of the computer system
100. The units of work that are ready to be executed are represented by a chain of control blocks in the operating system memory called the address space control block (ASCB) queue.
Work manager 160 is a component outside of the operating system 101 which uses operating system services to define one or more queues 161 to the workload manager 105 and to insert work requests 162 onto these queues. The workload manager 105
maintains the inserted requests 162 in first-in first-out order for selection by servers 163 of the work manager 160.
Servers 163 are components of the work manager 160 which are capable of servicing queued work requests 162. When the workload manager 105 starts a server 163 to service requests 162 for a work manager 160's queue 161, the workload manager uses
the server definitions 141 stored on a shared data facility 140 to start an address space (i.e., process) 164. The address space 164 started by the workload manager 105 contains one or more servers (i.e., dispatchable units or tasks) 163 which service
requests 162 on the particular queue 161 that the address space should service, as designated by the workload manager.
In a sysplex comprising a plurality of systems 100, any suitable means may be used to route incoming work requests 162 to a particular system based on the capacity of the system to handle new requests, such as that shown in U.S. Pat. No.
5,603,029 or copending application Ser. No. 08/488,374.
FIG. 1A shows the flow of a client work request 162 from a network (not shown) to which system 100 is connected to a server address space 164 managed by the workload manager 105. A work request 162 is routed to a particular system 100 in the
sysplex and received by a work manager 160. Upon receiving the work request 162, the work manager 160 classifies it to a WLM service class and calls the workload manager 105 to insert the work request in to a WLM work queue 161. The work request 162
waits in the work queue 161 until there is a server 163 ready to run it.
A task 163 in a server address space 164 that is ready to run a new work request 162 (either the space has just been started or the task finished running a previous request) calls the workload manager 105 for a new work request. If there is a
request 162 on the work queue 161 the address space 164 is serving, the workload manager 105 passes the request to the server 163. Otherwise, the workload manager 105 suspends the server 163 until a request 162 is available.
When a work request 162 is passed to the workload manager 105, it is put on a work queue 161 to wait for a server 163 to be available to run the request. There is one work queue 161 for each unique combination of work manager 160, application
environment name, and WLM service class of the work request 162. (An application environment is the environment that a set of similar client work requests 162 needs to execute. In OS/390 terms this maps to the job control language (JCL) procedure that
is used to start the server address space to run the work requests.) The queuing structures are built dynamically when the first work request 162 for a specific work queue 161 arrives. The structures are deleted when there has been no activity for a
work queue 161 for a predetermined period of time (e.g., an hour). If an action is taken that can change the WLM service class of the queued work requests 162, like activating a new WLM policy, the work queues 161 are dynamically rebuilt to reflect the
new WLM service class of each work request 162.
One server address space 164 is started when the first work request 162 arrives for a work queue 161. Subsequent spaces 164 are started when required to support the workload (see policy adjustment discussion below). Preferably, the mechanism to
start spaces 164 has several features to avoid common problems in other implementations that automatically start spaces. Thus, starting of spaces 164 is preferably paced so that only one start is in progress at time. This pacing avoids flooding the
system 100 with address spaces 164 being started.
Also, special logic is preferably provided to prevent creation of additional address spaces 164 for a given application environment if a predetermined number of consecutive start failures (e.g., 3 failures) are encountered for which the likely
cause is a JCL error in the JCL proc for the application environment. This avoids getting into a loop trying to start an address spaces that will not successfully start until the JCL error is corrected.
Additionally, if a server address space 164 fails while running a work request 162, workload manager 105 preferably starts a new address space to replace it. Repeated failures will cause workload manager to stop accepting work requests for the
application environment until informed by an operator command that the problem has been solved.
A given server address space 164 is physically capable of serving any work request 162 for its application environment even though it will normally only serve a single work queue 161. Preferably, when a server address space 164 is no longer
needed to support its work queue 161, it is not terminated immediately. Instead, the server address space 164 waits for a period of time as a "free agent" to see if it can be used to support another work queue 161 with the same application environment.
If the server address space 164 can be shifted to a new work queue 161, the overhead of starting a new server address space for that work queue is avoided. If the server address space 164 is not needed by another work queue 161 within a predetermined
period (e.g., 5 minutes), it is terminated.
The present invention takes as input the performance goals and server definitions 141 established by a system administrator and stored on a data storage facility 140. The data storage facility 140 is accessible by each system 100 being managed.
The performance goals illustrated here are of two types: response time (in seconds) and execution velocity (in percent). Those skilled in the art will recognize that other goals, or additional goals, may be chosen without departing from the spirit or
scope of this invention. Included with the performance goals is the specification of the relative importance of each goal. The goals 141 are read into each system 100 by a workload manager (WLM) component 105 of the operating system 101 on each of the
systems being managed. Each of the goals, which were established and specified by the system administrator, causes the workload manager 105 on each system 100 to establish a performance class to which individual work units will be assigned. Each
performance class is represented in the memory of the operating systems 101 by a class table entry 106. The specified goals (in an internal representation) and other information relating to the performance class are recorded in the class table entry.
Other information stored in a class-table entry includes the number of servers 163 (107) (a controlled variable), the relative importance of the goal class (108) (an input value), the multi-system performance index 151, the local performance index 152
(computed values), the response time goal 110 (an input value), the execution velocity goal 111 (an input value), sample data 125 (measured data), the remote response time history (157) (measured data), the remote velocity history 158 (measured data),
the sample data history 125 (measured data), and the response time history 126 (measured data).
Operating system 101 includes a system resource manager (SRM) 112, which in turn includes a multi-system goal-driven performance controller (MGDPC) 114. These components operate generally as described in U.S. Pat. No. 5,473,773 to J. D. Aman
et al. and copending application Ser. No. 08/383,168. However, MGDPC 114 is modified according to the present invention to manage the number of servers 163. MGDPC 114 performs the functions of measuring the achievement of goals, selecting the user
performance goal classes that need their performance improved, and improving the performance of the user performance goal classes selected by modifying the controlled variables of the associated work units, as described later. The MGDPC function is
performed periodically based on a periodic timer expiration approximately every ten seconds in the preferred embodiment.
The general manner of operation of MGDPC 114, as described in copending application Ser. No. 08/383,168, is as follows. At 115, a multi-system performance index 151 and a local performance index 152 are calculated for each user performance goal
class 106 using the specified goal 110 or 111. The multi-system performance index 151 represents the performance of work units associated with the goal class across all the systems being managed. The local performance index 152 represents the
performance of work units associated with the goal class on the local system 100. The resulting performance indexes 151, 152 are recorded in the corresponding class table entry 106. The concept of a performance index as a method of measuring user
performance goal achievement is well known. For example, in the above-cited U.S. Pat. No. 5,504,894 to Ferguson et al., the performance index is described as the actual response time divided by the goal response time.
At 116, a user performance goal class is selected to receive a performance improvement in the order of the relative goal importance 108 and the current value of the performance indexes 151, 152. The selected user performance goal class is
referred to as the receiver. MGDPC 11 | | |