|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Technical Field
The invention disclosed broadly relates to data processing and more
particularly relates to providing security auditing features for a data
processing system
2. Background Art
Many data processing applications involve highly confidential information
such as in financial applications, national security applications, and the
like, where many user terminals are connected through terminal controllers
to one of a plurality of data processors interconnected in a distributed
processing network. Data files can be stored on storage devices which are
commonly accessible by a plurality of data processors and terminals
connected in the network. The diversity of nodes at which access can be
had to the various data files stored throughout the network presents a
significant security problem, where highly confidential messages and files
are transmitted and stored in the system. The prior art has not provided
an effective mechanism to prevent the unauthorized persons or programs
from reading confidential data being transmitted over the distributed
processing network and stored in the commonly accessible storage devices.
In prior art data processing systems, communications paths and data
accessing nodes have been penetrated by unauthorized persons or programs
which divert, replicate or otherwise subvert the security of the
confidential information being transmitted and stored in the network.
For national security applications, the U.S. Government has established a
standard by which the security of data processing systems can be
evaluated, that standard having been published in "Trusted Computer System
Evaluation Criteria," U.S. Department of Defense, December 1985, DoD
publication number 5200.28-STD (referred to herein as DoD Standard). The
DoD Standard defines a trusted computer system as a system that employs
sufficient hardware and software integrity measures to allow its use for
processing simultaneously a range of sensitive or classified information.
trusted computing base (TCB) is defined as the totality of protection
mechanisms within a computer system, including hardware, firmware and
software, the combination of which is responsible for enforcing a security
policy. A TCB consists of one or more components that together enforce a
unified security policy over a product or system. The ability of a TCB to
correctly enforce a security policy depends solely on the mechanisms
within the TCB and on the correct input by system administrative personnel
of parameters such as a user's clearance, related to the security policy.
A trusted path is defined by the DoD Standard as a mechanism by which a
person at a terminal can communicate directly with the trusted computing
base. The trusted path mechanism can only be activated by the person or
the trusted computing base and cannot be imitated by untrusted software.
Trusted software is defined as the software portion of a trusted computing
base.
As is set forth in the DoD Standard, a secure computer system will control
access to information such that only properly authorized individuals or
processes will have access to read, write, create or delete information.
The DoD Standard sets forth six fundamental requirements to control access
to information and to deal with how one can obtain credible assurances
that this has been accomplished in a trusted computer system. The first
requirement for a secure computer system is that the system must enforce a
mandatory security policy that can effectively implement access rules for
handling sensitive information. Those rules would include the requirement
that no person lacking proper personnel security clearance can obtain
access to classified information and also that only selected users or
groups of users may obtain access to data based for example on a need to
know. A second requirement for a secure computer system is that access
control labels must be associated with information which is to be
maintained secure. A third requirement for a secure computer system is
that each access to information must be authorized based upon who is
accessing the information and what class of information they are
authorized to deal with. A fourth requirement for a secure computer system
is that audit information must be selectively kept and protected so that
actions which affect security can be traced to the responsible user. A
trusted system must be able to record the occurrences of events which are
relevant to security, in an audit log. The capability to select the audit
events to be recorded is necessary in order to minimize the expense of
auditing and to allow efficient analysis. Audit data must be protected
from modification and unauthorized destruction so as to permit detection
and later investigation of security violations. A fifth requirement for a
secure computer system is that a system must contain hardware and software
mechanisms that can be independently evaluated to provide sufficient
assurance that the system enforces the first four requirements. A sixth
requirement of a secure computer system is that trusted mechanisms that
enforce these basic requirements must be continuously protected against
tampering and/or unauthorized changes.
The problem of maintaining a secure computer system as defined in the DoD
Standard is compounded for those systems which accommodate multiple users.
Some examples of prior art multi-user operating systems which have not
provided an effective mechanism for establishing a secure computer system
as defined in the DoD Standard, include UNIX (UNIX is a trademark of AT&T
Bell Laboratories), XENIX (XENIX is a trademark of Microsoft Corporation)
and AIX (AIX is a trademark of the IBM Corporation). UNIX was developed
and is licensed by AT&T as an operating system for a wide range of
minicomputers and microcomputers. For more information on the UNIX
Operating System, the reader is referred to "UNIX (TM) System, Users
Manual, System V," published by Western Electric Company, January 1983. A
good overview of the UNIX Operating System is provided by Brian W.
Kernighan and Rob Pike in their book entitled "The UNIX Programming
Environment," published by Prentice-Hall (1984). A more detailed
description of the design of the UNIX Operating System is to be found in a
book by Maurice J. Bach, "Design of the UNIX Operating System," published
by Prentice-Hall (1986).
AT&T Bell Labs has licensed a number of parties to use the UNIX Operating
System, and there are now several versions available. The most current
version from AT&T is Version 5.2. Another version known as the Berkley
version of the UNIX Operating System was developed by the University of
California at Berkley. Microsoft Corporation has a version known under
their trademark as XENIX.
With the announcement of the IBM RT PC (RT and RT PC are trademarks of IBM
Corporation), (RISC (reduced instruction set computer) technology personal
computer) in 1985, IBM Corporation released a new operating system called
AIX which is compatible at the application interface level with AT&T's
UNIX Operating System, Version 5.2, and includes extensions to the UNIX
Operating System, Version 5.2. For a further description of the AIX
Operating System, the reader is referred to "AIX Operating System
Technical Reference," published by IBM Corporation, 2nd Edition (September
1986).
The invention disclosed and claimed herein specifically concerns providing
a mechanism for auditing information which must be selectively kept and
protected in a secure, distributed data processing system so that actions
affecting that security can be traced to the responsible user. This
mechanism is to be a part of a multi-user operating system such as UNIX,
XENIX or AIX, so that a secure computer system can be established. The
specific embodiment of the invention disclosed herein is applied to the
AIX Operating System. The reader is directed to the description provided
in the copending U.S. Pat. No. 4,918,653 by Abhai Johri, et al. entitled
"A Trusted Path Mechanism for an Operating System," assigned to the IBM
Corporation and which is incorporated herein by reference. The description
in the Johri, et al. copending patent application includes the discussion
of the operating principles for the AIX Operating System will assist the
reader in understanding the invention disclosed and claimed herein. For
further information on the AIX Operating System, the reader is further
referred to the above cited IBM publication "AIX Operating System
Technical Reference."
Since the AIX Operating System and other UNIX-like operating systems make
use of a specialized set of terms, the following definitions are offered
for some of those terms.
Process: A sequence of actions required to produce a desired result, such
as an activity within the system begun by entering a command, running a
shell program, or being started by another process.
Password: A string of characters that, when entered along with a user
identification, allows an operator to sign on to the system.
Operating System: Software that controls the running of programs. In
addition, an operating system may provide services such as resource
allocation, scheduling, input/output control, and data management.
Kernel: In UNIX-like operating systems, the kernel implements the system
call interface.
Init: After the kernel completes the basic process of initialization, it
starts a process that is the ancestor of all other processes in the
system, called the init process. The init process is a program that
controls the state in which the system is running, normally either
maintenance mode or multi-user mode.
Getty: The init process runs the getty command for each port to the system.
Its primary function is to set the characteristics of the port specified.
Login: The login program logs the user onto the system, validates the
user's password, makes the appropriate log entries, sets up the processing
environment, and runs the command interpreter that is specified in the
password file, usually the shell (SH) program.
Shell (SH): The shell command is a system command interpreter and
programming language. It is an ordinary user program that reads commands
entered at the keyboard and arrange for their execution.
Fork: The fork system call creates a new process called a child process,
which is an exact copy of the calling process (the parent process). The
created child process inherits most of the attributes of the parent
process.
Exec: The exec system call executes a new program in the calling process.
Exec does not create a new program, but it overlays the current program
with a new one, which is called the new process image. The new process
image file can be an executable binary file, an executable text file that
contains a shell procedure, or a file which names an executable binary
file or a shell procedure which is to be run.
Signal: Signals provide communication to an active process, forcing a
single set of events where the current process environment is saved and a
new one is generated. A signal is an event which interrupts the normal
execution of a process and can specify a signal handler subroutine which
can be called when a signal occurs.
Superuser (su): The user who can operate without the restrictions designed
to prevent data loss or damage to the system (user ID 0).
Root: Another name sometimes used for superuser.
Root Directory: The top level of a tree-structured directory system.
Daemon Process: A process begun by the kernel or the root shell that can be
stopped only by the superuser. Daemon processes generally provide services
that must be available at all times such as sending data to a printer.
Mount: To make a file system accessible.
Terminal: An input/output device containing a keyboard and either a display
device or a printer. Terminals usually are connected to a computer and
allow a person to interact with the computer.
An example of a distributed network within which the invention can find
application is described in the copending U.S. patent application by G. H.
Neuman, et al., Ser. No. 14,897, filed Feb. 13, 1987, entitled "A System
and Method for Accessing Remote Files in a Distributed Networking
Environment," now U.S. Pat. No. 4,887,204 which is assigned to the IBM
Corporation and which is incorporated herein by reference.
As described in the copending Neuman, et al. application, in a distributed
environment, several data processing systems are interconnected across a
network system. A distributed services program installed on the systems in
the network allows the processors to access data files distributed across
the various nodes of the network without regard to the location of the
data file in the network.
To reduce the network traffic overhead when files at other nodes are
accessed, and to preserve the file system semantics, i.e. the file
integrity, Neuman, et al. disclose that the accessing of the various files
are managed by file synchronization modes. A file is given a first
synchronization mode if a file is open at only one node for either read or
write access. A file is given a second synchronization mode if a file is
opened for read only access at any node. A file is given a third
synchronization mode if the file is open for read access in more than one
node, and at least one node has the file open for write access.
If a file is in either the first or second synchronization mode, Neuman, et
al. disclose that the client node, which is the node accessing the file,
uses a client cache within its operating system to store the file. All
read and writes are then sent to this cache.
If a file is in the third mode, Neuman, et al. disclose that all read and
write requests must go to the server node where the file resides. The node
accessing the file does not use the cache in its operating system to
access the file data during this third mode.
Neuman, et al. disclose that the client cache is managed such that all read
and write requests access the client cache in the first and second
synchronization modes. In the third synchronization mode, the client cache
is not used. In this way, overall system performance is improved without
sacrificing file integrity.
OBJECTS OF THE INVENTION
It is therefore an object of the invention to provide an improved secure
computer system.
It is another object of the invention to provide an improved secure
computer system which complies with the DoD Standard.
It is yet a further object of the invention to provide an improved secure
distributed data processing system in which audit information can be
selectively kept and protected so that actions affecting the security of
the system can be traced to the responsible user.
It is yet a further object of the invention to provide an improved secure,
distributed data processing system using a UNIX-type operating system, in
which audit information can be selectively kept and protected so that
actions affecting security of the system can be traced to the responsible
user
SUMMARY OF THE INVENTION
These and other objects, features and advantages of the invention are
accomplished by the distributed auditing subsystem disclosed herein. The
distributed auditing subsystem invention runs in a UNIX-like operating
system environment with a hierarchical file system. The invention includes
an audit daemon which provides an audit trail of accesses to the objects
it protects and maintains and protects that audit trail from modification
or unauthorized access or destruction. The audit data generated by the
invention is protected so that read access to it is limited to those who
are authorized for audit data. The invention enables the recording of
events which are relevant to the maintenance of the security of the
system, such as the use of identification and authentication mechanisms,
the introduction of objects into a user's address space, the deletion of
such objects, actions taken by computer operators and system
administrators and/or system security officers, and other security
relevant events. The invention generates an audit record for each recorded
event which includes the date and time of the event, the user, the type of
event, and the success or failure of the event. The invention performs an
on-line compression of the audit trail log file using a UNIX-type daemon
process. The audit daemon process has a restartable feature that enables
it to recover after node failures. The invention finds particular
application in a distributed processing system in which files may be
variously stored at diverse storage locations in the network. In such a
distributed system, the audit process of the invention can be carried out
on a network-wide, distributed basis so that audit files located at
diverse storage locations can be concentrated into a single audit trail
log file.
In this manner, a secure computer system which conforms to the DoD Standard
is achieved, which can generate, manipulate and data compress audit
information concerning actions affecting the security of the distributed
data processing system.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other objects, features and advantages of the invention will be
more fully appreciated with reference to the accompanying figures.
FIG. 1 is a diagram of a network within which includes two hierarchical
file systems.
FIG. 2 is a diagram similar to FIG. 1, showing how directories on systems B
and C can be remotely mounted on system A.
FIG. 3 is a diagram similar to FIG. 1, showing how a client system can
access a file on a server system.
FIG. 4 depicts how audit trail information is generated and compressed from
a plurality of nodes in the distributed processing system.
FIG. 5 shows the structure of the real audit trail file.
FIG. 6 is a flow diagram of the audit daemon process.
FIG. 7 is an architectural diagram of the invention.
DESCRIPTION OF THE BEST MODE FOR CARRYING OUT THE INVENTION
An auditing subsystem for a unitary data processor which includes the
feature of compressing the audit trail file has been previously disclosed
in a paper by J. Picciotto entitled "The Design of an Effective Auditing
Subsystem," Proceedings of the 1987 IEEE Symposium on Security and
Privacy, Oakland, CA, pp. 13-22 (April 1987). Picciotto talks about how to
design an auditing subsystem which contains compression. However,
Picciotto fails to deal with how to get an auditing subsystem to operate
in a distributed processing network where there are distributed services.
The concept of distributed services is described in the copending G. H.
Neuman, et al. application referenced above, for example, as a collection
of UNIX machines (nodes). Each node has a hierarchical file system that
can be drawn as a tree, as shown in FIG. 1. The root of the tree is called
slash and under each slash is a directory and in each directory we can
have either other directories or files. We can think of a UNIX directory
as like a file drawer. A UNIX file is like a file in that file drawer. In
UNIX, we can have subdirectories of a directory; that is a path directory
can have child directories. FIG. 1 shows a tree which has its root at the
top and branches going down representing hierarchical name space where we
have the root at the top represented by slash and under that we have some
subdirectories. In UNIX some of the typical subdirectories are /bin, /etc,
/temp and /usr and sometimes /u and then under a directory such as /etc,
we have files, for example, /etc/rc or we could have a directory.
In accordance with the invention, we have introduced a new directory
/etc/security. Under that, we have some tables. One of the tables under
/etc/security is a file named /etc/security/s.sub.-- cmd, which is the
name of the file that contains the command table for the trusted shell, as
described in the copending A. Johri, et al. application referenced above.
Also under the directory name /etc/security, we have introduced a
directory named /etc/security/audit and under this directory we have a
file named a.sub.-- event. In accordance with the invention, the event
table lists the known events in the system for this particular auditing
subsystem. Table 1 gives an example of an event table. There are two types
of events: there are base events and there are administrative events. Base
events are events that happen in kernel or that happen in commands. An
example of a base event would be an event in the kernel such as the event
named exec or the event named fork. An example of a base event in a
command would be that there are two events in the command named login. One
is login.sub.-- fail and the other is login.sub.-- ok. If we wanted to
define an administrative event in the system that was either login.sub.--
fail or login.sub.-- ok and we wanted to name this administrative event
login, then what we would do is go into the event table,
/etc/security/audit/an.sub.-- event, and edit the table to add a line that
says login:login.sub.-- fail,login.sub.-- ok. Then if we are an auditor
and we want to turn on the audit event named "login," then it is already
defined in the table. Administrative events are convenient macros that an
auditor can use to edit this file or customize this file so that it
contains new administrative events. The invention is especially designed
for operation over distributed services (DS), in a distributed processing
network.
In order to understand what DS does, as described in the G. H. Neuman, et
al. copending application referenced above, some discussion is given here
of hierarchical UNIX file systems. Suppose we have a network of UNIX
systems and on each UNIX system, we have a hierarchical name space with
directories and files; that is, it is a tree with a root at the top
represented by "/" and underneath the "/" we have directories and files,
as shown in FIG. 1. On the UNIX system we have some well-known directories
under "/" called bin, etc, temp, usr and some others and then under those
we have some other directories or some other files. Furthermore, it is the
case that on a traditional UNIX system when we look at the name space, all
of those directories and all of those files are local to that unitary data
processor. None of the files are remote.
Let us assume that we have two UNIX systems. We can call the first one A
and the second one B. Each UNIX system has a hierarchical name space
having a tree with the "/" and the various files under it. What we would
like to do is be on one machine A and to access files and directories on
the other machine B. One way to do that, in fact the old way to do that,
and that could still exist, is if we are on A and if we want to talk to B,
what we do is we can log into B with a command called "telnet" or a
command named "rlogin" (r for remote) and what we have actually done is
first log into A and now we want to talk to B. So we either do a telnet or
rlogin to B and now we are in B's environment (B's hierarchical name
space) and we can perform operations on B. But the one thing we cannot do
is move files back and forth between A and B. That is, we are either on A
or on B, but we cannot be on both.
What we would like to do is be on one machine and while remaining in the
first machine, get access to the other machine's hierarchical name space,
either to its directories or its files. In fact, what we would like to do
on A is have any command that works on A's local files, also work on any
remote files on B. To do this, we need a way of naming a remote file.
There are some other things we can do with existing code. One of the
things we can do is if we are on A and we want to copy a file from A to B
or from B to A, then there is a command for doing this and the command is
called FTP for file transfer program. On AIX, it is called XFTP. The way
that works is that we are on one machine and we say XFTP and it is
basically like a login. We log into the other machine and now we can copy
files back and forth and we can change the directory to a different
directory. We can do a list (LST) to see what is in that directory and
this was acceptable for a while, but it is rather cumbersome.
What we would really like to do if we are on A, is to copy a file from B.
There is an existing command named CP for copy and what we do with copy is
we say "CP X Y" where X and Y are the names of files. X is the name of an
existing file and Y is the name of a brand new to-be-created file. What we
have done effectively is, we made a copy of X and we have called it Y.
Traditionally, X and Y are both local. They are files on the same machine,
but what we would like, is for the software to be oblivious to whether or
not X and Y are local or remote, either both are local or both are remote,
a different one being local. There are more than four combinations,
because if both files are remote, they do not have to be both on machine
B. X can be on machine B and Y can be on machine C. When we make a copy of
the file, with CP, we are using the same command that we previously used
on the local machine to copy files that are non-local or remote. The
question is, is there a simple way of doing this, or logging into A and
working in A's hierarchical's name space and getting access to the
hierarchical name space of other machines on the network like B and C,
such that when we run commands that used to work only locally on A like
copy CP, the system is now oblivious to whether or not the files they deal
with are local or remote on B or C. In particular, we do not have to use
XFTP or FTP to move files back and forth explicitly, we do not have to use
telnet or rlogin to actually log into a different system to move files and
do things. Sitting on the local machine, we have LAN transparent access to
remote files. LAN is a local area network. Transparent means that we are
oblivious or the code is oblivious to whether or not the files are local
or remote. Access means we can read and write these files.
This is in effect what DS offers, as described in the Neuman, et al.
application. On a local UNIX system, what happens is that we have a hard
disk and we define in UNIX what is called file systems. A file system
corresponds to the space of an entire hard disk, or it corresponds to part
of the space on a hard disk. For example if we have 170 megabyte hard
disk, like in an RT, we can divide one UNIX file system for that or we
could chop that into two UNIX file systems, or three or four. There are
some limits to the number of file systems we can have on a hard disk. Let
us assume for the time being that we might have two file systems on 170
megabyte hard disk. What happens in UNIX is that "root" is not only a
directory, it turns out also to be a file system. There is a certain
amount of space on disk and it also contains the subdirectories, the
directories and files that are in that file system. If we have two file
systems in UNIX, we have one hierarchical name space. What we do is that
we want to represent all of the file systems in one hierarchical name
space and typically we define a directory in the name space and that will
be a "mount point" a. We are going to "mount" another file system on top
of that mount point a. In this particular example, mount point a happens
to be a subdirectory of the root. In FIG. 1 the new file system on B is
shown as a triangle with a dot b at the top. DS enables the mount point a
on system A and the point b on the new file of system B to logically
coincide. There are actually two directories under A. There is the
directory in the root file system "/" of A and there is the subdirectory a
at the mount point. The directory at b in system B that is the root of the
mounted file system that is the file we are going to mount on top of the
mount point a. Once we have done the mount operation, that is we have
mounted a file system onto an existing directory, we have basically grown
the hierarchical name space in system A to include more files and more
directories. When we progress down the path in system A from "/" to mount
point "a" and into the mounted directory at "b", by doing a change
directory command CD, we enter into an expanded area.
The reason why we bother chopping things into file systems on UNIX, is file
rolling. We do not want to put all data on one disk in one file system. We
want to chop things up into smaller file systems such that if we have a
lot of activity in one, but do not have a lot of activity in another, then
the one for which we do not have a lot of activity, we do not have to back
it up very often. The one that we have a lot of activity in, we want to
back up quite a bit. Rather than always backing up everything, we
partition the disk space in the file system such that it is easier to make
backups of some files, or if we have a failure in part of the disk, we do
not lose everything, we just lose that file system. When we back up
storage on UNIX, we back it up on a profile system basis. What we have
here is one UNIX machine, we have one or more disks on it, each disk
contains one or more UNIX file systems. If the UNIX system has more than
one disk, then each of the disks has one or more file systems and what we
do is we have a distinguished file system called the root file system and
the other file systems are sub-trees that we mount onto various
directories of this root file system. This is a way of extending a
hierarchical name space for a local machine.
Much of UNIX depends on having this single hierarchical name space and for
most of the commands used when we edit a file or when we copy a file, we
specify a path name and the path name is either an absolute path name, or
a relative path name. An absolute path name begins with "/" and it starts
at the root of this hierarchical tree. In a relative path name, there is a
command for changing a directory, like opening a file drawer. When we do a
CD to change directory to go into a different place, then we can specify
path names relative to that directory. This completes the background
discussion of hierarchical UNIX file systems necessary to introduce the
principles of distributed services.
In FIG. 1, | | |