|
Claims  |
|
|
What is claimed is:
1. A computer system having a plurality of computers, each computer having a processor, a memory, and a microkernel operating system, said computer system comprising:
a. an extensible file system wherein a new file system can be added to the system without modification to the microkernel operating system;
b. a virtual memory manager (VMM) in a computer which can cache data from said new file system; and
c. a caching file server (CFS) resident on the computer containing the VMM, said CFS configured to cache file system attributes from said new file system which attributes said VMM cannot cache, said CFS coordinating the caching operations for
clients between said CFS and said VMM so that data caching is not duplicated by said CFS and said VMM.
2. The computer system of claim 1 wherein said CFS can also cache data related to bind operations which said VMM cannot cache.
3. A data processing system having one or more file systems and connected at a local computer to at least one remote computer by a communications link, and said local computer having a memory and a virtual memory manager ("VMM"), said data
processing system comprising:
a. at least one file in the one or more file systems;
b. a microkernel operating system in said local computer thereby permitting a user to add a new file system without modification to the microkernel operating system;
c. said VMM in said local computer comprising mechanisms for caching data from said at least one file system required by clients in said local computer, said mechanisms comprising memory in said local computer and devices for managing said caches
and said memory and supplying said clients with data from said at least one file system; and
d. a caching file server ("CFS") in said local computer outside of said microkernel operating system which provides caching services for the at least one file, said caching services comprising mechanisms for caching specific data within said CFS
if said specific data is not cached by said VMM and mechanisms for using existing caches in said VMM for caching other data where said VMM is able to cache said other data thereby preventing any duplication of data caching in said CFS and said VMM, said
CFS comprising a first file program in the CFS which recognizes requests from a first client program, to read first designated data which originates in the at least one file and which is passed to said first client through said cache in said VMM and to
write second designated data which originates in said first client program and is ultimately written to said at least one file having passed through said cache in said VMM, said CFS coordinating the caching of said first and second designated data to the
at least one file with the VMM in the local computer when said VMM can cache said first and second designated data, and wherein said CFS recognizes requests to query/set attributes of the at least one file and caches said file attributes in the CFS when
said attributes are not cached by said VMM, said attributes of said at least one file comprising data which resides in said at least one file and which is accessible by said first client program.
whereby the caching services provided by the CFS and the VMM for the first client program need not be duplicated and can minimize data transfer overhead by virtue of being cached at the local computer.
4. The data processing system as recited in claim 3 wherein said first file program in the CFS recognizes requests from a second client program, to read/write data to the at least one file and recognizes requests from the second client program
to query/set attributes of the at least one file, and wherein said CFS coordinates the caching of read/write data to the at least one file with the VMM in the local computer to insure that only one cache of the read/write data is maintained for both of
the first and second client programs when addressing a file, and wherein said CFS caches said file attributes in the CFS, whereby the caching services provided by the CFS and the VMM for said first and second client programs need not be duplicated and
can minimize data transfer overhead by virtue of being cached at the local computer.
5. The data processing system as recited in claim 4 wherein the CFS services both the first client program and the second client program by use of a common CFS cache.
6. The data processing system as recited in claim 5 wherein bind requests are sent by a client program to the CFS, and wherein the CFS uses the common CFS cache to cache results of bind operations so that each new bind request is checked to
determine if a binding already exists before transmitting the bind request to a remote computer.
7. The data processing system as recited in claim 6 wherein the CFS uses the common CFS cache to cache attributes of the at least one file.
8. The data processing system as recited in claim 7 further comprising a file program in a file server located on the at least one remote computer and a communications link between said file program and the CFS for supplying attribute data of
the at least one file to the CFS.
9. The data processing system as recited in claim 8 wherein the file program in the file server located on the at least one remote computer supplies new attribute data from the at least one file to the common CFS cache to maintain coherent file
attribute data, thereby permitting the CFS to control "set length" and "write" operations to said at least one file.
10. The data processing system as recited in claim 8 further comprising a cache program in the file server and a communications link between said cache program and a pager program in the VMM for supplying file data to and from the at least one
file to the VMM, in response to page-in/page-out from the pager program in said VMM, whereby said VMM services requests from CFS programs for data from the at least one file.
11. The data processing system as recited in claim 10 wherein said programs are object oriented programs.
12. A data processing system having one or more file systems, one or more computers, each computer having a processor, a memory, program instructions in said memory, and a virtual memory manager ("VMM"), said data processing system comprising:
a. at least one file in the one or more file systems;
b. a microkernel operating system in a local one of said one or more computers, said microkernel operating system using an extensible file system wherein new file systems can be added by a user;
c. said VMM comprising mechanisms for caching data from said at least one file required by a client in a local computer, said mechanisms comprising memory in said local computer and devices for managing caches and said memory and supplying said
client with data from said at least one file; and
d. a caching file server ("CFS") in said local computer which provides caching services for the at least one file, said caching services comprising mechanisms for caching specific data within said CFS if said specific data is not cached by said
VMM and mechanisms for using existing cache in said VMM for caching other data where said VMM is able to cache said other data thereby preventing any duplication of data caching in said CFS and said VMM, said CFS comprising a first file program in the
CFS which recognizes requests from a first client program, to read first designated data which originates in the at least one file and which is passed to said first client through said cache in said VMM and to write second designated data which
originates in said first client program and is ultimately written to said at least one file having passed through said cache in said VMM, said CFS coordinating the caching of said first and second designated data to the at least one file with the VMM
when said VMM can cache said first and second designated data, and wherein said CFS recognizes requests to query/set attributes of the at least one file and caches said file attributes in the CFS when said attributes are not cached by said VMM, said
attributes of said at least one file comprising data which resides in said at least one file and which is accessible by said first client program,
whereby the caching services provided by the CFS and the VMM for the first client program need not be duplicated and disk input/output operations can be reduced.
13. The data processing system as recited in claim 12 wherein the file program in the CFS recognizes requests from a second client program to read/write data to the at least one file and coordinates the caching of read/ write data to the at
least one file with the VMM, and recognizes requests from the second client program to query/set attributes of the at least one file and caches said file attributes in the CFS, whereby the caching services provided by the CFS and the VMM for said first
and second client programs need not be duplicated, and related disk input/ output operations can be reduced.
14. The data processing system as recited in claim 13 wherein the CFS services both the first client program and the second client program by use of a common CFS cache.
15. The data processing system as recited in claim 14 wherein the CFS uses the common CFS cache to cache results of bind operations.
16. The data processing system as recited in claim 15 wherein the CFS uses the common CFS cache to cache attributes of the at least one file.
17. The data processing system as recited in claim 16 further comprising a file program in a file server containing the at least one file and a communications link between said file program and the common CFS cache for supplying attribute data
of the at least one file to the CFS.
18. The data processing system as recited in claim 17 wherein the file program in the file server supplies new attribute data from the at least one file to the common CFS cache to maintain coherent file attribute data.
19. The data processing system as recited in claim 18 further comprising a cache program in the file server and a communications link between said cache program and a pager program in the VMM, for supplying file data to and from the at least one
file, in response to page-in/page-out from the VMM, whereby said VMM services requests for data from the at least one file.
20. The data processing system as recited in claim 19 wherein said programs are object oriented programs.
21. A method, performed by a computer having a processor, a memory, a computer program residing in said memory, of accessing a remote file from a local computer in a data processing system having one or more file systems and connected at said
local computer to at least one remote computer by a communications link, said method comprising:
a. using a microkernel operating system in said local computer;
b. using a virtual memory manager ("VMM") in said local computer outside of said microkernel operating system, comprising mechanisms for caching data from said one or more file systems required by a client in said local computer, said mechanisms
comprising memory in said local computer and devices for managing said cache and said memory and supplying said client with data from said one or more file systems;
c. establishing a caching file server ("CFS") in said local computer which provides caching services for at least one file, said caching services comprising mechanisms for caching specific data within said CFS if said specific data is not cached
by said VMM, and mechanisms for using existing caches in said VMM for caching other data where said VMM is able to cache said other data thereby preventing any duplication of data caching in said CFS and said VMM,; and
d. establishing a file program in the CFS which recognizes requests from a first client program to read first designated data which originates in the at least one file and which is passed to said first client through said cache in said VMM and to
write second designated data which originates in said first client program and is ultimately written to said at least one file having passed through said cache in said VMM, said CFS coordinating the caching of said first and second designated data to the
at least one file with the VMM when said VMM can cache said first and second designated data, and wherein said CFS recognizes requests to query/set attributes of the at least one file and caches said file attributes in the CFS when said attributes are
not cached by said VMM, said attributes of said at least one file comprising data which resides in said at least one file and which is accessible by said first client program,
whereby the caching services provided by the CFS and the VMM for the first client program need not be duplicated and related network communications traffic between the local and remote computers can be minimized.
22. The method as recited in claim 21 further comprising the additional steps of recognizing requests from a second client program to read/ write data to the at least one file and coordinating the caching of read/write data to the at least one
file with the VMM in the local computer, and recognizing requests from the second client program to query/set attributes of the at least one file and caching said file attributes in the CFS, whereby the caching services provided by the CFS and the VMM
for said first and second client programs need not be duplicated and related network communications traffic between the local and remote computers can be minimized.
23. The method as recited in claim 22 wherein the CFS services both the first client program and the second client program by use of a common CFS cache.
24. The method as recited in claim 23 wherein the CFS uses the common CFS cache to cache results of bind operations.
25. The method as recited in claim 24 wherein the CFS uses the common CFS cache to cache attributes of the at least one file.
26. The method as recited in claim 25 further comprising the steps of establishing a communications link between a file program in a file server and the common CFS cache for supplying attribute data of the at least one file to the common CFS
cache.
27. The method as recited in claim 26 wherein the file program in the file server supplies new attribute data from the at least one file to the common CFS cache to maintain coherent file attribute data.
28. The method as recited in claim 27 further comprising a step of establishing a cache program in the file server and a communications link between the cache program and a pager program in the VMM, said VMM being located on the local computer,
for supplying file data to and from the at least one file, in response to page-in/page-out operations from the pager program in said VMM, whereby said VMM services requests from CFS programs for data from the at least one file. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates in general to the fields of Computer Operating Systems, Multi-processor Hardware Systems, Object Oriented Programming, and Virtual Memory Systems. In particular, the invention relates to improved techniques for
establishing and efficiently handling relationships between a client program, a file system server and a virtual memory manager ("VMM").
2. Background
Object oriented operating systems, with microkernels which permit client level implementation of file systems, create complexities in memory management which clients have not had to deal with in the past. Moreover, on widely distributed computer
networks, having files resident on different computers, clients are not prepared to handle the fact that file accesses may produce unnecessary network communications traffic.
This disclosure describes some of these inefficiencies that exist in the prior art and provides a method and apparatus for significantly reducing them.
The role of the operating system in a computer has traditionally been to efficiently manage the hardware resources (the central processing unit ("CPU"), memory and input/output devices). This management function has included the role of managing
the file system, which comprises data and programs stored generally on a disk drive or magnetic tape system. In modern systems, this management function has included the use of a virtual memory subsystem. More specifically, the operating system has
traditionally been responsible for: creating and deleting files and directories; providing support for primitive program routines for manipulating files and directories; mapping files onto disk storage; and general protection of the files by limiting
access of programs, processes and users to the files.
Distributed computer systems, some with shared memory, and some with remotely accessible file systems, have led to the creation of "distributed file systems ("DFS")" to support the sharing of files by multiple users when the files are physically
dispersed among the various computers of a distributed system. A DFS is a file system whose clients, servers and storage devices are dispersed among the machines of a distributed system. The location and multiplicity of the servers and storage devices
is transparent to the client.
In a DFS the time to satisfy a client's request is a function of: disk access time and a small amount of associated CPU time; the time needed to deliver the request to a server; the time for getting the response back across the network to the
client; the actual data transfer time; and the related CPU overhead for running the communications protocol software. Access times for DFSs have been shortened by the use of caching techniques on local machines to minimize the overall remote file
accessing times. Nevertheless, minimization of the access times for retrieval of data from remote files remains a major goal of computer hardware and software designers. For additional information on operating systems, file systems and related
problems, see the text "Operating System Concepts" 3rd edition, by A. Silberschatz, J. Peterson and P. Glavin, 1991 Addison-Wesley Publishing Inc.
With the advent of microkernel operating systems, file systems are being implemented outside of the kernel in user level servers. These new file systems must solve a new set of problems to provide efficient performance. For example, the
following describes a number of areas where caching of file data and file attributes would provide significant efficiency improvements in either memory usage or in network accesses in such user level file system implementations.
The MACH operating system developed by Carnegie Mellon University, is an object oriented operating system, with a minimum sized extensible kernel, based upon communications facilities. All requests to the kernel, and all data movement among
processes are handled through one communications mechanism. In MACH, many traditionally kernel-based functions, such as file services, can be implemented as user-level servers. Other modern operating systems like MACH are being developed.
MACH presently implements virtual memory techniques in an object oriented system. In MACH, the use of a memory object to both encapsulate the mapping and attributes of a memory space and to control the communications to a memory cache object and
the related paging operations can result in an inefficient use of the physical memory space used by the cache as well as related unnecessary paging operations where two or more programs or tasks or processes or threads (hereinafter "programs") with each
having access rights, are using the same memory space. These inefficiencies result from the fact that MACH creates a separate related memory cache object each time a new memory object with different access rights is mapped, without regard for the
possibility that the same data is already being paged by another memory object-memory cache object pair. For example, if a client wishes to access a file, the client sets up a file system object and memory maps it, MACH creates a memory object and
related cache and communications port to accept messages for the memory object to get data from and write data to the file. If a second client with a different access mode, wishes to access the same file, MACH sets up a second memory object, related
data cache and communications port. This is obviously a redundant and inefficient use of scarce memory resources as well as a duplication of the system overhead when the file is located on a remote machine. For more detailed information on MACH, see
"Exporting a User Interface to Memory Management from a Communications-Oriented Operating System" by Michael Wayne Young, Doctoral Thesis for Carnegie Mellon University, November 1989, CMU-CS-89-202.
In the prior art, an additional problem exists with file objects besides the fact that a duplicate MACH memory object with a duplicate cache is set up if a second user with a different access mode, memory-maps the same file object. This is the
problem created by the fact that all requests on the file object go the same location; that is, to the location containing the implementor of the file object. Going to the same location for all requests is inefficient when the file data resides on a
remote host. This is especially true when two client programs wish to access the same file and at least one of them wishes to access the file attributes. For example, there are two possible ways to approach this problem but each has its performance
problems:
Case 1) Implementor of the file is on a remote machine. Referring to FIG. 1, a first client 12 and a second client 14 are on local node 10 as is a VMM 16. Remote node 24 contains a file server 26 with a file object 28 and a connected file
storage system 25. The first client 12 has mapped the file object 28 using the VMM 16 to access file data through the memory object port 18 to cache object 30 connection. The second client 14 wishes to access the same file but wants to "query/set
attributes". Since the VMM 16 cannot cache this data, the second client 14 must access the remote file object 28 directly without the benefit of caching. In this case all requests to the file are remote. Whereas this may not be a problem for
page-in/page-out requests from the VMM 16 for the first client 12 (because the VMM can cache the data locally), all read/write requests as well as attribute query/set requests by the second client 14 must also go to the remote implementor of the file
with no possibility of caching the data or attributes. Note that the file data requested by the second client's direct request may even be already located in the cache controlled by the local VMM 16 but there is no way to know this or to share this
information.
Obviously it would be more efficient if both clients could access a common cache locally, both in terms of memory usage and in reduced network accesses. Moreover, if the attributes could also be cached locally, even more network accesses would
be saved.
Case 2) Implementor of the file is on the local machine, but the data is on a remote machine. Referring now to FIG. 2, again local node 10 contains the first client 12, and the second client 14, and the VMM 16. But in this case the local node
also contains a file server 40 which contains the file object 44 for file 46 located on the remote node 24. The first client 12 invokes its data read/write requests to the file object 44 on the same node 10. The second client 14 invokes commands on the
file object 44 as well. The file object 44, the implementor of the file, has the ability to cache file data and attributes so it could satisfy any requests from its local cache. However, all page-in/page-out operations by the VMM 16 bear an added cost
of going indirectly through the local implementor (the file object 44) to the remote node 24 that holds the data, instead of going directly to the remote node 24 for the data.
Access time could be saved if the VMM could access the remote file directly, and could cache the data and attributes for all local clients that wanted access to the file.
Other kinds of file operations that would benefit from caching include:
Mapping. Each time that a client domain maps a file into its address space, a bind call must go to the file server. If the file is remote, this requires a network access on each map call. If the result of binds could be cached locally, then
many of these network accesses could be eliminated.
Getting file attributes. If the file is remote, several opportunities for saving network accesses are available by caching the file length. In addition, all of a file's attributes are returned via the stat call. In current UNIX.RTM. system,
stat calls are very frequent. This is yet another area where caching can be effective. (UNIX is a registered trademark of UNIX Systems Laboratories Inc.).
Obviously if the implementor, client file server and the file system are all on the same machine, there is no network overhead involved, although the inefficiencies of possibly duplicated data storage remains. However, cases 1 and 2 are the more
common in a DFS. These problems create unnecessary network traffic and redundant use of memory for multiple caching of the same data.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus for solving these problems. Network communications overhead is reduced and redundant data caching for files is eliminated. This is done by establishing a Caching File Server ("CFS") on the
same machine as the end user client that wants to access the file. The CFS becomes the implementor of the file object and establishes a cache for the file attributes as well as a private communications channel to the remote file server to retrieve or
set attributes whenever required. The CFS also establishes a memory mapping for the file data which makes use of the VMM on that machine and its caching abilities for the data. The VMM establishes its own memory object link to a pager object on the
remote node containing the file server for doing its page-in/page-out operations. This is done by restructuring the file object into a CFS file object and a cache.sub.-- object-pager.sub.-- object pair, and implementing the CFS file object locally in
such a way that all file data and attributes are cached locally. This also improves the efficiency of paged data space usage and correspondingly reduces the processing time needed to support paged data operations by allowing the VMM and end user clients
to share the same pages of data via different CFS file objects, even though each CFS file object may have different access rights to the data. Thus the present invention elegantly minimizes the overall network overhead for file operations in a DFS by
allowing some operations to go remote (the normal VMM controlled page-in/page-outs) and some operations to go local (the get attributes and bind operations).
According to the invention, this is accomplished by establishing in a local computer node a caching file server ("CFS") which provides caching services for remote files, and by creating in this CFS a CFS file program which can recognize requests
from a local client to read/write data to a remote file and can coordinate the caching of the remote file data with a virtual memory manager ("VMM") in the local computer, and which can recognize requests to query/set attributes of the remote file and
can cache these file attributes in the CFS, whereby the caching services provided by the CFS and the VMM for the remote file are not duplicated and related network communications traffic between the local and remote computers is minimized. In addition
the CFS has the ability to accommodate a second local client which wishes to access the same remote file, providing a CFS file program to service this second client which can share the read/write data cache in the VMM and share the file attribute data in
a common CFS cache, so that again there is no redundant data caching and network traffic is minimized. Moreover, the CFS has the ability to cache the result of "bind" operations which establish the VMM-Pager linkage, in order to further reduce network
accesses when a subsequent request for an already established linkage appears. The invention also provides mechanisms for maintaining coherent data in the file attribute cache while minimizing the related network traffic.
DESCRIPTION OF THE
DRAWINGS
The objects, features and advantages of the system of the present invention will be apparent from the following description in which:
FIG. 1 illustrates a Case 1 prior art caching configuration.
FIG. 2 illustrates a Case 2 prior art caching configuration.
FIG. 3 illustrates the major system components on a SPRING Operating System node.
FIG. 4 illustrates the general relationship of the Caching File Server ("CFS") to other major system components.
FIG. 5 illustrates the major components of the CFS and their relationship to other major components.
FIG. 6 illustrates the relationship of multiple client domains to the CFS environment.
NOTATIONS AND NOMENCLATURE
The detailed descriptions which follow may be presented in terms of program procedures executed on a computer or network of computers. These procedural descriptions and representations are the means used by those skilled in the art to most
effectively convey the substance of their work to others skilled in the art.
A procedure is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these
quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It proves convenient at times, principally for reasons of common usage, to refer to these signals as bits,
values, elements, symbols, characters, terms, numbers, or the like. It should be noted, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these
quantities.
Further, the manipulations performed are often referred to in terms, such as adding or comparing, which are commonly associated with mental operations performed by a human operator. No such capability of a human operator is necessary, or
desirable in most cases, in any of the operations described herein which form part of the present invention; the operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital
computers or similar devices.
The present invention also relates to apparatus for performing these operations. This apparatus may be specially constructed for the required purposes or it may comprise a general purpose computer as selectively activated or reconfigured by a
computer program stored in the computer. The procedures presented herein are not inherently related to a particular computer or other apparatus. Various general purpose machines may be used with programs written in accordance with the teachings herein,
or it may prove more convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description given.
Description of the Preferred Embodiment
The following disclosure describes solutions to the problems which are encountered by client programs which reference file systems in an extensible microkernel operating system. A method and an apparatus are disclosed for a Caching File Server
("CFS"), which allow client programs to have the ability to access file systems in a distributed computer system, using caching techniques, and using virtual memory management techniques in an efficient way and in a trusted environment, which is
especially adapted for systems using object oriented programs. In the following description, for purposes of explanation, specific data and configurations are set forth in order to provide a thorough understanding of the present invention. The
preferred embodiment described herein is implemented as a portion of the SPRING Object-Oriented Operating System created by Sun Microsystems.RTM., Inc. (Sun Microsystems is a registered trademark of Sun Microsystems, Inc.). However, it will be apparent
to one skilled in the art that the present invention may be practiced without the specific details and may be implemented in various computer systems and in various configurations, or makes or models of tightly-coupled processors or in various
configurations of loosely-coupled multiprocessor systems.
The SPRING Operating System
The present invention is currently embodied in the Sun Microsystems, Inc. SPRING operating system ("SPRING"), a distributed operating system designed around a microkernel architecture, and a cache-coherent, network virtual memory system. SPRING
has no user-visible kernel calls. The interfaces for services traditionally implemented as kernel calls are specified in an object-oriented interface definition language that supports multiple inheritance, similar to IDL as defined by the Object
Management Group (OMG). A client application program can request the creation of an object, invoke methods from the interface of an object, and pass an object as a parameter to other invocations, without regard to the location of the object's
implementation. Invocations of methods on an object are made via client-side stubs that perform remote procedure calls, if necessary to the server-side implementation. The stubs are compiled independently of any application and can be dynamically
linked to any client that needs them. Accordingly, in this environment, a file system may be physically located on any one of a number of distributed machines and a given client program would have no prior knowledge of the location of the file or
efficient ways to perform data reads/writes or to query/set attributes.
A SPRING object is an abstraction that contains state and provides a set of methods to manipulate that state. The description of the object and its methods is an interface that is specified in the interface definition language. The interface is
a strongly-typed contract between the implementor (server) and the client of the object.
A SPRING domain is an address space with a collection of threads. A given domain may act as the server of some objects and the clients of other objects. The implementor and the client can be in the same domain or in a different domain.
Since SPRING is object-oriented it supports the notion of interface inheritance. Spring supports both notions of single and multiple interface inheritance. An interface that accepts an object of type "foo" will also accept an instance of a
subclass of "foo". For example, the address.sub.-- space object has a method that takes a memory.sub.-- object and maps it in the address space. The same method will also accept file and frame.sub.-- buffer objects as long as they inherit from the
memory.sub.-- object interface.
The SPRING kernel supports basic cross domain invocations and threads, low-level machine-dependent handling, as well as basic virtual memory support for memory mapping and physical memory management. A SPRING kernel does not know about other
SPRING kernels--all remote invocations are handled by a networkproxy server. In addition, the virtual memory system depends on external pagers to handle storage and network coherency.
Referring to FIG. 3, a typical SPRING node runs several servers in addition to the kernel 50. These include the domain manager 52; the virtual memory manager ("VMM") 54; a name server 56; the CFS file server 58 described in this disclosure; a
local file server 60; a linker domain 62 that is responsible for managing and caching dynamically linked libraries; a network proxy 64 that handles remote invocations; and a tty server 66 that provides basic terminal handling as well as frame-buffer and
mouse support. Other major SPRING system components which might be present are a UNIX process server 68 and any number of SPRING applications 70.
THE CACHING FILE SERVER ("CFS")
The CFS is a per-machine caching file server that uses the virtual memory system to provide caching of data for read and write operations on files, and that has its own private protocol with the remote file servers to cache file attributes. When
the CFS starts up on a machine it registers itself with the naming server on that machine.
Referring to FIG. 4, the environment of the preferred embodiment is depicted. Illustrated are a local computer node 222 and a remote computer node 204. The local node 222 contains at least a client domain 220, a naming service 224, a virtual
memory manager ("VMM") 230 and a caching file server ("CFS") 218. The remote node 204 is shown with a file server 206 and an attached disk unit 202 containing at least one file 200.
In the local node 222, the client domain 220 is shown containing a file object 211 which was created by the file server 206 for operations on file 200. The CFS 218 is shown containing a cachable.sub.-- file object 229 used to forward methods
invoked on the file object 211 to the file server 206. The CFS 218 contains caches 270 for attributes and bind results, and a file.sub.-- cacher object 241/fs.sub.-- cache object 251 connection 244 for keeping the caches 270 coherent. Also shown is the
direct pager object 234 to fs.sub.-- cache object 251 connection 253 whereby the VMM 230 in the local node 222 maintains its file data cache for page-in/page-out operations on the file 200.
The CFS permits end user client programs on a local node of a distributed computer system, to issue requests to read/write data to a remote file and to query/set attributes of the remote file, and to have these requests serviced by the CFS in a
way which minimizes the caching resources used as well as minimizes the related network communications. The CFS establishes CFS file objects to interface with the client programs, and sets up a common CFS cache for the file attributes, which cache is
kept current via a communications link to a file program in a file server at the remote node containing the file, wherein the file program in the File Server at the remote node and the file attributes in the common CFS cache are maintained coherent. In
addition the CFS coordinates all client program requests for read/write data with a virtual memory manager ("VMM") on the local node, servicing all client programs from a single data cache in the VMM, if possible. A cache of the results of previous bind
() operations is maintained by the CFS to facilitate this use of a single data cache for multiple clients. In this manner, network communications and related processing overhead as well as memory resources are minimized by use of the VMM for caching
file data and the Common CFS cache for caching file attributes.
Referring to FIG. 5, the operation of the CFS is described. According to the preferred embodiment of the present invention, when a client program 220 running on machine 222 wishes to access a file 200, it proceeds as follows,
1) the client program 220 obtains a file object 211 which represents the file 200.
2) the file object 211, has associated with it, an extended cachable handle 210, which contains a back handle 214 which points to the file object manager 245 on the remote node 204, and a cacher name 216 for a cacher that knows how to cache data
for this kind of file object 211;
3) the client program 220 calls upon the local name service 224 to look up the cacher name 216 portion of the extended cachable handle 210 and is given a CFS object 219 on the local machine 222 which points to the CFS 218 which can cache the data
and attributes for the target file 200.
4) the client program 220 then requests that the local CFS 218 set up the necessary caching procedure for the file object 211, by calling the CFS object 219 and giving it a copy of file object 211 (which contains the extended cachable handle
210). The CFS object 219 returns a CFS file object 236 which will service requests for file data and attributes from the file 200. The CFS file object 236 is placed in the front handle 212 of the file object's extended cachable handle 210. (The
details and operation of this extended cachable handle are described in more detail in co-pending application Ser. No. 07/858,788 filed by Graham Hamilton and Michael N. Nelson for A Method and Apparatus for Portable Object Handles that Use Local
Caches, which is hereby incorporated herein by reference.)
5) when the client program 220 tries to read the file 200, using the file object 211, the CFS file object 236 in the front handle 212 calls the CFS file object manager 221 which invokes the map method on its address.sub.-- space object 225. The
address.sub.-- space object 225 is implemented by the address.sub.-- space object manager 227 in the local VMM 230. The VMM 230 invokes a bind () method on the file object 211. Since calls on file object 211 are directed to the CFS file object 236 by
the front handle in the file object 211, the bind () call goes back to the CFS file object manager 221. The CFS file object manager 221 checks its cache of bind results 237 to see if a cache for this file is already in existence. If so, a pointer to
the cache to use is returned to the VMM 230. If no cache is already available one of two actions is taken:
1) If the file.sub.-- cacher-fs.sub.-- cache.sub.-- object connection 244 is already established, then the bind is forwarded to the file server 206 by invoking the cached.sub.-- bind method on the file.sub.-- cacher object 241. The file server
returns a pager object 234. or
2) If the file.sub.-- cacher-fs.sub.-- cache.sub.-- object connection 244 is not established, then the bind is forwarded to the file server 206 by invoking the cached.sub.-- bind.sub.-- init method on the cachable.sub.-- file object 229 passing
in a fs.sub.-- cache.sub.-- object 232 implemented by the CFS 218, that the file server 206 can use to tell the CFS 218 when the VM cache object 223 is no longer valid. The file server returns a file.sub.-- cacher object 241 and a pager object 234. In
either case the CFS 218 then uses the pager object 234 to create a cache object 223 at the VMM 230. A pointer to the cache to use is then returned to the VMM 230 by the CFS 218. (The exact details of how this mapping and bind operation is performed are
described in the co-pending application Ser. No. 07/904,226 filed by Yousef A. Khalidi and Michael N. Nelson for A Method and Apparatus for a Secure Protocol for Virtual Memory Managers that use Memory Objects which is hereby incorporated herein by
reference.)
This mapping operation establishes
a) a cachable.sub.-- file object 229 in the CFS file object manager 221. The cachable.sub.-- file object 229 is implemented by the cachable file object manager 245 in the file server 206 for use in obtaining file attribute data. A cache 239 for
storing this attribute data is also set up in the CFS domain 218;
b) a pager.sub.-- object 234 to fs.sub.-- cache.sub.-- object 232 pipeline. This pipeline is set up between the pager object manager 247 in the file server 206 and the VMM 230 to automatically handle page in/page out of file data without going
through the CFS; and
c) a cache 237 of the bind results (cache.sub.-- objects) which are already established is set up for use in determining if the CFS needs to set up a new fs.sub.-- cache object/pager object connection.
6) the client program 220 may then invoke read/write requests or query/set attribute requests on the file object 211. These requests go via the front handle 212 which points to the CFS file object 236 which services the read/write requests via
the data cache 231 in the VMM 230. However the query/set attribute requests are serviced by the CFS file object manager 221 from its own cache of these attributes 239. If the CFS file object manager 221 has cached the result of the query/set attribute
operation then the request can be accommodated locally with no network accesses required. Otherwise the CFS file object manager 221 must contact the remote file server 206 to get the attributes. This requires that the CFS takes one of two actions:
1) If the file.sub.-- cacher-fs.sub.-- cache.sub.-- object connection 244 is already established, then the get.sub.-- attributes is forwarded to the file server 206 by invoking the cached.sub.-- stat method on the file.sub.-- cacher object 241.
The file server returns the attributes. or
2) If the file.sub.-- cacher--fs.sub.-- cache.sub.-- object connection 244 is not established, then the get.sub.-- attributes is forwarded to the file server 206 by invoking the cached.sub.-- stat.sub.-- init method on the cachable.sub.-- file
object 229 passing in a fs.sub.-- cache.sub.-- object 232 implemented by the CFS 218 that the file server 206 can use to tell the CFS 218 when any file attributes have changed since the last transmission to the CFS 218. The file server returns a
file.sub.-- cacher object 241 and the attributes. In the preferred embodiment, when the client 220 issues a "get attributes" request, this also is directed to the CFS 218 which uses the file.sub.-- cacher.sub.-- object 241 to forward the request, and
the parameter list of the "get attribute" request contains a "call back" object (another fs.sub.-- cache object 232) which the remote file server 206 can use to tell the CFS 218 when any file attributes have changed since the last transmission to the CFS
218.
In the present invention, the CFS can service several client programs which request information from the same file without creating duplicate caches or using redundant network overhead. For example, two client programs wishing to access the same
remote file can either use the same CFS file object or they can use different CFS file objects. If two client programs independently ask the CFS to cache a remote file, then the two client programs will have different CFS file objects implemented by the
CFS. However, if one client program sets up the cached file with the CFS and then passes the CFS file object to another client, the two client programs will use the same CFS file object. In either case, the two client programs will share the same
remote file object because the CFS will make sure the clients' CFS file objects point to the same underlying cached state.
This servicing of multiple client programs is now illustrated by means of FIG. 6. A first client program 102 asks the CFS 218 to cache remote file 200. The CFS 218 provides a first CFS file object 108 for the first client 102 to use in
processing requests on the file. The CFS 218 establishes a communications link 244 to the file server 206 located on the remote node 204 to obtain the file attributes which the CFS 218 caches in the CFS attribute cache 239. The CFS 218 also establishes
the communications between the CFS file object manager 221 and the VMM 230 which caches the file data in the data cache 231 by means of the pager.sub.-- object 234/fs.sub.-- cache.sub.-- object 232 pipeline. This pipeline results from the map and bind
operations previously described. Pager object 234 is connected to fs.sub.-- cache object 232 in the pager object manager 247, and also connected via the link between the fs.sub.-- cache object 232 and the cache object 223 in the fs.sub.-- cache object
manager 243 in the CFS 218.
When a second client program 118 asks the CFS 218 to cache the same remote file 200, the CFS 218 implements a second CFS file object 122. However, since the CFS 218 has already set up a CFS attribute cache 239 to handle file attributes for the
remote file 200, the second CFS file object 1 | | |