|
Description  |
|
|
This application is related to copending U.S. patent application UNIFORM
MEMORY SYSTEM FOR SYMBOLIC COMPUTING, filed concurrently herewith,
application Ser. No. 159,467, titled Uniform Memory System for Symbolic
Computing filed Feb. 19, 1986, assigned to the assignee hereof, the whole
of which is hereby incorporated by reference herein, and to U.S.
application titled Recoverable Virtual Memory Having Resilient Objects,
filed concurrently herewith, application Ser. No. 774,827, titled
Recoverable Virtual Memory, filed on Sept. 11, 1985, assigned to the
assignee hereof, the whole of which is also incorporated by reference
herein.
BACKGROUND OF THE INVENTION
This invention relates to digital computer systems, and specifically to a
virtual memory having recovery capabilities.
In the future, as users of state-of-the-art symbolic computing machines
develop large-scale, knowledge-based applications, they are expected to
encounter major problems arising out of storage management problems in
supporting large and complex knowledge/data bases. The word storage is
used in a broad sense to encompass virtual memory, file systems and
databases. The problems can be primarily attributed to the dichotomy by
which today's computers, including state-of-the-art symbolic computers
such as the Texas Instruments Explorer and the Symbolics 3670, manage
storage along two entirely different organizations. These organizations
can be referred to as the computational storage and the long-term storage.
In symbolic/artificial intelligence (AI) processing, a representation of
knowledge is a combination of data structures and interpretive procedures
that, if used in the right way in a program, will lead to "knowledgeable"
behavior. The goals of AI systems can be described in terms of cognitive
tasks like recognizing objects, answering questions, and manipulating
robotic devices. The most important consideration in formulating a
knowledge representation scheme is the eventual use of the knowledge. The
actual use of the knowlege in symbolic/AI programs involves three stages:
(1) acquiring more knowledge, (2) retrieving facts from the knowledge base
relevant to the problem at hand, and (3) reasoning about these facts in
search of solutions. A number of different knowledge representation
schemes, such as state-space representation, logic, procedural
representation, semantic nets, production systems, and frames, have been
developed by the knowledge representation community. The choice of the
knowledge representation scheme very much depends on the application
requiements.
No matter which knowledge representation scheme is used, at some
sufficiently low level of representation the knowledge is represented by
memory objects interconnected by pointers. These objects exhibit a
structure, which is defined by the interconnection graph of pointers
connecting the objects. The structure of objects created and manipulated
by symbolic/AI applications is usually very rich and complex. Moreover,
both the information in objects, as well as the structure of objects, can
undergo rapid changes.
In symbolic computing, objects representing a knowledge base are created
and manipulated in the computational storage. As its name implies, the
computational storage contains objects to be manipulated by the processor
of a computer system. These objects can be numbers, strings, vectors,
arrays, records, linked lists, instructions, procedures, etc. These
objects, both small and large, are usually identified by names. The names
of objects serve as convenient handles or pointers that can be passed as
procedure parameters, returned as procedure results, and stored in other
objects as components. The names of objects are typically implemented as
their virtual addresses. Programmers create and manipulate objects by
using programming languages, such as LISP and Prolog.
Typically, the computational storage is implemented as virtual memory,
which consists of a hierarchy of memories: a fast, small semiconductor
main memory, backed up by a slow, large disk to support paging. Objects in
the computational storage are accessed very rapidly as the processor can
directly access them by specifying their addresses (real or virtual),
often at a speed that matches the basic processor cycle time. The
information stored in these objects is also processed and manipulated very
efficiently as it is stored in a format defined by the processor
architecture, and can therefore be directly interpreted by the processor
hardware or microcode.
Often, the information stored in the computational storage has a very rich
structure; i.e., objects in the computational storage are interconnected
by a rich and complex structure of pointers to match the requirements of
applications at hand. The structure of these objects is often dynamic.
However, objects in the computational storage do not exist beyond the life
times of programs that create them. When a program terminates or a system
shutdown, or crash occurs, these objects cease to exist. Therefore, they
are called short-lived or transient objects. To make these objects survive
beyond the life times of programs that created them, i.e., to make them
long-lived or persistent, they must be moved to the other storage
organization, i.e., the long-term storage.
As its name implies, the long-term storage is used to keep information for
long periods of time. It is typically implemented on a disk-resident file
system. The disk file system is logically different from the paging disk
of the computational storage, even though the physical disk media may be
shared by both. Examples of information stored in the long-term storage
are files, directories, libraries, and databases. The long-term storage
retains information in a reliable fashion for long periods of time. In
order to store information beyond the life time of a program that creates
it in the computational storage, the information needs to be first mapped
into a representation expected by the long-term storage and then
transferred to it for long-term retention using a file input/output (I/O)
operation or a database operation.
The types of objects supported by the long-term storage are very respective
(essentially files, directories, relations, etc.), and may match with the
data structure requirements of many applications. The representation of
information in the long-term storage is quite "flat." For example, a file
may consist of a sequential stream of bits or bytes, such as ASCII
characters. Files or relations usually can neither hold procedural objects
nor pointers to other objects in the long-term storage. Information in
these objects can neither be directly addressed nor directly processed by
the processor, because its representation is not compatible with the
processor architecture. The information can be processed only after it is
mapped into a representation expected by the computational storage and
then transferred to it for processing. The translation overhead in mapping
these objects to/from a collection of files is quite substantial, too.
In addition to the time overhead for translation and mapping of objects
between the computational and long-term storages, there is additional
space overhead, as the information is essentially duplicated in virtual
memory and the file system. There is an apparent paradox in that the
computional storage, usually implemented as a virtual memory, hides the
existence of the paging disk store; on the other hand, the long-term
storage makes the existence of the disk explicit to the programmer. Thus,
the programmer is faced with a nonuniform storage model, where differences
in addressing, function, and retention characteristics between the
computational and long-term storage are visible above the processor
architecture level.
Programming languages, such as FORTRAN, Pascal, LISP, and Prolog, strongly
reflect the dichotomy in storage organization. The specification of these
languages almost invariably assumes long-term storage objects (files) to
have entirely different characteristics from computational objects. As a
result, these programming languages cannot directly process information in
the long-term storage the way they can process information in the
computational storage. This dichotomy propagates throughout the whole
system and cannot be hidden fromthe user. It shows up in differences
between names used for programming language objects and names used for
files and databases.
The dichotomy also shows up in a different set of languages that has
evolved to process information in the long-term storage. These languages
include various so-called command languages, such as the UNIX shell
language and the IBM TSO Command Language, that are responsible, among
other things, for performing operations on files. The other class of
languages which operate on persistent objects are various database
languages, such as Square, Sequel, and Quel. These languages can define
database objects, and perform queries and updates on them. Typically, such
languages are often interpreted, and are restrictive and arcane in nature
compared to the more familiar programming languages, which also enjoy the
efficiency of compiled execution over interpreted execution.
As a consequence, the programmer must be aware of the nonuniform storage
model, and must explicitly move information among storage media, based on
the addressing mechanisms, functions and retention characteristics
desired. Another consequence is that the nonuniform storage model is an
obstacle to programming generality and modularity as it increases
potential types of interfaces among programs. The hodgepodge of
mode-dependent programming languages, such as command languages,
programming languages, debugging languages, and editing languages, makes
fast and efficient interaction with the system difficult.
The mapping between transient and persistent objects is usually done in
part by the file system or the data base management system (DBMS) and in
part by explicit user translation code which has to be written and
included in each program. This task imposes both space and time penalties,
and degrades system performance. Frequently the programmer is distracted
from his task by the difficulties of understanding the mapping and
managing the additional burden of coping with two disparate worlds: the
programming language world and the DBMS world.
In large data-intensive progams there is usually a considerable amount of
code, which has been estimated to be as high as 30% of the total,
concerned with transferring data between files or a database, and the
computational storage. Much space and time is wasted by code to perform
translations between the transient and persistent object worlds, which has
adverse performance impact. This is unsatisfactory because the effort and
time required to develop and execute the translation code can be
considerable, and also because the quality and reliability of the
application programs may be impaired by the mapping. The storage dichotomy
also gives rise to much duplication of effort in the operating system
design and DBMS design.
These problems, created by the storage dichotomy, are considerably further
complicated for symbolic/AI computing. Processes on current symbolic
machines share a single address space; i.e., there is no per-process
address space. Moreover, the address space is not segmented, but is a
single, linear address space. Such a model of the computational storage
allows easy, efficient and flexible sharing of objects among multiple
processes. Any object can point to any other object by simply holding a
pointer to that object (usually implemented as a virtual address of the
object being pointed to). Arbitrarily complex structures of objects
interconnected by pointers can be created and manipulated. Such powerful
structuring of objects is very important for the development of the highly
integrated and powerful software development environments available on
these symbolic computers.
Unfortunately, current symbolic computers make a distinction between the
computational and long-term storages, similar to today's conventional
computers. In symbolic computers, making a single object persistent by
moving it to a file system is not very meaningful; all objects that can be
reached from an obejct by following all out-going pointers also need to be
made persistent as a single entity, and all in-coming pointers pointing to
the entity must be "properly taken care of." Such an entity, however, can
be very large and moving it to a file system would be a complicated and
expensive operation. Conversely, the reverse move from a file system to
the computational storage would be equally as complicated and expensive.
Many current advanced programming techniques, especially as practiced in th
symbolic/AI community, do not distinguish between procedures and data;
procedures are just data, which are themselves active. As the body of
information being dealt with grows and becomes more active, it becomes
critical that the program environment, consisting of complex objects
interconnected with rich pointer structures, survives for long periods of
time. Mapping and moving of such rich environments into today's file
system or database for long-term retention would involve substantial
translation overhead, both in space and time.
Thus, there is a substantial difference between the representations of
objects in the computational and long-term storages for symbolic/AI
applications. The richer the structure of computational objects, the
greater the difference and the bigger the effort needed to perform
translation between these two representations. Emerging symbolic and AI
applications will employ increasingly sophisticated and complex structures
on a large number of objects on which retrievals, queries, inferences,
reasoning, deductions, and computation will be performed. As can be
anticipated, the overhead to map long-term objects into computational
objects and vice-versa for large knowledge-intensive applications could be
substantial.
The current approach taken by many researchers to facilitate
knowledge-based applications is based on connecting a symbolic computer to
a database machine. This approach is not based on persistent memory, as it
neither addresses the storage dichotomy issues nor deals with the lifetime
or interchangeability of procedure and data issues. There will be a
mismatch between the data model requirements of symbolic/AI applications
and the rigid data models supported by database machines. Therefore, such
approach appears to be inadequate for expert database systems. These
reservations are shared by other researchers in the field.
The persistent memory approach is based on a fundamentally different
foundation. The literature on persistent memory dates back to 1962, when
Kilburn proposed single-level storage, in which all programs and data are
named in a single context. (T. Kilburn, "One Level Storage System", IRE
Trans. Electronic Comput., vol. EC-11, no. 2, April, 1962) Saltzer
proposed a direct-access storage architecture, where there is only a
single context to bind and interpret all objects. (J. H. Salzer, "Naming
and Binding of Objects", in R. Bayer et al, editors, Operating Systems, An
Advanced Course, p. 99, Springer-Verlag, New York, NY, 1978.
Traiger proposed mapping databases into virtual address space. (I. L.
Traiger, "Virtual Memory Management for Database Systems", ACM Operating
Systems Review, pp. 26-48, October, 1982.) It seems that the simple data
modeling requirements of the FORTRAN and COBOL worlds discouraged
productization of these proposals because they are much more difficult to
implement than the conventional virtual memory and database systems.
The MIT MULTICS system and the IBM System/38 have attempted to reduce the
storage dichotomy. However, both have major shortcomings for symbolic
computing; unlike LISP machines, each process has its own address space.
All persistent information is in files. A file mapped into the address
space of a process cannot hold a machine pointer to a file mapped in the
address space of a different process. Thus, sharing of information among
different processes is more difficult than with LISP machines.
Furthermore, there is no automatic garbage collection, which is essential
for supporting symbolic languages.
Recently, many researchers have proposed implementing persistent objects on
top of a file system provided by the host operating system. Though
persistent and transient objects still reside in two separate storage
organizations, persistent objects can be of any general type, such as
number, vector, array, record, or list, and can be manipulated with a
common programming language such as ALGOL or LISP. However, there is a
large overhead to access persistent objects because their pointers must be
dereferenced by software, taking several machine cycles.
Systems having separate computational and long-term storage can easily
recover from a power failure, hardware failure, software error, or the
like, which can be considered as a group as "system crashes". After a
system crash, any hardware problems are repaired. All Data and procedures
which were in the virtual memory at the time of the crash are assumed to
be lost, and the system is restarted by reloading the software from
long-term storage and those items that have been stored in files or a DBMS
are considered to be valid.
A system which implements a larger uniform memory is especialy vulnerable
to system crashes. Because persistent objects are stored in the virtual
memory, they can be corrupted by the crash. The recent changes in a
particular persistent object may or may not be stored on the paging disk.
The current value of large objects may be partially on disk, and partially
in main memory. Thus, the values stored on disk cannot be used to merely
reload and restart the system after a crash because the disk alone may
contain a valid consistent state. Thus, if it is desired to restore a
virtual memory after a crash, prior art file and DBMS systems cannot be
used. It is necessary to devise some mechanism for preserving the state of
the virtual memory.
SUMMARY OF THE INVENTION
Therefor, it is an object of the present invention to provide a virtual
memory which can recover from system crashes caused by power failure,
hardware failures and software errors. It is a further object to provide a
virtual memory which can be restored to an earlier, valid state to
minimize loss of work. It is another object to provide a means for taking
regular checkpoints of the virtual memory to preserve valid states which
can be restored. It is yet another object to provide a recoverable virtual
memory which can be used to implement a uniform memory system suitable for
symbolic computing.
Therefore, in order to accomplish these and other objectives, a virtual
memory system has a random access memory backed up by a disk, the
combination giving a very large virtual address space. In order to provide
for system recovery in case of a power failure, hardware failure or
software error, checkpoints are periodically taken of the state of the
system. These checkpoints are marked and stored on disk. Changes made
between a checkpoint and the next checkpoint are also stored and marked,
but are discarded in the event of a system crash. When there is a system
crash, the system is rolled back to the checkpoint state, and processing
resumes in a normal manner.
This recovery scheme is especially suitable for use with a uniform memory,
in which all memory objects are located in the same address space. A
special object, the persistent root, indicates which objects are to be
retained beyond the lifetime of the program which creates them. Deleted
objects are marked as tombstoned, but are not entirely deleted until it is
ascertained that no references to those objects are outstanding. Such
memory system is especially suitable for use in symbolic computers.
The novel features which characterize the present invention are defined by
the appended claims. The foregoing and other objects and advantages of the
invention will hereinafter appear, and for purposes of illustration, but
not limitation, a preferred embodiment is shown in the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a prior art computer memory system
architecture;
FIG. 2 is a block diagram of a computer memory architecture according to
the present invention;
FIG. 3 is a block diagram of a uniform, persistent memory according to the
present invention;
FIG. 4 is a representation of sibling virtual pages as stored on a paging
disk;
FIG. 5 illustrates how some sibling pages are updated;
FIG. 6 illustrates how other sibling pages are updated;
FIG. 7 illustrates how singleton pages are updated;
FIG. 8 illustrates the process of recovering from a system crash; and
FIG. 9 illustrates a hierarchy of memory abstractions.
DESCRIPTION OF THE PREFERRED EMBODIMENT
The present recovery system for a virtual memory will be described in the
context of its implementation with a uniform memory, which will be
described first. It will be apparent to those skilled in the art that the
present recoverable virtual memory is especially msutable for implementing
the uniform memory abstraction, which supports persistent objects.
However, it can also be used to enhance conventional memory systems
utilizing a virtual memory.
FIG. 1 shows a prior art computer architecture having separate
computational and long-term storages. A central processor 10 has access to
computational storage 12 and long-term storage 14. Long-term storage 14 is
for retaining files, databases, etc., and is usually implemented as one or
more disks backed up by magnetic tape. Computational storage 12 is a
virtual memory, usually implemented as a fast semiconductor RAM memory 16
and a paging disk 18. The computational storage 12 appears to the central
processor 10 as a very large RAM. Virtual memory addesses which are not
actually in the semiconductor memory 16 are located on the paging disk 18
and loaded into the semiconductor memory 16 when they are referenced by
the central processor 10.
FIG. 2 shows a computer system having an architecture according to the
present invention. A central processor 20 (CPU) has access only to a
single, uniform memory 22. The memory 22 preferably consists of a very
large virtual memory, having semiconductor RAM 24 backed up by a paging
disk 26. The CPU 20 may be an existing system, such as an EXPLORER
symbolic processor from Texas Instruments. The virtual memory 22 appears
to the CPU 20 as a uniform, or single-level, memory store with a linear
address space.
The uniform memory abstraction defines the storage system architecture
necessary to implement a persistent memory according to the present
invention. The persistent memory system is based on the uniform memory
abstraction, in which a processor views memory as a set of variable-sized
blocks, or objects, of memory interconnected by pointers. The memory
system has a very large address space to support large knowledge-based
applications. The persistent memory system is expected to store persistent
objects, including "files," which could be very large in number and size.
Therefore, the size of the underlying address space should be sufficiently
large to support a practical system. However, the concept of persistent
memory does not depend on the actual size of the address space.
As previously explained, all processes within a symbolic computer share the
same single, linear address space. This allows a rich, complex structure
of objects interrelated by pointers, to be created and manipulated. The
structure of memory objects interconnected by pointers forms a graph.
Pointers interconnecting memory objects are implemented as virtual
addresses of the target objects.
As shown in FIG. 3, there is a distinguished object in the uniform memory
abstraction, called the persistent root 110, which defines persistent
objects.
The persistent root 110 is a distinguished object located at a fixed
virtual address and disk location. All objects that are in the transitive
closure of the persistent root, i.e., reachable from the persistent root
by following pointers, are persistent. The persistent root survives system
shutdowns or crashes. Typically, the persistent root may contain a pointer
to a table that points to other tables or structures of persistent objects
and so on. Thus, the persistent root anchors all persistent objects.
The persistence attribute of an object depends solely on whether that
object can be prevented from being garbage collected even after the
program that created it has terminated; this can be easily arranged by
making that object a member of the set of objects in the transitive
closure of the persistent root. Persistence based solely on the persistent
root rather than the properties of the storage medium allows a complete
separation of the persistence attribute of an object from its type or
relationship with other objects. Numbers, characters, lists, procedures,
environments, etc., can be persistent objects while they exist in virtual
memory.
Therefore, an invocation of a procedure as a persistent object is as easy
and efficient as its invocation as a transient object. In fact, from the
machine point of view, transient and persistent objects are
indistinguishable. From the upper point of view, there is no need to treat
transient and persistent objects differently; all the user needs to know
is that to make an object persistent, it has to be in the transitive
closure of the persistent root.
The processor contains a number of "registers." (101-108 are shown) The
processor can access a memory object, i.e., read and write its individual
words, if any of its registers holds a pointer to the object. The word
register in this context is used in a generic sense; it may be a hardware
register or a scratch-pad memory in the processor. These registers define
the transient root 109 of the memory system. They do not survive a system
shutdown or crash. All objects that are in the transitive closure of the
transient root, but not in the transitive closure of the persistent root,
are called transient. All the remaining objects are garbage and are
reclaimed by a garbage collector.
FIG. 3 shows an example snapshot of the memory system and categorizes
objects within it. The arrows between the objects and from CPU registers
to objects represent pointers. Pointers always refer to the beginning of
the object pointed to. Thus, the four pointers pointing into object 124,
for example, all have the same value and point to the beginning of block
124. By determining the transient closure of the persistent root 110, and
the transient root 109, it is seen that objects 126, 129 and 130 are
transient; objects 120, 121, 122, 123, 124, 125, and 127 are persistent;
and objects 128 and 131 are garbage.
Each memory object consists of one or more memory words, or cells, which
are stored in the consecutive virtual addresses. The processor 20 can
access a memory object, i.e., read and write its individual words, if any
of its registers holds a pointer to the object. For example, one method of
accessing individual cells is as follows. If register 101 contains a
pointer to a memory object 123, then the processor 20 can read the third
word of the memory object 123 by executing a READ(1, 3) instruction, where
"1" specifies the processor register 101, and "3" specifies the third word
of the memory object 123, pointed to by register 101. The contents of
register 101 are added to "3" to develop the virtual address of the word
to be read. Similarly, the processor 20 can write data in the fourth word
of the memory object 123 by executing a WRITE(1, 4), data instruction. The
processor 20 can access memory objects only via logical addresses; a
logical address consists of a pair (i, j), where "i" is the identification
number of a processor register, and "j" indicates the j-th word of an
object being pointed at by processor register "i."
The notion of memory objects in the uniform memory abstraction corresponds
to objects used in high-level programming languages, such as numbers,
booleans, characters, strings, LISP CONS cells, arrays, records,
procedures, or environments. These language-level objects can be
implemented using one or more memory objects interconnected by pointers.
Application-level objects are constructed by combining language-level
objects.
The persistence propety of objects is based solely on whether or not an
object is within the transitive closure of the persistent root 110. The
persistence attribute of an object is a fundamental notion. It should
depend only on whether the object can survive beyond the life time of a
program that creates it. It should neither depend on the type of the
object nor on the properties of the storage medium on which the object
resides. Since there will usually be several sets of unrelated groups of
objects which are persistant, the persistant root 110 will usually first
point to an object which contains nothing more than pointers to persistent
objects. Any object, transient or persistent, can point to any other
object to facilitate the unrestricted sharing desired in many symbolic/AI
computations.
In contrast to this mechanism of achieving persistence of objects based
solely on the persistent root, in today's machines, both conventional and
symbolic, an object becomes persistent only when it is stored in the
long-term storage, i.e., disk store. Even in MULTICS or IBM-System/38,
only certain types of objects, i.e., files, can become persistent, while
other types of objects, such as procedures, cannot.
With the persistent root 110, the persistence attributes of an object
solely depends on whether that object can be prevented from being garbage
collected even when the program that created it has terminated; that can
be easily arranged by making that object a member of the set of objects in
the transitive closure of the persistent root 110.
The transience/persistence attribute of objects is not necessarily a
permanent attribute. An object may be created as a transient object, then
it can become a persistent object solely on the basis of being in the
transitive closure of the persistent root 110, and then can revert back to
the transient state by getting out of the transitive closure of persistent
root 110, and so on.
Each pointer, implemented as a virtual address, is tagged as a pointer
within memory. This tagging mechanism is used to ensure that the processor
cannot specify, fabricate, or forge a pointer. The processor is allowed to
access memory only be reference to logical memory blocks. There may be
additional tagging information associated with each object to indicate its
type, such as integer, floating point number, string, array, list, or
closure. This tagging information is used to ensure that attempts to
perform operations that are undefined or illegal on a particular object
type cause traps to appropriate exception handling routines; for example,
an attempt to add an integer to a string object would cause an exception.
Each memory reference can be checked for bounds, i.e., "j" in a logical
address (i, j) should not exceed the size of the object pointed to by
processor register "i."
The nature of the memory system requires that it be garbage collected and
be free from the so-called dangling reference problem. Garbage collection
is essential to be able to make computational progress in a finite amount
of memory space. Without the reclamation and resuse of memory space
occupied by an object proven to be garbage (i.e., no outstanding pointers
to the object from non-garbage objects), the system would eventually come
to a grinding halt as it ran out of memory. Garbage collection is
preferably done automatically in real time, and preferably as a process
executing concurrently with user processes. This is not necessary to the
invention, however, and garbage collection can occur during periods of
machine non-use, such as overnight.
The dangling reference problem arises if the memory space for an explicitly
deleted object is reclaimed without proving that there are no outstanding
pointers to that object. If the space occupied by a deleted object is
reclaimed prior to such a proof, then the outstanding pointers to the
object may point to empty space, i.e., unallocated memory, or to some
undesired object if the reclaimed space has been later allocated to the
new object. In either case, the memory system integrity would be violated.
The proof that there are no outstanding pointers to a deleted object is
embedded within the garbage collector. A deleted object is specially
marked as tombstoned | | |