|
Claims  |
|
|
We claim:
1. A program tool, comprising
a. a workstation having an operator interface, a mass memory, a CPU, and
main memory;
b. an object oriented programming language system including,
(1) an object oriented programming language, and
(2) object oriented language compiler means for translating source code
written in the object oriented programming language into objects and
interpreter code;
c. a logic programming language system, having components representing
terms, clauses, predicates, atoms, and variables, including,
(1) a logic programming language, and
(2) logic compiler means for translating source code written in the logic
programming language into objects;
d. a database residing in said mass memory, for storing objects and
components of a logic programming language as objects in a common data
structure format, applications data, and applications stored as compiled
interpreter code;
e. object database management for representing objects and components of a
logic program in said common data structure format as objects, and
responsive to calls for retrieving and storing such objects in said
database, and for automatically deleting objects from said data base when
they become obsolete;
f. interpreter means for executing said interpreter code and generating
calls to said database management means; and
g. logic subsystem means for solving logic queries, said logic subsystem
means treating any object as a term in the logic programming language.
2. The programming tool claimed in claim 1, wherein said object oriented
programming language system includes means for calling subroutines written
in another language and treating the call as an object.
3. The programming tool claimed in claim 1 or 2, wherein said logic
programming language system includes means for processing attribute labels
in a manner such that the attribute labels are taken as identical to
object attribute names in the object programming language.
4. The programming tool claimed in claim 3, wherein said object oriented
language compiler means comprises:
a. first phase means for performing compilation including parsing,
optimization, and interpreter code generation;
b. second phase means in communication with said first phase means for
resolving global symbols and loading the database with objects and
interpreter code; and
c. an assembler-like intermediate language for communication between said
first and second phase means.
5. The programming tool claimed in claim 3, wherein said object-oriented
programming language system includes a language that is a dialect of
Smalltalk (Alltalk), and means for treating primitive invocations as
objects; and wherein said logic programming language system includes a
language that is an extension of Prolog (ALF), and means for treating
attribute labels as identical to instance variable names, and means for
typing logic variables using objects.
6. The programming tool claimed in claim 5, wherein said interpreter code
comprises a plurality of types of bytecodes, and said interpreter includes
a plurality of bytecode handler means, one such means for processing each
type of bytecode.
7. The programming tool claimed in claim 6, wherein said bytecode types
comprise:
execute a primitive,
send a message,
define a block,
evaluate a block,
return from a block or method,
branch, and
assign from one variable to another.
8. The programming tool claimed in claim 7, wherein said interpreter means
includes means for maintaining blocks as C data structures, and for making
blocks into objects when blocks are assigned to instance variables, or
returned as the result of a message.
9. The programming tool claimed in claim 8, wherein said interpreter means
includes means for maintaining contexts as C data structures, and for
making blocks into objects if and when an associated block is made into an
object.
10. The programming tool claimed in claim 9, wherein the bytecode handler
means for the "define a block" type bytecode generates block stubs, and
wherein said interpreter means creates active context(s) for a block stub,
stored separately from said block stub, and wherein said active contexts
associated with block stubs obey a stack discipline.
11. The programming tool claimed in claim 10, wherein said interpreter
means maintains running data structures for object-oriented processes in
an array, each element in the array representing one process, each element
containing a stack of active contexts, a pointer to the current context in
the stack, an array of block stubs, and a pointer to the next available
block stub.
12. The programming tool claimed in claim 11, wherein said interpreter
means manages processes by creating processes, switching processes,
destroying processes, and performing optimizations on processes.
13. The programming tool claimed in claim 12, wherein said interpreter
means performs optimization on processes by message flattening.
14. The programming tool claimed in claim 12, wherein said interpreter
means performs optimizations on processes by treating each primitive as
its own bytecode.
15. The programming tool claimed in claim 5, wherein said logic programming
language includes a set of built in predicates SEND N and including means
for sendng messages between the logic programming language system and the
object-oriented programming system by employing said predicates SEND N.
16. The programming tool claimed in claim 15, wherein said set of built-in
predicates take arguments "receiver", "answer", "selector", and n
additional arguments; wherein "receiver" is the receiver of the message to
be sent, "answer" is the object returned from the message, and "selector"
is that of the message send, and n remaining arguments are arguments to
the message send itself.
17. The programming tool claimed in claim 5, wherein all clauses in the
logic programming language are represented as instances of class "Clause",
and are rules, facts and queries, and wherein included in the instance
variables of class "Clause" are "head" and "tail"; if "head" is nil, the
clause is a query, if "tail" is nil, the clause is a fact; "head" is of
class "Predicate", or a sub-class thereof, "tail" is of class "LinkedList"
whose links are of class "Predicate", or a sub-class thereof; and wherein
values of the instance variables of the "head" and "tail links" can be
arbitrary objects.
18. The programming tool claimed in claim 3, wherein said database
management means includes:
a. object manager means employed by the object oriented language compiler,
the interpreter means, primitives, and utilities for providing access to
objects in the object database and for mainatining the orginization of
objects in the database;
b. method fetcher means for calling the object manager means to fetch
methods for the interpreter;
c. access manager means for managing access to the database, and being
called by,
(1) a buffer manager for retrieving objects from the database,
(2) a transaction manager for adding/updating objects in the object
database at commit points, and
(3) the object manager for providing higher level interface of the
database;
d. buffer manager means for,
(1) generating calls to the access manager means when called by the object
manager means, and
(2) keeping an in-memory copy of objects when called by the pool manager
means;
e. pool manager means for maintaining memory for buffers; and
f. garbage collector means integrated with said object manager means and
said interpreter means for identifying objects in main memory that are no
longer reachable.
19. The programming tool claimed in claim 18, wherein said garbage
collector means includes means for defining numbered regions for garbage
collection, such that when a context is created, it is assigned a region
number, each object created or accessed being assigned the region number
of the context that created or accessed it, unless it was previously
associated with a lower number; and when an object is returned from a
called method to the calling method, the object being moved to the region
of the calling method, and when a reference is made from a first object to
a second object in another region, the second object being moved to the
region of the first object, and when returning from a method, if the
context to which it is returning belongs to a region whose number is at
least two lower than that of the current region, then the said garbage
collector means collects garbage in the regions with the higher numbers
than that of the context to which return is being made.
20. The programming tool claimed in claim 19, wherein said garbage
collector means includes region cleaning means for detecting when a region
has accumulated an excessive number of objects and cleanng the region thus
detected.
21. The programming tool claimed in claim 19, wherein said garbage
collector means includes means for detecting when objects are shared
across processes and for insuring that no object is discarded that is in
use by another process.
22. The programming tool claimed in claim 19, wherein said garbage
collector means includes an off-line mark/sweep collector means for
periodically removing objects from the object database that have become
unreachable by any other object in the database, by first marking all
objects in the database that can be reached, and then sweeping the
database to remove unmarked objects.
23. The programming tool claimed in claim 22, wherein said object database
contains constants that are permanently marked such that they cannot be
removed by said off-line mark/sweep collector means.
24. The programming tool claimed in claim 3, wherein said logic subsystem
means performs unification of logic variables to answer logic queries, and
in doing so, takes into account the typing of the logic variables to
enable constraint of permissible values of logic variables.
25. The programming tool claimed in claim 3, further comprising debugger
means for providing debugging functions comprising setting break points,
stepping through program execution, tracing information (e.g. messages,
blocks, bytecodes, processes), and displaying values of data structures,
said debugger means being integrated with said interpreter means and
including a set of C routines for performing tasks associated with the
debugger commands, code within the interpreter, and a set of global
variables and constants used to communicate between the C routines and the
code in the interpreter.
26. The programming tool claimed in claim 3, wherein said object database
comprises a key file and a prime file, the prime file having records of
variable length containing objects, and the key file having records of
fixed length containing the address and record length of objects in the
prime file.
27. The programming tool claimed in claim 26, wherein objects in the prime
file can be one of 6 types, including:
normal objects,
a symbol cross reference record that contains a string for a symbol and
associated object identification of a symbol object,
a dictionary cross reference,
a control record,
a checkpoint integrity record, and
logically deleted objects.
28. In a heap based programming language system, having garbage collector
means for removing objects from memory that are no longer reachable by the
system, and improved garbage collector means, wherein the improvement
comprises: means for defining numbered regions for garbage collection,
such that when a context (representing the state of a method which is
executing in the system) is created, it is assigned a region number, when
an object is created or accessed by a method it is assigned the region
number of the on context of the method that created or accessed it, unless
the object was previously assigned a lower number; means for moving an
object to the region of a calling method when an object is returned from a
called method to the calling method, means for moving a second object to
the region of a first object when reference is made from the first object
to the second object assigned to another region and wherein said garbage
collector means collects garbage when returning from a method, if the
context to which it is returning belongs to a number at least two lower
than the current region number before returning; the regions with the
higher number than that of the context to which it is returning being
collected (i.e. the objects in the regions are discarded).
29. The improvement claimed in claim 28 wherein; said garbage collector
means includes region cleaning means for detecting when a region has
accumulated an excessive number of objects, and cleaning the regions thus
detected.
30. The improvement claimed in claim 28 wherein said garbage collector
means includes means for detecting when objects are shared across
processes for ensuring that no object is collected that is in use by
another process.
31. The improvement claimed in claim 28, wherein said system further
comprises an object database and wherein said garbage collector means
includes off-line mark/sweep collector means for periodically removing
objects from the database that have become unreachable by any other object
in the database, by first marking all objects in the database that can be
reached, and then sweeping the database to remove unmarked objects.
32. The improvement claimed in claim 31 wherein said object database
contains constants that are permanently marked, and said off-line
mark/sweep collector means includes means for recognizing said marks and
preventing removal of said constants by said collector means from said
database.
33. The improvement claimed in claims 28, 29, 30, 31, or 32, wherein said
system further comprises an in-use table containing a list of objects that
must be kept in-memory, said table including a field designating each
object's region. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
TECHNICAL FIELD OF THE INVENTION
The invention relates to a programming tool that allows application
programming in both logic and object-oriented style, and which provides
integrated database support.
BACKGROUND OF THE INVENTION
Object-oriented programming, logic programming, and database facilities
have all been shown to have significant power in the writing of
applications to run on a computer. No single programming tool has
successfully integrated all three facilities in such a way as to eliminate
an explicit interface between them. Normally, one must convert between
object data to logic data to use the logic programming system, and then
convert the logic data back again in order to use the object-oriented
system. Furthermore, one must normally make explicit calls to a database
manager in order to retrieve and store application data.
There have been some attempts to provide combined logic and object-oriented
programming tools. For example, the Smalltalk/V (Smalltalk Tutorial and
Programming Handbook, Digitalk, Inc., 1987) allows the user to invoke a
logic programming tool (Prolog) from an object-oriented on (Smalltalk).
However, the only kind of data (terms) that Prolog understands are
strings, symbols, numbers, structures, and lists of any of the above.
Furthermore, the Prolog structures are constrained to be a type of list
from the object-oriented programming tool. Additionally, Smalltalk/V does
not have database storage for the objects.
There have also been attempts to provide database support for
object-oriented tools. For example, the Gemstone system, a product of
Servio- Logic, Inc., while supporting a database server that can be
programmed in Smalltalk, does not allow the application to be written in
Smalltalk in such a way that the database server is transparent: i.e. the
application must make speciific calls to the database server (`Integrating
an Object Server with Other Worlds`, by Alan Purdy et al, ACM Transactions
on Office Information, Vol. 5, Number 1, Jan. 1987). Gemstone does not
contain any logic programming tools.
Some so-called "expert system shells"(e.g., Nexpert Object from Neuron
Data, Inc.) allow for objects, rules and database features to be combined,
but these tools are for the construction of a certain class of application
("expert systems"), and do not provide a general-purpose programming tool.
It is the object of the present invention to solve the problem of providing
a general purpose programming tool that smoothly integrates
object-oriented and logic programming, and provides the user with database
facilities that are transparent to the user.
SUMMARY OF THE INVENTION
The present invention solves the problem by providing a single programming
tool (referred to herein as Alltalk) which allows the programmer to write
applications in an object-oriented language (a dialect of Smalltalk, also
referred to herein as Alltalk), a logic programming language, (which is an
extension of Prolog, herein called ALF) or a combination of the object and
logic programming languages which allows the logic programming language
system to consider any object from the object-oriented programming
language system as a term in the logic programming language, and which
supplies database management on behalf of the programmer, without the need
for any specific database management control statements to be supplied by
the programmer.
The main components of the Alltalk tool include a work station having an
operator interface, a mass memory, and a CPU. An object-oriented
programming language system running on the work station includes an
object-oriented programming language and an object-oriented language
compiler for translating source code written in the object-oriented
programming language into objects and interpreter code. Also running on
the work station is a logic programming system including a logic
programming language having components of terms, clauses, predicates,
atoms, and logic variables, and a logic language compiler for translating
source code written in the logic programming language into objects. A
database residing in the mass memory stores objects and components of
logic programs as objects in a common data structure format, applications
data, and application stored as compiled interpreter code. The database is
managed by an database manager that represents objects and components of
the logic programming language in the common data structure format as
objects and is responsive to calls for retrieving and storing objects in
the database and for automatically deleting objects from the database when
they have become obsolete. An interpreter executes the interpreter code
and generates calls to the database manager. A logic subsystem solves
logic queries and treats objects as components of a logic program.
According to a further aspect of the present invention, an improved
database format is provided for an object-oriented programming language
system. The database has a key file and a prime file. The prime file
contains records of variable length for storing objects, and the key file
contains records of fixed length for storing the address, record length,
and type of object in the prime file. An improved database manager for
managing this database includes an object manager employed by the
compiler, interpreter, primitives and utilities for providing access to
objects in the database, and for maintaining organization of objects in
the database. An access manager is called by a buffer manager for
retrieving objects from the database, a transaction manager for updating
the database with new or changed objects at commit points, and for undoing
changes to objects upon aborts, and the object manager for providing high
level interface to the database. A buffer manager is called by the object
manager for generating calls to the access manager, and by a pool manager
for keeping an in-memory copy of objects. The pool manager maintains
memory for buffers.
According to another aspect of the present invention, an improved garbage
collector is provided for a heap based programming language system. The
garbage collector employs the concept of regions for garbage collection.
When a context (representing the state of a method which is executing in
the system) is created, it is assigned a region number. When an object is
created or accessed by a method, it is assigned the region number of the
context of the method that created or accessed it, unless the object was
previously assigned a lower number. When an object is returned from a
called method to the calling method, the object is moved to the region of
the calling method. When reference is made from a first object to a second
object assigned to another reigon, the second object is moved to the
region of the first object. When returning from a method, if the context
to which it is returning belongs to a number at least two lower than the
current region number before returning, the regions with the higher number
than that of the context to which it is returning are collected (i.e., the
objects in these regions are discarded).
According to a still further aspect of the present invention, the runtime
performance of a Smalltalk programming language system is improved by
implementing a technique called message flattening. The compiler flags any
method which consists of a single return statement which returns either an
instance variable, or the result of a primitive, for which the first
argument is self, and the other arguments correspond to arguments to the
method. The interpreter detects these flags at runtime and flattens any
message that would normally invoke these methods, by replacing this
message send in the first instance with an assign, and in the second
instance with a primitive invocation.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic diagram showing an overview of the invention;
FIG. 2 is a schematic diagram of the compiler;
FIG. 3 is a schematic diagram of the runtime environment;
FIG. 4 is a schematic diagram showing initialized stacks for contexts and
block stubs;
FIG. 5 is a schematic diagram showing the creation of a new context;
FIG. 6 is a schematic diagram showing creation of a block;
FIG. 7 is a schematic diagram showing creation of a second block;
FIG. 8 is a schematic diagram showing the creation of a new context;
FIG. 9 is a schematic diagram showing a block evaluation;
FIGS. 10-13 illustrate the modes of block execution;
FIG. 14 shows the creation of a process;
FIGS. 15-16 illustrate process management;
FIGS. 17-18 show the relationships between the context stack, processes,
regions, and objects;
FIG. 19 is a schematic block diagram illustrating the functions of the
garbage collector;
FIG. 20 shows the in-use table's structure and internal relationships;
FIG. 21 shows the in-use table's relationships with the object table, the
buffers, and the database;
FIG. 22 shows how garbage is collected upon a method and return; and
FIG. 23 is a schematic block diagram illustrating the functions of the ALF
compiler.
DESCRIPTION OF THE INVENTION
A portion of the disclosure of this patent document contains material to
which a claim of copyright protection is made. The copyright owner has no
objection to the copying of the patent document or the patent disclosure,
but reserves all other rights.
1. Introduction
The Alltalk tool runs on workstation type hardware, such as a Sun 4/360 by
Sun Microsystems, Inc., executing the UNIX operating system (UNIX is a
trademark of AT&T). Referring to the Drawings, FIG. 1, the hardware
includes an operator interface including a visual display (CRT) 10, a
keyboard 12, and a pointing device 14, such as a 3 button mouse. The
hardware also includes mass memory, such as a disk 16 on which the Alltalk
database resides, as well as a CPU and main memory 18. The Alltalk
software which is executed by the CPU and main memory 18 consists of an
Alltalk compiler 20 for a dialect of the Smalltalk language (also called
Alltalk) and an Alltalk runtime environment 22. The hardware components of
the workstation are connected by a bus 24.
2. Overview
The Alltalk compiler 20 is a program for translating Alltalk language
source statements into interpreter code. The compiler is generated by the
YACC and LEX utilities in the UNIX operating system, and contains
subroutines written in the C programming language.
Referring to the Drawings, FIG. 2, the compiler operates in 2 phases: the
first phase 26 parses the source code written in the Alltalk language 28
and constructs an intermediate code 30. The second phase 32 takes the
intermediate code and generates class objects, constant objects, and
method objects and places these in a database 40. These objects are
subsequently retrieved by the runtime environment 22 (see FIG. 1).
The runtime environment 22 is written in the C programming language, and in
Alltalk. Referring to the Drawings, FIG. 3, the logic language compiler 36
and the logic subsystem 38 are both written in Alltalk. These are compiled
through the previously mentioned Alltalk compiler 20, and the output
placed in the database 40 and hence available to the runtime environment
22. Other applications 42 written in Alltalk are similarly available to
the runtime environment after compilation. Application programs 42 (called
methods) are processed by an interpreter 44, which calls other components
of the runtime environment, which includes: a transaction manager 46 which
can commit and abort transactions, an object manager 48 which is called to
create and retrieve objects, a method fetcher 50 which determines the
correct method to execute next, and a garbage collector 52 which detects
and removes unneeded objects from main memory. The object manager 48 calls
upon a buffer manager 54 to determine if a requested object is in memory
or needs to be fetched from the database. If the object is to be retrieved
from the database, a pool manager 56 is called to find space in an
appropriate buffer, after which an access manager 58 is called. It is the
access manager 58 that accesses the disk 16 containing the database 40.
3. Compiler
The Alltalk compiler 20 translates class descriptions written in a dialect
of the Smalltalk language herein referred to as Alltalk into database
objects for use by the Alltalk interpreter 44 during execution.
3.1. Synopsis
The Alltalk compiler 20 takes a file containing one or more complete
Alltalk class descriptions, and for each class generates:
1. A class object, containing a dictionary of the methods in the class and
specification of the instance and class variables,
2. Compiled methods, each consisting of "bytecodes", which drive the
runtime interpreter, and
3. Objects representing constants encountered during compilation, (numeric
values, strings, etc.)
which are placed in the database 40 for use by the interpreter 44 during
execution.
The Alltalk compiler 20 consists of two phases. The first phase 26 (see
FIG. 2) does the compilation work, (parse, optimization, and code
generation), while the second phase 32 resolves global symbols and loads
the results into the database. The two phases communicate via intermediate
code 30 (written in an assembler-like intermediate language) which can be
examined and altered by the user, if desired. The following is a
description of the organization of the internals of the Alltalk compiler
20, including code generation strategies and optimization techniques.
3.2. Phase 1 (kcom)
3.2.1. Parsing
The first phase 26 of the Alltalk compiler 20 consists of two distinct
processing stages:
1. Parse tree construction, and
2. Code generation, (including optimization).
The parsing phase is implemented in a fairly straightforward manner using
the UNIX yacc/lex parser generator/lexical analyzer tools. The primary
goal of the parsing stage is to create an internal parse tree
representation of the class description and its methods which can be
analyzed using a relatively simple set of mutually recursive tree-walking
routines. In addition, the grammer of the input file is checked and errors
are reported to the user.
The grammer specification of the object-oriented language is virtually
identical to that specified in the syntax diagrams of the standard
Smalltalk language reference, Smalltalk-80 The Language and Its
Implementation, by Goldberg and Robson. The most notable variation in the
Alltalk grammar is that of allowing a primitive invocation to be used as a
primary expression and to have primary expressions as arguments, (this is
adopted from Little Smalltalk, by Timothy Budd). This allows Alltalk
primitives to be intermixed freely with the Alltalk language as if they
were function calls which return a value, (which is essentially what the
primitives really are), instead of as wholesale replacements for methods,
as in standard Smalltalk.
Additional productions have been included to allow for reading an entire
class description from a file, (in a form roughly similar to Smalltalk
"fileIn/fileOut" format). These additional productions include "header"
information such as superclass specification, instance/class variable
declarations, and instance/class method classification statements.
While it is possible to build the entire analysis and generation mechanisms
directly into the action portions of the yacc productions, the conciseness
of the analysis and generation stage would be lost in that it becomes
difficult to piece together how the parser actions interact to accomplish
that stage when the controlling function is the yacc parser. Clarity is
enhanced by having the analysis and generation functions make explicit
their own walking of the parsed information, since it may vary from that
of the parser at various points in the compilation. For example, more
complex/global optimization techniques, such as inter-statement
optimizations, may need to determine their own scope of applicability
across several statements worth of parsed information. Such techniques are
harder to embody as a single understandable function when mixed with the
simple actions of parsing.
The basic parse node is a simple binary node, (left and right child
pointers), with placeholders for the node type constant, a source code
line number, and a string pointer.
Parse nodes are created via a function called makenode(), which allocates
storage storage for the node, inserts the current source line number, and
sets the other elements as specified by the user. The storage allocated
for these nodes, (as well as for the class and method structures and
copies of strings), is not tracked in the Alltalk compiler 20 since the
compiler is expected to be run only for the duration of the compilation of
a file.
A sample parse tree for an Alltalk method is given in Table 3.1.
Syntactical shorthand and default meanings, such as the return value of a
method having no statements being "self", or a block having no statements
being "nil", are fleshed out during the parse phase in order to limit the
amount of special case logic in the analysis and generation phase.
TABLE 3.1
__________________________________________________________________________
Parse Tree Example
__________________________________________________________________________
Method Code
! SequenceableCollection methodsFor: 'enumerating'
do: aBlock
.vertline. index length .vertline.
index <- 0.
length <- self size.
[(index <- index + 1) <= length]
whileTrue: [aBlock value: (self at: index)]
__________________________________________________________________________
Parse Tree
statement statement
assign assign
"index" "length"
number unary.sub.-- expr
"0" "size"
identifier
"self"
statement
keyword.sub.---- expr
keyword.sub.-- arg
"whileTrue:"
block block
statement statement
binary.sub.-- expr
keyword.sub.-- expr
keyword.sub.-- arg
"<=" "value:"
assign identifier
identifier
keyword.sub.-- expr
keyword.sub.-- arg
"index" "length"
"aBlock"
"at:"
binary.sub.-- expr identifier
identifier
"+" "self" "index"
identifier
number
"index" "1"
__________________________________________________________________________
A successful parse generates a parse tree for the statements of each method
in the class. These parse trees are anchored in a method structure for
each method, which are all, in turn, linked to a single class structure.
When the parsing of a class is complete, the class structure is handed to
analysis and code generation routines.
3.2.2. Code Generation
The major components of this stage of the compiler are:
1. Symbol table (symbol.c)
2. Code generation (compile.c)
3. Code management (code.c)
4. Optimization (optimize.c)
Generally, the processing steps involved in this stage, (implemented by a
function called compileClass()), proceed as follows:
1. The symbol table is initialized with the instance and class variables
available via the superclass chain for the class. These symbols are
retrieved from a symbol file in the local directory. It is considered a
fatal error if the superclass cannot be found in the symbol file, (i.e.,
the superclass must be compiled first).
2. The instance and class variables for the class are added to the symbol
table. Name clashes involving superclass variables are also considered
fatal.
3. Each method is compiled. During method compilation, bytecodes are
collected into segments corresponding to groups of statements in the
method: one for the method itself, and one for each block within the
method. When method compilation is complete, the method code segment is
emitted first, followed by the segments for each block.
4. After the methods have been successfully compiled, a record of the
class' instance and class variables are written to the symbol file in the
local directory. This makes the class available for use as a superclass in
subsequent compilations.
The method compilation step is the heart of the compilation task. Before
describing this step in detail, a description of the compiler's view of
symbolic references and the symbol table is given.
3.2.3. Symbols
Throughout the Alltalk compiler 20, references to named symbols in the
program being compiled, as well as references to unnamed runtime storage
are represented in a uniform manner. This uniform representation allows
the code generation stage to freely create and pass references between the
recursive routines which implement this stage without regard for their
type until a leaf routine which needs detailed type information is
executed. The conciseness of the code generation routines is greatly
enhanced with this representation scheme.
There are nine reference types, as follows:
Named References
Instance Variable
Class Variable
Method Parameter
Formal Method Temporary
Block Parameter
Global Symbol ("true", "false", "nil", class name, etc.)
Unnamed References
Constant ("10", "3.14", `a string`, #symbol, etc.)
Compiler Temporary (used in evaluating intermediate
expressions) Block Stub/Closure (reference to storage holding the runtime
id of the closure)
The symbol table supports a subset of the named references, separating them
into the three categories of: (1) class, (2) instance, and (3) temporary
symbols. Temporary symbols encompass the method parameter, formal method
temporary, and block parameter references. Global symbol references are
never actually placed in the symbol table, but are materialized whenever
the search for a name fails. These symbols are resolved by the second
phase 32 of the Alltalk compiler 20, since the cross reference values for
these names are actually present in the runtime system dictionary
contained in the database 40.
The symbol table interface routines contain the usual routines for the
addition of symbols, (addSymbol()), and name-based search for symbols,
(findSymbol()). An initialization routine, (initSymbols()), purges the
table and then uses the globally specified superclass name to populate the
table with "ref" structures for the instance and class variable symbols
available via the superclass chain, as recorded in the symbol file in the
local directory. A routine for writing the instance and class variable
symbols, (writeSymbols()), to the local symbol file for the globally
specified class, (i.e., the one being compiled), is also provided.
Finally, a pair of general routines, (markSymbols() and releaseSymbols()),
are available for get/set of placeholders in the symbol table. These are
primarily used to record the starting position of method and/or block
temporary symbols, so that they can be removed at the end of the
compilation of the method and/or block statements.
3.2.4. Method Compilation
In the runtime environment 22 (see FIG. 3), a method is executed with an
associated "context" containing local storage organized as an array of
temporary slots, analogous to a "stack frame" in a conventional language.
This local storage is divided into the following five sections from the
compiler's point of view:
1. The object id of the receiver, known as "self".
2. Method parameters.
3. Formal method temporaries, (named temporaries).
4. Compiler scratch area for intermediate expression evaluation.
5. Block stub/closure id storage for blocks in the method.
A general mechanism for tracking the use of the temporary slots is
implemented in the compiler using a set of macro routines. This set
includes routines for: allocating a number of temporaries, (allocTemp()),
which returns the starting slot for the requested count; freeing a number
of temporaries, (freeTemp()); get/set of temporary usage information,
(getTempUse(); and setTempUse()); clearing usage information,
(clearTempUse()); and requesting the high water mark for temporary usage,
(maxTempUse()). Temporary usage is tracked with these routines for the
first four kinds of temporaries listed above. Storage for block ids is
tallied separately during method code generation since it is not known
what the required number of compiler scratch temporaries will be until the
method compilation has finished.
Before the method statements are examined, several initialization steps are
performed:
1. The symbol table is populated with the entries for "self", "super", the
parameters, and the formal temporaries. The slot index for each entry is
determined by allocating a temporary as each symbol is added to the table.
2. A code segment is allocated for the method statements and made to be the
"active" segment. During compilation, generated code is placed in the
"active" code segment, which is switched when compilation of a new list of
statements, (e.g., a block), is started or completed.
3. Label generation is reset, (used for branch targets and block entry
points).
4. The block count is reset.
Also prior to commencing code generation, the parse tree of the method is
examined to see if it can be tagged for "flattening" at runtime. "Method
flattening" is a technique for determining whether a runtime message send
can be avoided because the method is "trivial". A "trivial" method is one
which contains a single statement returning either:
1. An instance variable, (can replace send with an assign), or
2. The result of a primitive for which first argument is self and the
remaining arguments to the primitive invocation line up exactly with
arguments to the method, (can replace send with primitive invocation).
At this point, compilation of the method statements is initiated by calling
compileStatementList() with a pointer to the first statement parse node of
the method. This routine invokes compileExpr() to compile the expression
associated with each statement in the list. CompileStatementList() is used
to compile lists of statements for blocks as well as methods, generating
appropriate return bytecodes when explicit return statements are
encountered and after the last statement in the method or block.
CompileStatementList() distinguishes between method and block statement
lists by the value of active code segment id, which is -1 for a method or
>=0 for a block. Provision is also made for the case of "inline" block
code generation, which is used in optimization of certain messages
involving blocks, (such as messages to booleans), described later.
3.2.5. Expressions
Expression compilation is the center of most activity during the
compilation of a method. CompileExpr() defines the compilation actions for
all parse nodes other than statements in a concise manner. This routine is
invoked with the node to be compiled and a destination specification for
the result of compiling the expression indicated by the node in the form
of a "ref" structure. The destination specification allows the calling
routine | | |