|
Claims  |
|
|
Having thus described our invention, what we claim as new, and desire to
secure by Letters Patent is:
1. An apparatus for document processing for use in a computer system having
a processor, a storage and a display under control of the processor, the
apparatus comprising:
(a) a document framework stored in the storage, the document framework
defining a plurality of model classes, each one of the plurality of model
classes defining means for referencing data stored in the storage, means
for creating a container object to hold a plurality of objects
instantiated from one or more of the plurality of model classes and
program logic means for processing the data and objects held in the
container object;
(b) means for instantiating a root model object from one of the plurality
of model classes, the root model object containing a reference to data of
a first type;
(c) means for instantiating a plurality of additionally model objects from
the plurality of model classes, each one of the plurality of additional
model objects containing a reference to data of a type different from the
first type;
(d) means for creating a compound document from the root model object by
adding at least one additional model object instantiated from the
plurality of additional model objects to a container in the root model
object, wherein the root model object and each one of the at least one
additional model objects provide a hierarchy of model objects which
represent a containership hierarchy of the compound document; and
(e) means for processing the compound document by processing the root model
object, which applies the processing to the at least one additional model
object in the container in the root model object.
2. The apparatus of claim 1, wherein the document framework stored in the
storage includes means for streaming, wherein in response to a first model
being streamed, the means for streaming streams the first model and each
of a plurality of embedded models contained within the first model.
3. The apparatus of claim 2, further comprising means for copying, wherein
in response to a first one of an entire model or a selected data from a
model being copied, the means for copying copies each of a plurality of
embedded models contained with the first one of the entire model or the
selected data from a model.
4. The apparatus of claim, further comprising means for filing a model,
wherein in response to first model being filed, the means for filing files
a plurality of embedded models contained within the first model and
wherein means for filing files each of the plurality of embedded models
independently from the first model in which the plurality of embedded
models are contained.
5. The apparatus of claim 4 wherein the document framework comprises:
means for specifying a plurality of anchors;
means for linking a first anchor;
means for transferring information across links in first and second
opposite directions wherein the means for transferring information across
links includes:
means for initiating a transfer of information from either end of a link;
means for transferring document information across links.
6. The apparatus of claim 1, wherein the document framework includes means
for stacking one or more commands; and means for undoing the commands.
7. The apparatus of claim 6, wherein the means for undoing commands
includes means for undoing commands in an order opposite from which the
commands were stacked.
8. The apparatus of claim 6, wherein the means for undoing commands
includes means for selectively undoing commands in an order unrelated to
an order in which the commands were stacked.
9. The apparatus of claim 1, wherein the document framework includes means
for managing embedded models.
10. The apparatus of claim 1, wherein the document framework includes means
for providing notification.
11. The apparatus of claim 1, wherein the document framework includes means
for creating complex command groups.
12. The apparatus of claim 1, wherein the document framework includes
hierarchical document support means for embedded data.
13. The apparatus of claim 12, wherein the hierarchical document support
means includes means for embedding data in a model.
14. The apparatus of claim 13, wherein the means for embedding includes
means for overriding protocols associated with the embedded data.
15. The apparatus of claim 1, wherein the document framework includes
annotation means for providing additional information with respect to
document data.
16. The apparatus of claim 1, wherein the document framework includes
retrieval framework means for providing indexing and query processing.
17. The apparatus of claim 16, wherein the retrieval framework means
includes background indexing.
18. The apparatus of claim 1, wherein the document framework includes
object surrogate means for providing address space independent references
to real objects.
19. The apparatus of claim 1, wherein the document framework includes means
for seamlessly integrating audio and visual information in a compound
document.
20. A method for document processing for use in a computer system having a
processor, storage and a display under control of the processor, the
method comprising the steps of:
(a) storing a plurality of model classes in the storage, each one of the
plurality of model classes defining referencing data stored in the
storage;
(b) creating a container object to hold a plurality of objects instantiated
from one or more of the plurality of model classes
(c) creating logic for processing the data and objects held in the
container object;
(d) instantiating a root model object from one of the plurality of model
classes, the root model object containing a reference to data of a first
type;
(e) instantiating a plurality of additional model objects from the
plurality of model classes each one of the plurality of additional model
objects containing a reference to data of a type different from the first
type;
(f) adding at least one of the additional model objects to a container in
the root model object to provide a compound document from the root model
object; and
(g) processing the compound document by processing the root model object,
which applies the processing to the at least one additional model object
in the container in the root model object.
21. The method of claim 20, including the step of streaming a first model
and each of a plurality of embedded models contained within the first
model.
22. The method of claim 21, including the step of copying a model and each
of a plurality of embedded models contained with the model.
23. The method of claim 21, including the step of filing a first model,
wherein in response to the first model being filed, the step of filing a
first model includes the steps of:
filing a plurality of embedded models contained within the first model; and
filing each of the plurality of embedded models independently from the
first model in which the plurality of embedded models are contained.
24. The method of claim 21, including the steps of:
establishing at least one link between a first anchor and a second anchor;
initiating a transfer of information from the at least one link;
transferring information across the at least one link in a first direction;
initiating a transfer of information from a second different end of at
least one link; and
transferring information across the at least on link in a second opposite
direction.
25. The method of claim 20, including the step of containing document data.
26. The method of claim 25, including the step of modifying the data.
27. The method of claim 20, including the step of stacking one or more
commands; and undoing the commands.
28. The method of claim 27, including the step of undoing commands in an
order opposite from which the commands were requested.
29. The method of claim 20, including the step of selectively undoing
commands in an order other than the order in which the commands were
requested.
30. The method of claim 20, including the step of supporting hierarchical
document data including embedded data.
31. The method of claim 30, including the step of embedding data in a
model.
32. The method of claim 31, including the step of overriding protocols
associated with the embedded data.
33. The method of claim 20, including the step of annotating for providing
additional information with respect to document data.
34. The method of claim 20, including the step of indexing and query
processing using a retrieval framework.
35. The method of claim 34, including the step of background indexing.
36. The method of claim 20, including the step of providing address space
independent references to real objects using object surrogate means.
37. The method of claim 20, including the step of seamlessly integrating
audio or visual data into a compound document.
38. The method of claim 20, including the step of sharing access to a
single document by two or more users utilizing
command-based-collaboration.
39. The method of claim 38, including the step of sharing access to a
single document via a communication link utilizing remote command
execution. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
COPYRIGHT NOTIFICATION
Portions of this patent application contain materials that are subject to
copyright protection. The copyright owner has no objection to the
facsimile reproduction by anyone of the patent document or the patent
disclosure, as it appears in the Patent and Trademark Office patent file
or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The present invention generally relates to computer systems, and more
particularly to a method and system for object-oriented compound document
processing.
BACKGROUND OF THE INVENTION
Document processing has virtually revolutionized the way society generates
paper. Typical prior art document processing systems run on top of
operating systems, such as DOS or OS/2. More recently, these document
processing systems have been designed to run in a Windows environment.
Many of these document processing systems are commercially available.
While these document processing systems have vastly improved the ability to
process documents and text, there is great inconsistency among document
processors with respect to the particular methodologies of these
processing. The result of these inconsistencies creates problems for both
application developers and users of the applications.
Application developers must continuously "reinvent the wheel" when creating
a new document processor. While operating systems and interface programs
provide some tools which may be used, the great majority of the design
process for a particular document processor is directed toward creating a
group of processing modules which cooperate to allow the user to process
documents. Application developers often design processing modules which
have already been developed by another company. This requires great
duplication of effort, and requires each developer to deal with the
details of how to implement various desired functions.
Application users run into other problems. While particular functions may
be present in one application, they may be lacking in another. Or a
function available in one application may be slightly varied in another,
either in use or in performance. For example, a function in application A
may require certain user interaction and input to activate the function,
while a similar function in application B may require a slightly varied,
or totally different, user interaction and input.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a document processing
system in which object-oriented frameworks are utilized to implement
particular document processing techniques, including an object-oriented
compound document system. These and other objects of the present invention
are realized by a document flamework which supports at the system level a
variety of compound document processing functions. The framework provides
system level support of collaboration, linking, eternal undo, and content
based retrieval. These and other objects are carried out by system level
support of document changes, annotation through model and linking,
anchors, model hierarchies, enhanced copy and pasting, command objects,
and a generic retrieval framework.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a personal computer system in accordance with
a preferred embodiment of the invention;
FIG. 2 is a block diagram of a link, anchors, and a model;
FIG. 3 is a block diagram representing the functions associated with undo;
FIG. 4 is a diagram demonstrating the system level indexing and query
processing features of the present invention;
FIG. 5 is a block diagram of class representation in accordance with the
present invention;
FIG. 6 is a diagram of a typical document which may be made using the
present invention;
FIG. 7 is a diagram showing the hierarchical structure of the document
shown in FIG. 6;
FIG. 8 is a diagram of the general characteristics of TModel;
FIG. 9 is a diagram of a notification framework which could be used with
the document framework system;
FIG. 10 is a diagram of the relationship defining specification classes;
FIG. 11 is a diagram of the relationships associated with TModel Command
Group;
FIG. 12 is a diagram demonstrating the processing flow for DoBegin();
FIG. 13 is a diagram demonstrating the processing flow for DoRepeat();
FIG. 14 is a diagram showing the relationships established for
TModelAnchor;
FIG. 15 shows the relationships established for TModelLink;
FIG. 16 is a diagram depicting the processing of links;
FIG. 17 shows the processing of Complete Link;
FIG. 18 demonstrates the use of annotations and scripts being linked to a
document;
FIG. 19 shows some of the link properties which are available in the
system;
FIG. 20 shows the document framework client/server with respect to external
documents; and
FIG. 21 is a block diagram showing the method and system of the present
invention.
DETAILED DESCRIPTION OF THE INVENTION
The detailed embodiments of the present invention are disclosed herein. It
should be understood, however, that the disclosed embodiments are merely
exemplary of the invention, which may be embodied in various forms.
Therefore, the details disclosed herein are not to be interpreted as
limiting, but merely as the basis for the claims and as a basis for
teaching one skilled in the art how to make and/or use the invention.
The history of object-oriented programming and the developments of
frameworks is well-established in the literature. C++ and Smalltalk have
been well-documented and will not be detailed here. Similarly,
characteristics of objects, such as encapsulation, polymorphism and
inheritance have been discussed at length in the literature and patents.
For an excellent survey of object oriented systems, the reader is referred
to "Object Oriented Design With Applications" by Grady Booch, ISBN
0-8053-0091-0 (1991).
While many object oriented systems are designed to operate on top of a
basic operating system performing rudimentary input and output, the
present system is used to provide system level support for particular
features.
The invention is preferably practiced in the context of an operating system
resident on a personal computer such as the IBM.RTM. PS/2.RTM. or
Apple.RTM. Macintosh.RTM. computer. A representative hardware environment
is depicted in FIG. 1, which illustrates a typical hardware configuration
of a computer in accordance with the subject invention having a central
processing unit 10, such as a conventional microprocessor, and a number of
other units interconnected via a system bus 12. The computer shown in FIG.
1 includes a Read Only Memory (ROM) 16, a Random Access Memory (RAM) 14,
an I/O adapter 18 for connecting peripheral devices such as disk units 20
and other I/O peripherals represented by 21 to the system bus 12, a user
interface adapter 22 for connecting a keyboard 24, a mouse 32, a speaker
28, a microphone 26, and/or other user interface devices such as a touch
screen device (not shown) to the bus 12, a communication adapter 34 for
connecting the workstation to a data processing network represented by 23.
A display adapter 36 for connecting the bus to a display device 38. The
workstation has resident thereon an operating system such as the Apple
System 7.RTM. operating system.
The main goal of the document framework disclosed herein is to raise the
base level of applications by enabling several new features at the system
level. In addition, the lack of system support for these features limits
their implementation. For example, there are applications that allow users
to annotate static representations (pictures) of any document, but not the
"live" document itself. The content-based retrieval applications have
trouble accessing the contents of document because each application has a
custom file format. Also, once the application finds a document, it is
difficult to do anything with it. There's no system-level support for
opening the document, for example. The document framework also includes a
number of higher-level capabilities, such as annotation support, which are
built upon these basic services. To the extent possible, the document
framework does not specify any policy or user interface decisions. These
details will be provided by the particular applications using the document
framework.
Collaboration
Screen sharing is one popular type of collaboration on the Macintosh,
because it is relatively easy to implement and can be put to many uses.
Its main disadvantages are that some applications draw directly to the
screen (complicating the implementation) and the large bandwidth required
to transmit all drawing operations from one machine to another. Also, it
is very restrictive, since it is based on all collaborators viewing the
document in exactly the same way.
Screen sharing is one kind of simultaneous, real-time collaboration. The
document framework provides support for a different kind of simultaneous,
realtime collaboration. This operates at the level of changes to the
document, rather than changes to the screen, which will be more efficient
because the amount of data needed to specify a document change is usually
less than the amount needed to update the screen.
It is also useful to have asynchronous (i.e., non real-time) collaboration.
One form of this is the ability to annotate a document. The document
framework provides low-level support for annotations through its model and
linking mechanisms (described below).
Hypermedia Linking
FIG. 2 shows an illustration of the present invention. The blocks represent
both the apparatus and the methods involved in the system. As shown in
FIG. 2, in the document framework, a link 204 is a bi-directional
connection between anchors 202 and 206. The meaning of an anchor is
application-specific, but in most cases an anchor identifies a sticky
selection. An anchor is sticky in that it always refers to the same data
regardless of the user's editing changes. For example, if the anchor
refers to a word within a text block, it always refers to that word, even
if the text around it changes. Anchors are associated with a particular
encapsulated block of data (called a model) 200. Each kind of data in the
system is represented as a specific subclass of TModel. The abstract base
class, TModel, defines generic protocol to enable other models to embed,
display, and manipulate this data as a "black box." For example, an
application can ask the model to create a presentation (view) of the data.
These presentations range from a small thumbnail to a fully editable
presentation. There is also protocol for accessing the anchors associated
with the model's data and for accessing other models embedded within it. A
document's data is represented by a hierarchy of models, with a single
model at the root called the "root model."
Once the user creates a link, the user can operate upon it. First, the user
can navigate from one end of the link to the other. In general, this
involves opening the document containing the target anchor, scrolling the
anchor into view, and highlighting the corresponding selection.
Applications can change this behavior; for example, navigating to a sound
document may simply play the sound without opening the document. It is
also possible to transfer data across the link in either direction. The
semantics of transferring data is (roughly) equivalent to copying the
source data, navigating to the destination anchor, and replacing the
existing data with the transferred data. It makes no difference to the
document framework whether the data is pushed or pulled through the link
(i.e., whether the source or destination initiates the transfer).
The document framework also allows one application to send an arbitrary
command across a link. This will allow cooperating applications to
implement custom features using the same basic linking mechanism. The
straightforward use for the document framework's low-level linking
mechanism is to allow users to create links between documents, navigate
those links, transfer data across them, etc. This isn't the only use for
links, however. Links will also be used to implement other application
features. In these features, the fact that links are created and
manipulated will be transparent to the user.
For example, the system-wide annotation facility uses links to associate an
annotation with the part of the base document to which it refers. The
system can position a posted note icon (representing the annotation) near
the part of the document to which it refers. In addition, if the
annotation contains a suggested change to the document, it is possible for
the author to "accept" the suggestion and have the system automatically
transfer data across the link from the annotation to the document.
Another user-transparent use of linking would be to implement a function
performed by the Edition Manager of System 7. The user could publish part
of a drawing and subscribe to that data in a word processing document.
Internally, the system would create a link between the documents, and
perhaps attach an attribute to the link that indicates the nature of the
link (e.g., which end is the source of the data). The existence of the
link may be transparent to the user.
Changes to the drawing are sent across the link to the word processing
document. The document framework does not restrict which end of the link
initiates this transfer. The user can navigate from the destination to the
source of the data (from the word processing document to the drawing
document). The document framework also supports navigating in the opposite
direction because all links are bi-directional.
The document framework's models also improve the way simple copy and paste
works. On the Macintosh, an application handles a paste command in three
different ways, depending on the type of the data model in the Clipboard.
(1) The document can fully understand the pasted data, and the data
becomes a first-class part of the document. (2) The receiving document can
display the incoming data but not manipulate it. A typical example is a
picture pasted into a text document. In this case, the incoming data can
be displayed but not edited. (3) The incoming data type isn't understood
by the receiving document. Here, the paste command can't be completed, and
should be disabled.
If the data can't be absorbed, then it can be embedded in the receiving
document in the form of a model. Because models support generic protocol
for creating editable presentations of the data, embedded data is not
"dead" as is true on the Macintosh. Instead, the user can open up an
embedded model and view or edit the data it contains. This capability is
similar to that provided by HP's New Wave system or Microsoft's OLE
specification. An important difference is that the object-oriented
document framework makes it easier for a developer to implement a new data
type. Finally, if an application supports embedded models, then it can
paste every type of data. The paste command would never be disabled as
long as the Clipboard wasn't empty.
Eternal Undo
In most Macintosh applications, the undo command is precious, since only
the last change can be undone. The document framework uses the same kind
of command objects as MacApp, but saves as many command objects as
possible.
This decision has several benefits. First, it isn't as important to be
choosy about what commands are undoable. For example, in existing
Macintosh applications, changes to the selection are not undoable, even
though some selections can be difficult to create. In a drawing program,
the user can spend much time selecting the right combination of shapes and
lose everything with an extra click. If the system supports only one level
of undo, then it is unwise to save a selection change if it means
forgetting about the last Cut command, for example. With multiple levels
of undo, it is feasible to save selection changes.
Another benefit of using command objects is increased reliability. If every
command is saved, then it is possible to replay those commands in the
event of an application or system crash. The document framework uses
concurrency control and recovery classes to save command objects in a
robust manner. In the event of a crash, the user should not lose more than
the last couple of commands. With multi-level undo comes the added burden
of providing a good way to visualize and navigate the list of commands.
This is especially true if selection changes are included, because it will
be easy for the user to create hundreds of commands.
FIG. 3 shows The document framework linear list 300 of command objects 302,
which can be likened to a stack. There are many other ways in which the
undo processing could be carried out. This means that undo 306 returns the
document to a previous state. It is also contemplated that the user can
selectively undo commands 304 (i.e., undo a Cut command but keep all
subsequent commands intact). It should be remembered that commands are
dependent on one another. A command that copies a shape is dependent on
the command that first created the shape. These dependencies would
complicate the user interface to undo, as well as the underlying
implementation.
A good solution is to integrate the undo and scripting mechanisms, for
example to automatically create a script of everything that is done. The
user can then edit the script to remove arbitrary commands, rearrange
commands, etc. and execute the script. This gives users the maximum
flexibility and control.
Content-based Retrieval
Increasingly, users have more and more information available on their
computers. Local hard disks are getting larger, and there are many CD-ROMs
available that contain hundreds of megabytes of data. It is impossible to
browse through this data without some assistance from the system. In the
Macintosh, the standard tool used to be Find File, which located documents
based on their names. System 7 provides a Find command that is integrated
within the Finder. Third party developers also provide tools that go
beyond Find File and search for documents based on their content, but
which aren't well-integrated with the system.
It is important that these retrieval tools be integrated into the system.
There's little point in locating documents if the user can't do anything
with them. The third party content-based retrieval tools on the Macintosh
get no system support in examining the contents of the document, or even
opening the document with the appropriate application.
FIG. 4 shows the generic retrieval framework of The document framework from
a source of information 400. The framework handles both indexing 402 and
queries 404. Although many future operating systems will deliver with a
default indexing and query package, users will still want the ability to
plug in new search packages into the basic framework. The point of
designing a framework is so that the background indexing mechanism and
query user interface are the same regardless of the underlying retrieval
technology.
The framework will provide for automatic, background indexing of documents
when they are added to a volume or changed. A retrieval system is
worthless if only some of the documents are indexed, and it is
unreasonable to place this burden on the user. A generic query front-end
will allow a user to install a new retrieval engine and not have to learn
a new front-end.
Classes, The Developer's View
This section contains class and member function descriptions. Certain
conventions are followed. All classes have virtual destructors. If the
destructor does anything more than release storage, it is discussed;
otherwise, nothing is said. Methods inherited from MCollectible (FIG. 5,
element 500), such as streaming operators, are not discussed here. Many of
the classes have getters and setters which simply do field accesses.
Object Surrogates
Several objects in the document framework provide object surrogates, which
act as compact, address space independent references to the real object.
This use of surrogates can be described as "Explicit master and
surrogates". In a few cases, surrogates and real objects can be used
interchangeably, but in most cases the surrogate serves as a "key" for
finding the real object. In all cases the relationship between the
surrogate and real object is defined by a common abstract base class. The
abstract base class provides the address space independent identification
for the object and implements the protocol for comparison between objects
(IsEqual). This allows a surrogate to be compared directly against a real
object and used as a key into a set of real objects for lookup. Surrogate
objects provide a constructor which will create a surrogate from a real
object allowing easy creation of surrogates.
For example, it is necessary for TModelSelections to identify TModel
objects in an address space independent manner to allow commands to access
data in multiple collaborator's address spaces. TModel and TModelSurrogate
both derive from the common base class TAbstractModel. TAbstractModel
provides the address space independent identification of the object and
implements the protocol for comparison between objects. TModel provides
static member functions for looking-up the real model from a surrogate.
When a selection is streamed, it streams only a surrogate for the model to
which it refers. When the command attempts to access the data through the
selection, the real model is looked up using the surrogate to provide the
command access to the real model data.
Data Representation
FIG. 5 shows Representation classes. The abstract base class that
encapsulates the data for a particular data type is TModel 506. Derived
classes of TModel 506 are the containers for "document" data. They provide
type specific protocol for accessing and modifying the data contained in
the model. They must also override generic protocol that supports
embedding other models and for being embedded in other models. This mostly
involves overriding protocol for creating selections and user interfaces
on the data.
TModel objects 506 also provide notification of changes to the contained
data to interested objects (typically presentations). Notification could
be provided using a standard notification facility of the underlying
system, if the system has such notification available. See the "Data
Presentation" section for details on model notification.
The class TModelSurrogate 504 provides a lightweight stand-in or "name" for
an actual TModel 506. It does not actually contain the data, only the real
model does that, but it does provide protocol for a subset of the behavior
supported by TModel. Specifically, the behavior necessary to appear and
operate within the "Workspace". The behavior which TModelSurrogate 506 and
TModel 506 share is defined by their common base class TAbstractModel 502.
Compound Document Structure
A document, such as shown by element 600 in FIG. 6, can contain many models
which can be of many different classes. The basic structure of the
document is a hierarchy of models which reflects the containership
hierarchy of the document. A single model exists at the root of the
hierarchy and is referred to as the "root model." Each model in the
hierarchy may be a container for other models embedded within. A container
model refers to its embedded models with model surrogates. C++ pointers
must not be used because the embedded models may be filed using separate
contexts from the container and are not necessarily in memory with the
container. By accessing a model through a surrogate it will automatically
be filed in if not already in memory. | | |