|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to methods and systems for processing
information in a data processing system. In particular, the invention
relates to methods and apparatuses for searching for information stored in
information storage devices coupled to at least one data processing
system.
The process of searching through a large volume of documents which contain
text in order to find a particular document or documents is often a very
useful technique for obtaining information. Typically, the text of these
documents is stored in electronic media in an information storage device
(for example, magnetic media in a device such as a hard disk or an optical
medium) which is coupled to a data processing system, such as a digital
computer. It is often the case that an enormous volume of text is stored
in electronic form in such a storage device. For example, a large number
of U.S. patents are maintained in electronic form by various entities.
Similarly, the full text of numerous periodicals, including newspapers, is
often stored in information storage devices in the form of a database or
other file, and users often want to search these databases or files to
find articles, documents, etc. that are of interest to the user.
At times, the information being searched may reside locally on the computer
system which is being used by the user; for example, text in electronic
form from numerous sources such as articles from newspapers may be stored
on a hard disk of the user's computer system and may be searched by
commercially available full text searching software such as Gofer (TM),
Sonar (TM), and ZYINDEX (TM). Unfortunately, the source of information may
be so large that it cannot fit within a typical hard disk or other storage
device of a typical personal computer. In this case, it is often
necessary, due to the economics of computing resources, to spread the cost
of large information storage devices among numerous users which are linked
together by a computer network, such as a local area network. A well known
example of a computerized network which includes information storage
devices capable of storing large quantities of information is the
Lexis/Nexis (TM) system run by Mead Data. In this case, it will be
appreciated that this "network" is considerably larger than a normal local
area network.
In prior art systems for searching for text information in a data
processing system, the user may enter a single search request and then
request either the local processor (e.g. the client workstation) or a
remote processor (e.g. a server workstation) to execute the search request
by performing a search through the information stored in an information
storage device for documents which match the search request. While the
search is being executed, it is not possible for the user to concurrently
enter a further search request or to cause that further search request to
be executed concurrently with the first search. Consequently, the user
must wait after requesting execution of the first search request before
entering a further search request and causing that further search request
to be executed. While this is often acceptable in environments where all
of the processing occurs on a local workstation (e.g. a personal
computer), this situation is particularly inefficient in a network
environment. In this environment, servers may be called upon by a number
of different users from different client workstations to execute different
tasks or perform other tasks such as preparation of new documents (e.g.
indexing existing documents), and thus a server would not be available to
process a search request. Consequently a client user would be prohibited
from even entering a second search request until the server has had an
opportunity to execute the first search request after handling other prior
tasks from other users in a network. Wide area networks (with
interconnected local area networks) pose an even greater problem in the
sense that the gateways and routers interconnecting local area networks
may be busy with other transactions, and thus a user and his/her machine
may be prevented from any other searching activities while a first search
request is being processed through a first search.
In many information sources, such as databases, there is often a need to
add new documents which have come into existence after the creation of the
database, or add modified documents which have been modified since the
creation of the database or information source. For example, a textual
database containing articles from newspapers will need to be periodically
updated with subsequently released newspapers in order to keep the
database current with the current contents of the newspaper. Similarly, if
the information source is a collection of U.S. patents, then the
information source will need to be updated with U.S. patents which issued
subsequent to the last date on which the database was modified to include
newly issued U.S. patents. In prior art systems, a user would normally
define a search request at one point in time and then have to repeat that
search request at a later time by manually entering the search again in
order to see if any new documents which have been added to the database
since the last search. In such prior art systems the manual entry of a
subsequent search request (or retrieval of a saved search request to be
executed again) will result in the generation of a report which is a
listing of documents found in the search, where the format of the report
is identical to the format used in responding to a normal search request.
Even systems which execute automatic future searches (e.g. the "Eclipse"
feature in Lexis) do not generate specially formatted reports. That is,
the response of the data processing system to a subsequent search request
will be identical in format to the response from the search request when
previously executed. No special effort is taken to display the information
to the user in a manner which is helpful in evaluating updated information
available from the information source since a prior search. Indeed, in
many systems, the report of a subsequent search report will include the
results of a prior search report and thus there can be considerable
duplication between an original report from a first search and a
subsequent report in a subsequent search.
In these prior art systems which utilize information sources which change
over time, it is often necessary to perform "maintenance" on the
information source. This maintenance typically includes adding additional
documents or removing documents as well as indexing new documents or
compressing/compacting indexes which have been changed due to the removal
of documents from the database. This "maintenance" is typically performed
in a network of computer systems where one computer system, referred to as
a server workstation or computer system typically controls access to
information sources by other computer systems in the network such as those
systems referred to as "client" computer systems. In these environments, a
user at a client workstation is often presented with a list of available
information sources even though a particular information source is
unavailable or is undergoing maintenance by the operator of the server
workstation. In this circumstance, the results of any search performed may
be erroneous; for example, the user of the client system may believe that
in fact he/she is searching an information source when in fact it is not
available.
From the foregoing discussion, it can be seen that it is desirable to allow
a user of a computer system to be able to execute further searches after
requesting a first search, particularly in a network environment. It is
also desirable to present to the user data, in a summary format, showing a
report of a scheduled search, particularly one which has been scheduled to
occur automatically by the user, in order to improve the user's efficiency
in evaluating the search results from a scheduled search. It is also
desirable to allow a user on a client computer system in a network to
obtain accurate information about the availability of information sources
while also allowing the operator of a server computer system to maintain
the information sources and also provide accurate information to users of
client systems about the availability of information sources.
SUMMARY OF THE INVENTION
A method and apparatus for providing maintenance to information stored in a
network of data processing systems, where the information is searched by
various data processing systems in the network is described. The apparatus
includes a first processor having a first display device; the first
processor is coupled to an information storage device having information
stored in at least one information source. The first processor is coupled
to the network typically through a network interface. An input device,
such as a keyboard, mouse, trackball, touchpad, stylus, or other well
known computer input device is also coupled to the first processor. The
input device is used to select at least one information source to provide
a selected information source which is to be unavailable for searching by
other processors in the network. A second processor having a second
display is coupled to the network to communicate with the first processor.
The second display is for displaying an indicia of at least one
information source, and typically, this indicia of the at least one
information source is displayed in an information source window when the
information source is not selected for hiding by the input device which is
coupled to the first processor. When the input device selects the at least
one information source to make it unavailable for searching, then at some
time after this information source has been selected, the indicia of that
information will not be displayed on the second display device. In a
typical embodiment, a timer which is coupled to the second processor
periodically causes the second processor to determine whether any
information source has been selected by the user of the first processor to
make the information source unavailable for searching; for any such
information source which has been selected to be unavailable for
searching, the indicia for such information source will be removed from
the display on the second display device. In an alternative embodiment,
when an information source is hidden by the user of the first processor
(e.g. server), the first processor will as soon as possible, notify the
second processor that the source has been hidden and the second processor
will thereafter not display the source as being available until after the
source is again made available.
The method of the present invention for processing information in a network
of data processing systems includes displaying, on a first display device,
a first indicia corresponding to at least one information source stored on
an information storage device. The first display device and the
information storage device are coupled to a first processor which is
coupled to the network. On a second display device, a second indicia,
which corresponds to the at least one information source is displayed. The
second display device is coupled to the second processor and the second
processor is coupled to the network to communicate with the first
processor. The method further includes the step of selecting from the
first display device an information source to be made unavailable for
searching and at some time after the information source has been selected
the second indicia is no longer displayed on the second display device.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 depicts an embodiment of data processing systems according to the
present invention in a network of data processing systems.
FIG. 2 shows in block diagram form information which is stored within the
memory of a server computer system.
FIGS. 3a and 3b depict an example embodiment of a method of the present
invention which involves concurrent searches initiated from the same
processor.
FIG. 4a shows a typical search request window which may be used for
defining search requests and specifying other search options.
FIG. 4b shows a typical example of a document display window showing the
text of a document found in a search of an information source.
FIG. 5 shows two search request windows where concurrent search processes
are being performed at the request of a user on one processor.
FIG. 6 is a flow chart describing a typical embodiment of the method for
scheduling searches and generating reports of those scheduled searches in
a summary format.
FIG. 7a shows a typical embodiment of a report in summary report generated
from a scheduled search.
FIG. 7b shows an embodiment of a search scheduling window for entering data
specifying the time and date of a scheduled search.
FIG. 8 shows a further embodiment of the summary format of a report
generated from a scheduled search, which summary format includes a table
of contents listing.
FIG. 9 shows an embodiment of an information sources window which may be
displayed on a server computer system in a network according to the
present invention.
FIG. 10 shows a typical search status window which may be displayed on a
display device of a server computer system in the network of the present
invention.
FIG. 11 shows a typical method of the present invention for performing
maintenance on information sources and for controlling the display of
available information sources on client computer systems in the network
according to the present invention.
FIG. 12 shows a typical information sources window which may be displayed
on a client computer system in the network of the present invention.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
In the following description for purposes of explanation, specific systems,
interconnections, and processing steps are set forth in order to provide a
thorough understanding of the present invention. However, it will be
appreciated by one skilled in the art that the present invention may be
practiced without the specific details disclosed herein. In other
instances, well known systems are shown in diagrammatic or block diagram
form in order not to obscure the present invention unnecessarily.
The present description includes material protected by copyrights, such as
illustrations of graphical user interface images which the assignee of the
present invention owns. The assignee hereby reserves its rights, including
copyright, in these materials, and each such material should be regarded
as bearing the following notice: Copyright Apple Computer, Inc., 1993. The
copyright owner has no objection to the facsimile reproduction by anyone
of the patent document or the patent disclosure, as it appears in the
Patent and Trademark Office file or records, but otherwise reserves all
copyrights whatsoever.
Referring now to FIG. 1, various aspects of embodiments of the present
invention may be performed in a network of data processing systems, and
FIG. 1 shows an example of such a network. In particular, FIG. 1 shows two
local area networks interconnected by a gateway 61 which is typically a
computer system configured to operate as an interface between local area
networks and often includes a modem for communicating over telephone
systems. One local area network (LAN) includes a server computer system 9
and two client computer systems 33 and 57 as well as an information server
55. The network coupler 31 is typically a network bus comprised of wires
or fiber optic cable or maybe a wireless network which uses radio
communication transmitters and receivers. The network may operate pursuant
to any number of networking standards including well known standards such
as Local Talk (TM) EtherNet, Token Ring, AppleTalk Remote Access (TM)
(ARA) as well as other well known networking standards. A second local
area network (LAN2) 65 typically comprises a plurality of computer systems
(which may be referred to as workstations) where some computer systems may
be server computer systems and other computer systems may be client
computer systems. It will be appreciated that computer systems in the
second local area network 65 may communicate via the gateway 61 with other
computer systems in the first local area network, such as the server
computer system 9. It will also be appreciated that the gateway 61 may be
eliminated, and the server 9 may perform the functions of a gateway system
between the two local area networks. Also shown is an Internet server 63
which may store information (e.g. on computer "bulletin boards") which may
be retrieved by a computer system in either of the local area networks. It
will be appreciated that Internet server 63 is part of the well known
system of interconnected computers known as "Internet."
The server computer system 9 includes a processor 10 and a memory 11 which
are interconnected by a system bus 12. A display controller 14 and a
display device 15 are coupled to the processor 10 through the system bus
12. A mass memory 17, which may be a local hard disk which stores
information in magnetic media or optical media, is coupled to processor 10
and the memory 11 through the system bus 12. Typically, a computer system
includes input and output devices in addition to a display device. For
example, an output device may be a hard copy printer. Numerous input
devices are also well known such as keyboards, mice, trackballs,
touchpads, and styluses (pens) and these input devices communicate with
processor 10 and memory 11 via a controller such as the I/O controller 21.
The server computer system 9 is linked to the other computer systems in
the network by a network interface 25 which is coupled to the system bus
12 by a local bus 19 and the server 9 is linked to other remote servers
(e.g. Internet server 63) in conventional ways (e.g. through gateway 61)
or, as noted, the server 9 may itself perform the functions of a gateway
system.
Client computer system 33 contains similar components which are
interconnected in a similar matter. For example, client computer system 33
includes the processor 37 and a memory 39 which are interconnected by a
system bus 41. Client computer system 33 also includes a network interface
35 which couples the processor and other components within the computer
system 33 to other computer systems in the network. Client computer system
33 also includes a display device 47, which may be a CRT or a liquid
crystal display or a plasma display or other well known display devices
used in computer systems. As with server system 9, the client system 33
includes input devices such as input device 51 which may include at least
one of a keyboard, mouse, trackball, touchpad, and a pen input device as
well as other well known computer input devices. As with server system 9,
the display controller 45 couples the display 47 to other components
within the computer 33, and the I/O controller 49 couples the input/output
device 51 to other components, such as the processor 37 of client computer
system 33. It will be appreciated that the client computer system 57 is
typically similar to the client computer 33 and that server computer
system 55 is typically similar to the server computer system 9. It will be
appreciated that other network resources which are coupled to either local
area network of FIG. 1 may include printers, modems, memory, disk devices,
etc. as is well known in the art. See, for example, U.S. Pat. No.
5,150,464.
Prior to describing various aspects of the present invention in detail, a
general overview of various aspects will be provided with reference to
FIG. 1. In a typical embodiment, text documents (which may include other
information such as graphics) are stored on the mass memory device 17 of
the server computer system 9. Users of client computer systems, such as
computer system 33 may search through those text documents, such as
newspaper articles or articles from scientific engineering periodicals.
The searching process typically involves the user of a client computer
system specifying certain words which the user believes should be in
documents which the user desires to see. The user of the computer will
often type into a keyboard these words which are used to define a search
request. Then the user requests that the search be performed by typically
selecting an option representing a start search command which is displayed
on the display device 47. At this point, the processor 37 sends this first
search request over the network through network interface 35 and network
interface 25 to processor 10 which executes the search requests by
performing a first search through the documents stored in mass memory 17.
It will also be appreciated that the Internet server 63 is similar to the
server system 9 and that data from the server 63 (e.g. data stored in
information storage devices coupled to the server 63) may be obtained by
the server system 9 using known networking techniques. Thus, data stored
on storage devices coupled to the server 63 may be searched through by
searching software on server 63 which is similar to the software on server
9 and which receives search requests from server 9 and executes the search
requests by searching the data and responds to the server 9 with the
results of the search requests. The server 9 combines the results of such
remote searches with the results of the search the server 9 performs on
data stored in local storage devices (e.g. mass memory 17) coupled to the
server 9. The combined search results are displayed to a user of a client
system (e.g. system 33) within one window. While the first search is
performed through the information sources stored in mass memory 17 or
elsewhere such as the information sources associated with the Internet
server 63, the processor 37 in the client system 33 may receive further
search requests from the user of the client system 33 such as a second
search request which seeks different types of documents.
For example, the first request by the user of client system 33 may be
directed to data concerning financial information about a particular
company which data the user believes will be located in the information
sources stored in one of the memory devices coupled to the network. For
this search, the user may use words such as the company's name, "balance
sheet" and "cash flow" together with well known Boolean logic operators in
order to specify search parameters which specify the desired type of
information concerning financial information about the company. While the
processor 10 in the server system 9 is executing the first search for this
financial information, the user of this client system 33 may define a
second search request or perform some other search related operation such
as selecting other information sources to search through or scheduling a
search to be performed in the future at some scheduled time. The second
search request may typically comprise a different collection of words
which represent parameters which specify a second type of desired
information such as policy statements of a particular politician
concerning a particular economic/business issue. After defining this
second search request, the user of the client system 33 may instruct the
data processing system to perform this search, which will be a second
search while the first search is still being performed.
It will be appreciated that the present invention has particular utility in
a network environment where certain searchable information sources may be
temporally inaccessible through the network due to use by other computer
users. The unavailability of these information sources, would "tie-up" the
client system 33 and prevent it from performing any other search related
operation if the results of the first search needed to be completed before
other searching operations could be performed from system 33. Thus, for
example, if information server 55 or some other network resource was
temporarily occupied with performing another operation or servicing some
requests from other than the user of client system 33, any searches of
searchable information sources on information server 55 would have to wait
until information server 55 was again available for searching. According
to the present invention, rather than waiting for the search result to be
completed from this first search which is stalled due to the
unavailability of the information server 55, the present invention allows
the user of client computer system 33 to perform a further search which
may involve different information sources which are not stored on the
information server 55 and thus obtain results from that second search
request while the first search request may still be pending. In this
manner, the present invention improves the efficiency of data processing
systems which allow for searching of information, particularly in a
network environment.
According to another aspect of the present invention, the user of client
system 33 may define a first search request and the client 33 instructs
the server 9 to perform that search request in the future every time after
new or modified documents are added to the information sources which are
available for searching. In an alternative embodiment, the user of the
client system 33 may define the first search request and schedule the
performance of that search request in a first search at some first
scheduled time in the future. In either case, the search occurs at a
deferred time. Typically, the user of client system 33 will type into an
input device such as a keyboard, the words defining a future search, and
the processor 37 will communicate this search request along with the
scheduled time to the processor 10 which will store the search request as
well as the scheduled time. Typically, the scheduled time is a plurality
of scheduled times causing the processor 10 to periodically report the
results of searches for new and modified documents which have been made
available in the various information storage devices, such as the mass
memory 17 in the server 9. That is, this scheduled search is designed to
find only new documents or modified documents which have been added to the
information sources stored in the information storage devices of the
network since the search was defined (in the case of a first scheduled
search) or since the last scheduled search was performed. In this manner,
the user will be able to display on the display device 47 any new
documents since a last search in order keep up on current developments on
those issues of concern to the user. It is noted that a "new" document
includes a document which previously existed in the information sources
prior to a scheduled search but which has been modified to contain new
content or modified content. The user is presented with a report showing
the results of each scheduled search, where this report is in a different
format than the format generated from a search which is not a scheduled
search; that is, the format from a scheduled search is a summary format
providing summary information which may include a table of contents as
well as other items, and this format differs from the normal reports
prepared for searches which are executed on a non-scheduled basis (e.g.
immediately after defining a search request).
According to another aspect of the present invention, the operator of the
server computer system 9 must perform maintenance on the information which
is searchable in the information sources stored in the information storage
devices coupled to the network, such as mass memory 17. This maintenance
may include adding new documents not previously stored in the information
sources as well as removing documents which are desired to be removed and
other well known maintenance operations. The user of the server system 9
may select certain information sources for maintenance and by doing so
cause, at some point in time after selection for maintenance, the
information source to no longer be displayed at a display device of a
client system, such as client system 33.
FIG. 2 illustrates data structures and computer programs which are used
with the various searching operations performed according to the present
invention. Memory 11, as shown in FIG. 2, includes data 201 which
specifies a first search request including search parameters and scheduled
search times. Another search data 203 may specify a second search request
which includes parameters for searching pursuant to the second search in
request. Memory 11 will typically also include at least pointers 205 to
searchable information such as information sources. These pointers are
typically addresses to the mass memory 17 or other addresses for other
information storage devices coupled to the network, where these pointers
and addresses are provided in the well known manner of the prior art.
Memory 11 further includes software, such as software 207 for searching
through textual documents. This software will typically be capable of
searching through the full text of a textual document and may, according
to one embodiment, include indexing software for indexing the words in a
document to create an indexed list of words in the well known manner of
the prior art. It will also be appreciated that the operation of the
search engine in conjunction with performing a single search from a single
search request is well known in the prior art. The memory 11 further
includes request and reply control software and windowing interface
software which may be implemented using conventional techniques. This
software, as well as the search and indexing software 207, is typically
executed by processor 10. It will be also appreciated that processor 37 as
well as other processors in other clients systems will contain similar
request and reply control software for allowing users to define search
requests and for providing reports of the results of the searches back to
the user at a client system by displaying the report on a display device
or allowing the user to print reports and/or individual documents listed
in the reports. In addition, memory 11 will typically include updated
search reports 211 which are generated as a result of performing scheduled
searches at scheduled times according to one embodiment of the present
invention. These updated search reports will be accessed by client systems
through the network in order to display in summary format the results of a
scheduled search.
One aspect of the present invention will now be described with reference to
FIGS. 3a, 3b, 4a, 4b and FIG. 5.
FIGS. 3a and 3b illustrate a typical process according to the present
invention while FIGS. 4a, 4b, and 5 illustrate typical user interface
screens which result at various points from the searching process. The
process begins in step 301 in which the user of the server computer 9
starts the server searching software to run on processor 10 and the user
of a client computer system, such as client system 33 starts the client's
searching software to run on processor 37. In step 303, the user of the
client system selects a particular server which should be running the
server search software on the server processor. In the example of FIG. 1,
the user of the client system 33 would select the server 9 which includes
the processor 10. In step 305 the user on the client system 33 creates a
new search "agent" within a window 401 shown in FIG. 4a. The presentation
of the user interface of window 401 may be implemented by well known
programming techniques. Window 401 may be referred to as a search request
window for a first search request specified in box 403 by parameters 403a.
It will be appreciated that these parameters typically include words which
the user expects to find in documents which the user desires to retrieve.
These parameters may also include Boolean operators and other special
purpose characters (e.g. "Wildcard" characters) to interconnect the
various words; it will be appreciated that this process of defining search
requests is well known in the art. In step 307, the user may create a
second search agent (search request window) such as window 501 shown in
FIG. 5. Both search window 1 and search window 2 may exist concurrently
and may be reporting results of or the status of searching operations for
two different search requests also concurrently.
In step 309, the user of the client system 33 may select at least one
information source for the first search request defined by parameters
403a. In one embodiment, the user may position a cursor 409 over the icon
431 to cause the client system 33 to communicate with the server system 9
in order to determine the available information sources in the network,
which are available for searching currently. In other embodiment, the user
may select the icon 431 to select available information sources and the
client system 33 retrieves the list of such sources from a local storage
device (e.g. memory 39 or memory 43) which contains a list of the
available information sources in the network (which may include externally
remote sources such as those stored on Internet server 63). In this
embodiment, this locally stored list is "cached" on a local storage device
of client system 33 in order to avoid retrieval, through the network, of a
list from the server 9 each time icon 431 is selected and the cache list
may be created upon initialization of the search software on client system
33 and may be regularly updated. In response to selecting the icon 431,
which represents a command for retrieving an information sources window,
the client system 33 displays an information sources window to the client
user, such as the window shown in FIG. 12. Within the window, the user may
select various information sources which may be categorized according to
subject matter/topic in order to speed searching. This is shown in FIG.
12. After selecting the appropriate information sources in step 309 the
selected information sources for the first search request initiated by
search window 401 will be available for searching and after searching they
will be displayed within the lower portion of the window 401. For example,
an icon 421 for an information source having a name "Info. Source 1" is
displayed within the region along with status information 423 showing the
number of matches after a search. For example, this status information 423
may show the number of documents which match the request in the
information source. In addition, the search window 401 displays an icon
representing each document, such as icons 424 and 425a. Associated with
each document icon is a relevance ranking such as relevance ranking 425b,
a date 425c and the title of the document 425d, which is typically the
file name of the docume | | |