|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to a document storage and retrieval system
for filing documents as an image, and is particularly concerned with a
document storage and retrieval system capable of full text searching.
The typical information retrieval system has hitherto provided a retrieval
of data chiefly according to a keyword and a classification code.
Bibliographic information and patent information have been processed to
form a data base by means of the system mentioned above. Mainly
bibliographic information including abstracts in its coverage is processed
for a data base here, but the situation is such that only a part of its
function is realized to cope with the true need of information retrieval.
That is, even if a document or patent conceivably relevant is found, there
is the need to search among a lot of bookshelves to obtain the text.
Meanwhile, an optical disk capable of storing a mass data has now been
available for loading the text in the data base to provide the so-called
original document information service, thus coping with a social need. A
paperless documentation at the Patent Office is so planned accordingly. In
these systems, volumes of documents are stored in optical disks in the
form of image data, and a conventional information retrieval technique
based mainly on a keyword search is applied.
However, the conventional information retrieval technique is only effective
to orders of tens to hundreds, and hence a further technique for squeezing
relevant documents to 1/10 in number or so is desired. One method is that
in which an original document (text) stored as image data is called onto a
terminal and read visually by a retriever. The method is secure in
principle, however, documents amounting to hundreds maximumly are too many
to read out in the form of image data, and reading one by one visually is
not efficient practically as a matter of course.
On the other hand, the conventional method based on the keyword and
classification code must be updated all the time for the classification
system itself changes as time passes, thus leaving an intrinsic problem.
For example, volumes of documents classified already cannot be modified
practically as the classification system is subjected to modification
later. Documents and patents recording a progress of science and
technology are novel in content and hence of value because they provide a
new data conception which often is not included in the conventional
classification system. For this purpose, it is impossible to define
beforehand the keyword and the classification system representing a
conception originally, which is a problem essentially for the information
retrieval system.
For the reason as mentioned above, it is desirable to provide a method
which will retrieve contents with reference directly to the text of a
document. According to the method for referring to the text, a retrieval
can be practiced by means of a vocabulary recognized as a conception which
was not deemed to be important when the document was registered in a data
base but is taken new at the point of time of retrieval. Or otherwise, an
important document can be searched out directly without a "filter" or an
indexer (specialized for giving index) at the time of registration.
To satisfy such a requirement, it is necessary that a character pattern is
extracted from the document as image data and the text is replaced by a
character code, and a character recognition technique may be applied
therefor However, a document or a printed document, for example, which is
an object for filing is not perfect for character recognition from the
point of view of diversification of kinds of print quality and font. In a
conventional optical character reader, imperfect recognitions such as
error, rejection and the like are subjected to checks and corrections by
operators. (For example, "Introduction to Character Recognition" by
Hashimoto, Ohm-Sha 1982, pp. 153-154) Accordingly, even if the a
recognition precision is extremely high, a method for checking visually a
result obtained through recognizing the text is not realistic where the
amount of documents is very large, and hence a document filing system with
images as the main constituents which is available for text retrieval has
not been realized until now.
SUMMARY OF THE INVENTION
An object of the invention is to provide a document storage and retrieval
system having a full text retrieval function with reference directly to
the text of a document by solving the problems referred to above.
In order to attain the above-mentioned object, the invention is
characterized in that a document is stored as image data, that a text or a
part of the document is stored as a character code string, that the
character code string permits a character recognition result leaving
ambiguity, and that the text can be retrieved through matching the
character string.
That is, the document storage and retrieval system according to the
invention is to surmount disadvantages resulting from handling a document
as an image without impairing advantages secured by handling it as the
image at the same time. In other words, hitherto, the filing system for
handling a document in the form of an image has chiefly been for
retrieving according to keywords and bibliographical items given
separately. However, according to the present invention, a retrieval can
be realized by referring to sentences written therein.
For example, by inputting "full text search" to a retrieving terminal, a
document among a document group for retrieval with ". . . full text search
using character recognition . . . " written, for example, in the text will
be identified and extracted, and thus the document can be displayed on the
terminal as an image.
Information can be prevented from being lost due to the character
recognition by displaying it as an image. Generally in character
recognition, secondary information such as the position, size, font or the
like of each character, is abandoned in the process of normalization
Accordingly, as for the character to be recognized, the type Gothic or
italic and the size is not clarified after recognition, and thus the
printing in Gothic type or large font to indicate importance becomes
meaningless. This may correspond to the case of speech wherein whose
speech it is or what is the feeling is not clarified after the speech
recognition. In the case of documents, the secondary information is also
important for a person to read, and thus a mere character recognition is
inadvisable after all.
A first principle of the system according to the invention comprises, as
mentioned above, storing documents in the form of an image and also
storing the portion of characters redundantly as a character code. That
is, the portion of characters is stored in the form of a image and also as
a character code. The latter is used for retrieving, while the former is
used for outputting.
Now, for extracting a portion of characters from the image to replace it
with character code(s), it is necessary to segment and recognize
characters. The prior art can be applied thereto However, a perfect
recognition cannot be expected
Then, a second principle of the system of the invention comprises leaving
character(s) not for decision as the result of character recognition as it
stands in the character string with a character category remaining upper
handled as an aggregation.
For example, when ". . . full text retrieval using character recognition .
. . " is recognized, ". . . fu
[1l][l1].DELTA.text.DELTA.retr[i1]eva[l1].DELTA.. . . " will be obtained
as a recognition result in the system. (Here, .DELTA. indicates a blank.)
Then, characters given in [ ]are of the recognition result on a certain
character pattern, and "[1, l]" indicates "1" or "l".
Hitherto the character not for decision was necessarily replaced with a
correct character code through the operator to obtain a character
recognition result (output of OCR). Here, "[" and "]" are special symbols,
and are to be assigned with a predetermined special character code which
does not generally appear in the text. Symbols "[" and "]" will be used
simply for easy understanding.
As shown in FIG. 1, a document 10 is transformed into a notational
expression as indicated by 20 in the system according to the present
invention. The symbol string used is that provided in languages such as
LISP. It follows a notation called S-expression A process in which the
document (image) is transformed into a notational expression 20 is called
document understanding or document recognition. The notational expression
signifies roughly the following. That is the document is numbered 99, the
class is "Technical Paper" VOL=5, NO=7, the author is named "Peter S[mw]
[il]th", the title "Fu[1l][l1].DELTA.Text.DELTA.[RB]etr[il]e . . . ", the
text is ". . . Fu[l1][l1].DELTA.Text.DELTA.se[ao]rch . . . " and so forth.
Here, .DELTA. indicates a blank (space) and so forth.
In the character recognition, that of ambiguity includes, in most cases, a
character pattern which can hardly be coped with normally
For retrieval, meanwhile, a user inputs "FULL.DELTA.TEXT.DELTA.RETRIEVAL"
from a keyboard. Generally, there are such languages as will express the
same meaning in different words, and in this case
"FULL.DELTA.TEXT.DELTA.SEARCH" has also the same meaning While handling
such ambiguity automatically, the system is capable of searching documents
having the same character string.
After all, a plurality of partial character strings to be found out of the
sentence to be retrieved are expressed by a finite state automaton as
shown in FIG. 2. The title character string which is one of the sentences
to be retrieved as exemplified in FIG. 1 can be expressed similarly by the
automaton of FIG. 3. In this case, however, there is no distinction
between a capital letter and a small letter. The invention provides a text
search (character string retrieval) function in case there is present an
ambiguity (a plurality of possibilities, or the state wherein elements
which cannot be decided identically are present) on both searching key
(partial character string) and sentence to be retrieved, which is a third
principle.
A method given in a report [by A. V. Aho, et al. "Efficient String Matching
An Aid to Bibliographic Search, "Communications of the ACM, Vol. 18, No.
6, 1975] is well known for searching a plurality of partial character
strings out of an unambiguous text by means of the infinite state automato
n
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a drawing showing a document image and a result of document
understanding; FIG. 2 is a state transition diagram of a synonymic
character string generated from a partial character string; FIG. 3 is a
state transition diagram of a character string as a result of character
recognition which includes ambiguity; FIG. 4 is a system configuration
drawing of a first embodiment; FIG. 5 to FIG. 9 are drawings for
illustrating a method for storing and managing documents, images and
texts; FIG. 10 is a block diagram of a document recognizer; FIG. 11 is an
explanatory drawing of a rectangular area surrounding a character pattern;
FIG. 12 is a drawing illustrating a contour expression method for
describing a pattern; FIG. 13 is a drawing illustrating a relation between
pattern components and character pattern; FIG. 14 and FIG. 15 are drawings
showing a result of segmenting rows and columns respectively by means of a
bottom-up segmenter; FIG. 16 is an explanatory drawing of an algorithm for
obtaining a state transition list from a character string aggregation;
FIG. 17 is a block diagram of a flexible string matching circuit; FIG. 18
is an extended finite state automaton permitting an ambiguous character
string; FIG. 19 is a state transition table of the extended finite state
automaton; FIG. 20 is a drawing illustrating a program of FSM circuit;
FIG. 21 is a configuration drawing of a flexible string matching circuit
in a second embodiment.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The invention will now be described with reference to illustrative examples
FIG. 4 is a configuration drawing of a document storage and retrieval
system forming one embodiment of the invention. The system comprises a
control subsystem 100 providing a general control and a data base
function, an input subsystem 200 for inputting a document and others and
registering in a file, a document recognizer 300 for recognizing
documents, a text search subsystem 400 for carrying out a high-speed text
search, and a terminal subsystem 800 for carrying out a retrieval.
A configuration and a flow of operation of each subsystem will be described
in detail below.
The input subsystem 200 has a central processing unit (CPU) 201 for
controlling the subsystem, a main memory 202, a system file 251 and a
terminal 203 as a basic division The subsystem is controlled by operation
from the terminal 203, an image on each page of a document 220 is read
optically by a scanner 221, and digitized image data is stored first in a
video memory 224 by way of a bus 210. The image data is then subjected to
a redundant compression on an image processor (IP) 223, transformed into
MH (Modified Huffmann) code or MR (Modified Read) code and then returned
to another area of the video memory 224.
The inputted document image is displayed on the terminal 203 for
confirmation, and the operator is capable of inputting bibliographical
items such as the title, author's name, creation data and others while
observing the image displayed thereon. As will be described hereinlater,
bibliographical items of a formatted document can be read automatically
through document understanding, however, bibliographical items of a
not-formatted document and items of information which are not entered in
paper must be inputted manually For example, it is natural that a
classification code of document contents defined by users and a keyword
which is not present on paper should be inputted by the operator. Then, a
value and position setting of each document must be arranged independently
by a user of the document, which can be inputted from the terminal 203 A
data of such bibliographical items and others inputted as above is
correlated with an image data (compressed data) in the video memory 224
and is then loaded in the main memory 202.
Here, each document is given a proper number (document ID) and stored in
the memory so as to draw image data and bibliographical items using the
proper number of the document as a key. The document proper number can be
expressed, for example by coupling an identifier number (`INSYS 01` and
the like) of the subsystem to the character string indicating date and
time. For example, INSYS01. 850501.132437 indicates a document inputted
from an input subsystem INSYS01 at 13h: 24m: 37s on May 1, 1985. There may
be a case where the input time is important according to application of
the system, and hence it functions as a time stamp otherwise.
Now, whenever a predetermined quantity of the document is accumulated in
the subsystem 200 or a predetermined command arrives from the terminal
203, an interrupt signal is sent to a bus adapter 171.
A control subsystem 100, sensing the interrupt signal, reads a
predetermined address in the memory 202 of the input subsystem 200. The
contents of a request of the input subsystem can thus be decided.
An operation follows as described below upon request of a registration of
the inputted document in a data base.
The central processing unit (CPU) 101 is acquainted with the proper number
of documents stored temporarily in the input subsystem 200 according to a
predetermined program in a main memory 102 and further with a memory
address of bibliographical data (bibliographical items) relating thereto
and image data.
The control subsystem 100 has a data base file 151 for storing and managing
symbolic data such as bibliographical data and the like, and an image file
152 for storing and managing the image data.
The bibliographical data read out of the input subsystem 200 is written as
a new record in a data base (loaded in the file 151) which is given in the
form of the table in FIG. 5. The table of FIG. 5 is named MAIN-DIR (main
directory) and has the following data columns
______________________________________
DOC# A serial number of document registered
in the system
ID A document proper number given by the
input subsystem.
NP A page number constituting the document.
TITLE A title (character string)
AUTHOR An author's name (permitting iteration
of plural data).
CLASS A symbol indicating classification,
kind and the like of documents.
PUBL# A number of publication registered in
the system (detail being managed on
the table shown in FIG. 7.)
VOL, NO, PP Volume, number, page.
KWD A plurality of keywords.
ABS A text proper number of abstract
expressed as a character code string
(text data).
TXT A text proper number as a character
code string.
IMG A proper number of image data. Since
the image data is managed at every
page, a plurality of image proper
numbers are recorded.
______________________________________
In registration of the bibliographical data, only such data of the above
columns as will relate partly to the bibliographical data is written
newly.
Next, the image on a page constituting each document is read to the control
subsystem 100 from a predetermined storage area of the input subsystem and
is then stored sequentially in an empty area of the image file 152. Each
image (page unit) is concurrently given an image proper number (IMGID).
Then, a volume number (VOLSER) of the file having loaded the image data
therein, a file unit number (UNIT), a loading physical address (PHYSA) in
the file, a record length (SLENG) in the file and others are written in
tables shown in FIG. 6(a) and FIG. 8. The image proper number INGID given
newly is also recorded in IMG column of the table MAIN-DIR (FIG. 5).
Here, a table IMG-LOC shown in FIG. 6(b) is particularly effective when the
image file 152 is constituted of a plurality of driving devices or a
plurality of volumes, managing the location of each image. As a matter of
course, it is updated at every operations for demounting and mounting the
volume by operators.
Then, FIG. 8 shows a directory provided at each volume of the image file
152, and the following columns are provided therein.
______________________________________
IMGID An image proper number.
PN A serial page number (1 to n) in a
document.
PHYSA A physical address in a volume.
SLENG A record length (sector number, for
example).
CODE An image compression code name.
SIZE An image size (pixel number).
DOC# A document serial number.
______________________________________
Then in the drawing, data in the column PHYSA of a record 157 indicates a
leading address of image data 158 in an image data area 156 in the image
file.
Now, whenever the above operations come to end, the system is ready for
retrieving the bibliographical items and the keyword from the terminal
group 800.
A retrieval condition inputted from the retrieving terminal is transmitted
to the CPU 101 of the control subsystem 100 by way of a gateway 175. A
retrieval of a table MAIN-DIR 153 (FIG. 5) in the data base file 151 is
carried out according to a predetermined retrieving program of the memory
102. It goes without saying that indexing (for high-speed retrieval such
as hashing, inverted file and the like) is applied to main columns of the
table 153.
As a result of retrieving, a list of DOC#from the table 153 (FIG. 5) and a
list of image proper number IMGID are made out and stored in a
predetermined area of the memory 102. Upon request for display from the
retrieving terminal, a position in the image file is identified by means
of a table IMG-LOC 154 (FIG. 6(b)) and a table IMG-DIR 155 (FIG. 8), and
the image data is read successively onto the memory 102. The image data
thus read out is transmitted to the retrieving terminal in turn and then
displayed on a screen according to an indication on the terminal.
A managing method for the text used for full text retrieval will be
described, next.
As described in the main directory MAIN-DIR (FIG. 5), each document is
stored and managed not only for image data but also for text expressed in
a character code string. In the example, the abstract and the text are
stored and managed in text files 451, 452, 453 as a text. Each text
(character string) is given a proper text number and recorded in columns
ABS and TXT of the table 153 (FIG. 5), a column TXTID of the table TXT-LOC
shown in FIG. 6(a), and a column TXTID of the table TEXT-DIR shown in FIG.
9.
FIG. 9 indicates a method for storing and managing texts in the text files
451, 452, 453. In the drawing, a text body is stored one-dimensionally in
a file storage area 466. Each text (one character string) is given a
proper number TXTID and managed in a directory table TEXT-DIR 465
______________________________________
TXTID A text proper number.
NCH A total number of characters constituting
the text.
PHYSA A physical address in which the text
is recorded.
SLENG A record length on a storage medium
of the text.
CCLASS A class of characters expressing the
text (Chinese character-mixed Japanese
statement, English statement, Roman
character, kana character and others).
______________________________________
A record 467 of the table 465 indicates that the text expressed by the
record is a portion 468 in the storage area in the file.
On the other hand, as shown in FIG. 4, the text can be recorded in a
plurality of volumes, and the text directory is that of managing the text
in each volume. When the plural volumes are mounted, it is necessary that
a presence of a text in any of the volumes be known, and the table TXT-LOC
shown in FIG. 6(a) manages the location of each text. A volume serial
number VOLSER in which the text having the text proper number TXTID is
recorded, and a file unit number UNIT in which the volume is mounted is
managed. TXT-LOC will be updated automatically as a matter of course when
a physical volume is demounted or newly mounted by operators.
Then, when input of document images, input of bibliographic items and
registration of documents are over as a flow of big operation, a text
recognition (document understanding) of the registered document is carried
out by the document recognition apparatus 300. An input of the recognition
apparatus is the document image 10 shown in FIG. 1 in an image file 152,
and a recognition result output is a notational expression 20 shown
likewise in the drawing. A text portion of the abstract and the text in
the notational expression 20 is stored newly and so managed by the text
files 451 to 453 as described hereinabove.
The document recognition will be described with reference to a detailed
block diagram of the document recognition apparatus shown in FIG. 10.
The recognition apparatus 300 is connected to a bus 110 of the control
subsystem 100 through a bus adapter 371 and controlled by CPU 301. A
memory 302 stores data of a program and a parameter for controlling
operation of the apparatus.
An image data to be recognized is transmitted from the image file 152 to a
memory 321. The image data is coded through compression, decoded to a bit
expression image by an image processing circuit IP 322 and is again stored
in the memory 321. Then consecutively, a contour extraction of the pattern
is carried out by the IP 322 from the image decoded to a bit expression,
and a result of extraction is again loaded in the memory 321.
The extracted contour data is expressed as follows:
##EQU1##
where i represents a contour proper number (1, 2, 3, and Ci represents a
class of the contour. Then, Ci=0 represents an outer contour (a full line
1001 in FIG. 11), and Ci=1 represents an inner contour (a broken line 1002
in FIG. 11) Those x.sub.max, x.sub.min, y.sub.max, y.sub.min represent a
coordinate of the vertex of an outer quadrangle of the contour each, as
shown in FIG. 11. Further, (x.sub.s, y.sub.s) is a coordinate of one point
Ps of the contour length (or, for example, the point found first by
contour retrieval). With the point Ps as an origin, as shown in FIG. 12,
the contour data itself is expressed by rows of sets of a quantized
direction code .theta. and a pixel number L with the same direction
continuing therefor.
Next, an inclination correction circuit 323 detects a tilt angle arising at
the time of document input from the contour data given by the expression
(1), corrects the contour data accordingly and then rewrite it to the
memory 321. For example, a system disclosed by the inventor in Japanese
Patent Application No. 152210/1985 may be employed for the inclination
correction algorithm.
From a portion of the contour data corrected for inclination (x.sub.max,
x.sub.min, y.sub.max, y.sub.min), a raw segmentation and a column
segmentation are carried out on a bottom-up segmenter (BSG) 324.
The bottom-up segmenter BSG inputs the data expressed in the form of
expression (1), generates a pattern list given by the expression (2) and
loads it in the memory 321.
(j x.sub.max,j x.sub.min,j y.sub.max,j y.sub.min,j) (2)
Here, j represents a pattern proper number, the pattern is defined as a
rectangular area not overlapping mutually, and the expression (2) further
defines vertex coordinates of the rectangular area. For example,
rectangular areas 1008, 1009 indicated by broken lines in FIG. 13 are
inputs of the BSG, however, a rectangle 1010 is obtainable through the
BSG. The rectangles 1008, 1009 are made of one contour each to be an
element, and the rectangle 1010 is a pattern forming one character. An
element constituting the pattern j is obtainable through searching the
rectangle included in a rectangular area defined by the expression (2)
from the contour data of the expression (1). It can be obtained separately
and loaded as data. A result of row segmentation and another result of
column segmentation are shown diagrammatically in FIG. 14 and FIG. 15
respectively.
A character segmentation division (CSG) 325 extracts the pattern
constituting a character from the above pattern list with reference to a
document knowledge arranging regulations such as document form and the
like. As shown in FIG. 10, the document knowledge is loaded in a document
knowledge file (DKF) 327.
Structural regulations of the layout of such as a title, author's name,
author's belonging, abstract, text and the like are stored according to
each kind of documents in the document knowledge file together with a
parametric knowledge such as the size of font. The knowledge is described
in a format description language. The language disclosed in Japanese
Patent Application No. 122424/ 1985 may be used as a format description
language.
The character segmentation division CSG operates for integration of a
pattern constituting one character which has been divided into two
patterns or more or, to the contrary, for compulsory separation of two or
more characters which has been fused through contact into one pattern.
The character segmentation division CSG outputs the number of the pattern
constituting each character in a list for each item such as the title,
abstract or text as the result of processing. For example:
(ABSTRACT
"j.sub.1 j.sub.2 j.sub.3 . . . [j.sub.n j.sub.n+1 j.sub.n+2 ]. . . j.sub.N
" (3)
represents that the abstract is constituted of a string of characters
expressed by a pattern number j.sub.k. Here, [j.sub.n j.sub.n+1 j.sub.n+2
] represents that the character can be any of three patterns j.sub.n,
j.sub.n+1, j.sub.n+2
A character recognition division (CRG) 331 extracts the contour data
constituting each character pattern, as described hereinabove, from the
above-mentioned pattern list (expression (3), for example) and the contour
data (given by expression (1)) on the memory 321, and transforms it into a
data structure ready for feature extraction.
Since a known art may be employed as the character recognition technique, a
detailed description will be omitted here, however, after a feature is
extracted from the contour data, each character can be recognized through
a pattern matching with the standard pattern in a standard pattern file
333. In FIG. 10, a memory STPM 334 is one for storing a standard pattern
with high reference frequency, aiming at a high-speed processing.
The result of the character recognition is output, as described
hereinabove, by the notational expression 20 shown in FIG. 1. In the
process of final decision in the character recognition, when a similarity
obtained as a result of pattern matching satisfies an expression (4), a
character category (character code) .omega..sub.k for giving the
similarity is output.
##EQU2##
where .rho..sub.k is a similarity to the character category k, k is a
total category number, and .epsilon. is a relative threshold.
If the expression (4) is not satisfied, then an aggregation of the
character category {.rho..sub.5 .vertline.k=k.sub.1, k.sub.2, . . . }
satisfying an expression (5) is output within two special character codes.
For example, a character (code) string .omega..sub.s .omega..sub.kl
.omega..sub.k2 . . . .omega..sub.e is output. Here, .omega..sub.s
represents "[", and .omega..sub.e represents "]".
##EQU3##
In case a similar character is present and the expression (4) is not
satisfied by the above processing, a recognition result
"FU[L1][L1].DELTA.TEXT.DELTA.SEA [RB]CH" is obtainable, for example, in
response to the input pattern "FULL.DELTA.TEXT.DELTA.SEARCH". The
recognition result is buffered on the memory 321 and then transmitted to
the memory 102 (FIG. 4) collectively.
In the control subsystem 100, a maximum text proper number is detected with
reference to the table TXT-LOC (FIG. 6), and a character code string
(text) of the recognition result is registered with a value added by 1 as
a new text proper number. The registration is carried out with respect to
the main directory 153, the table TXT-LOC and the table 465 (FIG. 9), and
the text data itself is loaded in any of the text files 451 to 453.
Now, the document to which a text data is given as above is ready for
retrieving using the text search subsystem 400
Next, the text search subsystem 400 for retrieving text contents and its
operation will be described in detail.
A request for text content retrieval or ABS="TEXT.DELTA.RETRIEVAL", for
example, which is so made from the terminal 800 is transmitted first to
the control subsystem 100. In the subsystem 100, where the document to be
retrieved has already been narrowed down through keyword retrieval or
other means, a proper number of the text incidental to the document is
selected from the main directory 153, and an expression (6) for the list
of proper numbers of the texts to be retrieved is made out according to
each text file with further reference to the table TXT-LOC.
(u.sub.i v.sub.i (t.sub.i1 t.sub.i2 . . . t.sub.in)) (6)
i=1, 2, . . . , M
where u.sub.i is an i-th file unit number, v.sub.i is a volume serial
number, t.sub.ik is a k-th text proper number to be retrieved on the
volume. Then, M is a maximum number of the text file unit.
On the other hand, when the document to be retrieved has not been narrowed,
a special symbol (expression (7), for example) is sent to the whole text
file.
(u.sub.i v.sub.i *) i=1, 2, . . . , M (7)
| | |