|
|
|
| United States Patent | 4906940 |
| Link to this page | http://www.wikipatents.com/4906940.html |
| Inventor(s) | Greene; Robert R. (Tucson, AZ);
Weyker; Robert R. (Tucson, AZ);
West; Karen F. (Tucson, AZ) |
| Abstract | A pattern recognition process and apparatus automatically extracts features
in displays, images, and complex signals. Complex signals are processed to
two- or higher-dimensional displays or other imagery. The displays or
other imagery are then processed to produce one or more visual fields in
which regions with certain properties are enhanced. The enchanced regions
are induced to produce attractive forces. Flexible templates placed in the
visual fields are acted upon by the attractive forces, causing the
templates to deform in such a way as to match features which are similar,
but not identical to, the template. The deformed templates are then
evaluated in order to identify or interpret the feature to which the
template was attracted. Apparatus utilizing the process generates a
display of the features extracted from the input signal. Desired
information can be obtained from such a display, such as trajectories, the
location of ridges, buildings, edges, or other boundaries. The extracted
features can be used within a control system to automatically guide an
object, such as a vehicle or airplane, along a desired course; or within a
signal processing system to provide a display of the features in a way
that aids in the interpretation of such features. |
|
|
|
Title Information  |
|
|
|
|
|
|
| Publication Date |
March 6, 1990 |
|
|
|
|
|
| Filing Date |
February 13, 1989 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This application is a continuation of application Ser. No. 088,951, filed
8/24/87, now abandoned. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. A method of extracting features from source signals, such as image
signals, display signals, and similar complex signals, comprising the
steps of: (a) producing a display field of said source signal having two
or more dimensions;
(b) generating a force field around areas of said display field having
selected properties, such as those areas having a prescribed intensity;
(c) placing at least one movable and deformable template in the display
field that is acted upon by said force field; and
(d) evaluating at least one characteristic of said template after said
template has been acted upon by said force field, said force field causing
said template to move and/or deform in response to the forces present
within said force field, said at least one characteristic providing an
indication of a feature present within said source signal.
2. The method of extracting features from source signals of claim 1 wherein
the display-field production of step (a) comprises producing a visual
display and enhancing selected features of the display.
3. The method of extracting features from source signals of claim 2 wherein
the step of producing a visual display comprises generating an array of
pixels, each pixel being assigned a brightness level as a function of the
source signal being displayed.
4. The method of extracting features from source signals of claim 3 wherein
the step of enhancing features of the display comprises enhancing edges
appearing within the display to make them appear as highlighted linear
tracks.
5. The method of extracting features from source signals of claim 4 wherein
the step of enhancing features of the display further comprises enhancing
boundaries of regions of homogeneous texture within said display.
6. The method of extracting features from source signals of claim 1 wherein
the force-field generation of step (b) comprises generating an attractive
force field around the selected features of the display field, whereby a
movable object, such as a template, placed within said display field is
attracted towards the selected features in accordance with the governing
principles of the force field.
7. The method of extracting features from source signals of claim 6 wherein
the step of generating an attractive force field comprises treating the
display field as a field of compressible fluid or gas and assigning each
selected feature within the display field as a low pressure region,
whereby a movable object, such as a template, placed within the display
field flows towards the selected feature according to known principles of
fluid flow dynamics.
8. The method of extracting features from source signals of claim 6 wherein
the step of generating an attractive force field comprises treating the
display field as a potential field and assigning each selected feature
within the display field a potential value, whereby movable objects, such
as a template, placed within the display field are attracted to the
selected feature according to known principles of potential fields.
9. The method extracting features from source signals of claim 8 wherein
the step of treating the display field as a potential field comprises
treating the display field as a distribution of mass field wherein each
selected feature within the display field is assigned a mass value,
whereby a movable object having an assigned mass value, placed in the
display field, such as a template, is attracted towards the selected
features in accordance with known principles of physical dynamics.
10. The method of extracting features from source signals of claim 8
wherein the step of treating the display field as a potential field
comprises treating the display field as an electric field and assigning
each selected feature within the display field an electric charge value of
one polarity, whereby an object having an electric charge value of an
opposite polarity, such as a template, placed within the display field is
attracted towards the selected feature according to known principles of
electric dynamics.
11. The method extracting features from source signals of claim 1 wherein
step (c) of placing movable templates within said display field comprises:
defining a template having desired characteristics, including the ability
to bend and deform to a desired degree;
placing at least one such defined template in the display field provided in
step (a) so that it can be acted upon by at least one of the force fields
generated in step (b), and
allowing the force field to act upon the placed template until the template
is within a specified closeness of a match with the selected features of
the source signal.
12. The method of extracting features from source signals of claim 11
wherein the step of allowing the force field to act upon the placed
template comprises allowing the template to converge to an asymptotic
state, said asymptotic state comprising a state wherein said template has
been finally acted upon by said force field, said asymptotic state
providing a hypothetical location, orientation, and shape of the feature
in the display field towards which the template was attracted.
13. The method of extracting features from source signals of claim 12
wherein the step of defining the template to have desired characteristics
includes assigning the template to have a desired dimensionality, such as
a one dimensional line, a two dimensional rectangle, or a three
dimensional sphere.
14. The method of extracting features from source signals of claim 12
wherein the step of defining the template to have desired characteristic
includes assigning the template to have a desired topology, including the
shape of the template, the number of holes in the template, and the number
of separate pieces in the template.
15. The method of extracting features from source signals of claim 12
wherein the step of defining the template to have desired characteristics
includes assigning the template to have a desired number of degrees of
freedom.
16. The method of extracting features from source signals of claim 12
wherein the step of defining the template to have desired characteristics
includes assigning the template to have desired dynamics, including the
manner and degree to which the template can bend, deform, flex, and
otherwise respond to forces applied thereto.
17. The method of extracting features from source signals of claim 1
wherein the evaluation of the template carried out in step (d) comprises:
considering the asymptotic state of each template as a hypothetical
location, orientation, and shape for a feature within the display field,
and
deciding whether to accept or reject said hypothetical location,
orientation, and shape (the hypothesis) as the extracted feature of the
source signal.
18. The method of extracting features from source signals of claim 17
wherein the step of accepting/rejecting the hypothesis comprises testing
the parameters characterizing said template and rejecting the hypothesis
if these parameters lie outside certain prescribed bounds.
19. The method of extracting features from source signals of claim 17
wherein the step of accepting/rejecting the hypothesis comprises testing
the properties of the display field near at least one portion of the
template and accepting the hypothesis if these properties lie within
certain prescribed bounds.
20. The method of extracting features from source signals of claim 1
wherein step (c) comprises placing a prescribed number of templates in the
force field and wherein step (d) comprises determining whether a
prescribed number of said templates have clustered around a given point in
the display field, and if so, accepting the presence of a feature at said
point.
21. The method of extracting features from source signals of claim 1
wherein step (c) includes assigning a potential energy value to a template
as it is placed in the display field at its initial position; and step (d)
comprises measuring the decrease in the potential energy after the
template has moved within the display as a result of being acted upon by
the force field, and accepting a hypothesis concerning the location,
orientation and shape of a feature in the display field if the potential
energy has fallen by more than a specified amount.
22. A method for classifying features from a display having two or more
dimensions comprising the steps of:
(a) generating a force field around areas within said display field having
selected properties;
(b) defining a movable and deformable template having desired initial
characteristics;
(c) placing said template within said display field;
(d) allowing said template to move and deform within said display in
response to being acted upon by said force field; and
(e) evaluating at least one final characteristic of said template after
said template has moved to a final state and assumed a final shape as a
result of being acted upon by said force field; and
(f) classifying a feature present in the display field as a function of the
evaluated final characteristic of said template.
23. A method of identifying features in a display field, said display field
comprising a two or more dimensional array of a complex signal, said
method comprising the steps of:
(a) generating a force field around areas within said display field having
selected characteristics;
(b) placing a movable and flexible template within said display field that
moves and flexes in response to said force field; and
(c) evaluating at least one characteristic of said template after it has
moved and flexed as a result of being acted upon by said force field, said
evaluated characteristic providing an indication of the identity of
selected features within said display field.
24. A method of interpreting a complex signal comprising the steps of:
(a) generating at least one display field of two or more dimensions that
display said complex signal;
(b) enhancing selected portions of said display field;
(c) generating a force field around said selected enhanced portions;
(d) defining at least one movable template having desired characteristics,
such as a flexible stick, and placing said template within said display
field so that it is acted upon by said force field for a prescribed time
period; and
(e) evaluating at least one characteristic of said template at the
conclusion of said time period, said evaluated characteristic providing
information relative to the interpretation of said complex signal.
25. The interpretation method of claim 24 wherein the prescribed time
period of step (d) is determined by waiting until after the template has
settled to a final state as a result of being acted upon by said force
field.
26. A control system comprising:
an element to be controlled that is responsive to a control signal;
receiving means for receiving at least one input signal;
feature-extraction means for extracting at least one specified feature from
said input signal, said feature extraction means including
display-field generating means for generating at least one display field of
at least two dimensions of said input signal,
force-field generating means for generating a force field surrounding
selected portions of said display field,
template means for placing at least one movable and deformable template in
said display field that is acted upon by said force field, and
evaluating means for evaluating said at least one movable and deformable
template after it has been acted upon by said force field, the location,
orientation and shape of said template providing an indication that a
feature is present within said display field having a similar location,
orientation and shape, said identified feature being extracted from said
display field; and
control means responsive to the feature extracted by said feature
extraction means for generating said control signal;
whereby the element of said control system that is controlled in response
to said control signal is controlled as a function of the extracted
feature from said input signal.
27. The control system of claim 26 wherein said control system comprises a
wheeled vehicle, said receiving means includes a video camera attached to
said vehicle that generates a video signal as a result of an optical image
presented thereto, said feature-extraction means comprises a computer
on-board said vehicle that extracts the edges of a road from the video
signal generated by said video camera, and said control means includes
means for moving and steering said vehicle so that it follows said road.
28. The control system of claim 27 wherein said control means includes:
means for calculating the center of the road as half-way between the edges
of the road;
means for moving the vehicle forward along the center of the road; and
means for adjusting the video camera so that it is pointed at the center of
the road in front of the vehicle.
29. The control system of claim 27 wherein the display-field generating
means of said feature-extraction means includes:
means for processing the video signal using a Sobel edge detector;
means for normalizing the Sobel-processed image; and
means for calculating a visual field display from the normalized Sobel
image.
30. The control system of claim 29 wherein said visual field display
comprises a matrix of pixels, each pixel having an intensity level
associated therewith that varies as a function of the received video
signal; and further wherein the force-field generating means of said
feature-extraction means includes means for treating said matrix of pixels
as a fluid flow field wherein pixels having a prescribed intensity within
said display field are assigned a low pressure value; and still further
wherein said template means includes means for allowing a template placed
in said fluid-flow field to move within said fluid-flow field in response
to forces created by said low pressure values.
31. The control system of claim 30 wherein said template placed in said
fluid-flow field by said template means comprises a non-rigid template
that can flex and deform in response to the flow forces created within
said fluid-flow field.
32. The control system of claim 31 wherein said template comprises a pair
of flexible rods.
33. The control system of claim 32 wherein each of the flexible rods of
said pair of flexible rods includes repeller means for repelling each of
said rods from the other of said rods as said rods are moved by said force
field within said visual field display, thereby preventing said rods from
converging to the same location within said visual display field.
34. The control system of claim 26 wherein said control system comprises an
aircraft; said receiving means includes sensing means mounted on said
aircraft for receiving an input signal from the area in front of and below
said aircraft and for generating a sensor signal in response thereto; said
feature-extraction means includes signal processing means on-board said
aircraft for extracting linear features, such as roads and rivers, from
said sensor signal; and said control means includes means for guiding said
aircraft so that it follows said linear features.
35. The control system of claim 34 wherein said receiving means further
includes means for photographing and recording optical images observed
from said aircraft; and wherein said signal processing means further
includes means for extracting rectangular features, such as buildings,
from the optical images photographed and recorded by said receiving means.
36. A signal processing system for interpreting an input signal comprising:
receiving means for receiving at least one input signal;
feature-extraction means for extracting desired features from said input
signal, said feature extraction means including:
display-field generating means responsive to said input signal for
generating at least one display field of said input signal having at least
two dimensions,
force-field generating means for generating a force field surrounding
selected properties of said display field,
template means for placing at least one movable template in said display
field and for allowing said template to move within said display field in
response to said force field, and
evaluating means for identifying those features within said input signal
that are to be extracted, said evaluating means including means for
determining at least the position of said movable template after said
template has been acted upon by said force field, said determined position
providing an indication of those features within said display field that
are to be extracted; and
display means for extracting the identified features from the display field
and for displaying said extracted features, said display of extracted
features providing an interpretation of said signal.
37. The signal processing system of claim 36 wherein said receiving means
includes a plurality of sensors for receiving input signals from a moving
noise source, said feature-extraction means comprises processing means
that includes said display-field generating means, force-field generating
means, template means, and evaluating means; and wherein said display
means includes a detection display whereon a trajectory of the moving
noise source is displayed; said signal processing system thereby
comprising a multichannel warped signal correlator system.
38. The signal processing system of claim 37 wherein said display-field
generating means includes: (1) means for dividing the input signals from
each sensor into n sub-series, (2) means for generating a preliminary
visual field by calculating the cross correlation of corresponding pairs
of sub-series from the divided signals from each sensor, (3) means for
normalizing the preliminary visual field thus formed, and (4) means for
calculating the display field from said normalized preliminary visual
field.
39. The signal processing system of claim 38 wherein said display field
comprises a matrix of pixels, each pixel having an intensity level
associated therewith that varies as a function of the received input
signal; and further wherein the force-field generating means of said
feature-extraction means includes means for treating said matrix of pixels
as a fluid-flow field wherein pixels having a prescribed intensity within
said display field are assigned a low pressure value; and still further
wherein said template means includes means for allowing a template placed
in said fluid-flow field to move within said fluid-flow field in response
to forces created by said assigned low pressure values.
40. The signal processing system of claim 39 wherein said template placed
in said fluid-flow field comprises a flexible rod having prescribed
characteristics.
41. The signal processing system of claim 40 wherein said template
comprises a pair of flexible rods, each of said flexible rods having
repeller means for repelling each flexible rod from the other flexible
rod, thereby preventing said rods from converging to the same location
within said display field.
42. The signal processing system of claim 36 wherein said receiving means
comprises means for providing a digital imagery signal, said input signal
comprising an optical signal from which said digital imagery signal is
derived, and said feature extraction means comprises digital processing
means for extracting rectangles from said digital imagery signal; said
display means thereby displaying the extracted rectangles.
43. The signal processing system of claim 42 wherein the display-field
generating means of said feature extraction means comprises means for
producing first and second visual fields from the initial digital imagery
signal, said first visual field being produced so as to enhance regions of
uniform intensity, and said second visual field being produced from said
first visual field so as to enhance the edges around the regions of
uniform intensity.
44. The signal processing system of claim 43 wherein the force-field
generating means of said feature-extraction means includes means for
calculating an attractive force field within each of said first and second
visual fields, said calculation being based on the solution of the
equations for a compressible fluid flow.
45. The signal processing system of claim 44 wherein the calculating means
carries out the solution of the fluid flow equations using a two-step
finite difference solution technique.
46. The signal processing system of claim 44 wherein the template means of
said feature-extraction means comprises means for placing a plurality of
rectangular templates within said first and second visual fields and
allowing said templates to change shape, orientation, and size within said
first and second visual fields as said templates are acted upon by the
forces of said attractive force field.
47. The signal processing system of claim 46 further including repeller
means for repelling each of said plurality of rectangular templates from
the others of said rectangular templates as said rectangular templates are
acted upon by the forces of said attractive force field, thereby
preventing said templates from converging to the same location within said
display field.
48. The signal processing system of claim 46 wherein the evaluating means
of said feature extraction means includes means for testing the regions
within said first visual field that are surrounded by said templates,
after said templates have reached an asymptotic state, to determine if
said regions are homogeneous, said asymptotic state comprising that state
wherein said templates have finished moving in response to said attractive
force field; and by further testing the pixels within the second visual
field that are close to the edges of the rectangular templates that have
also reached an asymptotic state to determine if a prescribed percentage
of said pixels are edged enhanced pixles.
49. The signal processing system of claim 36 wherein said receiving means
receives an input signal comprising seismic data and includes means for
forming a common depth point display therefrom; and said
feature-extraction means extracts features from said seismic data signal
representative of the shape of the curves formed by the locus of
reflections in the common depth point display, said signal processing
system thereby serving as a common depth point interpretation station.
50. The signal processing system of claim 36 wherein said receiving means
receives an input signal comprising a zero offset signal obtained from
seismic data, and said feature extraction means extracts features from
said zero offset signal indicative of the locus of reflections from a
given reflecting interface, said signal processing system thereby
functioning as a seismic trace interpretation station.
51. The signal processing system of claim 36 wherein said receiving means
receives a reflected input signal from a moving target, such as occurs in
a radar system, and said feature-extraction means extracts the trajectory
of the reflected signals over time based on a collection of input signals,
and further wherein said display means displays said trajectory in a
multi-dimensional display, said signal processing system thereby
functioning as a multi-screen track detection system.
52. The signal processing system of claim 36 wherein said receiving means
receives a voice signal from a person to be identified, said
feature-extraction means includes means for extracting features, if any,
from said voice signal that are unique to a particular individual, and
said display means includes means for signaling whether any unique
features for said particular individual were extracted from said voice
signal.
53. A system for interpeting a complex signal comprising:
means for receiving said complex signal;
means for displaying said complex signal in a display field having at least
two dimensions;
means for enhancing areas of said display field having prescribed
properties;
means for generating a force field around at least one of said enhanced
areas;
means for placing a template having desired characteristics within said
display field so that it is acted upon by said force field until a
prescribed event occurs;
means for determining the occurrence of said prescribed event;
means for evaluating said template to determine its location orientation,
and shape within said display field after the occurence of said prescribed
event, which information provides an indication of the location,
orientation and shape of a feature within said display field, and hence
within said complex signal;
the presence of said feature within said complex signal providing an aid to
the interpretation of said complex signal.
54. The complex signal interpreting system of claim 53 wherein said
prescribed event comprises the convergence of said template to a final
position within said display field as a result of being acted upon by said
force field.
55. The complex signal interpreting system of claim 53 wherein said
prescribed event comprises the elapse of a prescribed time period.
56. The method of claim 1 wherein the step of generating a force field
comprises generating a second order force filed around areas of the
display field having selected properties, said second order force field
containing forces that are deformed by a second-order differential
equation.
57. The method of claim 6 wherein the force-field generation of step (b)
comprises generating a second order attractive force field around selected
features of the display field, whereby a movable object, such as a
template, placed within said display field is attracted towards the
selected features in accordance with the governing second-order principles
of the force field.
58. The method of extracting features from source signals of claim 57
wherein the step of generating a second order attractive force field
comprises treating the display field as a field of compressible fluid or
gas and assigning each selected feature within the display field as a low
pressure region, whereby a movable object, such as a template, placed
within the display field flows towards the selected feature according to
known second order principles of fluid flow dynamics.
59. The method of extracting features from source signals of claim 57
wherein the step of generating a second order attractive force field
comprises treating the display field as a potential field described by a
second order differential equation and assigning each selected feature
within the display field a potential value, whereby movable object, such
as a template, placed within the display field are attracted to the
selected feature according to known second order principles of potential
fields.
60. The method of extracting features from source signals of claim 58
wherein the step of treating the display field as a potential field
comprises treating the display field as a distribution of mass field
wherein each selected feature within the display field is assigned a mass
value, whereby a movable object having as assigned mass value, placed in
the display field, such as a template, is attracted towards the selected
features in accordance with known second-order principles of physical
dynamics.
61. The method of extracting features from source signals of claim 8
wherein the step of treating the dislay field as a potential field
comprises treating the display field as an electric field and assigning
each selected feature within the display field an electric charge value of
one polarity, whereby an object having an electric charge value of an
opposite polarity, such as a template, placed within the display field is
attracted toward the selected feature according to known second order
principles of electric dynamics.
62. The method for classifying features of claim 22 wherein step (a)
comprises generating a second order force field around areas within said
display field having selected properties, said second-order force field
having forces defined by a second order differential equation.
63. The method of identifying features of claim 23 wherein step (a)
comprises generating a second order force field around areas within said
display field having selected characteristics, said second order force
field having forces that are defined by a second order differential
equation.
64. The interpretation method of claim 24 wherein step (c) comprises
generating a second order force field around said selected enhanced
portions, said second order force field having forces therein that are
defined by at least a second order differential equation.
65. The control system of claim 26 wherein said force-field generating
means of said feature-extraction means comprises means for generating a
second order force field surrounding selected portions of said display
field, the forces generated by said second order force field being defined
by at least one second order differential equation.
66. The signal processing system of claim 36 wherein the force-field
generating means of said feature-extraction means comprises means for
generating a second order force field that surrounds selected properties
of the display field, said second order force field having forces
associated therewith that are defined at least one second order
differential equation.
67. The complex signal interpreting system of claim 53 wherein said means
for generating a force field around at least one of said enhanced areas
comprises means for generating a second order force field that generates
forces as defined by at least a second order differential equation. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
The present invention relates to the automatic detection and interpretation
of features in images, displays, and complex signals, and more
particularly to methods for automatically detecting and interpreting
features in images using the simulation of physical forces that force
templates to move towards similar features and to deform to match such
features. The present invention further relates to apparatus using the
feature-extraction method for the purpose of providing automatic control
or signal detection and interpretation.
The interpretation of images and displays is a function currently carried
out largely in a manual fashion by skilled human interpreters. The
interpretive function involves finding and identifying features and
collections of features in imagery, such as a photograph, or a display,
such as a radar screen. In the past, a large number of aids have been
developed which aid or enhance the ability of human interpreters to carry
out the interpretive function. These aids may restore the general picture
clarity which, for instance, may have been reduced by shortcomings of the
imaging process. This type of image processing is discussed in Andrews, H.
C. and B. R. Hunt, Digital Image Restoration, Prentice-Hall, 1977, pp.
113-124 (hereafter "Andrews and Hunt"). Another kind of aid enhances the
brightness of certain kinds of features in an image, such as edges, to
make them more readily apparent to the eye. These aids are described
extensively in Pratt, W. K., Digital Image Processing, John Wiley & Sons,
1978, pp 471-550 (hereafter "Pratt").
Techniques which attempt to automate the image interpretation task with the
object of replacing the human interpreter are very limited in capability
at the present time. The approach that has been used most successfully is
based on a paradigm of building up large structures from smaller
structures, occasionally reversing the procedure to correct for mistakes.
One example, which is called edge detection, consists of combining an edge
enhancement process with a thresholding process. In the combined
procedure, the image is processed in such a way that pixels at edges tend
to become brighter than other pixels in the image. Then pixels above a
certain brightness level are labeled as hypothetical edge points.
Hypothetical edge points which form a sequence based on adjacency are then
assembled into hypothetical continuous line segments. Isolated edge points
are dropped. Then, based on tests of certain numerical statistics such as
similarity in intensity or color, or colinearity, disconnected line
segments ae associated to form longer line segments. At each point in this
process, statistical decision theory, as described for example in
Fukunaga, K., Introduction to Statistical Pattern Recognition, Academic
Press, 1972, pp. 1-121 (hereafter "Fukunaga"), or Duda, R. O. and P. E.
Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, 1973,
pp. 1-39. (hereafter "Duda and Hart"), may be applied to accept or reject
certain hypothetical structures.
Pattern recognition techniques which build large structures from smaller
structures have several disadvantages. In general there is usually a large
number of small structures to identify, and an extremely large number of
combinations to analyze. If there is no simple way to reduce the number of
combinations that have to be examined, then the process suffers an
exponential growth in the number of operations to be performed. The result
is that for even moderately sized problems, the number of computations
involved is beyond the capability of any computer. Furthermore, small
features in an image are easily obscured by noise; thus any technique
exploiting small features is stopped at the start. Conversely, spurious
features may also be present; for instance, edge enhancement procedures
will spuriously enhance many points which do not lie on an edge. Another
problem is that techniques for associating disconnected line segments, for
instance the two visible parts of a line passing under an obstruction, are
not very well defined and their performance is difficult to evaluate.
Finally, algorithms in which operations depend on tests are difficult to
implement on parallel computer architectures.
Recent work in Artificial Intelligence (AI) has aimed at reducing the
computational size of vision problems. See, e.g., Winston, H. W.,
Artificial Intelligence, 2d Ed., Addison Wesley, 1984, pp. 159-169
(hereafter "Winston"). This is accomplished by a process identified as
goal reduction: building larger features from smaller features. In this
process, a sequence of several intermediate representations of features
are constructed. Each of the representations is of higher complexity than
the earlier ones. Advantageously, AI approaches are usually implemented
using a rule-based problem solving paradigm. In this paradigm, a
collection of rules is specified, each of which causes a certain function
to be performed if certain conditions are satisfied. The advantage of the
rule-based approach over statistical pattern recognition techniques is
that non-numeric information can be exploited. This information includes
knowledge of the physical and cultural context of the image as well as
natural constraints related to the fundamental topology of shapes. Winston
formalizes the feature recognition process as a two-step procedure called
Generate-and-Test. The implementation of this process involves a generator
module and a tester module. At each level of representation in the feature
extraction process, hypothetical features are generated and then tested
against criteria contained in the rules. One of the major goals of AI
research in vision has been to exploit contextual and constraint
information to limit the number of hypothetical featurs that must be
generated in order to generate an acceptable one. However, the rule-based
paradigm has been more successful at the testing function, which is
similar to the earlier successes of rule-based systems in medical
diagnosis.
Another technique known in the art for image interpretation attempts to
recognize large scale features in their entirety. The central tool in this
approach is correlation or template matching, as described in Levine, M.
D., Vision in Man and Machine, McGraw-Hill, 1985, pp. 46-52 (hereafter
"Levine"). Template matching is basically a numerical measure of
similarity between a portion of the image and an idealization or model of
the feature one is looking for, called a template. This approach seems to
avoid the combinatorial growth problems, is well-defined in execution, and
is easily implemented on parallel computer architectures. When the
template is an exact duplicate of the feature in the image, and the
template can be compared with the image at the exact position and
orientation of the feature, then the similarity measure between the
template and image will be very high at that position and orientation. The
procedure is robust, even in the presence of noise in the image.
Disadvantageously, in the real world, imagery features are seldom
identical to the templates due to changes in apparent size and
perspective, distortion in the imaging system, and the natural variability
between different objects. Unfortunately, even slight distortions degrade
the performance of the correlation matcher to such an extent that it is
obscured by the fluctuations due to commonly observed levels of noise in
the image. The only remedy for this degradation is to manually compare the
template to the image in all positions, orientations, sizes, perspectives,
known distortions, etc. This process is generally prohibitively expensive.
Artificial Neural Systems (ANS) technology is a parallel technology to the
present invention. The basic objective of ANS is to design large systems
which can automatically learn to recognize categories of features, based
on experience. The approach is based on the simulation of biological
systems of nerve cells. Each nerve cell is called a neuron; systems of
neurons are called neural systems or neural networks. The various software
and hardware simulations are called artificial neural systems or networks.
Each neuron responds to inputs from up to 10,000 other neurons. The power
of the technology is in the massive interconnectivity between the neurons.
Neural networks are often simulated using large systems of ordinary
differential equations, where the response of a single neuron to inputs is
governed by a single differential equation. The differential equations may
be solved digitally using finite difference methods or using analog
electronic circuits. Large scale analog implementations seem to be beyond
the current state of the art. Other implementations based on large-scale
switching circuits have also been proposed.
There are currently two major thrusts in ANS research and development. One
thrust, exemplified by Grossberg, S. and E. Mingolla, "Neural Dynamics of
Form Perception: Boundary Completion, Illusory Figures, and Neon Color
Spreading," Psychological Review, 1985, Vol 92, No. 2, pp. 173-211
(hereafter "Grossberg"), attempts to use the neural network simulations to
recreate the functions of the brain. The other thrust, represented by
researchers Tank and Hopfield, aims at demonstrating that many types of
currently difficult problems can be solved efficiently on ANS hardware
using the ordinary differential equation which also models neurons. See,
Tank, D. W. and J. J. Hopfield, "Simple `Neural` Optimization Networks: An
A/D Converter, Signal Decision Circuit, and a Linear Programming Circuit,"
IEEE Transactions on Circuits and Systems, Vol. CAS-33, No. 5, pp. 533-541
(May 1986) (herein "Tank and Hopfield").
One of the more common models for pattern recognition known in the art is
the classification model, described by Duda and Hart as follows:
"This model contains three parts: a transducer, a feature extractor, and a
classifier. The transducer senses the input and converts it into a form
suitable for machine processing. The feature extractor . . . extracts
presumably relevant information from the input data. The classifier uses
this information to assign the input data to one of a finite number of
categories." Duda and Hart, p. 4.
With respect to the division between the functions of the feature extractor
and the classifier, Duda and Hart go on to say:
"An ideal feature extractor would make the job of the classifier trivial,
and an omnipotent classifier would not need the help of a feature
extractor." Duda and Hart, p. 4.
SUMMARY OF THE INVENTION
The present invention provides a process for automating many of the pattern
recognition functions currently carried out by human beings. This process
advantageously combines the best features of prior art systems so that,
for example, a minimum number of computations are involved, and those that
are involved may be carried out on parallel processors, if needed.
Further, the present invention carries out most of the pattern recognition
functions at the level of a feature extractor, thereby greatly simplifying
the task of classifying.
More particularly, the present invention comprises a process or method for
extracting features from images, displays, and other complex signals. This
process, like the known correlation matching process, advantageously
recognizes large-scale features in their entirety. However, unlike such
known processes, the present invention avoids the performance degradation
inherent in the correlation process due to the natural variability in the
appearance of objects in images. This avoidance of performance degradation
is accomplished through the use of flexible templates which are caused to
deform in such a way as to match features which are similar but not
identical to the template.
The template deformation process used by the present invention balances two
procedures, one in which highlighted features in an image or display are
induced to be attractive, the other involving templates which are deformed
by the attracting forces to assume the shape of the highlighted features
while resisting deformation beyond allowed norms. The overall effect is
that features are detected without knowing their precise shape in advance.
In the case of signal detection, for example, the gain of a matched filter
is attained without knowing the precise nature of the signal in advance.
This technique can best be described as a form of constrained
optimization, where global constraints are enforced through local
computation. Advantageously, because all computations are local, massively
parallel computers of simple design can be used to attain real time
performance.
The method of extracting features from complex signals of the present
invention may thus be summarized as a four step process: (1) producing, in
response to a complex signal (such as an image signal), at least one
display field of two or more dimensions; (2) generating a force field
around selected features in this display field; (3) placing, through
simulation or otherwise, at least one deformable template within the
display field so that it can be acted upon by the forces of the force
field; and (4) evaluating at least one characteristic of the template
after it has converged to an asymptotic state as a result of being acted
upon by the force field in order to detect and classify features within
the comple | | |