|
Description  |
|
|
BACKGROUND OF THE INVENTION
FIELD OF THE INVENTION
This invention relates generally to the field of computer-human user
interface technology and more particularly to a method, apparatus, system
and computer program product for allowing a computer to automatically
determine what aspect of the computer's operation has the user's interest
and to optimize that aspect.
BACKGROUND
Human/Computer Interaction
An important characteristic of modern computing systems is the interface
between the human user and the computer. Early interactive interfaces were
text based wherein a user communicated with the computer by typing a
sequence of characters on a keyboard and the computer communicated with
the user by displaying characters on an output device--commonly a display
screen. These input characters specified a command to the computer's
operating system or to an application program executing on the computer.
This command invoked program logic to perform a given operation. Modern
computer systems use a graphical user interface (GUI) to simplify the
interaction between a user and a computer. A GUI equipped computer
communicates with a user by displaying graphics, including text and icons,
on a display screen and the user communicates with the machine both by
typing in textual information in response to dialogs and by manipulating
the displayed icons with a pointing device, such as a mouse.
Many modern GUIs provide a window environment. In a typical window
environment the graphical display portrayed on the display screen is
arranged to resemble the surface of an electronic "desktop" and each
application program running on the computer is represented as one or more
electronic "paper sheets" displayed as rectangular regions on the display
screen. These rectangular regions are called "windows". Each window may
include a multitude of panes. Each pane being an area for a particular
type of information (textual, still image, moving image, etc.).
Each window displays information generated by an associated application or
system program. Further, there may be several windows simultaneously
present on the desktop with each containing information generated by a
program. A program presents information to the user through each window by
drawing or "painting" images, graphics or text within the window. The user
can also move a window to a different location on the display screen and
change its size and appearance to arrange the desktop in a convenient
manner. The user communicates with the program by "pointing at" objects
displayed in the window with a cursor controlled by a pointing device and
manipulating the objects as desired. In some cases the program requests
additional information from the user in response to a manipulation. This
request is presented as a "dialog" that allows the user to provide the
requested information to the dialog from the keyboard.
Each window typically includes a number of standard graphical objects such
as sizing boxes, buttons and scroll bars. These features represent user
interface controls that the user can manipulate with the pointing device.
When the controls are selected or manipulated, the GUI invokes program
logic in the underlying program to effect a corresponding command.
One characteristic of a GUI is that the GUI is only responsive to a user's
explicit manipulation of the pointing device or keyboard. In the case of a
mouse, the user physically moves the mouse device and a cursor on the
display moves accordingly. Some pointing devices actually track the user's
gaze and move the cursor to where the user "looks" on the display screen.
However, even with the gaze tracking (eye tracking) devices, the GUI only
responds to the user's explicit commands whether that command be a button
press, a blink, or a shift of view. The computer remains a tool that the
user operates by issuing explicit commands.
In contrast, humans have the ability to make inferences by looking at
another human's eyes. Pupils dilate when people see something attractive.
People look at what they are interested in and stare at things they find
interesting. Also, human eye movements reflect thought processes. Thus,
humans observe what other persons do with their eyes and make inferences
as to what that other person is interested in and/or thinking.
The prior art in computer-human interfaces does not determine the user's
immediate interest. Prior art computer-human interfaces simply respond to
a user's command, whether input by typing the command at a keyboard, by
manipulating a mouse to move a cursor, or by using a gaze tracking device
to move a cursor. Thus, the computer is unable to detect or anticipate
what characteristic of the computer's operation is of most interest to the
user at any given time.
Gaze Tracking Devices
Most gaze tracking devices operate based upon the principal that the
direction of a person's gaze is directly related to the relative positions
of the pupil and the reflection of an object off the cornea (gaze tracking
is often termed eye tracking). These devices often include image
processing capabilities that operate on a video image of an eye to
determine the gaze direction of the eye. These image processing
capabilities are enhanced by using the bright eye affect.
The bright eye affect is a result of the highly reflective nature of the
retina. This characteristic of the retina means that a significant amount
of the light that enters an eye is reflected back through the pupil. Thus,
when light shines into an eye along the axis of a camera lens, the retina
reflects a significant portion of the light back to the camera. Hence, the
pupil appears as a bright disk to the camera. This affect allows the pupil
to be more readily imaged from a video of an eye.
Other methods exist for gaze tracking. Some incorporate having two video
cameras, one for tracking head movement and the other for measuring a
reflection off of the eyes. Other mechanisms involve measuring electric
potential differences between locations on different sides of an eye. High
accuracy devices are very intrusive on the user and require that the
user's head be held in a fixed position or that the user wear special
equipment to track the eye.
Recently, an eyegaze eyetracking system has been developed as described in
The Eyegaze Eyetracking System--Unique Example of a Multiple-Use
Technology, 4th Annual 1994 IEEE Dual-Use Technologies and Applications
Conference, May, 1994. This system comprises a video camera located below
a computer display that monitors one of the user's eyes. The device also
contains an infrared light emitting diode (LED) located at the center of
the camera's lens to maximize the bright-eye affect. Image processing
software on the computer computes the user's gazepoint on the display
sixty times a second with an accuracy of about a quarter inch.
Gaze tracking devices have been used for weapon control, operator training,
usability analysis, market research, and as an enablement for the
disabled. However, gaze tracking devices have not been used to determine
what characteristic of a computer's operation interests the computer user
at a particular time or to allow the computer to adapt to a user's
interest as demonstrated by where on the display screen the user is
looking.
Text to Speech
Many modern computers now provide text-to-speech capability. This
capability processes text strings and produces understandable audio speech
from the computer's audio output device (headphones or speaker). This
capability allows a computer to present an audio version of a text string
to a computer user.
Problems with Downloading Information
The background of the World Wide Web (WWW) and WWW browsers are well
described by reference to the first chapter of Instant HTML Web Pages, by
Wayne Ause, Ziff-Davis Press, ISBN 1-56276-363-6, Copyright 1995, pages
1-15, hereby incorporated by reference as illustrative of the prior art.
Using the Internet, a computer user has access to an immense amount of
information. However, retrieving this information over the Internet often
takes significant time because of the limited bandwidth of the
communication channel. The bandwidth is limited by many factors. Some of
these factors are the bandwidth of the communication link from the user's
computer to the Internet, the bandwidth of the communication link from the
information provider's computer to the Internet, the existence of other
communication traffic on these links, and the bandwidth of the Internet
itself. Often, the primary bandwidth limitation is at the user's computer.
This bandwidth limitation at the user's computer is exacerbated because
multiple data streams often flow across this limited communication link.
If the user is interested in a particular data transfer, these additional
data streams utilize bandwidth that would otherwise be available to the
data stream-of-interest to the user. This results in a decreased data
transfer rate of the data stream-of-interest.
Prior art WWW browsers, for example, generally attempt to equally allocate
bandwidth to all the data transfers directed towards visible views in a
window. Although this approach is clearly better then simply sequentially
retrieving data for each view, this approach delays retrieving data that
is of the most interest to the user because the available channel
bandwidth is divided between the data streams supplying data to the views.
Thus, the user must wait an additional time because of uninteresting
information using bandwidth that could have been applied to the
information of interest.
During the transmission of large amounts of data, a program generally
provides some indication of the progress of the transmission. This
indication is provided by indicators such as bar indicators, numerical
percentage indicators, or in the case of images often just the amount of
detail available in the displayed image. While waiting for the transfer to
complete, the user often watches the progress of the indicator or of the
partially-filled image.
As mentioned above, one problem with the prior art is that a user has
little control over the bandwidth allocated to the data stream used to
download information. Further, even if an application should provide this
control to the user, the user still must explicitly command the
application to set the allocated bandwidth.
The invention addresses these problems.
Problems with Additional Data Associated with Images
In print and computer hypertext documents, images such as pictures and
illustrations, are often provided with additional information, such as
captions explaining or enhancing the image. Those who view the image
cannot look at the image and read an associated caption at the same time.
Thus, the viewer's attention is diverted from the image while searching
for, and reading, the associated caption. Contrast this situation with a
directive time-dependent medium, such as film or video, where a viewer is
simultaneously presented with both visual and audio information. Audio
captioning presents additional information though an audio speaker
allowing the user to receive additional information auditorally without
distracting the viewer's gaze from the image of interest. Systems that
allow a user to select which image to view, from a plurality of images,
require the user to explicitly trigger the vocal caption. Thus, the user
is again distracted from looking at the image by the need to seek out and
activate the caption.
The invention addresses these problems.
Problems with Small Text Displayed to a User
People often have difficulty reading text on a computer display screen.
Often this is due to vision difficulties. Thus, the type used in WYSIWYG
(what you see is what you get) applications is often too small for
comfortable reading at the display distance. Further, publishers use
different type sizes as a layout tool that indicates importance. Thus,
there is a large variation in text size and screen space used between the
largest headline text and the text of an article. To address this problem,
some applications allow the WYSIWYG text to be magnified. Examples of word
processing programs that provide this capability are Microsoft's Word.RTM.
and Adobe's FrameMaker.RTM. programs. However, these programs require the
user to explicitly specify, either directly or indirectly, the desired
magnification factor. Further, the magnification process reduces the
amount of the page that can be displayed on the computer display at the
same time because the percentage of the page that is displayed to the user
is reduced when the page is magnified. This problem is exacerbated with
applications that display WYSIWYG versions of newspapers and magazines
because these applications generally attempt to maintain the WYSIWYG page
layout and the displayed page is uniformly magnified. To see the entire
page, the article text is generally reduced to unreadable size.
The page layout of newspapers and magazines is important. To attract the
interest of a large number of readers, the publishers of newspapers
present a large number of articles on the first few pages. One way to
increase the number of articles on a page is to decrease the amount of
space used for the article. In a traditional newspaper, this is
accomplished by moving subsequent parts of the article to different pages.
This allows the reader to quickly scan articles that the editor believes
to be most relevant and to read in depth those articles that the reader
finds interesting. Further, where articles are placed on a page influences
the order that articles are viewed. Electronic newspapers have these same
characteristics.
Additionally, electronic newspapers, like traditional newspapers, use
different type styles and sizes to indicate the relative importance of
headlines and subheaders. Thus, there is a wide disparity between the
largest and smallest text displayed to the reader. Moreover, even large
computer displays have a smaller display area than is available to a
traditional newspaper thus reducing the area available to the publisher
for articles.
Nevertheless, a computer display must often carry the same amount of
information as a newspaper. Thus, mapping the content of a newspaper onto
a display screen reduces the size of the type used for the articles to the
point where the text of the article is extremely difficult to read.
Further, the magnification method used by word processing programs for a
globally expanding the displayed text does not work well when presenting
many articles on a page because magnifying the entire page, and providing
a limited view into the page distances the structure of the page from the
viewer. Thus, globally expanding the text is incompatible with presenting
as many articles as is desired on the page. Further, globally expanding
the page also expands the larger title and headline text more than is
needed to make this text readable and at a cost of consuming undue display
space that could otherwise be used to present additional information.
Thus, there is a need for a mechanism that optimizes the text size for a
reader while still preserving the structural indications provided by the
page layout.
The invention addresses these problems.
Problems with Selecting Relevant Information for a User
Another aspect of electronic newspapers, briefly mentioned above, is that
of selecting information content for the newspaper. Information content
includes both articles about particular items of interest and advertising
information. Information content is a major reason why people select
different paper-based magazines and newspapers. These traditional
information providers present the information that they believe interest
their readers. Traditional newspapers and magazines are static once
printed. Thus, each edition is the same for all those who read it and each
copy of a particular edition distributed to a particular region has the
same articles, the same layout and the same advertising as all the other
copies distributed to the region. This advertising and information content
can be customized to the particular region. However, this regionalization
can only be carried so far as it is extremely expensive to customize a
newspaper or magazine to the particular individual interests of each
reader. Thus, some of the information selected for a region will not
interest some readers.
Intangible electronic newspapers need not be constrained by the above
mentioned limitations inherent in using a tangible paper medium. However,
electronic newspapers still target most advertising and information
content to a particular market and not to the particular interests of an
individual reader. Even where the reader of an electronic publication is
provided with a means to customize the content of the electronic paper the
user must explicitly specify the content. Further, by explicitly
specifying the content, the user may not as be presented with other
related information that falls just outside of the specification but that
could be of interest to the reader.
The invention addresses these problems.
SUMMARY OF THE INVENTION
The present invention provides an economical, apparatus, method, system and
computer program product for providing enhanced facilities to computer
users. This invention provides a way for a computer to monitor the user to
determine what aspect of the computer operation the user is interested in
and to respond accordingly.
One aspect of the invention is a computer controlled method for presenting
information to a user of a computer. The method makes use of a display
device and a gaze-tracking device. The gaze-tracking device determines the
user's gaze position on the display device. The first step of the method
displays a plurality of categorized information on the display device. It
then monitors the gaze position to determine the user's level of interest
in the plurality of categorized information displayed on the display
device. The invention also retrieves one or more topics that classify the
plurality of categorized information. Using these topics and the level of
interest the invention determines a correlation that is used to select a
new plurality of categorized information.
Another aspect of the invention discloses an information presentation
apparatus that presents information on a display device to a user. This
apparatus includes a central processor unit, a memory and a gaze-tracking
device. The gaze-tracking device determines a gaze position on the display
device. The apparatus also includes a display mechanism that displays a
plurality of categorized information on the display device. Additionally
the apparatus uses a monitoring mechanism to monitor the gaze position to
determine the user's level of interest in the plurality of categorized
information presented on the display device. The apparatus also includes a
retrieval mechanism that retrieves topics that classify the categorized
information displayed on the display device. These topics and the level of
interest of the user are provided to a correlation mechanism that
determines a correlation between the topics and level of interest. Lastly,
the apparatus includes a selection mechanism that selects a new plurality
of categorized information based on the correlation.
Yet another aspect of the invention is an information presentation system
that presents information on a display device to a user. This system
includes a gaze-tracking device. The gaze-tracking device determines a
gaze position on the display device. The system also includes a display
mechanism that displays a plurality of categorized information on the
display device. Additionally the system uses a monitoring mechanism to
monitor the gaze position to determine the user's level of interest in the
plurality of categorized information presented on the display device. The
system also includes a retrieval mechanism that retrieves topics that
classify the categorized information displayed on the display device.
These topics and the level of interest of said user are provided to a
correlation mechanism that determines a correlation between the topics and
level of interest. Lastly, the system includes a selection mechanism that
selects a new plurality of categorized information based on the
correlation.
A final aspect of the invention discloses a computer program product having
computer readable code embodied in a computer usable storage medium. When
executed on a computer, the computer readable code causes a computer to
effect a display mechanism to display a plurality of categorized
information on a display device. Further the invention contains code that
effects a monitoring mechanism, a retrieval mechanism, a correlation
mechanism and a selection mechanism having the same functions as the
system described above.
The foregoing and many other objects and advantages of the present
invention will no doubt become obvious to those of ordinary skill in the
art after having read the following detailed description of the preferred
embodiments which are illustrated in the various drawing figures.
DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a portion of a computer system, including a CPU and a
conventional memory in which the present invention may be embodied;
FIG. 2 illustrates a display device fitted with gaze tracking equipment;
FIG. 3 illustrates aspects of a gaze position in accordance with a
preferred embodiment;
FIG. 4 illustrates the process for determining a gaze position as used in
accordance with a preferred embodiment;
FIG. 5 illustrates the operation of the invention to allocate bandwidth to
an area of interest in accordance with a preferred embodiment;
FIG. 6 illustrates the process used to change the bandwidth of a data
stream based upon a gaze position in accordance with a preferred
embodiment;
FIGS. 7A & 7B illustrate audio captioning in accordance with a preferred
embodiment;
FIG. 8 illustrates extraction of an image caption from a page of text in
accordance with a preferred embodiment;
FIG. 9 illustrates the process used to implement captioning in accordance
with a preferred embodiment;
FIG. 10 illustrates the form of an electronic newspaper;
FIG. 11 illustrates text magnification and page layout in accordance with a
preferred embodiment;
FIG. 12 illustrates text magnification and page layout in accordance with a
second preferred embodiment;
FIG. 13 illustrates the process of expanding text in response to the users
interest in the text in accordance with a preferred embodiment;
FIG. 14 illustrates the process for adjusting the layout of a display as a
result of expanded text in accordance with a preferred embodiment;
FIG. 15 illustrates a possible first page of an electronic newspaper
showing articles and an advertisement in accordance with a preferred
embodiment;
FIG. 16 illustrates a possible second page of an electronic newspaper
showing information determined to be of interest to the reader in
accordance with a preferred embodiment; and
FIG. 17 illustrates the process used to evaluate the information of
interest to a reader and to select new information matching the readers
interest in accordance with a preferred embodiment.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Notations and Nomenclature
The following "notations and nomenclature" are provided to assist in the
understanding of the present invention and the preferred embodiments
thereof.
Advertisement--Information provided about a commercial product or service
with the purpose of informing the viewer about the product or service so
as to lead to a commercial transaction. A type of article.
Article--A complete piece of writing often identified with a title.
Bandwidth--The amount of information that can be passed across a
communication channel in a given period of time (usually designated in
reference to a second).
Dialog--A specialized window that is used to obtain additional information
from the user. A dialog is often used to obtain options and parameters
that are computer dependent. A good example is a print dialog that is
evoked by a print menu command. The print dialog allows the user to
specify what printer options are to be used for a particular print job.
Generally the dialog allows the user to specify specific parameters and
then to either affirm or cancel the command that evoked the dialog. If the
user cancels the command, the dialog window is removed and the command
that evoked the dialog is aborted. If the user confirms the command the
user provided information acquired by the dialog is used in the execution
of the command that evoked the dialog.
E-mail system--Electronic mail system. A system of computers generally
connected by a network that allow a sender (being a user of a first
computer) to compose and send data making up a message to a recipient
(being a user of either the first computer or of a second computer).
Graphical User Interface (GUI)--A user interface that allows a user to
interact with a computer display by pointing at selectable control areas
on the display and activating a command or computer operation associated
with the selectable control area. GUIs are well known in the art.
Gaze position--An area of interest on the screen providing a boundary of
the user's gaze over a limited period of time.
Gaze coordinates--The coordinates that represent the intersection of the
user's gaze with the display screen over a limited period of time.
Gaze coordinates (raw)--The coordinates that represent the instantaneous
intersection of the user's gaze with the display screen.
Image--Any information displayed on a display screen such as, but not
limited to, pictures, drawings, illustrations, text, and video. An image
generally displayed in a view contained in a window. A still image is a
picture. A moving image is comprised of a number of frames of still images
that are played in sequence similar to a video
Pointing device--A device responsive to a computer user's input that moves
an indicator on a computer display screen. Such an indicator has an active
point such that if the pointing device is activated (e.g., by a button
push for a mouse device) a command associated with the selectable control
area covered by the active point is invoked. Pointing devices are
generally used with graphical user interfaces.
Selectable control area--An area on a computer display that is sensitive to
activation of a pointing device. On activation of the pointing device over
the selectable control area, a command or computer operation associated
with the selectable control area is invoked. Most computer systems that
provide a Graphical User Interface (GUI) also provide other methods for
invoking these commands or computer operations such as keyboard function
keys or command lines.
URL--A Uniform Resource Locator. URLs are used to access information on the
World Wide Web.
View--An area in a window where information is provided.
Window--An area, usually rectangular, on a computer display screen
controlled by an application.
Procedure--A self-consistent sequence of steps leading to a desired result.
These steps are those requiring physical manipulation of physical
quantities. Usually these quantities take the form of electrical or
magnetic signals capable of being stored, transferred, combined, compared,
and otherwise manipulated. These signals are referred to as bits, values,
elements, symbols, characters, terms, numbers, or the like. It w | | |