|
|  Get related patents on CD |
| United States Patent | 5754939 |
| Link to this page | http://www.wikipatents.com/5754939.html |
| Inventor(s) | Herz; Frederick S. M. (Davis, WV); Eisner; Jason M. (Philadelphia, PA); Ungar; Lyle H. (Philadelphia, PA); Marcus; Mitchell P. (Philadelphia, PA) |
| Abstract | This invention relates to customized electronic identification of desirable
objects, such as news articles, in an electronic media environment, and in
particular to a system that automatically constructs both a "target
profile" for each target object in the electronic media based, for
example, on the frequency with which each word appears in an article
relative to its overall frequency of use in all articles, as well as a
"target profile interest summary" for each user, which target profile
interest summary describes the user's interest level in various types of
target objects. The system then evaluates the target profiles against the
users' target profile interest summaries to generate a user-customized
rank ordered listing of target objects most likely to be of interest to
each user so that the user can select from among these potentially
relevant target objects, which were automatically selected by this system
from the plethora of target objects that are profiled on the electronic
media. Users' target profile interest summaries can be used to efficiently
organize the distribution of information in a large scale system
consisting of many users interconnected by means of a communication
network. Additionally, a cryptographically-based pseudonym proxy server is
provided to ensure the privacy of a user's target profile interest
summary, by giving the user control over the ability of third parties to
access this summary and to identify or contact the user. |
| |
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5754939 |
|
|
System for generation of user profiles for a system for customized
electronic identification of desirable objects |
|
|
|
|
|
| Publication Date |
May 19, 1998 |
|
|
|
|
|
| Filing Date |
October 31, 1995 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
CROSS-REFERENCE TO RELATED APPLICATIONS
This patent application is a continuation-in-part of U.S. patent
application Ser. No. 08/346,425, filed Nov. 28, 1994 and titled "SYSTEM
AND METHOD FOR SCHEDULING BROADCAST OF AND ACCESS TO VIDEO PROGRAMS AND
OTHER DATA USING CUSTOMER PROFILES", which application is assigned to the
same assignee as the present application. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5600364 Hendricks 725/9 Feb,1997 |      Your vote accepted [0 after 0 votes] | | 5541638 Story 725/116 Jul,1996 |      Your vote accepted [0 after 0 votes] | | 5534911 Levitan 725/46 Jul,1996 |      Your vote accepted [0 after 0 votes] | | 5410344 Graves 725/46 Apr,1995 |      Your vote accepted [0 after 0 votes] | | 5373558 Chaum 713/180 Dec,1994 |      Your vote accepted [0 after 0 votes] | | 5331556 Black, Jr. 704/9 Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5331554 Graham 707/5 Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5321833 Chang 707/5 Jun,1994 |      Your vote accepted [0 after 0 votes] | | 5301109 Landauer 704/9 Apr,1994 |      Your vote accepted [0 after 0 votes] | | 5251324 McMullan, Jr. 725/14 Oct,1993 |      Your vote accepted [0 after 0 votes] | | 5245656 Loeb 713/154 Sep,1993 |      Your vote accepted [0 after 0 votes] | | 5136501 Silverman 705/37 Aug,1992 |      Your vote accepted [0 after 0 votes] | | 5131039 Chaum 705/69 Jul,1992 |      Your vote accepted [0 after 0 votes] | | 4987593 Chaum 705/69 Jan,1991 |      Your vote accepted [0 after 0 votes] | | 4947430 Chaum 713/180 Aug,1990 |      Your vote accepted [0 after 0 votes] | | 4926480 Chaum 705/69 May,1990 |      Your vote accepted [0 after 0 votes] | | 4914698 Chaum 380/30 Apr,1990 |      Your vote accepted [0 after 0 votes] | | 4759063 Chaum 380/30 Jul,1988 |      Your vote accepted [0 after 0 votes] | | 4706080 Sincoskie 340/825.02 Nov,1987 |      Your vote accepted [0 after 0 votes] | | 4529870 Chaum 235/380 Jul,1985 |      Your vote accepted [0 after 0 votes] | | 5483278 Strubbe 725/61 Dec,1969 |      Your vote accepted [0 after 0 votes] | | 5276736 Chaum 705/69 Dec,1969 |      Your vote accepted [0 after 0 votes] | | 5469206 Strubbe 725/60 Dec,1969 |      Your vote accepted [0 after 0 votes] | | | | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
Claims  |
|
|
We claim:
1. A method for providing a user with access to selected ones of a plurality of target objects and sets of target object characteristics that are accessible via an electronic storage
media, where said users are connected via user terminals and data communication connections to a target server system which accesses said electronic storage media, said method comprising the steps of:
automatically generating at least one user target profile interest summary for a user at a user terminal, each of said user target profile interest summary being indicative of ones of said target objects and sets of target object characteristics
accessed by said user; and
storing said at least one user target profile interest summary in a memory.
2. The method of claim 1 further comprising the step of:
enabling said user to access said plurality of target objects and sets of target object characteristics stored on said electronic storage media via said user target profile interest summaries.
3. The method of claim 2 wherein said step of enabling access comprises:
correlating said user target profile interest summaries, generated for said user, with target profiles generated for said plurality of target objects and sets of target object characteristics to identify ones of said plurality of target objects
and sets of target object characteristics stored on said electronic storage media that are likely to be of interest to said user.
4. The method of claim 3 wherein said step of enabling access further comprises: transmitting a list, that identifies at least one of said identified ones of said plurality of target objects and sets of target object characteristics, to said
user; and
providing access to a selected one of said plurality of target objects and sets of target object characteristics stored on said electronic storage media in response to said user selecting an item from said list.
5. The method of claim 4 wherein said step of providing access comprises:
transmitting data, in response to said user activating said user terminal to identify said selected item on said list, indicative of said user's selection of said selected item from said user terminal to said target server via a one of said data
communication connections.
6. The method of claim 5 wherein said step of providing access further comprises:
retrieving, in response to receipt of said data from said user terminal, a one of a target object and set of target object characteristics identified by said selected item from said electronic storage media; and
transmitting said retrieved one of said target object and set of target object characteristics to said user terminal for display thereon to said user.
7. The method of claim 1 wherein said step of automatically generating comprises:
automatically updating said user target profile interest summary for said user as a function of said target objects and sets of target object characteristics retrieved by said user.
8. The method of claim 1 wherein said target object is a document having at least one page, said step of automatically generating comprises:
automatically updating said user target profile interest summary for said user as a function of the number of pages of said retrieved documents accessed by said user.
9. The method of claim 1 wherein said step of automatically generating comprises:
automatically updating said user target profile interest summary for said user as a function of a length of time said user accessed said retrieved target objects and sets of target object characteristics.
10. The method of claim 1 wherein said user target profile interest summaries are stored in an autonomous server interposed between said user terminal and said target server.
11. The method of claim 10 further comprising the step of:
transmitting data from said user terminal, indicative of target objects and sets of target object characteristics retrieved by said user from said electronic storage media, to said autonomous server to enable said autonomous server to update said
user target profile interest summaries stored therein.
12. Apparatus for providing a user with access to selected ones of a plurality of target objects and sets of target object characteristics that are accessible via an electronic storage media, where said users are connected via user terminals and
data communication connections to a target server system which accesses said electronic storage media, comprising:
means for automatically generating at least one user target profile interest summary for a user at a user terminal, each of said user target profile interest summaries being indicative of ones of said target objects and sets of target object
characteristics accessed by said user; and
means for storing said at least one user target profile interest summary in a memory.
13. The apparatus of claim 12 further comprising:
means for enabling said user to access said plurality of target objects and sets of target object characteristics stored on said electronic storage media via said user target profile interest summaries.
14. The apparatus of claim 13 wherein said means for enabling access comprises:
means for correlating said user target profile interest summaries, generated for said user, with target profiles generated for said plurality of target objects and sets of target object characteristics to identify ones of said plurality of target
objects and sets of target object characteristics stored on said electronic storage media that are likely to be of interest to said user.
15. The apparatus of claim 14 wherein said means for enabling access further comprises:
means for transmitting a list, that identifies at least one of said identified ones of said plurality of target objects and sets of target object characteristics, to said user; and
means for providing access to a selected one of said plurality of target objects and sets of target object characteristics stored on said electronic storage media in response to said user selecting an item from said list.
16. The apparatus of claim 15 wherein said means for providing access comprises:
means for transmitting data, in response to said user activating said user terminal to identify said selected item on said list, indicative of said user's selection of said selected item from said user terminal to said target server via a one of
said data communication connections.
17. The apparatus of claim 16 wherein said means for providing access further comprises:
means for retrieving, in response to receipt of said data from said user terminal, a target object identified by said selected item from said electronic storage media; and
means for transmitting said retrieved target object to said user terminal for display thereon to said user.
18. The apparatus of claim 12 wherein said means for automatically generating comprises:
means for automatically updating said user target profile interest summary for said user as a function of said target objects and sets of target object characteristics retrieved by said user.
19. The apparatus of claim 12 wherein said target object is a document having at least one page, said means for automatically generating comprises:
means for automatically updating said user target profile interest summary for said user as a function of the number of pages of said retrieved documents accessed by said user.
20. The apparatus of claim 12 wherein said means for automatically generating comprises:
means for automatically updating said user target profile interest summary for said user as a function of a length of time said user accessed said retrieved target objects and sets of target object characteristics.
21. The apparatus of claim 12 further comprising:
autonomous server means interposed between said user terminal and said target server for storing said user target profile sets.
22. The apparatus of claim 21 further comprising:
means for transmitting data from said user terminal, indicative of target objects and sets of target object characteristics retrieved by said user from said electronic storage media, to said autonomous server to enable said autonomous server to
update said user target profile interest summaries stored therein. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF INVENTION
This invention relates to customized electronic identification of desirable objects, such as news articles, in an electronic media environment, and in particular to a system that automatically constructs both a "target profile" for each target
object in the electronic media based, for example, on the frequency with which each word appears in an article relative to its overall frequency of use in all articles, as well as a "target profile interest summary" for each user, which target profile
interest summary describes the user's interest level in various types of tar get objects. The system then evaluates the target profiles against the users' target profile interest summaries to generate a user-customized rank ordered listing of target
objects most likely to be of interest to each user so that the user can select from among these potentially relevant target objects, which were automatically selected by this system from the plethora of target objects that are profiled on the electronic
media. Users' target profile interest summaries can be used to efficiently organize the distribution of information in a large scale system consisting of many users interconnected by means of a communication network. Additionally, a cryptographically
based proxy server is provided to ensure the privacy of a user's target profile interest summary, by giving the user control over the ability of third parties to access this summary and to identify or contact the user.
PROBLEM
It is a problem in the field of electronic media to enable a user to access information of relevance and interest to the user without requiring the user to expend an excessive amount of time and energy searching for the information. Electronic
media, such as on-line information sources, provide a vast amount of information to users, typically in the form of "articles," each of which comprises a publication item or document that relates to a specific topic. The difficulty with electronic media
is that the amount of information available to the user is overwhelming and the article repository systems that are connected on-line are not organized in a manner that sufficiently simplifies access to only the articles of interest to the user.
Presently, a user either fails to access relevant articles because they are not easily identified or expends a significant amount of time and energy to conduct an exhaustive search of all articles to identify those most likely to be of interest to the
user. Furthermore, even if the user conducts an exhaustive search, present information searching techniques do not necessarily accurately extract only the most relevant articles, but also present articles of marginal relevance due to the functional
limitations of the information searching techniques. There is also no existing system which automatically estimates the inherent quality of an article or other target object to distinguish among a number of articles or target objects identified as of
possible interest to a user.
Therefore, in the field of information retrieval, there is a long-standing need for a system which enables users to navigate through the plethora of information. With commercialization of communication networks, such as the Internet, the growth
of available information has increased. Customization of the information delivery process to the user's unique tastes and interests is the ultimate solution to this problem. However, the techniques which have been proposed to date either only address
the user's interests on a superficial level or provide greater depth and intelligence at the cost of unwanted demands on the user's time and energy. While many researchers have agreed that traditional methods have been lacking in this regard, no one to
date has successfully addressed these problems in a holistic manner and provided a system that can fully learn and reflect the user's tastes and interests. This is particularly true in a practical commercial context, such as on-line services available
on the Internet. There is a need for an information retrieval system that is largely or entirely passive, unobtrusive, undemanding of the user, and yet both precise and comprehensive in its ability to learn and truly represent the user's tastes and
interests. Present information retrieval systems require the user to specify the desired information retrieval behavior through cumbersome interfaces.
Users may receive information on a computer network either by actively retrieving the information or by passively receiving information that is sent to them. Just as users of information retrieval systems face the problem of too much
information, so do users who are targeted with electronic junk mail by individuals and organizations. An ideal system would protect the user from unsolicited advertising, both by automatically extracting only the most relevant messages received by
electronic mail, and by preserving the confidentiality of the user's preferences, which should not be freely available to others on the network.
Researchers in the field of published article information retrieval have devoted considerable effort to finding efficient and accurate methods of allowing users to select articles of interest from a large set of articles. The most widely used
methods of information retrieval are based on keyword matching: the user specifies a set of keywords which the user thinks are exclusively found in the desired articles and the information retrieval computer retrieves all articles which contain those
keywords. Such methods are fast, but are notoriously unreliable, as users may not think of the right keywords, or the keywords may be used in unwanted articles in an irrelevant or unexpected context. As a result, the information retrieval computers
retrieve many articles which are unwanted by the user. The logical combination of keywords and the use of wild-card search parameters help improve the accuracy of keyword searching but do not completely solve the problem of inaccurate search results.
Starting in the 1960's, an alternate approach to information retrieval was developed: users were presented with an article and asked if it contained the information they wanted, or to quantify how close the information contained in the article was to
what they wanted. Each article was described by a profile which comprised either a list of the words in the article or, in more advanced systems, a table of word frequencies in the article. Since a measure of similarity between articles is the distance
between their profiles, the measured similarity of article profiles can be used in article retrieval. For example, a user searching for information on a subject can write a short description of the desired information. The information retrieval
computer generates an article profile for the request and then retrieves articles with profiles similar to the profile generated for the request. These requests can then be refined using "relevance feedback", where the user actively or passively rates
the articles retrieved as to how close the information contained therein is to what is desired. The information retrieval computer then uses this relevance feedback information to refine the request profile and the process is repeated until the user
either finds enough articles or tires of the search.
A number of researchers have looked at methods for selecting articles of most interest to users. An article titled "Social Information filtering: algorithms for automating "word of mouth" was published at the CHi-95 Proceedings by Patti Maes et
al and describes the Ringo information retrieval system which recommends musical selections. The Ringo system requires active feedback from the users--users must manually specify how much they like or dislike each musical selection. The Ringo system
maintains a complete list of users ratings of music selections and makes recommendations by finding which selections were liked by multiple people. However, the Ringo system does not take advantage of any available descriptions of the music, such as
structured descriptions in a data base, or free text, such as that contained in music reviews. An article titled "Evolving agents for personalized information filtering", published at the Proc. 9th IEEE Conf. on AI for Applications by Sheth and Maes,
described the use of agents for information filtering which use genetic algorithms to learn to categorize Usenet news articles. In this system, users must define news categories and the users actively indicate their opinion of the selected articles.
Their system uses a list of keywords to represent sets of articles and the records of users' interests are updated using genetic algorithms.
A number of other research groups have looked at the automatic generation and labeling of clusters of articles for the purpose of browsing through the articles. A group at Xerox Parc published a paper titled "Scatter/gather: a cluster-based
approach to browsing large article collections" at the 15 Ann. Int'l SIGIR '93, ACM 318-329 (Cutting et al. 1992). This group developed a method they call "scatter/gather" for performing information retrieval searches. In this method, a collection of
articles is "scattered" into a small number of clusters, the user then chooses one or more of these clusters based on short summaries of the cluster. The selected clusters are then "gathered" into a subcollection, and then the process is repeated. Each
iteration of this process is expected to produce a small, more focused collection. The cluster "summaries" are generated by picking those words which appear most frequently in the cluster and the titles of those articles closest to the center of the
cluster. However, no feedback from users is collected or stored, so no performance improvement occurs over time.
Apple's Advanced Technology Group has developed an interface based on the concept of a "pile of articles". This interface is described in an article titled "A `pile` metaphor for supporting casual organization of information in Human factors in
computer systems" published in CHI '92Conf. Proc. 627-634 by Mander, R. G. Salomon and Y. Wong. 1992. Another article titled "Content awareness in a file system interface: implementing the `pile` metaphor for organizing information" was published in
16 Ann. Int'l SIGIR '93, ACM 260-269 by Rose E. D. et al. The Apple interface uses word frequencies to automatically file articles by picking the pile most similar to the article being filed. This system functions to cluster articles into subpiles,
determine key words for indexing by picking the words with the largest TF/IDF (where TF is term (word) frequency and IDF is the inverse document frequency) and label piles by using the determined key words.
Numerous patents address information retrieval methods, but none develop records of a user's interest based on passive monitoring of which articles the user accesses. None of the systems described in these patents present computer architectures
to allow fast retrieval of articles distributed across many computers. None of the systems described in these patents address issues of using such article retrieval and matching methods for purposes of commerce or of matching users with common interests
or developing records of users' interests. U.S. Pat. No. 5,321,833 issued to Chang et al. teaches a method in which users choose terms to use in an information retrieval query, and specify the relative weightings of the different terms. The Chang
system then calculates multiple levels of weighting criteria. U.S. Pat. No. 5,301,109 issued to Landauer et al. teaches a method for retrieving articles in a multiplicity of languages by constructing "latent vectors" (SVD or PCA vectors) which
represent correlations between the different words. U.S. Pat. No. 5,331,554 issued to Graham et al. discloses a method for retrieving segments of a manual by comparing a query with nodes in a decision tree. U.S. Pat. No. 5,331,556 addresses
techniques for deriving morphological part-of-speech information and thus to make use of the similarities of different forms of the same word (e.g. "article" and "articles").
Therefore, there presently is no information retrieval and delivery system operable in an electronic media environment that enables a user to access information of relevance and interest to the user without requiring the user to expend an
excessive amount of time and energy.
SOLUTION
The above-described problems are solved and a technical advance achieved in the field by the system for customized electronic identification of desirable objects in an electronic media environment, which system enables a user to access target
objects of relevance and interest to the user without requiring the user to expend an excessive amount of time and energy. Profiles of the target objects are stored on electronic media and are accessible via a data communication network. In many
applications, the target objects are informational in nature, and so may themselves be stored on electronic media and be accessible via a data communication network.
Relevant definitions of terms for the purpose of this description include: (a.) an object available for access by the user, which may be either physical or electronic in nature, is termed a "target object", (b.) a digitally represented profile
indicating that target object's attributes is termed a "target profile", (c.) the user looking for the target object is termed a "user", (d.) a profile holding that user's attributes, including age/zip code/etc. is termed a "user profile", (e.) a summary
of digital profiles of target objects that a user likes and/or dislikes, is termed the "target profile interest summary" of that user, (f) a profile consisting of a collection of attributes, such that a user likes target objects whose profiles are
similar to this collection of attributes, is termed a "search profile" or in some contexts a "query" or "query profile," (g.) a specific embodiment of the target profile interest summary which comprises a set of search profiles is termed the "search
profile set" of a user, (h.) a collection of target objects with similar profiles, is termed a "cluster," (i.) an aggregate profile formed by averaging the attributes of all target objects in a cluster, termed a "cluster profile," (j.) areal number
determined by calculating the statistical variance of the profiles of all target objects in a cluster, is termed a "cluster variance," (k.) a real number determined by calculating the maximum distance between the profiles of any two target objects in a
cluster, is termed a "cluster diameter."
The system for electronic identification of desirable objects of the present invention automatically constructs both a target profile for each target object in the electronic media based, for example, on the frequency with which each word appears
in an article relative to its overall frequency of use in all articles, as well as a "target profile interest summary" for each user, which target profile interest summary describes the user's interest level in various types of target objects. The
system then evaluates the target profiles against the users' target profile interest summaries to generate a user-customized rank ordered listing of tar get objects most likely to be of interest to each user so that the user can select from among these
potentially relevant target objects, which were automatically selected by this system from the plethora of target objects available on the electronic media.
Because people have multiple interests, a target profile interest summary for a single user must represent multiple areas of interest, for example, by consisting of a set of individual search profiles, each of which identifies one of the user's
areas of interest. Each user is presented with those target objects whose profiles most closely match the user's interests as described by the user's target profile interest summary. Users' target profile interest summaries are automatically updated on
a continuing basis to reflect each user's changing interests. In addition, target objects can be grouped into clusters based on their similarity to each other, for example, based on similarity of their topics in the case where the target objects are
published articles, and menus automatically generated for each cluster of target | | |