|
|
|
| United States Patent | 6092049 |
| Link to this page | http://www.wikipatents.com/6092049.html |
| Inventor(s) | Chislenko; Alexander (Cambridge, MA); Lashkari; Yezdezard (Cambridge, MA); Tiu; David D. (Somerville, MA); Metral; Max E. (Boston, MA); McNulty; John Edward (Burlington, MA) |
| Abstract | A method for recommending items to users using automated collaborative
filtering stores profiles of users relating ratings to items in memory.
Profiles of items may also be stored in memory, the item profiles
associating users with the rating given to the item by that user or
inferred for the user by the system The user profiles include additional
information relating to the user or associated with the rating given to an
item by the user. Item profiles are retrieved to determine which users
have rated a particular item. Profiles of those users are accessed and the
ratings are used to calculate similarity factors with respect to other
users. The similarity factors, sometimes in connection with confidence
factors, are used to select a set of neighboring users. The neighboring
users are weighted based on their respective similarity factors, and a
rating for an item contained in the domain is predicted. In one
embodiment, items in the domain have features. In this embodiment, the
values for features can be clustered, and the similarity factors
incorporate assigned feature weights and feature value cluster weights. In
some embodiments, item concepts are used to enhance recommendation
accuracy. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 6092049 |
|
|
Method and apparatus for efficiently recommending items using automated
collaborative filtering and feature-guided automated collaborative
filtering |
|
|
|
|
|
| Publication Date |
July 18, 2000 |
|
|
|
|
|
| Filing Date |
March 14, 1997 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part application of application Ser.
No. 08/597,442 filed Feb. 2, 1996, now abandoned, which itself claims
priority to provisional application Ser. No. 60/000,598, filed Jun. 30,
1995, and provisional application 60/008,458, filed Dec. 11, 1995, both of
which are now expired and are incorporated herein by reference. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5704017 Heckerman 706/12 Dec,1997 |      Your vote accepted [0 after 0 votes] | | 5699507 Goodnow, II 714/38 Dec,1997 |      Your vote accepted [0 after 0 votes] | | 5692107 Simoudis 706/12 Nov,1997 |      Your vote accepted [0 after 0 votes] | | 5583763 Atcheson 707/3 Dec,1996 |      Your vote accepted [0 after 0 votes] | | 5544161 Bigham 370/397 Aug,1996 |      Your vote accepted [0 after 0 votes] | | 5459306 Stein 235/383 Oct,1995 |      Your vote accepted [0 after 0 votes] | | 5446159 Stucky 546/118 Aug,1995 |      Your vote accepted [0 after 0 votes] | | 5446891 Kaplan 707/2 Aug,1995 |      Your vote accepted [0 after 0 votes] | | 5041972 Frost 705/10 Aug,1991 |      Your vote accepted [0 after 0 votes] | | 5034981 Leonard
Jul,1991 |      Your vote accepted [0 after 0 votes] | | 4914694 Leonard 380/204 Apr,1990 |      Your vote accepted [0 after 0 votes] | | 4872113 Dinerstein 705/10 Oct,1989 |      Your vote accepted [0 after 0 votes] | | 4870579 Hey 705/27 Sep,1989 |      Your vote accepted [0 after 0 votes] | | 4781596 Weinblatt 434/236 Nov,1988 |      Your vote accepted [0 after 0 votes] | | 4745549 Hashimoto
May,1988 |      Your vote accepted [0 after 0 votes] | | 4682956 Krane 434/237 Jul,1987 |      Your vote accepted [0 after 0 votes] | | 4647964 Weinblatt 725/32 Mar,1987 |      Your vote accepted [0 after 0 votes] | | 4646145 Percy 725/24 Feb,1987 |      Your vote accepted [0 after 0 votes] | | 4630108 Gomersall 725/34 Dec,1986 |      Your vote accepted [0 after 0 votes] | | 4627818 Von Fellenberg 434/236 Dec,1986 |      Your vote accepted [0 after 0 votes] | | 4602279 Freeman 725/35 Jul,1986 |      Your vote accepted [0 after 0 votes] | | 4566030 Nickerson 379/92.04 Jan,1986 |      Your vote accepted [0 after 0 votes] | | 4546382 McKenna 725/14 Oct,1985 |      Your vote accepted [0 after 0 votes] | | 4348740 White 709/253 Sep,1982 |      Your vote accepted [0 after 0 votes] | | 4331973 Eskin 725/34 May,1982 |      Your vote accepted [0 after 0 votes] | | 4205464 Baggott 434/237 Jun,1980 |      Your vote accepted [0 after 0 votes] | | 4041617 Hollander 434/237 Aug,1977 |      Your vote accepted [0 after 0 votes] | | 3952184 Bassard 704/1 Apr,1976 |      Your vote accepted [0 after 0 votes] | | 4658290 McKenna 725/14 Dec,1969 |      Your vote accepted [0 after 0 votes] | | 4996642 Hey 705/27 Dec,1969 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed is:
1. In a machine which recommends at least one item to a first user based, in part, on similarity factors between the first user and at least a second user, a method for
operating the machine to determine a similarity
factor between the first user and the second user, the method comprising the steps of:
(a) retrieving from memory, using the machine an item profile of each of a plurality of items rated by the first user so as to define a plurality of retrieved item profiles, each of said retrieved item profiles includes a plurality of first
values, each of the plurality of first values representing a rating given to an associated one of the items by each of a plurality of users;
(b) determining from the retrieved item profiles, using the machine, whether the second user has previously rated any of the items;
(c) retrieving from memory, using the machine, a corresponding user profile of the first user so as to define a retrieved first user profile;
(d) retrieving, from memory, a corresponding user profile of the second user provided the second user has previously rated any of the items, so as to define a retrieved second user profile, wherein each of the retrieved first and second user
profiles includes a plurality of second values, each of the plurality of second values representing a rating given to one of a plurality of items by a corresponding one of the users; and
(e) calculating, using the machine, a similarity factor between the first user and the second user responsive to the retrieved profiles of the first and second users, wherein the first and second user profiles are distinct from the plurality of
item profiles.
2. The method of claim 1 wherein the step of calculating a similarity factor includes sub-steps of:
subtracting, using the machine, for each item rated by both users, the rating given to the item by the second user from the rating given to the item by the first user;
squaring, using the machine, each rating difference; and
dividing, using the machine, the sum of the squared differences by the number of items rated by both users.
3. The method of claim 1
wherein, in step (a), each item profile of each item rated by the first user is retrieved from a first memory and wherein, in step (c), the user profile of the second user is retrieved from a second memory.
4. The method of claim 1 further comprising the step of retrieving, using the machine, a user profile of the first user from a second memory before step (a).
5. In computer-implemented apparatus, the apparatus having a processor and a memory, the memory having computer executable instructions stored therein, a method for recommending an item to a particular one of a plurality of users, the item not
yet rated by the user, the method comprising steps, performed through the executable instructions, of:
(a) storing a user profile, in the memory, for each of a plurality of users so as to define a plurality of user profiles, wherein each of the user profiles includes a plurality of first values, each of the plurality of first values representing a
rating given to one of a plurality of items by a corresponding one of the users;
(b) storing an item profile, in the memory, for each of the plurality of items so as to define a plurality of item profiles, wherein each of the item profiles includes a plurality of second values, each of the plurality of second values
representing a rating given to an associated one of the items by each of the users;
(c) retrieving, from the memory, the item profile of each item rated by the particular one user so as to define a plurality of retrieved item profiles;
(d) determining from the retrieved item profiles, a set of users that have previously rated the items;
(e) retrieving, from the memory, the user profile for each of the set of users so as to define a plurality of retrieved user profiles, the plurality of user profiles being distinct from the plurality of item profiles;
(f) calculating, in response to the retrieved user profiles, a similarity factor between the particular one user and each of the set of users;
(g) selecting, in response to the similarity factors, for the particular one user, a plurality of neighboring users from the set of users;
(h) assigning a weight to each of the neighboring users; and
(i) recommending at least one of the plurality of items to the particular one user based on the weights assigned to the neighboring users and ratings given to the item by the neighboring users.
6. The method of claim 5 wherein the user profile storing step comprises the step of storing additional information, in the user profiles, other than ratings for the items.
7. The method of claim 5 wherein the item profile storing step comprises the step of storing additional information, in the item profiles, other than ratings for the items.
8. The method of claim 5 further comprising the step of storing the item profiles in a first memory and storing the user profiles in a second memory.
9. The method of claim 5 wherein the item profile retrieving step comprises the steps of:
receiving a rating from one of the plurality of users for one of the plurality of items so as to define a received rating;
updating the user profile of the one of the plurality of users with the received rating;
updating the item profile of the rated item with the received rating;
calculating, for the particular one user, a plurality of similarity factors, each of the plurality of similarity factors representing similarity between the particular one user and another one of the users.
10. In computer-implemented apparatus, the apparatus having a processor and a memory, the memory having computer executable instructions stored therein, a method for recommending an item to a particular one of a plurality of users, the item not
yet rated by the user, the method comprising steps, performed through the executable instructions, of:
(a) generating, for the particular one user, a concept mask representing areas of interest of the one user;
(b) storing, in the memory, a user profile in a memory for each of a plurality of users, wherein the user profile of the user includes a plurality of values, each of the plurality of values representing a rating given to one of a plurality of
items by the user;
(c) calculating, for each of the plurality of users, a plurality of similarity factor vectors representing the similarity between each user and another one of the plurality of users for at least one concept specified within the concept mask;
(d) selecting, for each of the plurality of users, a plurality of neighboring users based on the similarity factor vectors;
(e) assigning a weight to each of the neighboring users so as to define a plurality of weights assigned to the neighboring users; and
(f) recommending at least one of the plurality of items to the particular one user based on the weights assigned to the neighboring users and the ratings given to the non-rated item by the neighboring users.
11. The method of claim 1 wherein the step of calculating a similarity factor is performed only if the second user has previously rated the items.
12. The method of claim 3 where in the first memory is located on a first server and the second memory is located on a second server.
13. The method of claim 10 wherein the concept mask includes a number of concepts, wherein each of the concepts may be associated with at least one item rating, and wherein each item rating may be associated with at least one concept mask.
14. The method of claim 10 wherein the concept mask includes every concept with which rated items are associated.
15. The method of claim 10 wherein the concept mask includes concepts which have a highest value of all concepts.
16. The method of claim 10 wherein the concept mask includes concepts which have a value which exceeds a predetermined threshold.
17. The method of claim 10 wherein the concept mask includes a predetermined limited number of concepts. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
The present invention relates to a method and apparatus for efficiently recommending items and, in particular, to a method and apparatus for efficiently recommending items using automated collaborative filtering and feature-guided automated
collaborative filtering.
BACKGROUND OF THE INVENTION
The amount of information, as well as the number of goods and services, available to individuals is increasing exponentially. This increase in items and information is occurring across all domains, e.g. sound recordings, restaurants, movies,
World Wide Web pages, clothing stores, etc. An individual attempting to find useful information, or to decide between competing goods and services, is often faced with a bewildering selection of sources and choices.
Individual sampling of all items, even in a particular domain, may be impossible. For example, sampling every restaurant of a particular type in New York City would tax even the most avid diner. Such a sampling would most likely be
prohibitively expensive to carry out, and the diner would have to suffer through many unenjoyable restaurants.
In many domains, individuals have simply learned to manage information overload by relying on a form of generic referral system. For example, in the domain of movie and sound recordings, many individuals rely on reviews written by paid
reviewers. These reviews, however, are simply the viewpoint of one or two individuals and may not have a likelihood of correlating with how the individual will actually perceive the movie or sound recording. Many individuals may rely on a review only
to be disappointed when they actually sample the item.
One method of attempting to provide an efficient filtering mechanism is to use content-based filtering. The content-based filter selects items from a domain for the user to sample based upon correlations between the content of the item and the
user's preferences. Content-based filtering schemes suffer from the drawback that the items to be selected must be in some machine-readable form, or attributes describing the content of the item must be entered by hand. This makes content-based
filtering problematic for existing items such as sound recordings, photographs, art, video, and any other physical item that is not inherently machine-readable. While item attributes can be assigned by hand in order to allow a content-based search, for
many domains of items such assignment is not practical. For example, it could take decades to enter even the most rudimentary attributes for all available network television video clips by hand.
Perhaps more importantly, even the best content-based filtering schemes cannot provide an analysis of the quality of a particular item as it would be perceived by a particular user, since quality is inherently subjective. So, while a
content-based filtering scheme may select a number of items based on the content of those items, a content-based filtering scheme generally cannot further refine the list of selected items to recommend items that the individual will enjoy.
SUMMARY OF THE INVENTION
In one aspect, the present invention relates to a method for calculating a similarity factor between a first user and a second user. The method begins by retrieving from memory the profile of items rated by the first user. The retrieved item
profiles indicate whether the second user has previously rated those items. The second user's profile is then retrieved from memory and a similarity factor between the first user and the second user is calculated responsive to the retrieved user
profiles.
In another aspect, the present invention relates to a method for recommending an item to a user which begins by storing a user profile in memory for each user. The user profile includes ratings given to items by the user. An item profile is
also stored in memory which includes ratings given to the item by users. The profile of each item rated by the user is retrieved from memory and used to determine which other users of the system have rated that item. The profile of each of those users
is retrieved from memory and a similarity factor between the initial user each of the users that have rated the item is calculated. The similarity factors are calculated responsive to the retrieved user profiles. A set of neighboring users is selected
responsive to the similarity factors and a weight is assigned to each of the neighboring user. The neighboring users and the weights given to them are used together with the ratings given to items by those neighboring users to recommend at least one
item to the initial user.
In yet another aspect, the present invention relates to a method for recommending an item to a user which begins by generating a concept mask for the user which represents the user's areas of interest. The user's profile is stored in memory and
includes information related to the ratings given to items by the user. A plurality of similarity factor vectors is calculated. Each vector represents the similarity between each user and another user and the individual entries in the vector represent
the similarity between those users on a per-concept basis. The similarity factor vectors are used to select a set of neighboring users. A weight is assigned to the neighboring users and that weight, together with the ratings given to items by the
neighboring users, is used to recommend an item to the initial user.
BRIEF DESCRIPTION OF THE DRAWINGS
This invention is pointed out with particularity in the appended claims. The above and further advantages of this invention may be better understood by referring to the following description taken in conjunction
with the accompanying drawings, in which:
FIG. 1 is a flowchart of one embodiment of the method;
FIG. 2 is a diagrammatic view of a user profile-item profile matrix;
FIG. 3 is a flowchart of another embodiment of the method;
FIG. 4 is a block diagram of an embodiment of the apparatus; and
FIG. 5 is a block diagram of an Internet system on which the method and apparatus may be used.
DETAILED DESCRIPTION OF THE INVENTION
As referred to in this description, items to be recommended can be items of any type that a user may sample in a domain. When reference is made to a "domain," it is intended to refer to any category or subcategory of ratable items, such as sound
recordings, movies, restaurants, vacation destinations, novels, or World Wide Web pages. Referring now to FIG. 1, a method for recommending items begins by storing user and item information in profiles.
A plurality of user profiles is stored in a memory element (step 102). One profile may be created for each user or multiple profiles may be created for a user to represent that user over multiple domains. Alternatively, a user may be
represented in one domain by multiple profiles where each profile represents the proclivities of a user in a given set of circumstances. For example, a user that avoids seafood restaurants on Fridays, but not on other days of the week, could have one
profile representing the user's restaurant preferences from Saturday through Thursday, and a second profile representing the user's restaurant preferences on Fridays. In some embodiments, a user profile represents more than one user. For example, a
profile may be created which represents a woman and her husband for the purpose of selecting movies. Using this profile allows a movie recommendation to be given which takes into account the movie tastes of both individuals. For convenience, the
remainder of this specification will use the term "user" to refer to single users of the system, as well as "composite users." The memory element can be any memory element known in the art that is capable of storing user profile data and allowing the
user profiles to be updated, such as a disc drive or random access memory.
Each user profile associates items with the ratings given to those items by the user. Each user profile may also store information in addition to the user's rating. In one embodiment, the user profile stores information about the user, e.g.
name, address, or age. In another embodiment, the user profile stores information about the rating, such as the time and date the user entered the rating for the item. User profiles can be any data construct that facilitates these associations, such as
an array, although it is preferred to provide user profiles as sparse vectors of n-tuples. Each n-tuple contains at least an identifier representing the rated item and an identifier representing the rating that the user gave to the item, and may include
any number of additional pieces of information regarding the item, the rating, or both. Some of the additional pieces of information stored in a user profile may be calculated based on other information in the profile, for example, an average rating for
a particular selection of items (e.g., heavy metal albums) may be calculated and stored in the user's profile. In some embodiments, the profiles are provided as ordered n-tuples. Alternatively, a user profile may be provided as an array of pointers;
each pointer is associated with an item rated by the user and points to the rating and information associated with the rating.
A profile for a user can be created and stored in a memory element when that user first begins rating items, although in multi-domain applications user profiles may be created for particular domains only when the user begins to explore, and rate
items within, those domains. Alternatively, a user profile may be created for a user before the user rates any items in a domain. For example, a default user profile may be created for a domain which the user has not yet begun to explore based on the
ratings the user has given to items in a domain that the user has already explored.
Whenever a user profile is created, a number of initial ratings for items may be solicited from the user. This can be done by providing the user with a particular set of items to rate corresponding to a particular group of items. Groups are
genres of items and are discussed below in more detail. Other methods of soliciting ratings from the user may include: manual entry of item-rating pairs, in which the user simply submits a list of items and ratings assigned to those items; soliciting
ratings by date of entry into the system, i.e., asking the user to rate the newest items added to the system; soliciting ratings for the items having the most ratings; or by allowing a user to rate items similar to an initial item selected by the user.
In still other embodiments, the system may acquire a number of ratings by monitoring the user's environment. For example, the system may assume that Web sites for | | |