WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Computer-implemented collaborative filtering based method for recommending an item to a user    
United States Patent6049777   
Link to this pagehttp://www.wikipatents.com/6049777.html
Inventor(s)Sheena; Jonathan Ari (Cambridge, MA), McNulty; John Edward (Burlington, MA), Sullivan; James J. (Arlington, MA), Metral; Max E. (Boston, MA)
AbstractAn object for providing isolated, hierarchical data storage can be used in a method for recommending an item to one of a plurality of users. The data object abstracts an associated physical memory element and provides an interface for storing data and retrieving data from the physical memory element. In some embodiments the data object is provided with an indicator for identifying another data object that is used if a memory request is unable to be serviced by the associated physical memory element. In other embodiments this data object can be used to efficiently and transparently store profile data associated with a system for recommending items to users.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6049777
Computer-implemented collaborative filtering based method for
     recommending an item to a user - US Patent 6049777 Drawing
Computer-implemented collaborative filtering based method for recommending an item to a user
Inventor     Sheena; Jonathan Ari (Cambridge, MA) , McNulty; John Edward (Burlington, MA) , Sullivan; James J. (Arlington, MA) , Metral; Max E. (Boston, MA)
Owner/Assignee     Microsoft Corporation (Redmond, WA)
Patent assignment
All assignments
Publication Date     April 11, 2000
Application Number     08/818,515
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 14, 1997
US Classification     705/10 705/26
Int'l Classification    
Examiner     MacDonald; Allen R.
Assistant Examiner     Kazimi; Hani M.
Attorney/Law Firm     Michaelson; Peter L. Michaelson & Wallace
Address
Parent Case     CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation-in-part application of application Ser. No. 08/597,442 filed Feb. 2, 1996, now abandoned which itself claims priority to provisional application Ser. No. 60/000,598, filed Jun. 30, 1995, now expired, and provisional application 60/008,458, filed Dec. 11, 1995, now expired, all of which are incorporated herein by reference.
Priority Data    
USPTO Field of Search     705/10 705/14 705/16 705/26 705/27 705/28 702/179 707/1 707/3 707/103
Patent Tags     computer-implemented collaborative filtering based for recommending item user
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5774868
Cragun et al.

Jun,1998

[0 after 0 votes]
5704017
Heckerman et al.

Dec,1997

[0 after 0 votes]
5699507
Goodnow, II et al.

Dec,1997

[0 after 0 votes]
5692107
Simoudis et al.

Nov,1997

[0 after 0 votes]
5583763
Atcheson et al.

Dec,1996

[0 after 0 votes]
5544161
Bigham et al.

Aug,1996

[0 after 0 votes]
5446159
Stucky et al.

Aug,1995

[0 after 0 votes]
5446891
Kaplan et al.

Aug,1995

[0 after 0 votes]
5124911
Sack

Jun,1992

[0 after 0 votes]
5041972
Frost

Aug,1991

[0 after 0 votes]
5034981
Leonard et al.

Jul,1991

[0 after 0 votes]
4996642
Hey

Feb,1991

[0 after 0 votes]
4914684
Leonard et al.

Apr,1990

[0 after 0 votes]
4908758
Sanders

Mar,1990

[0 after 0 votes]
4870579
Hey

Sep,1989

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. In a computer system having a processor and a memory, the memory connected to the processor and storing computer executable instructions, a method for recommending an item to one of a plurality of users, wherein the item is not yet rated by said one user, the method comprising the steps, implemented through said instructions, of:

(a) storing a user profile, in the memory, for each of a plurality of users, wherein the user profile comprises a separate rating value, supplied by a particular one of the users, for each corresponding one of a plurality of items, said items including the item non-rated by the user;

(b) storing an item profile, in the memory, for each of the rated items, wherein the item profile comprises a separate rating value, for a particular one of the items, provided by each one of the plurality of the users, wherein the user profile and the item profile are distinct from each other;

(c) calculating, for each one of the plurality of users and in response to the user and item profiles, a plurality of similarity factors, between said each one user and at least one other one of the users, for each of said items including said non-rated item;

(d) selecting, in response to the plurality of similarity factors and for each one of the plurality of users, a plurality of neighboring ones of the users, such that each of the neighboring ones of the users has an associated similarity factor which is greater than a first predefined threshold value or, if a confidence factor is associated with the associated similarity factor, both the associated similarity factor is less than the first predefined threshold and the confidence factor exceeds a second predefined threshold value

(e) assigning a corresponding weight to each of the neighboring users so as to define a plurality of weights; and

(f) recommending at least one of a plurality of the items to said one user in response to the plurality of weights and ratings given to the non-rated item by the neighboring ones of the users.

2. The method of claim 1 wherein the user profile comprises either a set of user n-tuples or a first set of pointers to storage locations in the memory at which user entries are stored, wherein each of the user n-tuples comprises a separate rating value, supplied by a particular one of the users, for each corresponding one of a plurality of items, said items including the item non-rated by the user and each of the user entries stores a separate rating for an associated one of the users for a corresponding one of the items.

3. The method of claim 2 wherein the item profile comprises either a set of item n-tuples or a second set of pointers to storage locations in the memory at which item entries are stored, wherein each of the item n-tuples comprises a separate rating value, for a particular one of the items, provided by each one of the plurality of the users and each of the item entries stores a rating for an associated one of the items by a corresponding one of the users.

4. The method of claim 3 wherein the calculating step further comprises the steps of:

receiving a rating from a given one of the plurality of users for a given one of the plurality of items so as to define a received rating;

updating the user profile associated with the given one user by either writing the received rating into an appropriate one of the user n-tuples or writing the received rating into an appropriate one of the user entries;

updating the item profile associated with the given one item by writing the received rating into an appropriate one of the item n-tuples or writing the received rating into an appropriate one of the item entries; and

calculating, for the given one user, a plurality of said similarity factors, wherein each of the similarity factors for said given one user represents similarity between said given one user and another one of the users with respect to the items.

5. The method of claim 1 wherein the item profile comprises either a set of item n-tuples or a second set of pointers to storage locations in the memory at which item entries are stored, wherein each of the item n-tuples comprises a separate rating value, for a particular one of the items, provided by each one of the plurality of the users and each of the item entries stores a rating for an associated one of the items by a corresponding one of the users.

6. The method of claim 5 wherein the user profile comprises either a set of user n-tuples or a first set of pointers to storage locations in the memory at which user entries are stored, wherein each of the user n-tuples comprises a separate rating value, supplied by a particular one of the users, for each corresponding one of a plurality of items, said items including the item non-rated by the user and each of the user entries stores a separate rating for an associated one of the users for a corresponding one of the items.

7. The method of claim 6 wherein the calculating step further comprises the steps of:

receiving a rating from a given one of the plurality of users for a given one of the plurality of items so as to define a received rating;

updating the user profile associated with the given one user by either writing the received rating into an appropriate one of the user n-tuples or writing the received rating into an appropriate one of the user entries;

updating the item profile associated with the given one item by writing the received rating into an appropriate one of the item n-tuples or writing the received rating into an appropriate one of the item entries; and

calculating, for the given one user, a plurality of said similarity factors, wherein each of the similarity factors for said given one user represents similarity between said given one user and another one of the users with respect to the items.

8. A computer readable medium having computer executable instructions stored therein, which, when executed by a computer, perform a method for recommending an item to one of a plurality of users, wherein the item is not yet rated by said one user, the method comprising the steps, implemented through said instructions, of:

(a) storing a user profile, in the memory, for each of a plurality of users, wherein the user profile comprises a separate rating value, supplied by a particular one of the users, for each corresponding one of a plurality of items, said items including the item non-rated by the user;

(b) storing an item profile, in the memory, for each of the rated items, wherein the item profile comprises a separate rating value, for a particular one of the items, provided by each one of the plurality of the users, wherein the user profile and the item profile are distinct from each other;

(c) calculating, for each one of the plurality of users and in response to the user and item profiles, a plurality of similarity factors, between said each one user and at least one other one of the users, for each of said items, including said non-rated item;

(d) selecting, in response to the plurality of similarity factors and for each one of the plurality of users, a plurality of neighboring ones of the users, such that each of the neighboring ones of the users has an associated similarity factor which is greater than a first predefined threshold value or, if a confidence factor is associated with the associated similarity factor, both the associated similarity factor is less than the first predefined threshold and the confidence factor exceeds a second predefined threshold value;

(e) assigning a corresponding weight to each of the neighboring users so as to define a plurality of weights; and

(f) recommending at least one of a plurality of the items to said one user in response to the plurality of weights and ratings given to the non-rated item by the neighboring ones of the users.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

The present invention relates to an improved method and apparatus for recommending items and, in particular, to an improved method and apparatus for recommending items using automated collaborative filtering and feature-guided automated collaborative filtering.

BACKGROUND OF THE INVENTION

The amount of information, as well as the number of goods and services, available to individuals is increasing exponentially. This increase in items and information is occurring across all domains, e.g. sound recordings, restaurants, movies, World Wide Web pages, clothing stores, etc. An individual attempting to find useful information, or to decide between competing goods and services, is often faced with a bewildering selection of sources and choices.

Individual sampling of all items, even in a particular domain, may be impossible. For example, sampling every restaurant of a particular type in New York City would tax even the most avid diner. Such a sampling would most likely be prohibitively expensive to carry out, and the diner would have to suffer through many unenjoyable restaurants.

In many domains, individuals have simply learned to manage information overload by relying on a form of generic referral system. For example, in the domain of movie and sound recordings, many individuals rely on reviews written by paid reviewers. These reviews, however, are simply the viewpoint of one or two individuals and may not have a likelihood of correlating with how the individual will actually perceive the movie or sound recording. Many individuals may rely on a review only to be disappointed when they actually sample the item.

One method of attempting to provide an efficient filtering mechanism is to use content-based filtering. The content-based filter selects items from a domain for the user to sample based upon correlations between the content of the item and the user's preferences. Content-based filtering schemes suffer from the drawback that the items to be selected must be in some machine-readable form, or attributes describing the content of the item must be entered by hand. This makes content-based filtering problematic for existing items such as sound recordings, photographs, art, video, and any other physical item that is not inherently machine-readable. While item attributes can be assigned by hand in order to allow a content-based search, for many domains of items such assignment is not practical. For example, it could take decades to enter even the most rudimentary attributes for all available network television video clips by hand.

Perhaps more importantly, even the best content-based filtering schemes cannot provide an analysis of the quality of a particular item as it would be perceived by a particular user, since quality is inherently subjective. So, while a content-based filtering scheme may select a number of items based on the content of those items, a content-based filtering scheme generally cannot further refine the list of selected items to recommend items that the individual will enjoy.

SUMMARY OF THE INVENTION

In one aspect the present invention relates to a method for recommending an item to one of a plurality of users. The method begins by storing a user profile in a memory by writing user profile data to a memory management data object. Item profile data is also written to a memory management data object. Similarity factors are calculated for each of the users and the similarity factors are used to select a neighboring user set for each user of the system. A weight is assigned to each of the neighboring users and the assigned weights, together with the ratings given to items by the user's neighboring users, are utilized to recommend one of the items to the user.

In another aspect the present invention relates to a memory management data object which is associated with a physical memory element. The memory management data object includes a retrieval method for accessing data stored in the associated physical memory element and a storage method for writing data to the associated physical memory element. The object also includes an indicator for identifying another data object. The identified data object is accessed if the memory request cannot be fulfilled by the associated physical memory element. In some embodiments the data object is provided with look-ahead storage and retrieval capabilities.

In yet another aspect the present invention relates to an article of manufacture which has computer-readable program means for recommending an item to one of a plurality of users embodied thereon. The article includes computer-readable program means for storing user profiles and item profiles in memory by writing them to a memory management data object. The article also includes computer-readable program means for calculating similarity factors between each user of the system and another user of the system. The article further includes computer-readable program means for selecting a plurality of neighboring users for each user of the system, and computer-readable program means for assigning a weight to each of those neighboring users. The article also includes computer-readable program means for recommending at least one item to one of the users based on the weights assigned to that user's neighboring users and the ratings those users have given to items in the system.

BRIEF DESCRIPTION OF THE DRAWINGS

This invention is pointed out with particularity in the appended claims. The above and further advantages of this invention may be better understood by referring to the following description taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flowchart of one embodiment of the method;

FIG. 2 is a diagrammatic view of a user profile-item profile matrix;

FIG. 3 is a flowchart of another embodiment of the method;

FIG. 4 is a block diagram of an embodiment of the apparatus; and

FIG. 5 is a block diagram of an Internet system on which the method and apparatus may be used.

DETAILED DESCRIPTION OF THE INVENTION

As referred to in this description, items to be recommended can be items of any type that a user may sample in a domain. When reference is made to a "domain," it is intended to refer to any category or subcategory of ratable items, such as sound recordings, movies, restaurants, vacation destinations, novels, or World Wide Web pages. Referring now to FIG. 1, a method for recommending items begins by storing user and item information in profiles.

A plurality of user profiles is stored in a memory element (step 102). One profile may be created for each user or multiple profiles may be created for a user to represent that user over multiple domains. Alternatively, a user may be represented in one domain by multiple profiles where each profile represents the proclivities of a user in a given set of circumstances. For example, a user that avoids seafood restaurants on Fridays, but not on other days of the week, could have one profile representing the user's restaurant preferences from Saturday through Thursday, and a second profile representing the user's restaurant preferences on Fridays. In some embodiments, a user profile represents more than one user. For example, a profile may be created which represents a woman and her husband for the purpose of selecting movies. Using this profile allows a movie recommendation to be given which takes into account the movie tastes of both individuals. For convenience, the remainder of this specification will use the term "user" to refer to single users of the system, as well as "composite users." The memory element can be any memory element known in the art that is capable of storing user profile data and allowing the user profiles to be updated, such as disc drive or random access memory.

Each user profile associates items with the ratings given to those items by the user. Each user profile may also store information in addition to the user's rating. In one embodiment, the user profile stores information about the user, e.g. name, address, or age. In another embodiment, the user profile stores information about the rating, such as the time and date the user entered the rating for the item. User profiles can be any data construct that facilitates these associations, such as an array, although it is preferred to provide user profiles as sparse vectors of n-tuples. Each n-tuple contains at least an identifier representing the rated item and an identifier representing the rating that the user gave to the item, and may include any number of additional pieces of information regarding the item, the rating, or both. Some of the additional pieces of information stored in a user profile may be calculated based on other information in the profile, for example, an average rating for a particular selection of items (e.g., heavy metal albums) may be calculated and stored in the user's profile. In some embodiments, the profiles are provided as ordered n-tuples. Alternatively, a user profile may be provided as an array of pointers; each pointer is associated with an item rated by the user and points to the rating and information associated with the rating.

A profile for a user can be created and stored in a memory element when that user first begins rating items, although in multi-domain applications user profiles may be created for particular domains only when the user begins to explore, and rate items within, those domains. Alternatively, a user profile may be created for a user before the user rates any items in a domain. For example, a default user profile may be created for a domain which the user has not yet begun to explore based on the ratings the user has given to items in a domain that the user has already explored.

Whenever a user profile is created, a number of initial ratings for items may be solicited from the user. This can be done by providing the user with a particular set of items to rate corresponding to a particular group of items. Groups are genres of items and are discussed below in more detail. Other methods of soliciting ratings from the user may include: manual entry of item-rating pairs, in which the user simply submits a list of items and ratings assigned to those items; soliciting ratings by date of entry into the system, i.e., asking the user to rate the newest items added to the system; soliciting ratings for the items having the most ratings; or by allowing a user to rate items similar to an initial item selected by the user. In still other embodiments, the system may acquire a number of ratings by monitoring the user's environment. For example, the system may assume that Web sites for which the user has created "bookmarks" are liked by that user and may use those sites as initial entries in the user's profile. One embodiment uses all of the methods described above and allows the user to select the particular method they wish to employ.

Ratings for items which are received from users can be of any form that allows users to record subjective impressions of items based on their experience of the item. For example, items may be rated on an alphabetic scale ("A" to "F") or a numerical scale (1 to 10). In one embodiment, ratings are integers between 1 (lowest) and 7 (highest). Ratings can be received as input to a stand-alone machine, for example, a user may type rating information on a keyboard or a user may enter such information via a touch screen. Ratings may also be received as input to a system via electronic mail, by telephone, or as input to a system via a local area or wide area network. In one embodiment, ratings are received as input to a World Wide Web page. In this embodiment, the user positions a cursor on a World Wide Web page with an input device such as a mouse or trackball. Once the cursor is properly positioned, the user indicates a rating by using a button on the input device to select a rating to enter. Ratings can be received from users singularly or in batches, and may be received from any number of users simultaneously.

Ratings can be inferred by the system from the user's usage pattern. For example, the system may monitor how long the user views a particular Web page and store in that user's profile an indication that the user likes the page, assuming that the longer the user views the page, the more the user likes the page. Alternatively, a system may monitor the user's actions to determine a rating of a particular item for the user. For example, the system may infer that a user likes an item which the user mails to many people and enter in the user's profile an indication that the user likes that item. More than one aspect of user behavior may be monitored in order to infer ratings for that user, and in some embodiments, the system may have a higher confidence factor for a rating which it inferred by monitoring multiple aspects of user behavior. Confidence factors are discussed in more detail below.

Profiles for each item that has been rated by at least one user may also be stored in memory. Each item profile records how particular users have rated this particular item. Any data construct that associates ratings given to the item with the user assigning the rating can be used. It is preferred is to provide item profiles as a sparse vector of n-tuples. Each n-tuple contains at least an identifier representing a particular user and an identifier representing the rating that user gave to the item, and it may contain other information, as described above in connection with user profiles. As with user profiles, item profiles may also be stored as an array of pointers. Item profiles may be created when the first rating is given to an item or when the item is first entered into the system. Alternatively, item profiles may be generated from the user profiles stored in memory, by determining, for each user, if that user has rated the item and, if so, storing the rating and user information in the item's profile. Item profiles may be stored before user profiles are stored, after user profiles are stored, or at the same time as user profiles. For example, referring to FIG. 2, item profile data and user profile data may be stored as a matrix of values which provides user profile data when read "across," i.e. when rows of the matrix are accessed, and provides item profile data when read "down," i.e. when columns of the matrix are accessed. A data construct of this sort could be provided by storing a set of user n-tuples and a set of item n-tuples. In order to read a row of the matrix a specific user n-tuple is accessed and in order to read a column of the matrix a specific item n-tuple is selected.

The additional information associated with each item-rating pair can be used by the system for a variety of purposes, such as assessing the validity of the rating data. For example, if the system records the time and date the rating was entered, or inferred from the user's environment, it can determine the age of a rating for an item. A rating which is very old may indicate that the rating is less valid than a rating entered recently, for example, users' tastes may change or "drift" over time. One of the fields of the n-tuple may represent whether the rating was entered by the user or inferred by the system. Ratings that are inferred by the system may be assumed to be less valid than ratings that are actually entered by the user. Other items of information may be stored, and any combination or subset of additional information may be used to assess rating validity. In some embodiments, this validity metric may be represented as a confidence factor, that is, the combined effect of the selected pieces of information recorded in the n-tuple may be quantified as a number. In some embodiments, that number may be expressed as a percentage representing the probability that the associated rating is incorrect or as an expected deviation of the predicted rating from the "correct" value.

Since the system may be hosted by any one of a number of different types of machines, or by a machine that is reconfigured frequently, it is desirable to provide data storage for profiles in a hierarchical, isolated manner. The term "isolated," for the purposes of this specification, means that the interface to the physical memory elements storing item and user profiles is abstracted, i.e. the system interacts with the physical memory elements through a defined data object. Although the description of such a data object is c