WO2019184480A1 - Item recommendation - Google Patents

Item recommendation Download PDF

Info

Publication number
WO2019184480A1
WO2019184480A1 PCT/CN2018/123411 CN2018123411W WO2019184480A1 WO 2019184480 A1 WO2019184480 A1 WO 2019184480A1 CN 2018123411 W CN2018123411 W CN 2018123411W WO 2019184480 A1 WO2019184480 A1 WO 2019184480A1
Authority
WO
WIPO (PCT)
Prior art keywords
item
user
rating
sample pairs
identifiers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/123411
Other languages
French (fr)
Chinese (zh)
Inventor
陈超超
周俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Publication of WO2019184480A1 publication Critical patent/WO2019184480A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Recommending goods or services
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Definitions

  • the embodiments of the present specification relate to the field of data processing, and more particularly, to a method and apparatus for predicting a user's rating of an item, and an item recommendation method and apparatus.
  • the recommendation function is a feature that is frequently used.
  • the rating of the item is generally recommended based on the existing user.
  • the rating information there is a wide variety of information.
  • movie recommendation in addition to the user's rating information on the movie, there are many potential context features, such as the time of the rating (whether it is a holiday, morning, noon, evening, etc.), the age of the user (youth, middle-aged or old) , the type of film (such as love, action, horror) and so on. Therefore, there is a need for a more efficient recommendation that, in addition to utilizing explicit scoring information, can also utilize the contextual features to make recommendations more efficiently.
  • the embodiments of the present specification aim to provide a more effective item recommendation scheme to solve the deficiencies in the prior art.
  • an aspect of the present specification provides a method for predicting a user's rating of an item, comprising: acquiring a plurality of sample pairs, the sample pair comprising any one of the user identifiers selected from the plurality of user identifiers and selected from the group consisting of And identifying, by the plurality of item identifiers, the plurality of existing scores, the plurality of existing scores corresponding to the partial sample pairs of the plurality of sample pairs; acquiring the plurality of sets of context features respectively corresponding to the respective sample pairs
  • the set of context features includes at least one of the following: a user feature, an item feature, and an interaction feature; and the plurality of sample pairs are clustered into a plurality of subclasses, wherein each subclass is based on the plurality of sets of context features Include a plurality of first sample pairs from the plurality of sample pairs, each of the first sample pairs including a first user identification and a first item identification, wherein the first user identification is first An identification of the user, the first item
  • the set of contextual features includes at least one of the following characteristics: a user feature, an item feature, and an interactive feature.
  • the user characteristic includes a user attribute feature and/or a user rating statistical feature, the item feature including an item attribute feature and/or an item rating statistical feature .
  • the clustering algorithm is a k-means algorithm or a gmm algorithm.
  • clustering the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features comprises: in the plurality of sample pairs Randomly selecting a predetermined number of initial centroids; based on the context features, calculating a distance of each non-centroid sample pair to each centroid; according to the distance, classifying each non-centroid sample pair to the closest centroid; Calculating the same number of new centroids by the predetermined number of centroids and their corresponding non-centroid sample pairs; determining whether the new centroid meets a predetermined condition; and in the case that the predetermined condition is satisfied, outputting the plurality Clustering results for sample pairs.
  • the collaborative filtering algorithm is a matrix decomposition algorithm or a knn algorithm.
  • predicting, by a collaborative filtering algorithm, a score of each first user for a first item that is not scored includes: for each sub-category, based on the plurality of Obtaining a user-item scoring matrix by the first user identifier, the plurality of first item identifiers, and the plurality of existing scores of the plurality of first users relative to the plurality of first items; - the item scoring matrix is decomposed into two low dimensional matrices such that the product of the two low dimensional matrices is closest to the user-item scoring matrix; the user is predicted based on a matrix obtained by multiplying two low dimensional matrices - A score for each first user in the item rating matrix for which the first item was not scored.
  • the existing rating is a rating directly scored by the user or based on a user operation.
  • Another aspect of the present specification provides an item recommendation method, including: acquiring a plurality of second sample pairs, the second sample pair including a second user identifier and a second item identifier, wherein the second user identifier is to be recommended a user identifier of the user, the second item identifier being any one of the plurality of item identifiers corresponding to the plurality of items to be recommended; determining, in each of the plurality of sub-categories obtained by the method of predicting the score a sub-category in which the two sample pairs are located; a score corresponding to each of the second sample pairs in the sub-category thereof is obtained from the score predicted by the method for predicting the score by the above; Sorting the second item identifications included in the second sample pair; and recommending the second item to the second user based on the ranking.
  • Another aspect of the present specification provides an apparatus for predicting a user's rating of an item, comprising: a sample pair obtaining unit configured to acquire a plurality of sample pairs, the sample pair including any one of the user identifiers selected from the plurality of user identifiers And any one of the item identifiers selected from the plurality of item identifiers; the score obtaining unit configured to acquire a plurality of existing scores, the plurality of existing scores corresponding to the partial sample pairs of the plurality of sample pairs;
  • the feature acquiring unit is configured to acquire a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features includes at least one of the following types of features: a user feature, an item feature, and an interaction feature;
  • the clustering unit is configured to Generating the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class comprises a plurality of first sample pairs taken from the plurality of sample pairs, each of
  • the clustering unit includes: a selecting unit configured to randomly select a predetermined number of initial centroids among the plurality of sample pairs; a calculating unit configured to calculate, according to the context feature, a distance of each non-centroid sample pair to each centroid; the categorizing unit configured to classify each non-centroid sample pair to a distance according to the distance a second centroid; the second calculating unit configured to calculate the same number of new centroids according to the predetermined number of centroids and their corresponding non-centroid sample pairs; the determining unit is configured to determine whether the new centroid is satisfied a predetermined condition; and an output unit configured to output a clustering result for the plurality of sample pairs in a case where the predetermined condition is satisfied.
  • the rating prediction unit includes: an obtaining unit configured to, based on the plurality of first user identifiers, the plurality of sub-categories a first item identifier and the plurality of existing scores of the plurality of first users relative to the plurality of first items, acquiring a user-item scoring matrix; and a disassembling unit configured to: the user-item The scoring matrix is decomposed into two low-dimensional matrices such that the product of the two low-dimensional matrices is closest to the user-item scoring matrix; and the prediction unit is configured to be based on a matrix obtained by multiplying two low-dimensional matrices, A score for each of the first users in the user-item scoring matrix for which the first item was not scored is predicted.
  • the present specification provides an item recommendation apparatus, including: a sample pair acquisition unit configured to acquire a plurality of second sample pairs, the second sample pair including a second user identifier and a second item identifier, wherein The second user identifier is a user identifier of the user to be recommended, and the second item identifier is any one of the plurality of item identifiers corresponding to the plurality of items to be recommended; the determining unit is configured to score according to the prediction a plurality of subclasses obtained by the method, determining a subclass in which each of the second sample pairs is located; a prediction score obtaining unit configured to acquire each of the second ones from a score predicted by the method of predicting the score And a ranking unit configured to sort the second item identifiers included in each of the second sample pairs according to the predicted score; and a recommendation unit configured to: The second item is recommended to the second user based on the ranking.
  • the user-item pair is clustered by using the context feature of the user-item, so that the scoring noise of each sub-class is smaller and the correlation is higher, and therefore, in each sub-class Use the collaborative filtering method to get better recommendation performance.
  • FIG. 1 shows a schematic diagram of a system 100 in accordance with an embodiment of the present specification
  • FIG. 2 is a flow chart showing a method of predicting a user's rating of an item in accordance with an embodiment of the present specification
  • Figure 3 schematically illustrates a plurality of sets of context features corresponding to user-items
  • FIG. 4 illustrates a flow chart for clustering by the K-means algorithm in accordance with an embodiment of the present specification
  • FIG. 5 illustrates a flow chart of a method for predicting a score by a collaborative filtering algorithm in accordance with an embodiment of the present specification
  • Figure 6 shows schematically the process of matrix decomposition
  • FIG. 7 is a flow chart showing an item recommendation method according to an embodiment of the present specification.
  • Figure 8 illustrates an apparatus 800 for predicting a user's rating of an item in accordance with an embodiment of the present specification
  • FIG. 9 illustrates an item recommendation device 900 in accordance with an embodiment of the present specification.
  • FIG. 1 shows a schematic diagram of a system 100 in accordance with an embodiment of the present specification.
  • the system 100 includes a clustering module 11, a predictive scoring module 12, and a recommendation module 13.
  • a plurality of user-item pairs and their corresponding sets of contextual features are input to the clustering module 11.
  • the clustering module 11 obtains clustering of user-item pairs by clustering a plurality of feature vectors composed of each set of context features, that is, clustering each user-item pair into a corresponding sub-category .
  • the clustering module 11 sends the plurality of subclasses obtained by the clustering to the prediction scoring module 12.
  • the existing ratings of the items by the users included in each sub-category are sent to the predictive scoring module 12.
  • the predictive scoring module 12 utilizes the existing scoring in each sub-category to predict the user's score for the missing item in the sub-category through a collaborative filtering algorithm.
  • the recommendation module 13 determines the sub-category of the user-to-recommended item pair by using the user identification and the item identification to be recommended, and acquires the user-to-subsidiary from the prediction scoring module 12
  • the predicted score of the recommended item pair is estimated, and the item is recommended to the user according to the ranking of the predicted scores of the plurality of items to be recommended.
  • step S21 acquiring a plurality of sample pairs, the sample pair including being selected from a plurality of user identifiers Any one of the user identifiers and any one of the plurality of item identifiers; and in step S22, obtaining a plurality of existing scores, the plurality of existing scores corresponding to the plurality of sample pairs And acquiring, in step S23, a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features comprises at least one of the following types of features: a user feature, an item feature, and an interaction feature; and in step S24, based on the plurality of groups a context feature, clustering the plurality of sample pairs into a plurality of subclasses, wherein each subclass includes a plurality of first sample pairs taken from the plurality of sample pairs, each of the first sample pairs The first user identifier and the
  • a plurality of sample pairs are acquired, the sample pairs including any one of the user identifications selected from the plurality of user identifications and any one of the item identifications selected from the plurality of item identifications.
  • the sample pair is a user-item pair, which can be represented as (user identification, item identification).
  • the user may be all users in the recommendation system, for example, all users included in the Douban movie APP, all users included in Taobao, and the like.
  • the plurality of users do not have to be all users in the recommendation system, and for example, they may also be system part users involved in one unit in the recommendation system.
  • the item may be all items included in the recommendation system, for example, a movie in a watercress movie, a commodity in Taobao, or the like.
  • the plurality of items need not be all items in the system, but may also be part of a certain range of items in the system.
  • a plurality of user-item pairs are obtained by combining each of a plurality of users with each of a plurality of items.
  • a plurality of existing scores are obtained, and the plurality of existing scores correspond to a partial sample pair of the plurality of sample pairs.
  • the existing rating may be a direct rating of the user, for example, in a Douban movie, the user will score each movie with a score of 1 to 5.
  • the existing rating is obtained indirectly by a user's operation. For example, in Taobao, the user's rating of the item can be calculated based on the user's operation of clicking, purchasing, etc. on the item.
  • the recommendation system usually only some users score some items. For example, in the Douban movie, some users just browse, do not score the movie, or some movies are too unfamiliar, and no user scores them. Therefore, only a portion of the sample pairs have a corresponding user's existing rating for the item.
  • step S23 multiple sets of context features respectively corresponding to respective sample pairs are acquired, wherein the set of context features includes at least one of the following types of features: user features, item features, and interactive features.
  • the context features corresponding to user-item pairs can be generally classified into the following types of features: user static features, such as user age characteristics, teens, middle-aged, and Old age, gender characteristics of users, etc.; static characteristics of items, such as movie categories, love, action, horror, etc.; user rating statistical characteristics, such as the average score of user ratings, variance, etc.; statistical characteristics of item ratings, such as the average of movies Rating, variance, etc.; interactive characteristics, such as whether the rating time is a holiday, morning, noon, evening, etc.
  • the contextual feature can be obtained from user profiles, item attributes, and user-item interaction information.
  • Figure 3 illustrates schematically a plurality of sets of contextual features corresponding to user-items.
  • u 1 , u 2 , u 3 and u 4 are user identifiers
  • v 1 , v 2 , v 3 and v 4 are item identifiers
  • the numbers 3, 4, 5, etc. in the figure are the corresponding user's ratings of the items.
  • Behind each user-item compartment includes a list of squares that schematically represent the set of contextual characteristics corresponding to the user-item pair.
  • the set of context features includes at least one feature associated with the user, the item, and their interactions included in the user-item pair.
  • the plurality of sample pairs are clustered into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class includes a plurality of first sample pairs taken from the plurality of sample pairs, Each of the first sample pairs includes a first user identification and a first item identification, wherein the first user identification is an identification of the first user, and the first item identification is an identification of the first item.
  • the set of context features may be represented in the form of a feature vector whose dimensions are the number of features included in a set of context features, and each component of the feature vector represents a feature value in the corresponding feature dimension.
  • a set of contextual characteristics may include: age, middle age; movie type, love.
  • the values in the age dimension are quantified as: 1 (love), 2 (action), 3 ( Horror), thereby obtaining a feature vector corresponding to the set of contextual features: (2, 1), where the first component represents the age feature dimension and the second component represents the movie type feature dimension.
  • the feature vector corresponding to the context feature group can be located with a vector point in the feature space composed of the respective feature dimensions.
  • the corresponding feature vectors of different user-item pairs may be equal, ie coincide in a dimension space at a point, ie, the point corresponds to a plurality of user-item pairs.
  • the vector points can be clustered by various clustering algorithms, such as K-means algorithm, gmm (Gaussian mixed model) algorithm, BIRCH algorithm, OPTICS algorithm and so on.
  • step S41 a predetermined number of initial centroids are randomly selected among the plurality of feature vector points.
  • the predetermined number is k which needs to be predetermined in the K-means algorithm.
  • k can be determined by the estimated number of sub-categories.
  • the estimated sub-categories may include: (youth, love), (youth, action), (youth, horror), (middle-aged, love), (middle-aged, action), (middle-aged, horror), (old age, love), (old age, action), (old age, horror), therefore, k can be set to 9. That is, the value of k is related to the number of features and combinations thereof. After determining the k, it is preferred to select the dispersed k initial centroids when selecting the initial centroid.
  • each non-centroid point pair is classified into the closest centroid according to the distance, thereby obtaining k clusters.
  • step S44 the same number of new centroids are calculated according to the predetermined number of centroid points and their corresponding non-centroid points, so that the sum of the distances of all the points to the center of the cluster to which they belong is the smallest, that is, as in formula (1) As shown, the new centroid Is the average vector of all vector points in the cluster.
  • step S45 it is judged whether or not the new centroid satisfies a predetermined condition, for example, the predetermined condition is that the new centroid does not change with respect to the original centroid.
  • step S46 a clustering result is outputted, the clustering result including a plurality of clusters and points included in each cluster, the points corresponding to feature vectors, that is, corresponding to user-item pairs. Thereby multiple user-item pairs are clustered into multiple sub-categories based on contextual characteristics.
  • Each of the sub-categories includes a plurality of first user-item pairs taken from the plurality of user-item pairs, each of the first user-item pairs including a first user identification and a first item identification, wherein
  • the first user identifier is an identifier of the first user
  • the first item identifier is an identifier of the first item.
  • step S25 based on each of the plurality of the first user identifiers and the plurality of the first item identifiers, and the plurality of the first users relative to the plurality of the first items A plurality of existing scores are predicted by a collaborative filtering algorithm for each first user to score a first item that is not scored.
  • the collaborative filtering algorithm herein may employ various algorithms such as a knn algorithm or a matrix decomposition algorithm.
  • a knn algorithm or a matrix decomposition algorithm.
  • the process of predicting a score according to an embodiment of the present specification will be described below by taking a matrix decomposition algorithm as an example.
  • FIG. 5 illustrates a flow chart of a method for predicting scores by a collaborative filtering algorithm in accordance with an embodiment of the present specification.
  • step S51 for each subclass, based on the plurality of first user identifiers, the plurality of first item identifiers, and the plurality of first users relative to the plurality of first
  • the plurality of existing scores of the item obtain a user-item rating matrix.
  • Figure 6 shows schematically the process of matrix decomposition.
  • the matrix on the left in Figure 6 schematically shows a user-item scoring matrix, where u 1 , u 2 , u 3 and u 4 are user identities, v 1 , v 2 , v 3 , v 4 and v 5 are The item identification, the number in the square intersecting u i and v j represents the score of u i versus v j , where "?” indicates that u i is not scored for v j .
  • the user-item scoring matrix is decomposed into two low-dimensional matrices such that the product of the two low-dimensional matrices is closest to the user-item scoring matrix.
  • the product of the two low dimensional matrices is brought closest to the user-item scoring matrix, ie, the difference between the product of the two low dimensional matrices and the user-item scoring matrix is minimized. Therefore, the objective function can be set to the following formula (2):
  • U and V can be iteratively calculated by, for example, a gradient descent algorithm to obtain two low dimensional matrices U and V that minimize the objective function.
  • the two matrices multiplied by FIG. 6 are the two low-dimensional matrices U T and V obtained by, for example, a gradient descent algorithm.
  • a score of each user in the user-item scoring matrix for the ungraded item is predicted based on a matrix obtained by multiplying two low-dimensional matrices. For example, as shown in FIG. 6, by multiplying U T by V, the prediction matrix shown on the right side of FIG. 6 is obtained. Comparing the scoring matrix and the prediction matrix in Figure 6, it can be seen that the score in the gray square in the prediction matrix is equal to (or as close as possible to) the existing score in the scoring matrix, and the score in the white square in the prediction matrix. This is the score predicted by the matrix decomposition algorithm.
  • FIG. 7 shows a flow chart of an item recommendation method in accordance with an embodiment of the present specification.
  • the method includes: in step S71, acquiring a plurality of sample pairs, where the sample pair includes a user identifier and an item identifier, wherein the user identifier is a user identifier of a user to be recommended, and the item identifier corresponds to multiple to-be-requested Determining any one of the plurality of item identifications of the item; in step S72, determining, in the plurality of sub-categories obtained by the above-described method of predictive scoring, a sub-category in which each sample pair is located; and in step S73, scoring from the above-mentioned prediction a method for predicting a score, obtaining a corresponding predicted score for each of the sample pairs in a subclass to which it belongs; and, at step S74, sorting the item identifiers included in the respective sample pairs according to the predicted score; At step S75, an item is recommended to the user
  • a plurality of sample pairs are acquired, where the sample pair includes a user identifier and an item identifier, wherein the user identifier is a user identifier of a user to be recommended, and the item identifier is corresponding to a plurality of items to be recommended.
  • the system will start items recommended procedure. At this time, the system recalls the item candidate set recommended to the user u 1 based on the user identification u 1 and the item identification v 1 of the item operated by the user.
  • the recall here is a coarse screening of recommended items according to predetermined conditions, for example, generating a candidate set according to the user's initial preference, generating a candidate set or the like according to the attributes of the item (for example, when the item is a recommended restaurant, the attribute is, for example, a geographical position).
  • the user identification u 1 is combined with the item identification v i of each item in the candidate set, respectively, such that multiple sample pairs are available.
  • step S72 among the plurality of subclasses obtained by the above-described method of predictive scoring, the subclass in which each sample pair is located is determined.
  • the above prediction scoring method it can be clarified that one sample pair corresponds to one feature vector, that is, corresponds to one point in the vector space. Therefore, a sample pair can only be classified into one subclass.
  • the sample pair can be searched for among the plurality of sub-classes obtained above, thereby determining the sub-category in which the sample pair is located.
  • the subclasses in which each sample pair is located can be obtained.
  • step S73 from the scores predicted by the above-described method of predictive scoring, the predicted scores corresponding to each of the sample pairs in the subclass to which they belong are obtained.
  • the scores of the items in the unclassified sub-categories of the individual users in the sub-categories are predicted by the collaborative filtering algorithm.
  • the predicted score corresponding to the sample pair can be obtained from all of the predicted scores associated with the sub-category.
  • step S74 the item identifiers included in the respective sample pairs are sorted according to the predicted score. The higher the predicted score, the greater the user's estimated preference for the item. Thus, items with a high predicted score can be placed in the front position.
  • an item is recommended to the user based on the ranking.
  • items can be recommended to the user in a variety of ways.
  • the item ranked first can be recommended only to the user, the item ranked first can be preferentially recommended to the user, and the item can be recommended to the user according to the order, order (chronological order or spatial order), and the like.
  • FIG. 8 illustrates an apparatus 800 for predicting a user's rating of an item according to an embodiment of the present specification, including: a sample pair obtaining unit 81 configured to acquire a plurality of sample pairs, the sample pair including being selected from a plurality of Any one of the user identifiers and any one of the plurality of item identifiers; the score obtaining unit 82 is configured to acquire a plurality of existing scores, wherein the plurality of existing scores correspond to the plurality of samples a partial sample pair of the pair; the context feature obtaining unit 83 is configured to acquire a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features comprises at least one of the following characteristics: user features, item features, and interactions a clustering unit 84 configured to cluster the plurality of sample pairs into a plurality of subclasses based on the plurality of sets of context features, wherein each of the subclasses comprises a plurality of the plurality of sample pairs a first sample pair, each of the first sample pairs
  • the clustering unit 84 includes: a selecting unit 841 configured to randomly select a predetermined number of initial centroids among the plurality of sample pairs;
  • the first calculating unit 842 is configured to calculate, according to the context feature, a distance of each non-centroid sample pair to each centroid;
  • the categorizing unit 843 is configured to, according to the distance, each non-centroid sample pair Classified to the nearest centroid;
  • the second calculating unit 844 is configured to calculate the same number of new centroids according to the predetermined number of centroids and their corresponding non-centroid sample pairs;
  • the determining unit 845 is configured to determine Whether the new centroid satisfies a predetermined condition; and an output unit 846 configured to output a clustering result for the plurality of sample pairs in a case where the predetermined condition is satisfied.
  • the rating prediction unit 85 includes: an obtaining unit 851 configured to, based on the plurality of first user identifiers, the each of the subcategories a plurality of first item identifiers and the plurality of existing scores of the plurality of first users relative to the plurality of first items to obtain a user-item scoring matrix; and a decomposing unit 852 configured to: - the item scoring matrix is decomposed into two low dimensional matrices such that the product of the two low dimensional matrices is closest to the user-item scoring matrix; and the predicting unit 853 is configured to obtain by multiplying the two low dimensional matrices a matrix predicting a score of the first item of the first user in the user-item scoring matrix for which the first item was not scored.
  • FIG. 9 illustrates an item recommendation apparatus 900 according to an embodiment of the present specification, including: a sample pair acquisition unit 91 configured to acquire a plurality of second sample pairs, the second sample pair including a second user identifier and a second An item identifier, wherein the second user identifier is a user identifier of a user to be recommended, and the second item identifier is any one of a plurality of item identifiers corresponding to the plurality of items to be recommended; determining unit 92, configuring To determine, in a plurality of sub-categories obtained by the above-described method of predictive scoring, a sub-category in which each of the second sample pairs is located; a prediction score obtaining unit 93 configured to be predicted from a score predicted by the method of predicting scoring Obtaining a prediction score corresponding to each of the second sample pairs in the subclass to which it belongs; the sorting unit 94 is configured to perform, according to the predicted score, the second item identifier included in each of the second sample pairs Sort
  • the user-item pair is clustered by using the context feature of the user-item, so that the scoring noise of each sub-class is smaller and the correlation is higher, and therefore, in each sub-class Use the collaborative filtering method to get better recommendation performance.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein may be implemented in hardware, in a software module in a processor orbit, or in a combination of the two.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and a device for predicting a rating of an item by a user and a method and a device for item recommendation. Therating method for predicting a rating comprises: acquiring a plurality of sample pairs, a sample pair comprising any user identifier selected from a plurality of user identifiers and any item identifier selected from a plurality of item identifiers (S21); acquiring a plurality of existing ratings, the plurality of existing ratings corresponding to some of the plurality of sample pairs (S22); acquiring multiple sets of contextual features corresponding to the respective sample pairs respectively (S23); on the basis of the multiple sets of contextual features, clustering the plurality of sample pairs into a plurality of sub-categories (S24); and with regard to each sub-category, on the basis of the plurality of first user identifiers and the plurality of first item identifiers, as well as the plurality of existing ratings of the plurality of first items by the plurality of first users, predicting, by means of a collaborative filtering algorithm, the ratings of the first items not rated by the first users (S25).

Description

物品推荐Item recommendation

相关申请的交叉引用Cross-reference to related applications

本专利申请要求于2018年3月27日提交的、申请号为201810257617.X、发明名称为“一种物品推荐方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。The present application claims priority to Chinese Patent Application No. 201, 810, 257, 617, filed on March 27, 20, the entire disclosure of which is incorporated herein by reference. Into this article.

技术领域Technical field

本说明书实施例涉及数据处理领域,更具体地,涉及一种预测用户对物品的评分的方法和装置、以及一种物品推荐方法和装置。The embodiments of the present specification relate to the field of data processing, and more particularly, to a method and apparatus for predicting a user's rating of an item, and an item recommendation method and apparatus.

背景技术Background technique

在互联网中,推荐功能是频繁使用的一种功能。在现有的推荐系统中,一般依据已有的用户对物品的评分进行推荐。然而,在系统中,除了评分信息之外,还存在多种多样的信息。以电影推荐为例,除了用户对电影的评分信息以外,还有许多潜在的上下文特征,比如评分的时间(是否节假日,早上、中午、晚上等),用户的年龄(青少年、中年还是老年),电影的类型(如爱情,动作,恐怖)等等。因此,需要一种更有效的推荐方案,其除了利用显式的评分信息外,还可以利用所述上下文特征,以更有效地进行推荐。In the Internet, the recommendation function is a feature that is frequently used. In the existing recommendation system, the rating of the item is generally recommended based on the existing user. However, in the system, in addition to the rating information, there is a wide variety of information. Taking movie recommendation as an example, in addition to the user's rating information on the movie, there are many potential context features, such as the time of the rating (whether it is a holiday, morning, noon, evening, etc.), the age of the user (youth, middle-aged or old) , the type of film (such as love, action, horror) and so on. Therefore, there is a need for a more efficient recommendation that, in addition to utilizing explicit scoring information, can also utilize the contextual features to make recommendations more efficiently.

发明内容Summary of the invention

本说明书实施例旨在提供一种更有效的物品推荐方案,以解决现有技术中的不足。The embodiments of the present specification aim to provide a more effective item recommendation scheme to solve the deficiencies in the prior art.

为实现上述目的,本说明书一个方面提供一种预测用户对物品的评分的方法,包括:获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及关于 每个子类,基于多个所述第一用户标识和多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。To achieve the above object, an aspect of the present specification provides a method for predicting a user's rating of an item, comprising: acquiring a plurality of sample pairs, the sample pair comprising any one of the user identifiers selected from the plurality of user identifiers and selected from the group consisting of And identifying, by the plurality of item identifiers, the plurality of existing scores, the plurality of existing scores corresponding to the partial sample pairs of the plurality of sample pairs; acquiring the plurality of sets of context features respectively corresponding to the respective sample pairs The set of context features includes at least one of the following: a user feature, an item feature, and an interaction feature; and the plurality of sample pairs are clustered into a plurality of subclasses, wherein each subclass is based on the plurality of sets of context features Include a plurality of first sample pairs from the plurality of sample pairs, each of the first sample pairs including a first user identification and a first item identification, wherein the first user identification is first An identification of the user, the first item identification being an identification of the first item; and, regarding each sub-category, based on the plurality of the first user identification and the plurality of the first item identification, and And a plurality of the existing scores of the plurality of the first users relative to the plurality of the first items, and the scores of the first items that the first users have not scored are predicted by a collaborative filtering algorithm.

在一个实施例中,在所述预测用户对物品的评分的方法中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征。In one embodiment, in the method of predicting a user's rating of an item, the set of contextual features includes at least one of the following characteristics: a user feature, an item feature, and an interactive feature.

在一个实施例中,在所述预测用户对物品的评分的方法中,所述用户特征包括用户属性特征和/或用户评分统计特征,所述物品特征包括物品属性特征和/或物品评分统计特征。In one embodiment, in the method of predicting a user's rating of an item, the user characteristic includes a user attribute feature and/or a user rating statistical feature, the item feature including an item attribute feature and/or an item rating statistical feature .

在一个实施例中,在所述预测用户对物品的评分的方法中,所述聚类算法为k-means算法或gmm算法。In one embodiment, in the method of predicting a user's rating of an item, the clustering algorithm is a k-means algorithm or a gmm algorithm.

在一个实施例中,在所述预测用户对物品的评分的方法中,基于所述多组上下文特征,将所述多个样本对聚类为多个子类包括:在所述多个样本对中随机选择预定数目的初始质心;基于所述上下文特征,计算每个非质心的样本对到各个质心的距离;根据所述距离,将每个非质心的样本对归类到距离最近的质心;根据所述预定数目的质心及其对应的非质心样本对,计算相同数目的新的质心;判断所述新的质心是否满足预定条件;以及在满足所述预定条件的情况中,输出对所述多个样本对的聚类结果。In one embodiment, in the method of predicting a user's rating of an item, clustering the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features comprises: in the plurality of sample pairs Randomly selecting a predetermined number of initial centroids; based on the context features, calculating a distance of each non-centroid sample pair to each centroid; according to the distance, classifying each non-centroid sample pair to the closest centroid; Calculating the same number of new centroids by the predetermined number of centroids and their corresponding non-centroid sample pairs; determining whether the new centroid meets a predetermined condition; and in the case that the predetermined condition is satisfied, outputting the plurality Clustering results for sample pairs.

在一个实施例中,在所述预测用户对物品的评分的方法中,所述协同过滤算法为矩阵分解算法或knn算法。In one embodiment, in the method of predicting a user's rating of an item, the collaborative filtering algorithm is a matrix decomposition algorithm or a knn algorithm.

在一个实施例中,在所述预测用户对物品的评分的方法中,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分包括:对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵;将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵;根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个第一用户对其未评分的第一物品的评分。In one embodiment, in the method of predicting a user's rating of an item, predicting, by a collaborative filtering algorithm, a score of each first user for a first item that is not scored includes: for each sub-category, based on the plurality of Obtaining a user-item scoring matrix by the first user identifier, the plurality of first item identifiers, and the plurality of existing scores of the plurality of first users relative to the plurality of first items; - the item scoring matrix is decomposed into two low dimensional matrices such that the product of the two low dimensional matrices is closest to the user-item scoring matrix; the user is predicted based on a matrix obtained by multiplying two low dimensional matrices - A score for each first user in the item rating matrix for which the first item was not scored.

在一个实施例中,在所述预测用户对物品的评分的方法中,所述已有评分为用户直接评分或基于用户操作获取的评分。In one embodiment, in the method of predicting a user's rating of an item, the existing rating is a rating directly scored by the user or based on a user operation.

本说明书另一方面提供一种物品推荐方法,包括:获取多个第二样本对,所述第二样本对包括第二用户标识和第二物品标识,其中,所述第二用户标识为待推荐用户的用 户标识,所述第二物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;在通过上述预测评分的方法获取的多个子类中,确定各个所述第二样本对所在的子类;从通过上述预测评分的方法预测的评分中,获取每个所述第二样本对在其所属子类中对应的预测评分;根据所述预测评分,对所述各个第二样本对中包括的第二物品标识进行排序;以及根据所述排序,对所述第二用户推荐所述第二物品。Another aspect of the present specification provides an item recommendation method, including: acquiring a plurality of second sample pairs, the second sample pair including a second user identifier and a second item identifier, wherein the second user identifier is to be recommended a user identifier of the user, the second item identifier being any one of the plurality of item identifiers corresponding to the plurality of items to be recommended; determining, in each of the plurality of sub-categories obtained by the method of predicting the score a sub-category in which the two sample pairs are located; a score corresponding to each of the second sample pairs in the sub-category thereof is obtained from the score predicted by the method for predicting the score by the above; Sorting the second item identifications included in the second sample pair; and recommending the second item to the second user based on the ranking.

本说明书另一方面提供一种预测用户对物品的评分的装置,包括:样本对获取单元,配置为,获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;评分获取单元,配置为,获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;上下文特征获取单元,配置为,获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;聚类单元,配置为,基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及评分预测单元,配置为,关于每个子类,基于多个所述第一用户标识和多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。Another aspect of the present specification provides an apparatus for predicting a user's rating of an item, comprising: a sample pair obtaining unit configured to acquire a plurality of sample pairs, the sample pair including any one of the user identifiers selected from the plurality of user identifiers And any one of the item identifiers selected from the plurality of item identifiers; the score obtaining unit configured to acquire a plurality of existing scores, the plurality of existing scores corresponding to the partial sample pairs of the plurality of sample pairs; The feature acquiring unit is configured to acquire a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features includes at least one of the following types of features: a user feature, an item feature, and an interaction feature; the clustering unit is configured to Generating the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class comprises a plurality of first sample pairs taken from the plurality of sample pairs, each of the The first sample pair includes a first user identification and a first item identification, wherein the first user identification is an identification of the first user, and the first item identification is a first item And a score prediction unit configured to, based on each of the plurality of the first user identifiers and the plurality of the first item identifiers, and the plurality of the first users relative to the plurality of first A plurality of existing scores of the item, and a score of each first user for the first item that is not scored is predicted by a collaborative filtering algorithm.

在一个实施例中,在所述预测用户对物品的评分的装置中,所述聚类单元包括:选择单元,配置为,在所述多个样本对中随机选择预定数目的初始质心;第一计算单元,配置为,基于所述上下文特征,计算每个非质心的样本对到各个质心的距离;归类单元,配置为,根据所述距离,将每个非质心的样本对归类到距离最近的质心;第二计算单元,配置为,根据所述预定数目的质心及其对应的非质心样本对,计算相同数目的新的质心;判断单元,配置为,判断所述新的质心是否满足预定条件;以及输出单元,配置为,在满足所述预定条件的情况中,输出对所述多个样本对的聚类结果。In one embodiment, in the device for predicting a user's rating of an item, the clustering unit includes: a selecting unit configured to randomly select a predetermined number of initial centroids among the plurality of sample pairs; a calculating unit configured to calculate, according to the context feature, a distance of each non-centroid sample pair to each centroid; the categorizing unit configured to classify each non-centroid sample pair to a distance according to the distance a second centroid; the second calculating unit configured to calculate the same number of new centroids according to the predetermined number of centroids and their corresponding non-centroid sample pairs; the determining unit is configured to determine whether the new centroid is satisfied a predetermined condition; and an output unit configured to output a clustering result for the plurality of sample pairs in a case where the predetermined condition is satisfied.

在一个实施例中,在所述预测用户对物品的评分的装置中,所述评分预测单元包括:获取单元,配置为,对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵;分解单元,配置为,将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵;以及预测单元,配置为,根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个第一用户对其 未评分的第一物品的评分。In one embodiment, in the device for predicting a user's rating of an item, the rating prediction unit includes: an obtaining unit configured to, based on the plurality of first user identifiers, the plurality of sub-categories a first item identifier and the plurality of existing scores of the plurality of first users relative to the plurality of first items, acquiring a user-item scoring matrix; and a disassembling unit configured to: the user-item The scoring matrix is decomposed into two low-dimensional matrices such that the product of the two low-dimensional matrices is closest to the user-item scoring matrix; and the prediction unit is configured to be based on a matrix obtained by multiplying two low-dimensional matrices, A score for each of the first users in the user-item scoring matrix for which the first item was not scored is predicted.

本说明书另一方面提供一种物品推荐装置,包括:样本对获取单元,配置为,获取多个第二样本对,所述第二样本对包括第二用户标识和第二物品标识,其中,所述第二用户标识为待推荐用户的用户标识,所述第二物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;确定单元,配置为,在通过上述预测评分的方法获取的多个子类中,确定各个所述第二样本对所在的子类;预测评分获取单元,配置为,从通过所述预测评分的方法预测的评分中,获取每个所述第二样本对在其所属子类中对应的预测评分;排序单元,配置为,根据所述预测评分,对所述各个第二样本对中包括的第二物品标识进行排序;以及推荐单元,配置为,根据所述排序,对所述第二用户推荐所述第二物品。In another aspect, the present specification provides an item recommendation apparatus, including: a sample pair acquisition unit configured to acquire a plurality of second sample pairs, the second sample pair including a second user identifier and a second item identifier, wherein The second user identifier is a user identifier of the user to be recommended, and the second item identifier is any one of the plurality of item identifiers corresponding to the plurality of items to be recommended; the determining unit is configured to score according to the prediction a plurality of subclasses obtained by the method, determining a subclass in which each of the second sample pairs is located; a prediction score obtaining unit configured to acquire each of the second ones from a score predicted by the method of predicting the score And a ranking unit configured to sort the second item identifiers included in each of the second sample pairs according to the predicted score; and a recommendation unit configured to: The second item is recommended to the second user based on the ranking.

在根据本说明书实施例的物品推荐方法中,通过使用用户-物品的上下文特征对用户-物品对进行聚类,使得每个子类的评分噪音更小,相关性更高,因此,在每个子类中使用协同过滤方法,可以获得更好的推荐性能。In the item recommendation method according to the embodiment of the present specification, the user-item pair is clustered by using the context feature of the user-item, so that the scoring noise of each sub-class is smaller and the correlation is higher, and therefore, in each sub-class Use the collaborative filtering method to get better recommendation performance.

附图说明DRAWINGS

通过结合附图描述本说明书实施例,可以使得本说明书实施例更加清楚:The embodiments of the present specification can be more clearly understood by describing the embodiments of the specification with reference to the accompanying drawings:

图1示出了根据本说明书实施例的系统100的示意图;FIG. 1 shows a schematic diagram of a system 100 in accordance with an embodiment of the present specification;

图2示意示出了根据本说明书实施例的一种预测用户对物品的评分的方法的流程图;2 is a flow chart showing a method of predicting a user's rating of an item in accordance with an embodiment of the present specification;

图3示意示出了与用户-物品对应的多组上下文特征;Figure 3 schematically illustrates a plurality of sets of context features corresponding to user-items;

图4示出了根据本说明书实施例的通过K-means算法进行聚类的流程图;4 illustrates a flow chart for clustering by the K-means algorithm in accordance with an embodiment of the present specification;

图5示出了根据本说明书实施例的通过协同过滤算法预测评分的方法流程图;FIG. 5 illustrates a flow chart of a method for predicting a score by a collaborative filtering algorithm in accordance with an embodiment of the present specification; FIG.

图6示意示出了矩阵分解的过程;Figure 6 shows schematically the process of matrix decomposition;

图7示出了根据本说明书实施例的一种物品推荐方法的流程图;FIG. 7 is a flow chart showing an item recommendation method according to an embodiment of the present specification;

图8示出了根据本说明书实施例的一种预测用户对物品的评分的装置800;Figure 8 illustrates an apparatus 800 for predicting a user's rating of an item in accordance with an embodiment of the present specification;

图9示出根据本说明书实施例的一种物品推荐装置900。FIG. 9 illustrates an item recommendation device 900 in accordance with an embodiment of the present specification.

具体实施方式detailed description

下面将结合附图描述本说明书实施例。Embodiments of the present specification will be described below with reference to the drawings.

图1示出了根据本说明书实施例的系统100的示意图。如图1所示,系统100包括聚类模块11、预测评分模块12和推荐模块13。首先,将多个用户-物品对及其对应的多组上下文特征输入给聚类模块11。聚类模块11通过对由每组上下文特征构成的多个特征向量进行聚类,而获得对用户-物品对的聚类,即,将每个用户-物品对都聚类到对应的子类中。然后,聚类模块11将通过聚类获得的多个子类发送给预测评分模块12。同时,将各个子类包括的用户对物品的已有评分发送给预测评分模块12。预测评分模块12在各个子类中利用所述已有评分,通过协同过滤算法预测子类中的用户对物品的缺失的评分。在通过推荐模块13对用户进行推荐时,推荐模块13通过用户标识和待推荐物品标识,确定用户-待推荐物品对所在的子类,从预测评分模块12获取关于该子类的该用户-待推荐物品对的预测评分,并根据多个待推荐物品的预测评分的排序,向用户推荐物品。FIG. 1 shows a schematic diagram of a system 100 in accordance with an embodiment of the present specification. As shown in FIG. 1, the system 100 includes a clustering module 11, a predictive scoring module 12, and a recommendation module 13. First, a plurality of user-item pairs and their corresponding sets of contextual features are input to the clustering module 11. The clustering module 11 obtains clustering of user-item pairs by clustering a plurality of feature vectors composed of each set of context features, that is, clustering each user-item pair into a corresponding sub-category . Then, the clustering module 11 sends the plurality of subclasses obtained by the clustering to the prediction scoring module 12. At the same time, the existing ratings of the items by the users included in each sub-category are sent to the predictive scoring module 12. The predictive scoring module 12 utilizes the existing scoring in each sub-category to predict the user's score for the missing item in the sub-category through a collaborative filtering algorithm. When recommending the user through the recommendation module 13, the recommendation module 13 determines the sub-category of the user-to-recommended item pair by using the user identification and the item identification to be recommended, and acquires the user-to-subsidiary from the prediction scoring module 12 The predicted score of the recommended item pair is estimated, and the item is recommended to the user according to the ranking of the predicted scores of the plurality of items to be recommended.

图2示意示出了根据本说明书实施例的一种预测用户对物品的评分的方法的流程图,包括:在步骤S21,获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;在步骤S22,获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;在步骤S23,获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;在步骤S24,基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及在步骤S25,关于每个子类,基于多个所述第一用户标识和多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。2 is a flow chart schematically showing a method of predicting a user's rating of an item according to an embodiment of the present specification, including: in step S21, acquiring a plurality of sample pairs, the sample pair including being selected from a plurality of user identifiers Any one of the user identifiers and any one of the plurality of item identifiers; and in step S22, obtaining a plurality of existing scores, the plurality of existing scores corresponding to the plurality of sample pairs And acquiring, in step S23, a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features comprises at least one of the following types of features: a user feature, an item feature, and an interaction feature; and in step S24, based on the plurality of groups a context feature, clustering the plurality of sample pairs into a plurality of subclasses, wherein each subclass includes a plurality of first sample pairs taken from the plurality of sample pairs, each of the first sample pairs The first user identifier and the first item identifier are included, wherein the first user identifier is an identifier of the first user, the first item identifier is an identifier of the first item; and in step S25, regarding each child a class, based on a plurality of the first user identifiers and a plurality of the first item identifiers, and a plurality of existing scores of the plurality of first users relative to the plurality of first items, predicted by a collaborative filtering algorithm A score for each first user for the first item that they did not score.

首先,在步骤S21,获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识。所述样本对即用户-物品对,其可以表示为(用户标识,物品标识)。所述用户可以是推荐系统中的全部用户,例如,在豆瓣电影APP中包括的全部用户、淘宝中包括的全部用户等。当然,所述多个用户不必须是推荐系统中的全部用户,其例如也可以是推荐系统中的一个单元所涉及的系统部分用户。所述物品可以是推荐系统中包括的全部物品,例如,豆瓣电影中的电影、淘宝中的商品等。同理,所述多个物品不必须是系统中的全部物品,其也可以是系统中一定范围内的部分物品。通过将多个用户中的每个用户与多个物品中的每个物品两两组合, 从而获得多个用户-物品对。First, in step S21, a plurality of sample pairs are acquired, the sample pairs including any one of the user identifications selected from the plurality of user identifications and any one of the item identifications selected from the plurality of item identifications. The sample pair is a user-item pair, which can be represented as (user identification, item identification). The user may be all users in the recommendation system, for example, all users included in the Douban movie APP, all users included in Taobao, and the like. Of course, the plurality of users do not have to be all users in the recommendation system, and for example, they may also be system part users involved in one unit in the recommendation system. The item may be all items included in the recommendation system, for example, a movie in a watercress movie, a commodity in Taobao, or the like. Similarly, the plurality of items need not be all items in the system, but may also be part of a certain range of items in the system. A plurality of user-item pairs are obtained by combining each of a plurality of users with each of a plurality of items.

在步骤S22,获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对。这里,已有评分可以是用户的直接评分,例如,在豆瓣电影中,用户会以1到5的分值对每个电影进行评分。在另一个实例中,通过用户的操作间接获取所述已有评分。例如,在淘宝中,可基于用户对物品的点击、购买等操作,计算出用户对物品的评分。在推荐系统中,通常只有部分用户对部分物品的评分,例如,在豆瓣电影中,有的用户只是浏览,不对电影进行打分,或者,有的电影过于生僻,没有用户对其进行打分。因此,只有部分样本对具有对应的用户对物品的已有评分。In step S22, a plurality of existing scores are obtained, and the plurality of existing scores correspond to a partial sample pair of the plurality of sample pairs. Here, the existing rating may be a direct rating of the user, for example, in a Douban movie, the user will score each movie with a score of 1 to 5. In another example, the existing rating is obtained indirectly by a user's operation. For example, in Taobao, the user's rating of the item can be calculated based on the user's operation of clicking, purchasing, etc. on the item. In the recommendation system, usually only some users score some items. For example, in the Douban movie, some users just browse, do not score the movie, or some movies are too unfamiliar, and no user scores them. Therefore, only a portion of the sample pairs have a corresponding user's existing rating for the item.

在步骤S23,获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征。不同的推荐场景存在不同的特征类型,例如,在豆瓣电影中,与用户-物品对对应的上下文特征通常可分为以下几类特征:用户静态特征,例如用户的年龄特征,青少年、中年和老年,用户的性别特征等等;物品静态特征,如电影类别,爱情,动作,恐怖,等等;用户评分统计特征,如用户评分的平均分,方差等;物品评分统计特征,如电影的平均评分,方差等;交互特征,如评分时间是否节假日,早上、中午、晚上等。可从用户资料、物品属性及用户-物品交互信息获取所述上下文特征。In step S23, multiple sets of context features respectively corresponding to respective sample pairs are acquired, wherein the set of context features includes at least one of the following types of features: user features, item features, and interactive features. Different recommendation scenarios have different feature types. For example, in Douban movies, the context features corresponding to user-item pairs can be generally classified into the following types of features: user static features, such as user age characteristics, teens, middle-aged, and Old age, gender characteristics of users, etc.; static characteristics of items, such as movie categories, love, action, horror, etc.; user rating statistical characteristics, such as the average score of user ratings, variance, etc.; statistical characteristics of item ratings, such as the average of movies Rating, variance, etc.; interactive characteristics, such as whether the rating time is a holiday, morning, noon, evening, etc. The contextual feature can be obtained from user profiles, item attributes, and user-item interaction information.

图3示意示出了与用户-物品对应的多组上下文特征。图中u 1、u 2、u 3和u 4为用户标识,v 1、v 2、v 3和v 4为物品标识,u i与v j相交的方格表示一个用户-物品对,方格中的数字3、4、5等为对应的用户对物品的评分。在每个用户-物品对方格的后方,都包括一列方块,其示意表示对应于该用户-物品对的上下文特征组。该上下文特征组包括与该用户-物品对中包括的用户、物品及其交互相关的至少一个特征。 Figure 3 illustrates schematically a plurality of sets of contextual features corresponding to user-items. In the figure, u 1 , u 2 , u 3 and u 4 are user identifiers, v 1 , v 2 , v 3 and v 4 are item identifiers, and the squares where u i and v j intersect represent a user-item pair, square The numbers 3, 4, 5, etc. in the figure are the corresponding user's ratings of the items. Behind each user-item compartment includes a list of squares that schematically represent the set of contextual characteristics corresponding to the user-item pair. The set of context features includes at least one feature associated with the user, the item, and their interactions included in the user-item pair.

在步骤S24,基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识。At step S24, the plurality of sample pairs are clustered into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class includes a plurality of first sample pairs taken from the plurality of sample pairs, Each of the first sample pairs includes a first user identification and a first item identification, wherein the first user identification is an identification of the first user, and the first item identification is an identification of the first item.

可以将上下文特征组以特征向量的形式表示,该特征向量的维度为一组上下文特征中包括的特征数,并且,该特征向量中的每个分量表示在对应的特征维度中的特征值。例如,一组上下文特征可能包括:年龄,中年;电影类型,爱情。通过将年龄维度中的取值量化为:1(青少年)、2(中年)、3(老年),将电影类型维度中的取值量化为:1(爱情)、2(动作)、3(恐怖),从而获得对应于该组上下文特征的特征向量:(2, 1),其中第一分量表示年龄特征维度,第二个分量表示电影类型特征维度。从而可在由各个特征维度构成的特征空间中以向量点定位与所述上下文特征组对应的特征向量。不同用户-物品对对应的特征向量可能是相等的,即在维度空间中重合在一点上,即,该点对应于多个用户-物品对。The set of context features may be represented in the form of a feature vector whose dimensions are the number of features included in a set of context features, and each component of the feature vector represents a feature value in the corresponding feature dimension. For example, a set of contextual characteristics may include: age, middle age; movie type, love. By quantifying the values in the age dimension as: 1 (adolescent), 2 (middle-age), 3 (older), the values in the movie type dimension are quantified as: 1 (love), 2 (action), 3 ( Horror), thereby obtaining a feature vector corresponding to the set of contextual features: (2, 1), where the first component represents the age feature dimension and the second component represents the movie type feature dimension. Thereby, the feature vector corresponding to the context feature group can be located with a vector point in the feature space composed of the respective feature dimensions. The corresponding feature vectors of different user-item pairs may be equal, ie coincide in a dimension space at a point, ie, the point corresponds to a plurality of user-item pairs.

通过以上述方式将上下文特征组表示为特征空间中的向量点之后,可通过各种聚类算法对这些向量点进行聚类,例如K-means算法、gmm(高斯混合模型)算法、BIRCH算法、OPTICS算法等等。After the context feature set is represented as a vector point in the feature space in the above manner, the vector points can be clustered by various clustering algorithms, such as K-means algorithm, gmm (Gaussian mixed model) algorithm, BIRCH algorithm, OPTICS algorithm and so on.

下面将以K-means为例说明根据本说明书实施例的聚类过程。图4示出了根据本说明书实施例的通过K-means算法进行聚类的流程图。在步骤S41,在所述多个特征向量点中随机选择预定数目的初始质心。该预定数目即K-means算法中需预先确定的k。在本说明书实施例中,可通过预估的子类数确定k,例如,针对豆瓣电影,预估的子类可包括:(青少年,爱情)、(青少年、动作)、(青少年、恐怖)、(中年、爱情)、(中年、动作)、(中年、恐怖)、(老年、爱情)、(老年、动作)、(老年、恐怖),因此,可将k设定为9。即,k的值与特征数及其组合相关。在确定好k之后,在选择初始质心时,优选选择分散的k个初始质心。The clustering process according to an embodiment of the present specification will be described below by taking K-means as an example. 4 shows a flow chart for clustering by the K-means algorithm in accordance with an embodiment of the present specification. In step S41, a predetermined number of initial centroids are randomly selected among the plurality of feature vector points. The predetermined number is k which needs to be predetermined in the K-means algorithm. In the embodiment of the present specification, k can be determined by the estimated number of sub-categories. For example, for the Douban movie, the estimated sub-categories may include: (youth, love), (youth, action), (youth, horror), (middle-aged, love), (middle-aged, action), (middle-aged, horror), (old age, love), (old age, action), (old age, horror), therefore, k can be set to 9. That is, the value of k is related to the number of features and combinations thereof. After determining the k, it is preferred to select the dispersed k initial centroids when selecting the initial centroid.

在步骤S42,基于各个特征向量点,计算每个非质心点到各个质心点的距离。所述距离可以采用各种计算形式,例如,其可以为欧式距离、明氏(Minkowsky)距离、马氏(Manhattan)距离等。在步骤S43,根据所述距离,将每个非质心点对归类到距离最近的质心,从而获得k个簇。At step S42, the distance from each non-centroid point to each centroid point is calculated based on each feature vector point. The distance may take various forms of calculation, for example, it may be Euclidean distance, Minkowsky distance, Manhattan distance, and the like. At step S43, each non-centroid point pair is classified into the closest centroid according to the distance, thereby obtaining k clusters.

在步骤S44,根据所述预定数目的质心点及其对应的非质心点,计算相同数目的新的质心,使得全部点到自己所属的簇中心的距离之和最小,即,如公式(1)所示,新的质心

Figure PCTCN2018123411-appb-000001
为簇中的全部向量点的平均向量。 In step S44, the same number of new centroids are calculated according to the predetermined number of centroid points and their corresponding non-centroid points, so that the sum of the distances of all the points to the center of the cluster to which they belong is the smallest, that is, as in formula (1) As shown, the new centroid
Figure PCTCN2018123411-appb-000001
Is the average vector of all vector points in the cluster.

Figure PCTCN2018123411-appb-000002
Figure PCTCN2018123411-appb-000002

在步骤S45,判断所述新的质心是否满足预定条件,例如,预定条件为,新的质心相对于原有的质心未发生变化。In step S45, it is judged whether or not the new centroid satisfies a predetermined condition, for example, the predetermined condition is that the new centroid does not change with respect to the original centroid.

在不满足所述预定条件的情况中,流程回到步骤S42,以重复步骤S42-S45,在满足所述预定条件的情况中,流程进到步骤S46。在步骤S46,输出聚类结果,所述聚类结果包括多个簇及每个簇中包括的点,所述点对应于特征向量,即,对应于用户-物品对。 从而基于上下文特征,将多个用户-物品对聚类到多个子类中。其中每个子类包括取自于所述多个用户-物品对中的多个第一用户-物品对,每个所述第一用户-物品对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识。In the case where the predetermined condition is not satisfied, the flow returns to step S42 to repeat steps S42-S45, and in the case where the predetermined condition is satisfied, the flow advances to step S46. In step S46, a clustering result is outputted, the clustering result including a plurality of clusters and points included in each cluster, the points corresponding to feature vectors, that is, corresponding to user-item pairs. Thereby multiple user-item pairs are clustered into multiple sub-categories based on contextual characteristics. Each of the sub-categories includes a plurality of first user-item pairs taken from the plurality of user-item pairs, each of the first user-item pairs including a first user identification and a first item identification, wherein The first user identifier is an identifier of the first user, and the first item identifier is an identifier of the first item.

再参考图2,在步骤S25,关于每个子类,基于多个所述第一用户标识和多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。Referring again to FIG. 2, in step S25, based on each of the plurality of the first user identifiers and the plurality of the first item identifiers, and the plurality of the first users relative to the plurality of the first items A plurality of existing scores are predicted by a collaborative filtering algorithm for each first user to score a first item that is not scored.

这里的协同过滤算法可采用各种算法,例如knn算法或矩阵分解算法。下面以矩阵分解算法为例说明根据本说明书实施例的预测评分的过程。图5示出了根据本说明书实施例的通过协同过滤算法预测评分的方法流程图。The collaborative filtering algorithm herein may employ various algorithms such as a knn algorithm or a matrix decomposition algorithm. The process of predicting a score according to an embodiment of the present specification will be described below by taking a matrix decomposition algorithm as an example. FIG. 5 illustrates a flow chart of a method for predicting scores by a collaborative filtering algorithm in accordance with an embodiment of the present specification.

如图5所示,首先在步骤S51,对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵。图6示意示出了矩阵分解的过程。图6中的左侧的矩阵示意示出了一个用户-物品评分矩阵,其中u 1、u 2、u 3和u 4为用户标识,v 1、v 2、v 3、v 4和v 5为物品标识,u i与v j相交的方格中的数字表示u i对v j的评分,其中的“?”表示u i对v j未评分。 As shown in FIG. 5, first, in step S51, for each subclass, based on the plurality of first user identifiers, the plurality of first item identifiers, and the plurality of first users relative to the plurality of first The plurality of existing scores of the item obtain a user-item rating matrix. Figure 6 shows schematically the process of matrix decomposition. The matrix on the left in Figure 6 schematically shows a user-item scoring matrix, where u 1 , u 2 , u 3 and u 4 are user identities, v 1 , v 2 , v 3 , v 4 and v 5 are The item identification, the number in the square intersecting u i and v j represents the score of u i versus v j , where "?" indicates that u i is not scored for v j .

在步骤S52,将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵。设用户-评分矩阵为R,可将其分解为用户矩阵的转置矩阵U T和物品矩阵V,即R=U TV。使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵,也就是使得所述两个低维矩阵的乘积与所述用户-物品评分矩阵的差最小。因此,目标函数可设为以下公式(2): In step S52, the user-item scoring matrix is decomposed into two low-dimensional matrices such that the product of the two low-dimensional matrices is closest to the user-item scoring matrix. Let the user-scoring matrix be R, which can be decomposed into the transposed matrix U T of the user matrix and the item matrix V, ie R=U T V. The product of the two low dimensional matrices is brought closest to the user-item scoring matrix, ie, the difference between the product of the two low dimensional matrices and the user-item scoring matrix is minimized. Therefore, the objective function can be set to the following formula (2):

Figure PCTCN2018123411-appb-000003
Figure PCTCN2018123411-appb-000003

可通过例如梯度下降算法迭代计算U和V,从而获得使得所述目标函数最小的两个低维矩阵U和V。例如,如图6所示,图6中间相乘的两个矩阵即为通过例如梯度下降算法获得的两个低维矩阵U T和V。 U and V can be iteratively calculated by, for example, a gradient descent algorithm to obtain two low dimensional matrices U and V that minimize the objective function. For example, as shown in FIG. 6, the two matrices multiplied by FIG. 6 are the two low-dimensional matrices U T and V obtained by, for example, a gradient descent algorithm.

在步骤S53,根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个用户对其未评分的物品的评分。例如,如图6所示,通过将U T与V相乘,获得图6右侧所示的预测矩阵。对比图6中的的评分矩阵与预测矩阵,可见,预测矩阵中的灰色方格中的评分等于(或尽可能接近)评分矩阵中的已有评分,而预测矩阵中的白色方 格中的评分即为通过矩阵分解算法预测的评分。 At step S53, a score of each user in the user-item scoring matrix for the ungraded item is predicted based on a matrix obtained by multiplying two low-dimensional matrices. For example, as shown in FIG. 6, by multiplying U T by V, the prediction matrix shown on the right side of FIG. 6 is obtained. Comparing the scoring matrix and the prediction matrix in Figure 6, it can be seen that the score in the gray square in the prediction matrix is equal to (or as close as possible to) the existing score in the scoring matrix, and the score in the white square in the prediction matrix. This is the score predicted by the matrix decomposition algorithm.

图7示出了根据本说明书实施例的一种物品推荐方法的流程图。所述方法包括:在步骤S71,获取多个样本对,所述样本对包括用户标识和物品标识,其中,所述用户标识为待推荐用户的用户标识,所述物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;在步骤S72,在通过上述预测评分的方法获取的多个子类中,确定各个样本对所在的子类;在步骤S73,从通过上述预测评分的方法预测的评分中,获取每个所述样本对在其所属子类中对应的预测评分;在步骤S74,根据所述预测评分,对所述各个样本对中包括的物品标识进行排序;以及,在步骤S75,根据所述排序,对所述用户推荐物品。FIG. 7 shows a flow chart of an item recommendation method in accordance with an embodiment of the present specification. The method includes: in step S71, acquiring a plurality of sample pairs, where the sample pair includes a user identifier and an item identifier, wherein the user identifier is a user identifier of a user to be recommended, and the item identifier corresponds to multiple to-be-requested Determining any one of the plurality of item identifications of the item; in step S72, determining, in the plurality of sub-categories obtained by the above-described method of predictive scoring, a sub-category in which each sample pair is located; and in step S73, scoring from the above-mentioned prediction a method for predicting a score, obtaining a corresponding predicted score for each of the sample pairs in a subclass to which it belongs; and, at step S74, sorting the item identifiers included in the respective sample pairs according to the predicted score; At step S75, an item is recommended to the user based on the ranking.

首先,在步骤S71,获取多个样本对,所述样本对包括用户标识和物品标识,其中,所述用户标识为待推荐用户的用户标识,所述物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识。例如,当用户u 1在豆瓣电影中打开关于电影v 1的页面之后,或者当用户u 1在淘宝中打开商品v 1的购买页面之后,在诸如此类的场景中,系统会启动物品推荐流程。此时,系统根据用户标识u 1和用户操作的物品的物品标识v 1召回向用户u 1推荐的物品候选集。这里的召回是根据预定条件对推荐物品的粗筛,例如根据用户的初始喜好生成候选集、根据物品的属性(例如,当物品为推荐饭店时,该属性例如为地理位置)生成候选集等。将用户标识u 1分别与候选集中的每个物品的物品标识v i相组合,从而可获得多个样本对。 First, in step S71, a plurality of sample pairs are acquired, where the sample pair includes a user identifier and an item identifier, wherein the user identifier is a user identifier of a user to be recommended, and the item identifier is corresponding to a plurality of items to be recommended. Any one of a plurality of item identifications. For example, when the user u 1 Open the page on the film v 1 in the watercress movie, or when the user u 1 purchase page opens goods v Taobao in 1, in the sort of scenario, the system will start items recommended procedure. At this time, the system recalls the item candidate set recommended to the user u 1 based on the user identification u 1 and the item identification v 1 of the item operated by the user. The recall here is a coarse screening of recommended items according to predetermined conditions, for example, generating a candidate set according to the user's initial preference, generating a candidate set or the like according to the attributes of the item (for example, when the item is a recommended restaurant, the attribute is, for example, a geographical position). The user identification u 1 is combined with the item identification v i of each item in the candidate set, respectively, such that multiple sample pairs are available.

在步骤S72,在通过上述预测评分的方法获取的多个子类中,确定各个样本对所在的子类。根据上述预测评分方法,可以明确,一个样本对对应于一个特征向量,即对应于向量空间中的一个点。因此,一个样本对只可能被归类到一个子类中。从而,通过样本对中的用户标识和物品标识,可以在上述获得的多个子类中搜索出该样本对,从而确定该样本对所在的子类。类似地,可以获得这里的各个样本对所在的子类。In step S72, among the plurality of subclasses obtained by the above-described method of predictive scoring, the subclass in which each sample pair is located is determined. According to the above prediction scoring method, it can be clarified that one sample pair corresponds to one feature vector, that is, corresponds to one point in the vector space. Therefore, a sample pair can only be classified into one subclass. Thus, by the user identification and the item identification in the sample pair, the sample pair can be searched for among the plurality of sub-classes obtained above, thereby determining the sub-category in which the sample pair is located. Similarly, the subclasses in which each sample pair is located can be obtained.

在步骤S73,从通过上述预测评分的方法预测的评分中,获取每个所述样本对在其所属子类中对应的预测评分。如上述参考图5中所述,在每个子类中,通过协同过滤算法预测子类中的各个用户对其未评分的子类中的物品的评分。从而,在确定样本对所在的子类之后,可从与该子类关联的全部预测评分中获取与该样本对对应的预测评分。In step S73, from the scores predicted by the above-described method of predictive scoring, the predicted scores corresponding to each of the sample pairs in the subclass to which they belong are obtained. As described above with reference to FIG. 5, in each sub-category, the scores of the items in the unclassified sub-categories of the individual users in the sub-categories are predicted by the collaborative filtering algorithm. Thus, after determining the sub-category in which the sample pair is located, the predicted score corresponding to the sample pair can be obtained from all of the predicted scores associated with the sub-category.

在步骤S74,根据所述预测评分,对所述各个样本对中包括的物品标识进行排序。预测评分越高,表示用户对该物品的预估喜好程度越大。从而,可将预测评分高的物品排在靠前的位置。In step S74, the item identifiers included in the respective sample pairs are sorted according to the predicted score. The higher the predicted score, the greater the user's estimated preference for the item. Thus, items with a high predicted score can be placed in the front position.

在步骤S75,根据所述排序,对所述用户推荐物品。根据所述排序,可以以多种方式向用户推荐物品。例如,可仅向用户推荐排序靠前的物品,可向用户优先推荐排序靠前的物品,可以根据排序,顺序(时间顺序或空间顺序)向用户推荐物品,等等。At step S75, an item is recommended to the user based on the ranking. Depending on the ranking, items can be recommended to the user in a variety of ways. For example, the item ranked first can be recommended only to the user, the item ranked first can be preferentially recommended to the user, and the item can be recommended to the user according to the order, order (chronological order or spatial order), and the like.

图8示出了根据本说明书实施例的一种预测用户对物品的评分的装置800,包括:样本对获取单元81,配置为,获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;评分获取单元82,配置为,获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;上下文特征获取单元83,配置为,获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;聚类单元84,配置为,基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及评分预测单元85,配置为,关于每个子类,基于多个所述第一用户标识和多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。FIG. 8 illustrates an apparatus 800 for predicting a user's rating of an item according to an embodiment of the present specification, including: a sample pair obtaining unit 81 configured to acquire a plurality of sample pairs, the sample pair including being selected from a plurality of Any one of the user identifiers and any one of the plurality of item identifiers; the score obtaining unit 82 is configured to acquire a plurality of existing scores, wherein the plurality of existing scores correspond to the plurality of samples a partial sample pair of the pair; the context feature obtaining unit 83 is configured to acquire a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features comprises at least one of the following characteristics: user features, item features, and interactions a clustering unit 84 configured to cluster the plurality of sample pairs into a plurality of subclasses based on the plurality of sets of context features, wherein each of the subclasses comprises a plurality of the plurality of sample pairs a first sample pair, each of the first sample pairs including a first user identifier and a first item identifier, wherein the first user identifier is an identifier of the first user, the first object An identifier identified as a first item; and a score prediction unit 85 configured to, based on each of the sub-categories, a plurality of the first user identifiers and the plurality of the first item identifiers, and a plurality of the first users And a plurality of existing scores of the plurality of first items, and a score of each first user for the first item that is not scored is predicted by a collaborative filtering algorithm.

在一个实施例中,在上述预测用户对物品的评分的装置800中,所述聚类单元84包括:选择单元841,配置为,在所述多个样本对中随机选择预定数目的初始质心;第一计算单元842,配置为,基于所述上下文特征,计算每个非质心的样本对到各个质心的距离;归类单元843,配置为,根据所述距离,将每个非质心的样本对归类到距离最近的质心;第二计算单元844,配置为,根据所述预定数目的质心及其对应的非质心样本对,计算相同数目的新的质心;判断单元845,配置为,判断所述新的质心是否满足预定条件;以及输出单元846,配置为,在满足所述预定条件的情况中,输出对所述多个样本对的聚类结果。In one embodiment, in the foregoing apparatus 800 for predicting a user's rating of an item, the clustering unit 84 includes: a selecting unit 841 configured to randomly select a predetermined number of initial centroids among the plurality of sample pairs; The first calculating unit 842 is configured to calculate, according to the context feature, a distance of each non-centroid sample pair to each centroid; the categorizing unit 843 is configured to, according to the distance, each non-centroid sample pair Classified to the nearest centroid; the second calculating unit 844 is configured to calculate the same number of new centroids according to the predetermined number of centroids and their corresponding non-centroid sample pairs; the determining unit 845 is configured to determine Whether the new centroid satisfies a predetermined condition; and an output unit 846 configured to output a clustering result for the plurality of sample pairs in a case where the predetermined condition is satisfied.

在一个实施例中,在上述预测用户对物品的评分的装置中,所述评分预测单元85包括:获取单元851,配置为,对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵;分解单元852,配置为,将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵;以及预测单元853,配置为,根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个第一用户对其未评分的第一物品的评分。In one embodiment, in the foregoing apparatus for predicting a user's rating of an item, the rating prediction unit 85 includes: an obtaining unit 851 configured to, based on the plurality of first user identifiers, the each of the subcategories a plurality of first item identifiers and the plurality of existing scores of the plurality of first users relative to the plurality of first items to obtain a user-item scoring matrix; and a decomposing unit 852 configured to: - the item scoring matrix is decomposed into two low dimensional matrices such that the product of the two low dimensional matrices is closest to the user-item scoring matrix; and the predicting unit 853 is configured to obtain by multiplying the two low dimensional matrices a matrix predicting a score of the first item of the first user in the user-item scoring matrix for which the first item was not scored.

图9示出根据本说明书实施例的一种物品推荐装置900,包括:样本对获取单元91,配置为,获取多个第二样本对,所述第二样本对包括第二用户标识和第二物品标识,其中,所述第二用户标识为待推荐用户的用户标识,所述第二物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;确定单元92,配置为,在通过上述预测评分的方法获取的多个子类中,确定各个所述第二样本对所在的子类;预测评分获取单元93,配置为,从通过所述预测评分的方法预测的评分中,获取每个所述第二样本对在其所属子类中对应的预测评分;排序单元94,配置为,根据所述预测评分,对所述各个第二样本对中包括的第二物品标识进行排序;以及推荐单元95,配置为,根据所述排序,对所述第二用户推荐所述第二物品。FIG. 9 illustrates an item recommendation apparatus 900 according to an embodiment of the present specification, including: a sample pair acquisition unit 91 configured to acquire a plurality of second sample pairs, the second sample pair including a second user identifier and a second An item identifier, wherein the second user identifier is a user identifier of a user to be recommended, and the second item identifier is any one of a plurality of item identifiers corresponding to the plurality of items to be recommended; determining unit 92, configuring To determine, in a plurality of sub-categories obtained by the above-described method of predictive scoring, a sub-category in which each of the second sample pairs is located; a prediction score obtaining unit 93 configured to be predicted from a score predicted by the method of predicting scoring Obtaining a prediction score corresponding to each of the second sample pairs in the subclass to which it belongs; the sorting unit 94 is configured to perform, according to the predicted score, the second item identifier included in each of the second sample pairs Sorting; and a recommendation unit 95 configured to recommend the second item to the second user based on the ranking.

在根据本说明书实施例的物品推荐方法中,通过使用用户-物品的上下文特征对用户-物品对进行聚类,使得每个子类的评分噪音更小,相关性更高,因此,在每个子类中使用协同过滤方法,可以获得更好的推荐性能。In the item recommendation method according to the embodiment of the present specification, the user-item pair is clustered by using the context feature of the user-item, so that the scoring noise of each sub-class is smaller and the correlation is higher, and therefore, in each sub-class Use the collaborative filtering method to get better recommendation performance.

本领域普通技术人员应该还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执轨道,取决于技术方案的特定应用和设计约束条件。本领域普通技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art should further appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, in order to clearly illustrate the hardware. Interchangeability with software, the components and steps of the various examples have been generally described in terms of functionality in the above description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the solution. Different methods may be used to implement the described functionality for each particular application, but such implementation should not be considered to be beyond the scope of the application.

结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、处理器执轨道的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of a method or algorithm described in connection with the embodiments disclosed herein may be implemented in hardware, in a software module in a processor orbit, or in a combination of the two. The software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments of the present invention have been described in detail with reference to the preferred embodiments of the present invention. All modifications, equivalent substitutions, improvements, etc., made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (16)

一种预测用户对物品的评分的方法,包括:A method of predicting a user's rating of an item, including: 获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;Obtaining a plurality of sample pairs, the sample pair comprising any one of the user identifiers selected from the plurality of user identifiers and any one of the item identifiers selected from the plurality of item identifiers; 获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;Obtaining a plurality of existing scores corresponding to a partial sample pair of the plurality of sample pairs; 获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;Obtaining a plurality of sets of context features respectively corresponding to respective sample pairs, wherein the set of context features comprises at least one of the following types of features: user features, item features, and interactive features; 基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及Generating the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class comprises a plurality of first sample pairs taken from the plurality of sample pairs, each of the The first sample pair includes a first user identification and a first item identification, wherein the first user identification is an identification of the first user, and the first item identification is an identification of the first item; 关于每个子类,基于多个所述第一用户标识、多个所述第一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。Regarding each sub-category, based on a plurality of the first user identifiers, a plurality of the first item identifiers, and a plurality of existing scores of the plurality of first users relative to the plurality of first items, The filtering algorithm predicts the scores of the first item for each of the first users for which they have not scored. 根据权利要求1所述的预测用户对物品的评分的方法,其中,所述用户特征包括用户属性特征和/或用户评分统计特征,所述物品特征包括物品属性特征和/或物品评分统计特征。The method of predicting a user's rating of an item of claim 1, wherein the user characteristic comprises a user attribute feature and/or a user rating statistical feature, the item feature comprising an item attribute feature and/or an item rating statistical feature. 根据权利要求1所述的预测用户对物品的评分的方法,其中,所述聚类算法为k-means算法或gmm算法。The method of predicting a user's rating of an item according to claim 1, wherein the clustering algorithm is a k-means algorithm or a gmm algorithm. 根据权利要求1所述的预测用户对物品的评分的方法,其中基于所述多组上下文特征,将所述多个样本对聚类为多个子类包括:The method of predicting a user's rating of an item according to claim 1, wherein clustering the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context characteristics comprises: 在所述多个样本对中随机选择预定数目的初始质心;Selecting a predetermined number of initial centroids randomly among the plurality of sample pairs; 基于所述多组上下文特征,计算每个非质心的样本对到各个质心的距离;Calculating a distance from each non-centroid sample pair to each centroid based on the plurality of sets of context features; 根据所述距离,将每个非质心的样本对归类到距离最近的质心;According to the distance, each non-centroid sample pair is classified into the closest centroid; 基于所述多组上下文特征,根据所述预定数目的质心及其对应的非质心样本对,计算相同数目的新的质心;Calculating the same number of new centroids based on the predetermined number of centroids and their corresponding non-centroid sample pairs based on the plurality of sets of context features; 判断所述新的质心是否满足预定条件;以及Determining whether the new centroid meets a predetermined condition; 在满足所述预定条件的情况中,输出对所述多个样本对的聚类结果。In the case where the predetermined condition is satisfied, a clustering result for the plurality of sample pairs is output. 根据权利要求1所述的预测用户对物品的评分的方法,其中,所述协同过滤算法为矩阵分解算法或knn算法。The method of predicting a user's rating of an item according to claim 1, wherein the collaborative filtering algorithm is a matrix decomposition algorithm or a knn algorithm. 根据权利要求1所述的预测用户对物品的评分的方法,其中通过协同过滤算法预 测各个第一用户对其未评分的第一物品的评分包括:The method of predicting a user's rating of an item according to claim 1, wherein predicting a score of each of the first users for which the first item is not scored by the collaborative filtering algorithm comprises: 对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵;Obtaining, for each sub-category, based on the plurality of first user identifiers, the plurality of first item identifiers, and the plurality of existing scores of the plurality of first users relative to the plurality of first items User-item scoring matrix; 将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵;Decomposing the user-item scoring matrix into two low-dimensional matrices such that a product of the two low-dimensional matrices is closest to the user-item scoring matrix; 根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个第一用户对其未评分的第一物品的评分。A score of the first item in the user-item scoring matrix for which the first item is not scored is predicted based on a matrix obtained by multiplying two low-dimensional matrices. 根据权利要求1所述的预测用户对物品的评分的方法,其中,所述已有评分为用户直接评分或基于用户操作获取的评分。The method of predicting a user's rating of an item according to claim 1, wherein the existing rating is a rating directly scored by a user or based on a user operation. 一种物品推荐方法,包括:An item recommendation method includes: 获取多个第二样本对,所述第二样本对包括第二用户标识和第二物品标识,其中,所述第二用户标识为待推荐用户的用户标识,所述第二物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;Obtaining a plurality of second sample pairs, where the second sample pair includes a second user identifier and a second item identifier, wherein the second user identifier is a user identifier of the user to be recommended, and the second item identifier corresponds to Any one of a plurality of item identifiers of the plurality of items to be recommended; 在通过根据权利要求1-7中任一项所述的方法获取的多个子类中,确定各个所述第二样本对所在的子类;Determining, in a plurality of subclasses obtained by the method according to any one of claims 1 to 7, a subclass in which each of the second sample pairs is located; 从通过根据权利要求1-7中任一项所述的方法预测的评分中,获取每个所述第二样本对在其所属子类中对应的预测评分;Acquiring a predicted score corresponding to each of the second sample pairs in a subclass to which it belongs, from a score predicted by the method according to any one of claims 1-7; 根据所述预测评分,对所述各个第二样本对中包括的第二物品标识进行排序;以及Sorting the second item identifiers included in the respective second sample pairs according to the predicted score; 根据所述排序,对所述第二用户推荐所述第二物品。The second item is recommended to the second user based on the ranking. 一种预测用户对物品的评分的装置,包括:A device for predicting a user's rating of an item, comprising: 样本对获取单元,配置为,获取多个样本对,所述样本对包括选自于多个用户标识的任一个用户标识和选自于多个物品标识的任一个物品标识;The sample pair obtaining unit is configured to acquire a plurality of sample pairs, the sample pair including any one of the user identifiers selected from the plurality of user identifiers and any one of the item identifiers selected from the plurality of item identifiers; 评分获取单元,配置为,获取多个已有评分,所述多个已有评分对应于所述多个样本对中的部分样本对;a score obtaining unit configured to acquire a plurality of existing scores, the plurality of existing scores corresponding to a part of the plurality of sample pairs; 上下文特征获取单元,配置为,获取分别与各个样本对对应的多组上下文特征,其中,一组上下文特征包括以下至少一类特征:用户特征、物品特征、以及交互特征;The context feature acquiring unit is configured to acquire a plurality of sets of context features respectively corresponding to the respective sample pairs, wherein the set of context features includes at least one of the following types of features: a user feature, an item feature, and an interaction feature; 聚类单元,配置为,基于所述多组上下文特征,将所述多个样本对聚类为多个子类,其中每个子类包括取自于所述多个样本对中的多个第一样本对,每个所述第一样本对包括第一用户标识和第一物品标识,其中所述第一用户标识为第一用户的标识,所述第一物品标识为第一物品的标识;以及a clustering unit configured to cluster the plurality of sample pairs into a plurality of sub-categories based on the plurality of sets of context features, wherein each sub-class comprises a plurality of the same from the plurality of sample pairs In the pair, each of the first sample pairs includes a first user identifier and a first item identifier, wherein the first user identifier is an identifier of the first user, and the first item identifier is an identifier of the first item; as well as 评分预测单元,配置为,关于每个子类,基于多个所述第一用户标识和多个所述第 一物品标识、和多个所述第一用户相对于多个所述第一物品的多个已有评分,通过协同过滤算法预测各个第一用户对其未评分的第一物品的评分。a score prediction unit configured to, based on each of the plurality of the first user identifiers and the plurality of the first item identifiers, and the plurality of the first users relative to the plurality of the first items The existing scores are used to predict the scores of the first items that each first user has not scored by the collaborative filtering algorithm. 根据权利要求9所述的预测用户对物品的评分的装置,其中,所述用户特征包括用户属性特征和/或用户评分统计特征,所述物品特征包括物品属性特征和/或物品评分统计特征。The apparatus for predicting a user's rating of an item of claim 9, wherein the user characteristic comprises a user attribute feature and/or a user rating statistical feature, the item feature comprising an item attribute feature and/or an item rating statistical feature. 根据权利要求9所述的预测用户对物品的评分的装置,其中,所述聚类算法为k-means算法或gmm算法。The apparatus for predicting a user's rating of an item according to claim 9, wherein the clustering algorithm is a k-means algorithm or a gmm algorithm. 根据权利要求9所述的预测用户对物品的评分的装置,其中所述聚类单元包括:The apparatus for predicting a user's rating of an item according to claim 9, wherein the clustering unit comprises: 选择单元,配置为,在所述多个样本对中随机选择预定数目的初始质心;a selecting unit configured to randomly select a predetermined number of initial centroids among the plurality of sample pairs; 第一计算单元,配置为,基于所述上下文特征,计算每个非质心的样本对到各个质心的距离;a first calculating unit configured to calculate, according to the context feature, a distance of each non-centroid sample pair to each centroid; 归类单元,配置为,根据所述距离,将每个非质心的样本对归类到距离最近的质心;a categorizing unit configured to classify each non-centroidal sample pair to a closest centroid according to the distance; 第二计算单元,配置为,根据所述预定数目的质心及其对应的非质心样本对,计算相同数目的新的质心;a second calculating unit, configured to calculate the same number of new centroids according to the predetermined number of centroids and their corresponding non-centroid sample pairs; 判断单元,配置为,判断所述新的质心是否满足预定条件;以及a determining unit configured to determine whether the new centroid meets a predetermined condition; 输出单元,配置为,在满足所述预定条件的情况中,输出对所述多个样本对的聚类结果。And an output unit configured to output a clustering result for the plurality of sample pairs in a case where the predetermined condition is satisfied. 根据权利要求9所述的预测用户对物品的评分的装置,其中,所述协同过滤算法为矩阵分解算法或knn算法。The apparatus for predicting a user's rating of an item according to claim 9, wherein the collaborative filtering algorithm is a matrix decomposition algorithm or a knn algorithm. 根据权利要求9所述的预测用户对物品的评分的装置,其中所述评分预测单元包括:The apparatus for predicting a user's rating of an item according to claim 9, wherein the rating prediction unit comprises: 获取单元,配置为,对于每个子类,基于所述多个第一用户标识、所述多个第一物品标识及所述多个第一用户相对于所述多个第一物品的所述多个已有评分,获取用户-物品评分矩阵;An obtaining unit, configured to, based on the plurality of first user identifiers, the plurality of first item identifiers, and the plurality of first users relative to the plurality of first items for each sub-category Have scored to obtain a user-item rating matrix; 分解单元,配置为,将所述用户-物品评分矩阵分解为两个低维矩阵,使得所述两个低维矩阵的乘积最接近所述用户-物品评分矩阵;以及a decomposition unit configured to decompose the user-item scoring matrix into two low-dimensional matrices such that a product of the two low-dimensional matrices is closest to the user-item scoring matrix; 预测单元,配置为,根据将两个低维矩阵相乘获得的矩阵,预测所述用户-物品评分矩阵中各个第一用户对其未评分的第一物品的评分。And a prediction unit configured to predict, according to the matrix obtained by multiplying the two low-dimensional matrices, a score of the first item of the user-item scoring matrix for the first item that is not scored. 根据权利要求9所述的预测用户对物品的评分的装置,其中,所述已有评分为用户直接评分或基于用户操作获取的评分。The apparatus for predicting a user's rating of an item according to claim 9, wherein the existing rating is a rating directly scored by a user or based on a user operation. 一种物品推荐装置,包括:An item recommendation device comprising: 样本对获取单元,配置为,获取多个第二样本对,所述第二样本对包括第二用户标识和第二物品标识,其中,所述第二用户标识为待推荐用户的用户标识,所述第二物品标识为对应于多个待推荐物品的多个物品标识中的任一个物品标识;The sample pair obtaining unit is configured to acquire a plurality of second sample pairs, where the second sample pair includes a second user identifier and a second item identifier, wherein the second user identifier is a user identifier of the user to be recommended, The second item identifier is any one of the plurality of item identifiers corresponding to the plurality of items to be recommended; 确定单元,配置为,在通过根据权利要求1-7中任一项所述的方法获取的多个子类中,确定各个所述第二样本对所在的子类;a determining unit, configured to determine, in a plurality of subclasses obtained by the method according to any one of claims 1-7, a subclass in which each of the second sample pairs is located; 预测评分获取单元,配置为,从通过根据权利要求1-7中任一项所述的方法预测的评分中,获取每个所述第二样本对在其所属子类中对应的预测评分;a prediction score acquisition unit configured to acquire, from a score predicted by the method according to any one of claims 1 to 7, a predicted score corresponding to each of the second sample pairs in a subclass to which it belongs; 排序单元,配置为,根据所述预测评分,对所述各个第二样本对中包括的第二物品标识进行排序;以及a sorting unit configured to sort the second item identifiers included in the respective second sample pairs according to the predicted score; 推荐单元,配置为,根据所述排序,对所述第二用户推荐所述第二物品。a recommendation unit configured to recommend the second item to the second user according to the sorting.
PCT/CN2018/123411 2018-03-27 2018-12-25 Item recommendation Ceased WO2019184480A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810257617.XA CN108647985B (en) 2018-03-27 2018-03-27 Item recommendation method and device
CN201810257617.X 2018-03-27

Publications (1)

Publication Number Publication Date
WO2019184480A1 true WO2019184480A1 (en) 2019-10-03

Family

ID=63744806

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123411 Ceased WO2019184480A1 (en) 2018-03-27 2018-12-25 Item recommendation

Country Status (3)

Country Link
CN (1) CN108647985B (en)
TW (1) TW201942834A (en)
WO (1) WO2019184480A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220092654A1 (en) * 2020-09-24 2022-03-24 Ncr Corporation Prepackaged basket generator and interface

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647985B (en) * 2018-03-27 2020-06-09 阿里巴巴集团控股有限公司 Item recommendation method and device
CN109635291B (en) * 2018-12-04 2023-04-25 重庆理工大学 A Recommendation Method Fused with Rating Information and Item Content Based on Collaborative Training

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8655882B2 (en) * 2011-08-31 2014-02-18 Raytheon Company Method and system for ontology candidate selection, comparison, and alignment
CN104966125A (en) * 2015-05-06 2015-10-07 同济大学 Article scoring and recommending method of social network
CN106326483A (en) * 2016-08-31 2017-01-11 华南理工大学 Collaborative recommendation method with user context information aggregation
CN108647985A (en) * 2018-03-27 2018-10-12 阿里巴巴集团控股有限公司 A kind of item recommendation method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853470A (en) * 2010-05-28 2010-10-06 浙江大学 A Collaborative Filtering Method Based on Social Tags
CN102789499B (en) * 2012-07-16 2015-08-12 浙江大学 Based on the collaborative filtering method of implicit relationship situated between article
CN102968506A (en) * 2012-12-14 2013-03-13 北京理工大学 Personalized collaborative filtering recommendation method based on extension characteristic vectors

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8655882B2 (en) * 2011-08-31 2014-02-18 Raytheon Company Method and system for ontology candidate selection, comparison, and alignment
CN104966125A (en) * 2015-05-06 2015-10-07 同济大学 Article scoring and recommending method of social network
CN106326483A (en) * 2016-08-31 2017-01-11 华南理工大学 Collaborative recommendation method with user context information aggregation
CN108647985A (en) * 2018-03-27 2018-10-12 阿里巴巴集团控股有限公司 A kind of item recommendation method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220092654A1 (en) * 2020-09-24 2022-03-24 Ncr Corporation Prepackaged basket generator and interface

Also Published As

Publication number Publication date
CN108647985B (en) 2020-06-09
TW201942834A (en) 2019-11-01
CN108647985A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
CN110969516B (en) A product recommendation method and device
Bellogín et al. Using graph partitioning techniques for neighbour selection in user-based collaborative filtering
Desrosiers et al. A comprehensive survey of neighborhood-based recommendation methods
Chen et al. An effective recommendation method for cold start new users using trust and distrust networks
Ning et al. A comprehensive survey of neighborhood-based recommendation methods
Ghazanfar et al. A scalable, accurate hybrid recommender system
US10922725B2 (en) Automatic rule generation for recommendation engine using hybrid machine learning
CN112685635B (en) Item recommendation method, device, server and storage medium based on classification label
Ghazanfar et al. Building switching hybrid recommender system using machine learning classifiers and collaborative filtering
Mohammed et al. Feature reduction based on hybrid efficient weighted gene genetic algorithms with artificial neural network for machine learning problems in the big data
WO2019184480A1 (en) Item recommendation
Ghazanfar et al. Fulfilling the needs of gray-sheep users in recommender systems, a clustering solution
Kumar Bokde et al. An item-based collaborative filtering using dimensionality reduction techniques on mahout framework
Indira et al. Multi cloud based service recommendation system using DBSCAN algorithm
Tso et al. Attribute-aware collaborative filtering
Stankova et al. Classification over bipartite graphs through projection
CN115222177A (en) Service data processing method and device, computer equipment and storage medium
Xu et al. Exploiting interactions of review text, hidden user communities and item groups, and time for collaborative filtering
El Alami et al. Improving Neighborhood-Based Collaborative Filtering by a Heuristic Approach and an Adjusted Similarity Measure.
Cho et al. Book recommendation system
Lampropoulos et al. Evaluation of a cascade hybrid recommendation as a combination of one-class classification and collaborative filtering
Kużelewska Quality of recommendations and cold-start problem in recommender systems based on multi-clusters
Kuzelewska et al. Multi-clustering applied to collaborative recommender systems
Bouza et al. (Partial) user preference similarity as classification-based model similarity
KR20190136941A (en) Rating Prediction Method for Recommendation Algorithm Based on Observed Ratings and Similarity Graphs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18912055

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18912055

Country of ref document: EP

Kind code of ref document: A1