Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
In the following description, numerous specific details are set forth to provide a thorough description of embodiments of the invention. However, it will be understood by those skilled in the art that the embodiments of the present application may be practiced without these specific details.
Referring to fig. 1, an exemplary flowchart of a content recommendation method according to an embodiment of the present application is shown. For ease of understanding, in the present embodiment, the description is given in conjunction with a device for presenting recommended content. It will be understood by those skilled in the art that the device for displaying the recommended content may be an electronic large screen, such as an electronic screen disposed in an underground passage or a hall of a shopping mall, or may be a mobile electronic device with a display function, such as a mobile phone or a tablet computer.
As shown in fig. 1, in step 101, user attribute information and/or environment attribute information is acquired.
In this embodiment, the user attribute information may be extracted based on an image captured by a camera. The camera may be mounted on the device for presenting recommended content or may be mounted at one or more locations near the device for presenting recommended content. In some implementations, images collected by multiple cameras may be acquired to analyze user attribute information for all users within a display range.
The user attribute information may include user individual attribute information and group attribute information. The individual attribute information may be information obtained by analyzing individual characteristics of each user, and the group attribute information may be information obtained based on a relationship between a plurality of users.
In some implementations, the individual attribute information may include individual characteristic information of the user, such as appearance characteristic information of the user, i.e., information that may be directly derived from surface characteristics of the user, such as information of the user's gender, age, race, style of clothing, style of makeup, health status, posture, and the like. The gender, age, race and makeup style can be obtained by classifying the characteristics of the face region through a classifier, and the health status can be obtained by classifying or searching the characteristics of the face region and the posture through the classifier. The individual characteristic information may also include emotional state information of the user, such as happiness, sadness, anger, and the like, which may also be obtained by analyzing the facial expression and limb movement of the user using a classifier.
Further, the individual attribute information may also include the personality and purchasing power of the user. Wherein the personality may include at least one of: sense of responsibility, emotional stability, extroversion, openness to novelty, affinity, popularity, confidence, and autism. These personality traits may be quantified as a number of levels, each level corresponding to a different degree of compliance. For example 7 levels from-3 to +3, -3 may represent the least fit and +3 may represent the most fit. Taking the recommended commodities as an example, when the openness index of the user to new things is quantified as +3, some novel commodities, such as clothes of a different style or color from that worn by the user or recommended travel information, can be recommended to the user; in contrast, when the user's openness index for new things is quantified as-3, it is possible to recommend to the user goods, such as clothes, accessories, etc., of the same style that are consistent with the user's current state.
The character of the user can be obtained by analyzing the characteristics of the user through a regressor. In some implementations, the personality may be obtained through a method of machine learning. For example, a training set can be established based on existing data, and a personality model is trained by a machine learning method to obtain a mapping relation between user characteristics and the personality. Optionally, the character model may be trained using artificially quantized character indexes as training data. For example, images of multiple users may be acquired and user features extracted, and then the degree of extroversion for each user is quantified based on psychological analysis to seven levels-3, -2, -1, 0, 1, 2, 3, -3 for lowest extroversion and +3 for highest extroversion. For example, a user wearing brightly colored clothing may quantify the degree of extroversion of the user as +2 or + 3. The characteristics of the user and the quantized outward degree are used as training data, and machine learning methods such as a Support Vector Machine (SVM), a random forest and the like are adopted to train an outward degree model, so that the mapping relation between the characteristics of the user and the outward degree is obtained. The degree of extroversion model may be used for analysis when determining the degree of extroversion of a user. In a further implementation, the training may be performed based on a plurality of features of the user and a plurality of personality indicators, and the obtained model is a comprehensive personality model, and the determination results of the plurality of personality indicators of the user may be directly obtained through the comprehensive personality model.
The purchasing power information can be obtained from the price information of clothes, shoes and accessories worn by the user. First, the features of the clothes, shoes and accessories of the user can be extracted and matched in the database to obtain the brand information and/or price information of the clothes, shoes and accessories. Alternatively, the price level of the price information among the price points of all the same kind of goods may be calculated, thereby determining the purchasing power of the user. For example, features of a watch worn by a user may be extracted, and brand information and price information for the watch may be looked up in a merchandise database. Further, price interval information of watches of the same brand can be inquired, and the price information or the ordering (such as the ordering percentage and the like) of the price interval information in all watch commodities can be obtained. Alternatively or additionally, the purchasing power may be quantified, such as in a plurality of levels. Specifically, if the price of the watch is ranked as the top 10% in all types of watches, the purchasing power of the user may be determined as the highest rank.
The group attribute information of the user may include social relationship information of the user, including family relationship, lover relationship, friendship, and the like. The social relationship information of the users can be determined from clothes worn by the users, and the relationships among the users can be determined by, for example, a lover's clothing or a parent-child clothing.
In some alternative implementations, the user attribute information may be obtained by: and acquiring an image of an area where the display position of the recommended content is located, determining a user serving as a service object from the image, and performing individual analysis and group analysis on the user serving as the service object.
In the above implementation, an image of an area where the recommended content presentation position is located may be captured by an image capturing device (e.g., a camera component on a mobile terminal). Optionally, before the image is acquired, whether the user exists in the area where the display position is located and the position of the user are detected through a sensor, and the detected image of the user is acquired by controlling the camera to rotate and focus through a computer system.
Optionally, determining the user as the service object from the image may include: detecting a pedestrian in an image and a sight-line focus position of the pedestrian; judging whether the sight focus position of the pedestrian is located at the display position of the recommended content; if so, the pedestrian is determined to be the user who is the object of the service. After the image is collected, pedestrian detection can be performed based on skin color characteristics or human body shape characteristics, and also can be performed by adopting machine learning methods such as random forests, hidden Markov models and the like, so that the human body in the image is extracted. Then, based on the color feature (for example, black) and the shape feature (approximate to a circle) of the pupil, the pupil position of the human body can be detected by methods such as edge extraction and hough transformation, so as to determine the position parameter of the line focus of the pedestrian, and based on the position parameter, whether the line focus position of the pedestrian in the image is located at the display position of the recommended content is judged. If so, the detected pedestrian can be identified as the user to be served. It should be noted that the display position of the recommended content may include an area, and when the focal point of the line of sight of the pedestrian is located in the area, the pedestrian is considered as the user who is the service object.
In some alternative implementations, the individual analysis of the user may be performed by: each user in the image is divided into a plurality of sub-images according to the human body part, and the sub-images are analyzed by adopting a classifier and/or a regressor to obtain user attribute information. Specifically, the user detected in the image may be image-segmented according to different parts of the human body. For example, the human body image may be segmented into a face image, a limb image, and a body image. Each sub-image may then be analyzed using a classifier and/or a regressor. For example, the facial image may be classified using a makeup style classifier, and the limb image and the body image may be classified using a clothing style classifier, thereby obtaining various attribute information of the user.
Further reference is made to fig. 2, which shows a schematic diagram of the effect of an individual analysis according to an embodiment of the application. As shown in fig. 2, the extracted user image may be divided into a plurality of sub-images such as a hair image, a face image, a left arm image, a bag image, a left leg image, a skirt image, a left shoe image, a right leg image, a jacket image, a right arm image, and a glasses image. Different user attribute information can be obtained by analyzing each sub-image with a classifier or a regressor. For example, classifying the hair image can obtain the style and the hair quality of the hair of the user, analyzing the facial image can obtain the attribute information such as the sex, the age, the race, the expression, the skin condition, the facial features and the like of the user, analyzing the left arm image, the right arm image, the left leg image and the right leg image of the user can obtain the other characteristics such as the strength degree and the health degree of the user, and analyzing the bag image, the glasses image, the left shoe image, the right shoe image, the skirt image and the coat image can obtain the attribute information such as the type of clothing and accessories preferred by the user, the brand and the price preferred by the user, the matched commodity and the like. The analysis of the glasses images may also result in user preferred glasses function information. These user attribute information may each be represented quantitatively, e.g., in a hierarchical manner as previously described.
Returning to fig. 1, further, performing a group analysis on the users includes grouping the users. An optional implementation manner is to classify the multiple users in the image according to the social relationship by using a classifier based on the clothing and the association degree of the postures and the relative position information of the multiple users in the image. For example, whether the users are in a lover relationship or a family membership may be determined based on whether the styles and styles of the clothes of the users in the image are the same, and whether the users are in a friend relationship or a lover relationship may be analyzed according to the degree of closeness of the users. For example, when it is detected that two users have limb contact, it may be preliminarily determined that the two users are in a friendship or family relationship, and then whether the two users are in a lover relationship is determined based on whether the clothes of the two users are the same. Another optional implementation manner is to cluster a plurality of users in the image by using the user attribute information based on the individual analysis result. The clustering method may be to calculate a distance between the user attribute information after quantifying the attribute information of each user, group the user attribute information having a distance smaller than a preset threshold into a group, and further group the corresponding users. Alternatively, in clustering, clustering may be performed preferentially using group attribute information (e.g., social relationships) of the users, and then clustering may be performed using user attribute information of the individuals.
Through the above-described acquisition mode of the user attribute information, not only richer superficial characteristics, such as gender, age, race, health status, makeup style, accessory style, and the like, can be acquired, but also deeper characteristics of the user, such as character, purchasing power, and the like, can be acquired, so that content more meeting the user requirements or more interesting to the user can be recommended to the user based on the characteristics, and further the conversion rate of the recommended content can be improved.
In some embodiments, the manner of obtaining the environment attribute information may include, but is not limited to, receiving current time information through a network and/or receiving spatial information corresponding to a presentation location of the recommended content through the network. Wherein the time information may comprise at least one of: current time of day, weather conditions, holiday information, current trending events. Spatial information may include geographic aspects of the presentation location and/or landmarks in nearby areas, such as airports, waiting rooms, business centers, and the like.
According to the method and the device, besides the user attribute, the environment attribute can be analyzed and obtained, the recommended content is analyzed and decided by utilizing the environment attribute, the recommended content which is more in line with the environment state can be provided, and the timeliness of content recommendation is improved.
In step 102, recommended content is synthesized based on the user attribute information and/or the environment attribute information.
In embodiments of the present application, content is divided into multiple elements, which may be combined in various ways to produce different content. In this way, the content recommendation system can generate rich content by storing various content elements without storing a large amount of fixed content in advance, and thus the changed content can also be called dynamic content.
In some embodiments, step 102 may include step 1021: candidate content elements are determined based on the user attribute information and/or the environment attribute information.
In this embodiment, the candidate content elements may include candidate object elements and candidate scene elements. In some alternative implementations, the object element may be a commodity and the scene element may be an advertising element. Accordingly, the candidate object may be a candidate good and the candidate scene element may be a candidate advertisement element. Object elements and scene elements may have a variety of attributes. The attributes of the object elements may include, but are not limited to, at least one of a category, price, color, and brand of the item, and the attributes of the scene elements may include, but are not limited to, at least one of a visual style, a story line, a suitable item, a person, a time, a place, and a score.
In some implementations, the set of recommendation models may be utilized to determine candidate element content based on the user attribute information and/or the environment attribute information obtained in step 101. The recommendation model set may include at least one of a first recommendation model set recommending object elements and/or scene elements according to the user attribute information, a second recommendation model set in which the object elements and/or scene elements are jointly recommended, and a third recommendation model set recommending the object elements and/or scene elements according to the environment attribute information.
In a further implementation, the first set of recommendation models may be a set of sub-models of interest relationships between the user attribute information and attributes of the object elements and/or attributes of the scene elements, wherein the sub-models of interest relationships between at least one user attribute information and attributes of the object elements and/or attributes of the scene elements may be included. The second set of recommendation models may be a set of sub-models of interest relationships between attributes of object elements and/or attributes of scene elements, which may include at least one sub-model of interest relationships between attributes of object elements and/or attributes of scene elements. The third set of recommendation models may be a set of sub-models of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements, wherein at least one sub-model of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements may be included.
Taking the object element as a commodity and the scene element as an advertisement element as an example, each sub-model in the first recommendation model set can represent a mapping relation for recommending each commodity or each advertisement element according to the user attribute; each sub-model in the second set of recommendation models may represent a mapping relationship in which different goods/advertisement elements are jointly recommended, and each sub-model in the third set of recommendation models may represent a mapping relationship in which each goods or each advertisement element is recommended according to an environmental attribute.
In some optional implementation manners of this embodiment, after the recommendation model set is determined, at least one sub-model in the recommendation model set may be used to recommend the content related to the user attribute and the environment attribute.
Referring to fig. 3, an exemplary flow diagram for determining candidate content elements is shown, according to one embodiment of the present application. In an embodiment corresponding to fig. 3, the method for determining candidate content elements by using the recommendation model set may include:
step 301, training the submodels in the recommended model set based on the interestingness statistical data to determine the parameters of the submodels.
As described above, each recommendation model set may include a set of sub models, and the sub models may represent mapping relationships between user attributes and object elements, between user attributes and scene elements, between different object elements, between different scene elements, between environment attributes and object elements, and between environment attributes and scene elements. In this embodiment, the mapping relationship may be derived from interestingness statistics. Specifically, each submodel may be trained based on interestingness statistics, resulting in parameters for the submodel. Wherein the interestingness statistics may include: the interest statistical data of the attribute of the object element and/or the attribute of the scene element by the user attribute information, the interest statistical data between the attributes of different object elements, the interest statistical data between the attributes of different scene elements, the interest statistical data between the attribute of the object element and the attribute of the scene element, and the interest statistical data of the attribute of the object element and/or the attribute of the scene element by the environment attribute information.
Interestingness statistics may be quantified as a number of levels or as normalized values. The obtaining mode may be obtained through data statistics of an online shopping website, for example, the corresponding relationship between the browsing amount and the purchasing amount of a certain commodity on the online shopping website and the age group to which the user who browses or purchases the commodity belongs may be counted, so as to count the interest degree statistical data of the users of different age groups on the commodity. Also, for example, the interestingness statistical data of a refrigerator and a washing machine, and the interestingness statistical data of a refrigerator brand and a washing machine brand can be calculated by counting the number of users who purchase various commodities at the same time (for example, the number of users who purchase a refrigerator brand and a washing machine brand at the same time). Another way to obtain interestingness statistics is through questionnaires. For example, a targeted questionnaire can be designed to count the interest of users with different ages, characters and purchasing power in different brands of commodities. The cosmetic interest level statistic data of women can be empirically set and normalized to 0.8, and the cosmetic interest level statistic data of men can be set to 0.2.
Table 1 shows an example of statistical data of interest degree of an age in user attribute information to a brand in an attribute of an object element in a list form. Where the interestingness is normalized to 0 to 1.
Table-age interest degree statistical data table for brand
| |
Disney
|
Gap
|
Eland
|
……
|
| 0-3
|
0.8
|
0.6
|
0.0
|
……
|
| 3-5
|
0.6
|
0.5
|
0.0
|
……
|
| 5-10
|
0.8
|
0.6
|
0.3
|
……
|
| 10-20
|
0.3
|
0.8
|
0.8
|
……
|
| ……
|
……
|
……
|
……
|
…… |
As can be seen from table 1, the age-interest-in-brand statistical data table counts the interest of users of various ages in the brand of the product, and similarly, counts the interest of other user attribute information in different product attributes or advertisement elements, the interest of different products, the interest of different advertisement elements, and the interest of environment attribute information in different products or different advertisement elements.
Optionally, in order to make the trained sub-model adaptive to the environment change, the corresponding sub-model may be updated based on the new object elements and scene elements. For example, interestingness statistics and sub-models related to a brand may be updated based on the new brand's goods. In addition, the interest degree statistical data can be updated in a certain time period, and the updated interest degree statistical data is adopted to train the corresponding sub-model to obtain the updated sub-model. The sub-models may be updated, for example, quarterly based on feedback information from merchants.
Step 302, a global energy function is established based on the recommendation model set.
After the sub-models in the recommendation model set are obtained through training, object elements and scene elements meeting requirements can be searched from the object element database and the scene element database according to certain rules. The following function may be established based on the first, second, and third sets of recommendation models, and the object element and the scene element may be determined based on equation (1).
productSet * =argmin productSet E(productSet|models,userSet,context) (1)
Wherein productSet represents a set of object elements or scene elements, productSet represents a set of determined candidate object elements or candidate scene elements, models represents a set of recommended models, and models ═ model 1 ,model 2 ,model 3 Therein model 1 Representing a first set of recommended models, model 2 Representing a second set of recommended models, model 3 Representing a third set of recommendation models. userSet represents a set of user attribute information, context represents environment attribute information, and E (-) represents a global energy function.
And determining the recommended content, namely selecting the object element and/or the scene element with the minimum energy function from the database. The global energy function may be defined as equation (2):
in formula (2), product set ═ product j },userSet={user i H, where i, j 1 ,j 2 Is a positive integer, product j ,product j1 ,product j2 Representing object elements or scene elements, user i Representing user attribute information. Alpha is alpha 1 ,α 2 ,α 3 The weight coefficients are expressed and can be set or trained empirically.
As shown in equation (2), the global energy function may include a first energy function E 1 (. DEG), a second energy function E 2 (. a) and a thirdEnergy function E 3 (. cndot.). The first energy function may be an energy function that recommends an object element or a scene element according to the user attribute information, and specifically, the first energy function may be calculated according to equation (3):
wherein i, j, p and q are positive integers, product j profile p P-th attribute, user, representing the j-th object element/scene element i profile q Representing the qth attribute, β, of the ith user (p,q) Representing the weight coefficients. The first energy function may include: and calculating recommendation probabilities of the attributes of the object elements and/or the attributes of the scene elements corresponding to the user attribute information by adopting a classifier and/or a regressor based on the first recommendation model set.
The second energy function may be an energy function in which different object elements or different scene elements are recommended in common, and specifically, the second energy function may be calculated according to equation (4):
wherein j is 1 ,j 2 P and q are positive integers, product j1 profile p Denotes the j (th) 1 Product, p-th property of individual object elements/scene elements j2 profile q Denotes the jth 2 Q attribute of object element/scene element, beta (p,q) Representing the weight coefficients. The second energy function may include: and calculating the probability that the attributes of the object elements and/or the attributes of the scene elements are jointly recommended by adopting a classifier and/or a regressor based on the second recommendation model set.
The third energy function may be an energy function that recommends an object element or a scene element according to the environment attribute information, and specifically, the third energy function may be calculated according to equation (5):
wherein i, j, p and q are positive integers, product j profile p The pth attribute, contextprofile, representing the jth object element/scene element q Indicating the qth attribute, γ, in the environment attribute information (p,q) Representing the weight coefficients. The third energy function may include: and calculating the recommendation probability of the attribute of the object element and/or the attribute of the scene element corresponding to the user attribute information by adopting a classifier and/or a regressor based on the third recommendation model set.
Continuing with fig. 3, in step 303, a global optimization solution is performed on the global energy function to obtain candidate content elements that optimize the global energy function.
In this embodiment, the recommended content may be determined based on the energy function described above. Specifically, the global energy function may be solved for global optimization according to equation (1). Methods of global optimization may include optimization algorithms based on genetic algorithms, linear programming, simulated annealing, and the like. After the productSet in equation (1) is solved, the candidate object element and the candidate scene element are obtained.
In some embodiments, the candidate content elements may be determined based on one of a first energy function, a second energy function, and a third energy function, for example, wedding apparel may be recommended to couples based on the first energy function, colorful and gorgeous apparel may be recommended to younger and outsourcing women; the refrigerator and the television, the lipstick and the eyebrow pencil, the crib and the milk bottle can be respectively used as jointly recommended commodities based on the second energy function; also, the down jacket goods can be recommended in winter when the user is snowy and the travel information can be recommended on the billboard of the airport based on the third energy function. In some implementations, the recommended content may be determined in conjunction with two or three of the first energy function, the second energy function, and the third energy function. For example, lover T-shirts can be recommended to lovers in summer, lovers down jackets can be recommended to lovers in winter, lovers watches and lovers rings can be recommended to lovers at the same time, and the like.
In practical applications, when the recommended content is an advertisement, a group of preferred jointly recommended commodity sets and advertisement element sets can be obtained after global optimization solution is performed on a global energy function.
In the method for determining candidate content elements provided by the embodiment, a plurality of recommended object elements and scene elements can be selected according to the interestingness and tendency of the user and/or the environmental information, so that richer recommended content can be provided. For example, when the advertisement is recommended, various advertisement elements and scene elements meeting the requirements and preferences of the user can be obtained, so that the advertisement content is rich, and the utilization rate and the delivery effect of the billboard are improved. In addition, more vivid advertising elements can be provided, and user experience is improved.
Returning to fig. 1, step 102 may further include step 1022 of synthesizing the recommended content based on the candidate object element and the candidate scene element.
In this embodiment, after determining the candidate object element and the candidate scene element in step 1021, the candidate object element may be fused, the candidate scene elements may be combined, and the recommended content may be generated by combining the candidate object element and the candidate scene element. Each candidate object element may first be fused with a respective candidate scene element, followed by combining a plurality of candidate scene elements.
In some implementations, the candidate object element and the candidate scene element may be fused based on a preset rule. With further reference to FIG. 4, an exemplary flow diagram of a method of composing recommended content is shown, according to one embodiment of the present application.
As shown in fig. 4, in step 401, a placement index, a direction index, and a motion trajectory index of a candidate scene element are obtained.
In this embodiment, the scene element generally has a transparent background or a specific placement position for placing the object element. A placement index may be built at these particular locations for setting the types of object elements that a particular location may place. For example, a vehicle can be placed on the road, and a watch can be placed on the wrist. Further, a direction index may be established at a specific position for indicating the orientation of the object element. For example, the placement orientation of the vehicle may be determined according to the direction of the road. The orientation of the watch is determined according to the posture of the upper arm of the person. Further, when the candidate scene element is a dynamic element, such as a video, a motion trail index may also be established to indicate the motion direction and route of the object element. For example, a road direction index may be included in the road scene to cause the vehicle to travel along the road direction.
Before fusing the candidate object element and the candidate scene element, the above-mentioned index information of the candidate scene element, including the placement index, the direction index, and the motion trajectory index, may be first obtained. The obtaining mode can be directly searching related data from a database, or can be image analysis and video analysis of scene elements, extracting features used for placing candidate object elements, determining a placing index of the scene through a trained model, and extracting position features, direction features and motion track features in the scene elements, so as to obtain the direction index and the motion track index.
In step 402, candidate object elements and candidate scene elements are fused according to the placement index, the direction index and the motion trail index to generate candidate recommended content.
When synthesizing the candidate content elements, the candidate object elements may be placed at specific positions in the candidate scene elements according to the placement index, then the candidate object elements are rotated according to the direction index, and then the candidate object elements are moved according to the motion trajectory index to synthesize the complete candidate recommended content.
In step 403, candidate recommended contents are filtered based on the correlation among the attributes of the candidate scene elements, the correlation among the attributes of the candidate object elements, and the correlation between the attributes of the candidate scene elements and the candidate object elements, and the filtered candidate recommended contents are fused to generate recommended contents.
In this embodiment, a plurality of candidate recommended contents may have a certain relevance, such as a temporal relevance, a spatial relevance, a person relevance, an event relevance, and an attribute relevance. According to the relevance, a plurality of candidate recommended contents with strong relevance can be screened, the candidate recommended contents irrelevant to other candidate recommended contents are filtered, and the screened candidate recommended contents are integrated into smooth and coherent recommended contents.
The association between the candidate recommended contents may be determined based on the association between the attributes of different candidate object elements included in the candidate recommended contents, between the attribute of the candidate object element and the attribute of the candidate scene element, and between the attributes of different candidate scene elements. Therefore, in the present embodiment, candidate recommended contents containing corresponding candidate object elements or candidate scene elements may be screened according to the correlation between the attributes of different candidate object elements, the correlation between the attribute of a candidate object element and the attribute of a candidate scene element, and the correlation between the attributes of different candidate scene elements.
The correlation among the attributes of the different candidate object elements, the correlation between the attributes of the candidate object elements and the attributes of the candidate scene elements, and the correlation between the attributes of the different candidate scene elements can be obtained by a model training method, or can be manually set according to experience. The representation of the association may be a quantized numerical value. Alternatively, the attributes of the candidate object elements and the attributes of the candidate scene elements may be vectorized, and then the distance between the attributes is calculated, and the smaller the distance is, the stronger the relevance is. And calculating the relevance between the attributes of the object elements and the attributes of the scene elements contained in all the candidate recommended contents, and filtering out the candidate recommended contents which have the weakest relevance or no relevance with other candidate recommended contents according to the relevance.
After screening, a plurality of candidate recommended contents with strong relevance can be obtained, and the screened candidate recommended contents are connected in series according to a time sequence, a spatial position relation or an event state to generate the finished recommended contents.
It is understood that if only one candidate recommended content is obtained in step 1021, or there is no correlation between multiple candidate recommended contents, one candidate recommended content may be taken as the recommended content.
For example, when an advertisement is recommended, the correlation between the commodity element and the advertisement element may be calculated, for example, if two advertisement elements with similar video styles are highly correlated, an advertisement containing the two advertisement elements may be placed in the same advertisement. For example, if the scenes of the two advertisement elements are the same scene, the time attributes are morning and noon, and the included goods are the car and the watch, the car advertisement with the time attribute of the advertisement element being morning and the advertisement with the time attribute of the advertisement element being noon can be connected in series to form a time-continuous video advertisement.
It should be noted that in the above exemplary implementation of the method for synthesizing recommended content described with reference to fig. 4, a screening step may be performed first, the candidate object elements and the candidate scene elements are screened based on the correlation among the attributes of the candidate scene elements, the correlation among the attributes of the candidate object elements, and the correlation between the attributes of the candidate scene elements and the candidate object elements, then the placement index, the direction index, and the motion trajectory index of the candidate scene elements are obtained, then the candidate object elements and the candidate scene elements with high correlation are fused into the candidate recommended content, and the candidate recommended content is concatenated according to the time attribute of the candidate scene elements to form the complete recommended content.
In the content recommendation method provided by the embodiment of the application, the candidate object element and the candidate scene element are determined based on the user attribute information and the environment attribute information, and then the recommended content is synthesized according to the candidate object element and the candidate scene element. More information can be provided in the recommended content, and the utilization rate of the display position of the recommended content is improved. And more targeted personalized content can be provided, so that the conversion rate of the recommended content is improved.
The method provided by the embodiment can be used for the intelligent advertisement recommendation system. The system can acquire an image in front of the billboard through a camera arranged on the billboard, perform human body detection on the image, detect a target user A, and then determine that the target user A pays attention to the content on the billboard through focus detection. The system can perform individual analysis on a target user A, the analysis result is male, the target user A is 40-50 years old, the target user A wears high-grade dark blue Western-style clothes and black leather shoes, the facial feature analysis has the characteristics of responsibility, confidence, strong emotional stability, inward direction and strong purchasing power, then the target user A can be recommended with commodities such as black business cars, deep-color high-grade POLO shirts and certain brand high-grade watches, and the recommended advertising elements can include classical style music, high-grade home scenes, business office building office scenes, urban road conditions and the like. And finally, fusing according to the relevance between the commodities and the advertisement elements, wherein the generated advertisement can be a video for a man in the morning to wear a blue high-grade POLO shirt, wear a high-grade watch and drive a black business car to a business office building. During the period, the male can also participate in the conference after inserting the watch, and drive away in sunset.
The smart advertisement recommendation system may also recommend advertisements for a user group that includes multiple users. For example, 6 target users A, B, C, D, E, F may be detected by analyzing the image, and it is determined by focus detection that 6 target users are paying attention to the content on the billboard. The system may first perform individual analysis on the 6 target users, and then may perform group analysis based on the results of the individual analysis and the posture and relative position relationship between the 6 target users. The analysis result was A, B, C, which is three people at home, and the probability of being a user group 1, D, E, which is a lover relationship, is high, and the result is a user group 2, and F is a user group 3. The system can generate three sections of advertisements, and advertisement recommendation is respectively carried out on the user groups 1, 2 and 3.
In some embodiments, the content recommendation method may further include:
and 103, displaying the recommended content in a time division presentation or space division presentation mode.
If the number of users or user groups to be served acquired in step 101 is plural, after generating the recommended content, the recommended content needs to be presented to the plural users or user groups to be served in an appropriate manner. Taking the example of displaying the recommended content on the electronic display screen as an example, in this embodiment, the recommended content may be displayed in a time-division presentation manner or a space-division presentation manner. The time division presentation mode is suitable for the electronic screen with smaller screen area, and the space division presentation mode is suitable for the screen with larger screen area or the curved screen.
In some implementations, presenting the recommended content in a spatial-division presentation may be performed by: the display positions of the recommended contents are firstly divided into sub-areas with the number equal to that of the users or the user groups, and then the recommended contents corresponding to each user or each group of users are displayed in the sub-areas where the sight focus positions of the users/the group of users are located. It should be noted that pupil position detection and depth detection can be performed on the face image when performing user gaze focus detection, thereby determining the position of the area concerned by the user on the screen. Further, the user visual field range can be determined according to the pupil position of the user, so that the size of the displayed recommended content can be determined.
In a further implementation, the gaze focus position of the user may also be tracked, and the display position of the recommended content of each user or each group of users may be adjusted according to the change of the gaze focus position of the user. If the user or the user group is in a moving state, the position change of the user or the user group can be tracked through pedestrian detection, or when the user or the user group is in a static state but the attention area changes, the change of the focus position of the sight line of the user can be detected and determined in real time based on the pupil position, so that the change state of the position concerned by the user can be acquired. In this case, the display position of the recommended content may be adjusted so that the content recommended for the user or the user group may be projected within the visual field of the user or the user group all the time.
With further reference to fig. 5a, a schematic diagram illustrating an effect of presenting recommended content in a spatial division manner is shown. The scenario of fig. 5a may utilize a billboard in a mall hall or hotel hall to recommend advertisements to customers. In fig. 5a, the display screen for presenting recommended content is a cylindrical screen 510. The system detects target service object users 501 and 502, where user 501's gaze focus is located in area 511 and user 502's gaze focus is located in area 512. Through individual analysis of the user 501 and the user 502, the fact that the interest degree and the demand degree of the user 501 for the watch are large and the demand degree of the user 502 for the car is large is obtained, the watch advertisement and the car advertisement can be displayed in the areas 511 and 512 respectively, and different personalized advertisements can be recommended to different users at different positions of the cylindrical screen.
With further reference to fig. 5b, a schematic diagram of another effect of presenting recommended content in a spatial division presentation is shown. The presentation position shown in fig. 5b may be a flat display screen within a transfer aisle like a subway. These screens may be laid along walls, and during the transfer, the user may present personalized advertising on the screens on the walls. As shown in fig. 5b, the screen 520 may be divided into a plurality of sub-areas. The gaze focus of user 503, user 504, and user 505 are in area 521, area 522, and area 523, respectively. The contents recommended to each user may be presented on the corresponding area. For example, a jacket and skirt advertisement is recommended to user 503, a watch advertisement is recommended to user 504, and a car advertisement is recommended to user 505. And the position of the sight focus of the user can be tracked in real time in the moving process of the user, and the position of the displayed recommended content is adjusted according to the change of the position of the sight focus of the user. The content presented in area 522 may be switched to a car advertisement, for example, when the user's gaze focus position moves to area 522. When the users overlap, recommended content corresponding to the users with unobstructed view can be displayed on the screen.
The screen for time division presentation may be a raster display screen, and different recommended contents are switched by fast movement of a raster.
In some implementations, presenting the recommended content in a time-division presentation manner may be performed by switching at least one recommended content corresponding to each user or each group of users at certain time intervals at a presentation position of the recommended content. The time interval may be a persistence time of human eyes, and thus, the display of the plurality of recommended contents may be realized by using a persistence phenomenon of human eyes.
In a further implementation, the gaze focal position of the user may also be tracked, and the presentation angle of the recommended content of each user or each group of users may be adjusted according to the change of the gaze focal position of the user. The method comprises the steps of detecting the position of a focus and depth information of a user in real time through focus detection while presenting recommended content, so as to determine the change of the visual field range of the user, and then adjusting the direction of a grating based on the change of the visual field range of the user, so that the recommended content is presented in the visual field range of the user all the time.
Further reference is made to fig. 6, which shows a schematic diagram of the principle of presenting recommended content in a time-division presentation. As shown in fig. 6, 602 is a camera for capturing images, and a Light Emitting Diode (LED) projection array 601 presents images or video to a user via a raster display screen 603, wherein when the raster is moved to a certain position, a user whose left and right eyes are respectively located in regions 611 and 612 can see a first type of image containing content recommended for the user; when the raster is shifted to another position, the user whose left and right eyes are respectively located in the area 613 and the area 614 can see the second type of image containing the content recommended for the user. The camera 602 may detect a change in a focal position of the user's gaze in real time, and the grating adjusts an angle according to the movement of the user's gaze, ensuring that the displayed recommended content is always within the user's field of view.
According to the method for displaying the recommended content in the space division or time division presentation mode, multiple personalized recommended contents can be displayed for multiple or multiple groups of users at the same time, the utilization rate of the display positions of the recommended contents is improved, and the display positions can be automatically adjusted through focus detection, so that content recommendation is more intelligent.
Further referring to fig. 7, a schematic structural diagram of a content recommendation device according to an embodiment of the present application is shown.
As shown in fig. 7, the content recommendation apparatus 700 may include an acquisition unit 701 and a synthesis unit 702. The obtaining unit 701 may be configured to obtain user attribute information and/or environment attribute information. The synthesizing unit 702 may be configured to synthesize recommended content based on the user attribute information and/or the environment attribute information. In some embodiments, composition unit 702 may include a determination subunit 7021 and a composition subunit 7022. The determining subunit 7021 is configured to determine candidate content elements based on the user attribute information and/or the environment attribute information acquired by the acquiring unit 701, where the candidate content elements may include candidate object elements and candidate scene elements. The synthesizing subunit 7022 may be configured to synthesize the recommended content based on the candidate object elements and the candidate scene elements determined by the determining subunit 7021.
In this embodiment, the acquisition unit 701 may extract user attribute information based on an image captured by a camera. The user attribute information may include user individual attribute information and group attribute information. The individual attribute information may be information obtained by analyzing individual characteristics of each user, and may include information such as gender, age, race, dress style, makeup style, health status, posture, and the like of the user. The group attribute information may be information obtained based on relationships among a plurality of users, and may be social relationship information of the users, including family relationships, lover relationships, friendships, and the like. In some implementations, the user attribute information may be retrieved by a variety of classifier classifications or regressors.
In some implementations, the manner in which the obtaining unit 701 is configured for the environment attribute information may include, but is not limited to, receiving current time information through a network and/or receiving spatial information corresponding to a presentation location of the recommended content through the network.
In some implementations, the determining subunit 7021 may determine the candidate element content by constructing a global energy function using the recommendation model set based on the user attribute information and the environment attribute information acquired by the acquiring unit 701, and performing an optimization solution on the global energy function. The recommendation model set may include at least one of a first recommendation model set recommending object elements and/or scene elements according to the user attribute information, a second recommendation model set in which object elements and/or scene elements are jointly recommended, and a third recommendation model set recommending object elements and/or scene elements according to the environment attribute information.
In a further implementation, the first set of recommendation models may be a set of sub-models of interest relationships between the user attribute information and attributes of the object elements and/or attributes of the scene elements, wherein the sub-models of interest relationships between at least one user attribute information and attributes of the object elements and/or attributes of the scene elements may be included. The second set of recommendation models may be a set of sub-models of interest relationships between attributes of object elements and/or attributes of scene elements, which may include at least one sub-model of interest relationships between attributes of object elements and/or attributes of scene elements. The third set of recommendation models may be a set of sub-models of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements, wherein at least one sub-model of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements may be included. The interest relationship may be represented by interestingness statistics. The method for obtaining the interestingness statistical data includes, but is not limited to: experience setting, questionnaire statistics, and online shopping website data statistics.
In some implementations, the synthesizing subunit 7022 may obtain a placement index, a direction index, and a motion trajectory index of the candidate scene element, fuse the candidate object element and the candidate scene element according to the indexes, generate the candidate recommended content, then screen the candidate recommended content according to the relevance between the candidate object element and the candidate scene element, the relevance between different candidate scene elements, and the relevance between different candidate object elements, and finally concatenate the screened candidate recommended contents according to a time sequence to form the complete and smooth recommended content.
In some embodiments, the content recommendation device may further include a presentation unit 703. The presentation unit 703 may be configured to present the recommended content synthesized by the synthesis unit 702 in a time-division presentation or a space-division presentation. The time division presentation mode can present recommended contents on a display screen provided with a movable grating, and the recommended contents corresponding to a plurality of users or user groups are switched within the persistence time of vision by changing the angle of the grating and utilizing the persistence of vision of human eyes. The space division presentation mode can present recommended content on a screen with a large area or a curved screen, divide the screen into a plurality of sub-areas, and present corresponding recommended content in the sub-area concerned by each user or each group of users. In a further implementation, the implementation focus changes of the user or the user group can be tracked, and the display position or the angle of the recommended content can be adjusted in real time.
The content recommendation device provided by the embodiment of the application can provide more targeted personalized content, provide more information in recommended content, and improve the utilization rate of the recommended content display position and the conversion rate of the recommended content.
It should be understood that the elements described in the content recommendation device 700 correspond to various steps in the method described with reference to fig. 1-6. Thus, the operations and features described above for the method are also applicable to the content recommendation device 700 and the units included therein, and are not described herein again.
With further reference to fig. 8, a schematic structural diagram of a content recommendation system according to an embodiment of the present application is shown.
The content recommendation system 800 may include at least a processor 801 and a display device 802. Wherein the processor 801 may comprise the content recommendation device 700 described above in connection with fig. 7. The display device may be configured to display the recommended content generated by the processor. It is to be understood that the processor may be a stand-alone processing unit for performing the content recommendation method. In some implementations, the content recommendation system can also include an input device such as a keyboard, mouse, or the like; a memory, such as a hard disk, for storing candidate object elements and candidate scene elements; a communication unit such as a network interface card of a LAN card, a modem, or the like, which performs communication processing via a network such as the internet; and a removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory so that a computer program read out therefrom is installed in the memory as needed.
As another aspect, the present application also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiments; or it may be a separate computer-readable storage medium that is not incorporated into the terminal device. The computer readable storage medium stores one or more programs, which may include program code for performing the methods illustrated in the flowcharts.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.