CN106294489B

CN106294489B - Content recommendation method, device and system

Info

Publication number: CN106294489B
Application number: CN201510308816.5A
Authority: CN
Inventors: 李志轩; 张文波; 李艳丽; 严超; 熊君君
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2015-06-08
Filing date: 2015-06-08
Publication date: 2022-09-30
Anticipated expiration: 2035-06-08
Also published as: KR102519686B1; CN106294489A; KR20160144305A

Abstract

The present application discloses a content recommendation method, device and system. A specific implementation of the method includes: acquiring user attribute information and/or environment attribute information; and synthesizing recommended content based on the user attribute information and/or environment attribute information. This implementation can provide more information in the recommended content and improve the utilization of the recommended content placement. Moreover, more targeted personalized content can be provided, thereby increasing the conversion rate of the recommended content.

Description

Content recommendation method, device and system

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, and a system for recommending content.

Background

Existing content recommendation systems (e.g., advertisement recommendation systems, information recommendation systems, etc.) recommend relevant content based primarily on analysis of detected user interests, preferences, and areas of interest. This type of content recommendation system mainly includes three modules: the system comprises a user detection and feature analysis module, a recommendation module and a display module.

In the user detection and feature analysis module, after an image is collected through a camera, a target user in the image is extracted by adopting methods such as pedestrian detection, face detection and the like, and then feature analysis is carried out on the target user. The most common features in the prior art include superficial features such as the user's gender, age, color and style of clothing, and the user's expression.

The recommendation decision of the recommendation module mainly comprises two modes. One way is to recommend according to a preset rule, for example, when the gender of the user is detected as female, the system recommends according to a preset female attention content, specifically, the system can recommend cosmetics, fashion, and other goods to the female, and also recommend health and beauty information to the female. Another way is to make recommendations by means of learning. The features of the user and the features of the content to be recommended can be vectorized, the two features are associated through machine learning on line, the features of the user are mapped into the features of the content to be recommended through a trained model on line, the features of the content to be recommended are matched in a content database, and therefore the content with high matching degree is used as a recommendation result. The content database stores preset contents, and the forms of the preset contents and the contained information amount are small and fixed.

The prior art display modules typically have only one display screen and are designed to serve only one group of users. I.e. only the recommended content for a group of users is displayed during a time period. The group of users may contain a plurality of individuals, and content is recommended based on common characteristics by extracting common characteristics between the plurality of individuals, such as extracting common age groups, social relationships, and the like.

The above prior art has the following problems:

in the user detection and feature analysis module, only the shallow features of the user are analyzed and extracted. In practical applications, it is difficult to determine whether the goods or information are suitable for being recommended to the user through these shallow features, for example, when it is detected that the user is a young woman, the system determines that the user may be interested in cosmetics, but cannot determine what brand or type of cosmetics should be recommended to the user.

In the recommending module, preset content is stored in a content database, and the recommending of the preset content has the following defects: firstly, a recommending module may decide a plurality of preset contents to be recommended, the relevance among the plurality of preset contents may be poor, and a user may have a sense of fracture when acquiring information by displaying the plurality of preset contents to be recommended; secondly, if the matching degree of the user characteristics and the characteristics of the contents to be recommended is low, the recommendation effect is poor; thirdly, the content form recommended and the information contained in the content recommended in the prior art are fixed, and personalized recommended content cannot be provided, so that the requirements of different users are met, for example, when a watch is recommended, different characters and music are matched, the perception of the watch by the users is different, and the recommending effect is completely different.

In the display module, the prior art only serves one group of users, the personalized requirements of the users are difficult to meet in such a way, and the information amount of the recommended content is small. When the number of users is excessive or the features are too complex, the system may not be able to decide to recommend content at all.

Disclosure of Invention

In view of the above-mentioned shortcomings in the prior art, it is desirable to provide a personalized content recommendation method. Further, it is also desirable that the recommended content may contain richer information for a plurality of groups of users having different characteristics. In view of this, the present application provides a content recommendation method, device and system.

In a first aspect, the present application provides a content recommendation method. The method comprises the following steps: acquiring user attribute information and/or environment attribute information; and synthesizing the recommended content based on the user attribute information and/or the environment attribute information.

In some embodiments, synthesizing the recommended content includes: determining candidate content elements based on the user attribute information and/or the environment attribute information, wherein the candidate content elements comprise candidate object elements and candidate scene elements; and synthesizing the recommended content according to the candidate object element and the candidate scene element.

In a second aspect, the present application provides a content recommendation device. The device includes: an acquisition unit configured to acquire user attribute information and/or environment attribute information; and a synthesizing unit configured to synthesize the recommended content based on the user attribute information and/or the environment attribute information.

In some embodiments, the composition unit comprises a determination subunit configured to determine candidate content elements based on the user attribute information and/or the environment attribute information, wherein the candidate content elements comprise candidate object elements and candidate scene elements; and a synthesizing subunit configured to synthesize the recommended content according to the candidate object element and the candidate scene element.

In a third aspect, the present application provides a content recommendation system. The system includes a processor and a display device; wherein the display device is configured to display the recommended content; the processor comprises a content recommendation device according to the second aspect of the present application.

The content recommendation method, device and system provided by the application synthesize or generate recommended content based on the user attribute and the environment attribute. Personalized content can be automatically recommended, the recommended content contains more information, and the pertinence of the content recommendation system is improved. Meanwhile, the content which can not be directly judged from the shallow feature of the user and meets the requirement and interest of the user can be recommended, and the utilization rate of the content recommendation system is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings in which:

FIG. 1 shows an exemplary flow diagram of a content recommendation method according to one embodiment of the present application;

FIG. 2 shows a schematic diagram of the effect of an individual analysis according to an embodiment of the application;

FIG. 3 illustrates an exemplary flow diagram for determining candidate content elements according to one embodiment of the present application;

FIG. 4 illustrates an exemplary flow diagram of a method of synthesizing recommended content according to one embodiment of the present application;

FIG. 5a is a schematic diagram illustrating an effect of presenting recommended content in a spatial division presentation;

FIG. 5b is a diagram illustrating another effect of presenting recommended content in a spatial division manner;

FIG. 6 is a schematic diagram illustrating the principle of presenting recommended content in a time-division presentation;

fig. 7 is a schematic structural diagram of a content recommendation device according to an embodiment of the present application; and

fig. 8 shows a schematic structural diagram of a content recommendation system according to an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

In the following description, numerous specific details are set forth to provide a thorough description of embodiments of the invention. However, it will be understood by those skilled in the art that the embodiments of the present application may be practiced without these specific details.

Referring to fig. 1, an exemplary flowchart of a content recommendation method according to an embodiment of the present application is shown. For ease of understanding, in the present embodiment, the description is given in conjunction with a device for presenting recommended content. It will be understood by those skilled in the art that the device for displaying the recommended content may be an electronic large screen, such as an electronic screen disposed in an underground passage or a hall of a shopping mall, or may be a mobile electronic device with a display function, such as a mobile phone or a tablet computer.

As shown in fig. 1, in step 101, user attribute information and/or environment attribute information is acquired.

In this embodiment, the user attribute information may be extracted based on an image captured by a camera. The camera may be mounted on the device for presenting recommended content or may be mounted at one or more locations near the device for presenting recommended content. In some implementations, images collected by multiple cameras may be acquired to analyze user attribute information for all users within a display range.

The user attribute information may include user individual attribute information and group attribute information. The individual attribute information may be information obtained by analyzing individual characteristics of each user, and the group attribute information may be information obtained based on a relationship between a plurality of users.

In some implementations, the individual attribute information may include individual characteristic information of the user, such as appearance characteristic information of the user, i.e., information that may be directly derived from surface characteristics of the user, such as information of the user's gender, age, race, style of clothing, style of makeup, health status, posture, and the like. The gender, age, race and makeup style can be obtained by classifying the characteristics of the face region through a classifier, and the health status can be obtained by classifying or searching the characteristics of the face region and the posture through the classifier. The individual characteristic information may also include emotional state information of the user, such as happiness, sadness, anger, and the like, which may also be obtained by analyzing the facial expression and limb movement of the user using a classifier.

Further, the individual attribute information may also include the personality and purchasing power of the user. Wherein the personality may include at least one of: sense of responsibility, emotional stability, extroversion, openness to novelty, affinity, popularity, confidence, and autism. These personality traits may be quantified as a number of levels, each level corresponding to a different degree of compliance. For example 7 levels from-3 to +3, -3 may represent the least fit and +3 may represent the most fit. Taking the recommended commodities as an example, when the openness index of the user to new things is quantified as +3, some novel commodities, such as clothes of a different style or color from that worn by the user or recommended travel information, can be recommended to the user; in contrast, when the user's openness index for new things is quantified as-3, it is possible to recommend to the user goods, such as clothes, accessories, etc., of the same style that are consistent with the user's current state.

The character of the user can be obtained by analyzing the characteristics of the user through a regressor. In some implementations, the personality may be obtained through a method of machine learning. For example, a training set can be established based on existing data, and a personality model is trained by a machine learning method to obtain a mapping relation between user characteristics and the personality. Optionally, the character model may be trained using artificially quantized character indexes as training data. For example, images of multiple users may be acquired and user features extracted, and then the degree of extroversion for each user is quantified based on psychological analysis to seven levels-3, -2, -1, 0, 1, 2, 3, -3 for lowest extroversion and +3 for highest extroversion. For example, a user wearing brightly colored clothing may quantify the degree of extroversion of the user as +2 or + 3. The characteristics of the user and the quantized outward degree are used as training data, and machine learning methods such as a Support Vector Machine (SVM), a random forest and the like are adopted to train an outward degree model, so that the mapping relation between the characteristics of the user and the outward degree is obtained. The degree of extroversion model may be used for analysis when determining the degree of extroversion of a user. In a further implementation, the training may be performed based on a plurality of features of the user and a plurality of personality indicators, and the obtained model is a comprehensive personality model, and the determination results of the plurality of personality indicators of the user may be directly obtained through the comprehensive personality model.

The purchasing power information can be obtained from the price information of clothes, shoes and accessories worn by the user. First, the features of the clothes, shoes and accessories of the user can be extracted and matched in the database to obtain the brand information and/or price information of the clothes, shoes and accessories. Alternatively, the price level of the price information among the price points of all the same kind of goods may be calculated, thereby determining the purchasing power of the user. For example, features of a watch worn by a user may be extracted, and brand information and price information for the watch may be looked up in a merchandise database. Further, price interval information of watches of the same brand can be inquired, and the price information or the ordering (such as the ordering percentage and the like) of the price interval information in all watch commodities can be obtained. Alternatively or additionally, the purchasing power may be quantified, such as in a plurality of levels. Specifically, if the price of the watch is ranked as the top 10% in all types of watches, the purchasing power of the user may be determined as the highest rank.

The group attribute information of the user may include social relationship information of the user, including family relationship, lover relationship, friendship, and the like. The social relationship information of the users can be determined from clothes worn by the users, and the relationships among the users can be determined by, for example, a lover's clothing or a parent-child clothing.

In some alternative implementations, the user attribute information may be obtained by: and acquiring an image of an area where the display position of the recommended content is located, determining a user serving as a service object from the image, and performing individual analysis and group analysis on the user serving as the service object.

In the above implementation, an image of an area where the recommended content presentation position is located may be captured by an image capturing device (e.g., a camera component on a mobile terminal). Optionally, before the image is acquired, whether the user exists in the area where the display position is located and the position of the user are detected through a sensor, and the detected image of the user is acquired by controlling the camera to rotate and focus through a computer system.

Optionally, determining the user as the service object from the image may include: detecting a pedestrian in an image and a sight-line focus position of the pedestrian; judging whether the sight focus position of the pedestrian is located at the display position of the recommended content; if so, the pedestrian is determined to be the user who is the object of the service. After the image is collected, pedestrian detection can be performed based on skin color characteristics or human body shape characteristics, and also can be performed by adopting machine learning methods such as random forests, hidden Markov models and the like, so that the human body in the image is extracted. Then, based on the color feature (for example, black) and the shape feature (approximate to a circle) of the pupil, the pupil position of the human body can be detected by methods such as edge extraction and hough transformation, so as to determine the position parameter of the line focus of the pedestrian, and based on the position parameter, whether the line focus position of the pedestrian in the image is located at the display position of the recommended content is judged. If so, the detected pedestrian can be identified as the user to be served. It should be noted that the display position of the recommended content may include an area, and when the focal point of the line of sight of the pedestrian is located in the area, the pedestrian is considered as the user who is the service object.

In some alternative implementations, the individual analysis of the user may be performed by: each user in the image is divided into a plurality of sub-images according to the human body part, and the sub-images are analyzed by adopting a classifier and/or a regressor to obtain user attribute information. Specifically, the user detected in the image may be image-segmented according to different parts of the human body. For example, the human body image may be segmented into a face image, a limb image, and a body image. Each sub-image may then be analyzed using a classifier and/or a regressor. For example, the facial image may be classified using a makeup style classifier, and the limb image and the body image may be classified using a clothing style classifier, thereby obtaining various attribute information of the user.

Further reference is made to fig. 2, which shows a schematic diagram of the effect of an individual analysis according to an embodiment of the application. As shown in fig. 2, the extracted user image may be divided into a plurality of sub-images such as a hair image, a face image, a left arm image, a bag image, a left leg image, a skirt image, a left shoe image, a right leg image, a jacket image, a right arm image, and a glasses image. Different user attribute information can be obtained by analyzing each sub-image with a classifier or a regressor. For example, classifying the hair image can obtain the style and the hair quality of the hair of the user, analyzing the facial image can obtain the attribute information such as the sex, the age, the race, the expression, the skin condition, the facial features and the like of the user, analyzing the left arm image, the right arm image, the left leg image and the right leg image of the user can obtain the other characteristics such as the strength degree and the health degree of the user, and analyzing the bag image, the glasses image, the left shoe image, the right shoe image, the skirt image and the coat image can obtain the attribute information such as the type of clothing and accessories preferred by the user, the brand and the price preferred by the user, the matched commodity and the like. The analysis of the glasses images may also result in user preferred glasses function information. These user attribute information may each be represented quantitatively, e.g., in a hierarchical manner as previously described.

Returning to fig. 1, further, performing a group analysis on the users includes grouping the users. An optional implementation manner is to classify the multiple users in the image according to the social relationship by using a classifier based on the clothing and the association degree of the postures and the relative position information of the multiple users in the image. For example, whether the users are in a lover relationship or a family membership may be determined based on whether the styles and styles of the clothes of the users in the image are the same, and whether the users are in a friend relationship or a lover relationship may be analyzed according to the degree of closeness of the users. For example, when it is detected that two users have limb contact, it may be preliminarily determined that the two users are in a friendship or family relationship, and then whether the two users are in a lover relationship is determined based on whether the clothes of the two users are the same. Another optional implementation manner is to cluster a plurality of users in the image by using the user attribute information based on the individual analysis result. The clustering method may be to calculate a distance between the user attribute information after quantifying the attribute information of each user, group the user attribute information having a distance smaller than a preset threshold into a group, and further group the corresponding users. Alternatively, in clustering, clustering may be performed preferentially using group attribute information (e.g., social relationships) of the users, and then clustering may be performed using user attribute information of the individuals.

Through the above-described acquisition mode of the user attribute information, not only richer superficial characteristics, such as gender, age, race, health status, makeup style, accessory style, and the like, can be acquired, but also deeper characteristics of the user, such as character, purchasing power, and the like, can be acquired, so that content more meeting the user requirements or more interesting to the user can be recommended to the user based on the characteristics, and further the conversion rate of the recommended content can be improved.

In some embodiments, the manner of obtaining the environment attribute information may include, but is not limited to, receiving current time information through a network and/or receiving spatial information corresponding to a presentation location of the recommended content through the network. Wherein the time information may comprise at least one of: current time of day, weather conditions, holiday information, current trending events. Spatial information may include geographic aspects of the presentation location and/or landmarks in nearby areas, such as airports, waiting rooms, business centers, and the like.

According to the method and the device, besides the user attribute, the environment attribute can be analyzed and obtained, the recommended content is analyzed and decided by utilizing the environment attribute, the recommended content which is more in line with the environment state can be provided, and the timeliness of content recommendation is improved.

In step 102, recommended content is synthesized based on the user attribute information and/or the environment attribute information.

In embodiments of the present application, content is divided into multiple elements, which may be combined in various ways to produce different content. In this way, the content recommendation system can generate rich content by storing various content elements without storing a large amount of fixed content in advance, and thus the changed content can also be called dynamic content.

In some embodiments, step 102 may include step 1021: candidate content elements are determined based on the user attribute information and/or the environment attribute information.

In this embodiment, the candidate content elements may include candidate object elements and candidate scene elements. In some alternative implementations, the object element may be a commodity and the scene element may be an advertising element. Accordingly, the candidate object may be a candidate good and the candidate scene element may be a candidate advertisement element. Object elements and scene elements may have a variety of attributes. The attributes of the object elements may include, but are not limited to, at least one of a category, price, color, and brand of the item, and the attributes of the scene elements may include, but are not limited to, at least one of a visual style, a story line, a suitable item, a person, a time, a place, and a score.

In some implementations, the set of recommendation models may be utilized to determine candidate element content based on the user attribute information and/or the environment attribute information obtained in step 101. The recommendation model set may include at least one of a first recommendation model set recommending object elements and/or scene elements according to the user attribute information, a second recommendation model set in which the object elements and/or scene elements are jointly recommended, and a third recommendation model set recommending the object elements and/or scene elements according to the environment attribute information.

In a further implementation, the first set of recommendation models may be a set of sub-models of interest relationships between the user attribute information and attributes of the object elements and/or attributes of the scene elements, wherein the sub-models of interest relationships between at least one user attribute information and attributes of the object elements and/or attributes of the scene elements may be included. The second set of recommendation models may be a set of sub-models of interest relationships between attributes of object elements and/or attributes of scene elements, which may include at least one sub-model of interest relationships between attributes of object elements and/or attributes of scene elements. The third set of recommendation models may be a set of sub-models of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements, wherein at least one sub-model of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements may be included.

Taking the object element as a commodity and the scene element as an advertisement element as an example, each sub-model in the first recommendation model set can represent a mapping relation for recommending each commodity or each advertisement element according to the user attribute; each sub-model in the second set of recommendation models may represent a mapping relationship in which different goods/advertisement elements are jointly recommended, and each sub-model in the third set of recommendation models may represent a mapping relationship in which each goods or each advertisement element is recommended according to an environmental attribute.

In some optional implementation manners of this embodiment, after the recommendation model set is determined, at least one sub-model in the recommendation model set may be used to recommend the content related to the user attribute and the environment attribute.

Referring to fig. 3, an exemplary flow diagram for determining candidate content elements is shown, according to one embodiment of the present application. In an embodiment corresponding to fig. 3, the method for determining candidate content elements by using the recommendation model set may include:

step 301, training the submodels in the recommended model set based on the interestingness statistical data to determine the parameters of the submodels.

As described above, each recommendation model set may include a set of sub models, and the sub models may represent mapping relationships between user attributes and object elements, between user attributes and scene elements, between different object elements, between different scene elements, between environment attributes and object elements, and between environment attributes and scene elements. In this embodiment, the mapping relationship may be derived from interestingness statistics. Specifically, each submodel may be trained based on interestingness statistics, resulting in parameters for the submodel. Wherein the interestingness statistics may include: the interest statistical data of the attribute of the object element and/or the attribute of the scene element by the user attribute information, the interest statistical data between the attributes of different object elements, the interest statistical data between the attributes of different scene elements, the interest statistical data between the attribute of the object element and the attribute of the scene element, and the interest statistical data of the attribute of the object element and/or the attribute of the scene element by the environment attribute information.

Interestingness statistics may be quantified as a number of levels or as normalized values. The obtaining mode may be obtained through data statistics of an online shopping website, for example, the corresponding relationship between the browsing amount and the purchasing amount of a certain commodity on the online shopping website and the age group to which the user who browses or purchases the commodity belongs may be counted, so as to count the interest degree statistical data of the users of different age groups on the commodity. Also, for example, the interestingness statistical data of a refrigerator and a washing machine, and the interestingness statistical data of a refrigerator brand and a washing machine brand can be calculated by counting the number of users who purchase various commodities at the same time (for example, the number of users who purchase a refrigerator brand and a washing machine brand at the same time). Another way to obtain interestingness statistics is through questionnaires. For example, a targeted questionnaire can be designed to count the interest of users with different ages, characters and purchasing power in different brands of commodities. The cosmetic interest level statistic data of women can be empirically set and normalized to 0.8, and the cosmetic interest level statistic data of men can be set to 0.2.

Table 1 shows an example of statistical data of interest degree of an age in user attribute information to a brand in an attribute of an object element in a list form. Where the interestingness is normalized to 0 to 1.

Table-age interest degree statistical data table for brand

	Disney	Gap	Eland	……
					0-3	0.8	0.6	0.0	……
3-5	0.6	0.5	0.0	……
					5-10	0.8	0.6	0.3	……
10-20	0.3	0.8	0.8	……
					……	……	……	……	……

As can be seen from table 1, the age-interest-in-brand statistical data table counts the interest of users of various ages in the brand of the product, and similarly, counts the interest of other user attribute information in different product attributes or advertisement elements, the interest of different products, the interest of different advertisement elements, and the interest of environment attribute information in different products or different advertisement elements.

Optionally, in order to make the trained sub-model adaptive to the environment change, the corresponding sub-model may be updated based on the new object elements and scene elements. For example, interestingness statistics and sub-models related to a brand may be updated based on the new brand's goods. In addition, the interest degree statistical data can be updated in a certain time period, and the updated interest degree statistical data is adopted to train the corresponding sub-model to obtain the updated sub-model. The sub-models may be updated, for example, quarterly based on feedback information from merchants.

Step 302, a global energy function is established based on the recommendation model set.

After the sub-models in the recommendation model set are obtained through training, object elements and scene elements meeting requirements can be searched from the object element database and the scene element database according to certain rules. The following function may be established based on the first, second, and third sets of recommendation models, and the object element and the scene element may be determined based on equation (1).

productSet ^* ＝argmin _productSet E(productSet|models,userSet,context) (1)

Wherein productSet represents a set of object elements or scene elements, productSet represents a set of determined candidate object elements or candidate scene elements, models represents a set of recommended models, and models ═ model ₁ ,model ₂ ,model ₃ Therein model ₁ Representing a first set of recommended models, model ₂ Representing a second set of recommended models, model ₃ Representing a third set of recommendation models. userSet represents a set of user attribute information, context represents environment attribute information, and E (-) represents a global energy function.

And determining the recommended content, namely selecting the object element and/or the scene element with the minimum energy function from the database. The global energy function may be defined as equation (2):

in formula (2), product set ═ product _j }，userSet＝{user _i H, where i, j ₁ ，j ₂ Is a positive integer, product _j ，product _j1 ，product _j2 Representing object elements or scene elements, user _i Representing user attribute information. Alpha is alpha ₁ ，α ₂ ，α ₃ The weight coefficients are expressed and can be set or trained empirically.

As shown in equation (2), the global energy function may include a first energy function E ₁ (. DEG), a second energy function E ₂ (. a) and a thirdEnergy function E ₃ (. cndot.). The first energy function may be an energy function that recommends an object element or a scene element according to the user attribute information, and specifically, the first energy function may be calculated according to equation (3):

wherein i, j, p and q are positive integers, product _j profile _p P-th attribute, user, representing the j-th object element/scene element _i profile _q Representing the qth attribute, β, of the ith user _(p，q) Representing the weight coefficients. The first energy function may include: and calculating recommendation probabilities of the attributes of the object elements and/or the attributes of the scene elements corresponding to the user attribute information by adopting a classifier and/or a regressor based on the first recommendation model set.

The second energy function may be an energy function in which different object elements or different scene elements are recommended in common, and specifically, the second energy function may be calculated according to equation (4):

wherein j is ₁ ，j ₂ P and q are positive integers, product _j1 profile _p Denotes the j (th) ₁ Product, p-th property of individual object elements/scene elements _j2 profile _q Denotes the jth ₂ Q attribute of object element/scene element, beta _(p，q) Representing the weight coefficients. The second energy function may include: and calculating the probability that the attributes of the object elements and/or the attributes of the scene elements are jointly recommended by adopting a classifier and/or a regressor based on the second recommendation model set.

The third energy function may be an energy function that recommends an object element or a scene element according to the environment attribute information, and specifically, the third energy function may be calculated according to equation (5):

wherein i, j, p and q are positive integers, product _j profile _p The pth attribute, contextprofile, representing the jth object element/scene element _q Indicating the qth attribute, γ, in the environment attribute information _(p，q) Representing the weight coefficients. The third energy function may include: and calculating the recommendation probability of the attribute of the object element and/or the attribute of the scene element corresponding to the user attribute information by adopting a classifier and/or a regressor based on the third recommendation model set.

Continuing with fig. 3, in step 303, a global optimization solution is performed on the global energy function to obtain candidate content elements that optimize the global energy function.

In this embodiment, the recommended content may be determined based on the energy function described above. Specifically, the global energy function may be solved for global optimization according to equation (1). Methods of global optimization may include optimization algorithms based on genetic algorithms, linear programming, simulated annealing, and the like. After the productSet in equation (1) is solved, the candidate object element and the candidate scene element are obtained.

In some embodiments, the candidate content elements may be determined based on one of a first energy function, a second energy function, and a third energy function, for example, wedding apparel may be recommended to couples based on the first energy function, colorful and gorgeous apparel may be recommended to younger and outsourcing women; the refrigerator and the television, the lipstick and the eyebrow pencil, the crib and the milk bottle can be respectively used as jointly recommended commodities based on the second energy function; also, the down jacket goods can be recommended in winter when the user is snowy and the travel information can be recommended on the billboard of the airport based on the third energy function. In some implementations, the recommended content may be determined in conjunction with two or three of the first energy function, the second energy function, and the third energy function. For example, lover T-shirts can be recommended to lovers in summer, lovers down jackets can be recommended to lovers in winter, lovers watches and lovers rings can be recommended to lovers at the same time, and the like.

In practical applications, when the recommended content is an advertisement, a group of preferred jointly recommended commodity sets and advertisement element sets can be obtained after global optimization solution is performed on a global energy function.

In the method for determining candidate content elements provided by the embodiment, a plurality of recommended object elements and scene elements can be selected according to the interestingness and tendency of the user and/or the environmental information, so that richer recommended content can be provided. For example, when the advertisement is recommended, various advertisement elements and scene elements meeting the requirements and preferences of the user can be obtained, so that the advertisement content is rich, and the utilization rate and the delivery effect of the billboard are improved. In addition, more vivid advertising elements can be provided, and user experience is improved.

Returning to fig. 1, step 102 may further include step 1022 of synthesizing the recommended content based on the candidate object element and the candidate scene element.

In this embodiment, after determining the candidate object element and the candidate scene element in step 1021, the candidate object element may be fused, the candidate scene elements may be combined, and the recommended content may be generated by combining the candidate object element and the candidate scene element. Each candidate object element may first be fused with a respective candidate scene element, followed by combining a plurality of candidate scene elements.

In some implementations, the candidate object element and the candidate scene element may be fused based on a preset rule. With further reference to FIG. 4, an exemplary flow diagram of a method of composing recommended content is shown, according to one embodiment of the present application.

As shown in fig. 4, in step 401, a placement index, a direction index, and a motion trajectory index of a candidate scene element are obtained.

In this embodiment, the scene element generally has a transparent background or a specific placement position for placing the object element. A placement index may be built at these particular locations for setting the types of object elements that a particular location may place. For example, a vehicle can be placed on the road, and a watch can be placed on the wrist. Further, a direction index may be established at a specific position for indicating the orientation of the object element. For example, the placement orientation of the vehicle may be determined according to the direction of the road. The orientation of the watch is determined according to the posture of the upper arm of the person. Further, when the candidate scene element is a dynamic element, such as a video, a motion trail index may also be established to indicate the motion direction and route of the object element. For example, a road direction index may be included in the road scene to cause the vehicle to travel along the road direction.

Before fusing the candidate object element and the candidate scene element, the above-mentioned index information of the candidate scene element, including the placement index, the direction index, and the motion trajectory index, may be first obtained. The obtaining mode can be directly searching related data from a database, or can be image analysis and video analysis of scene elements, extracting features used for placing candidate object elements, determining a placing index of the scene through a trained model, and extracting position features, direction features and motion track features in the scene elements, so as to obtain the direction index and the motion track index.

In step 402, candidate object elements and candidate scene elements are fused according to the placement index, the direction index and the motion trail index to generate candidate recommended content.

When synthesizing the candidate content elements, the candidate object elements may be placed at specific positions in the candidate scene elements according to the placement index, then the candidate object elements are rotated according to the direction index, and then the candidate object elements are moved according to the motion trajectory index to synthesize the complete candidate recommended content.

In step 403, candidate recommended contents are filtered based on the correlation among the attributes of the candidate scene elements, the correlation among the attributes of the candidate object elements, and the correlation between the attributes of the candidate scene elements and the candidate object elements, and the filtered candidate recommended contents are fused to generate recommended contents.

In this embodiment, a plurality of candidate recommended contents may have a certain relevance, such as a temporal relevance, a spatial relevance, a person relevance, an event relevance, and an attribute relevance. According to the relevance, a plurality of candidate recommended contents with strong relevance can be screened, the candidate recommended contents irrelevant to other candidate recommended contents are filtered, and the screened candidate recommended contents are integrated into smooth and coherent recommended contents.

The association between the candidate recommended contents may be determined based on the association between the attributes of different candidate object elements included in the candidate recommended contents, between the attribute of the candidate object element and the attribute of the candidate scene element, and between the attributes of different candidate scene elements. Therefore, in the present embodiment, candidate recommended contents containing corresponding candidate object elements or candidate scene elements may be screened according to the correlation between the attributes of different candidate object elements, the correlation between the attribute of a candidate object element and the attribute of a candidate scene element, and the correlation between the attributes of different candidate scene elements.

The correlation among the attributes of the different candidate object elements, the correlation between the attributes of the candidate object elements and the attributes of the candidate scene elements, and the correlation between the attributes of the different candidate scene elements can be obtained by a model training method, or can be manually set according to experience. The representation of the association may be a quantized numerical value. Alternatively, the attributes of the candidate object elements and the attributes of the candidate scene elements may be vectorized, and then the distance between the attributes is calculated, and the smaller the distance is, the stronger the relevance is. And calculating the relevance between the attributes of the object elements and the attributes of the scene elements contained in all the candidate recommended contents, and filtering out the candidate recommended contents which have the weakest relevance or no relevance with other candidate recommended contents according to the relevance.

After screening, a plurality of candidate recommended contents with strong relevance can be obtained, and the screened candidate recommended contents are connected in series according to a time sequence, a spatial position relation or an event state to generate the finished recommended contents.

It is understood that if only one candidate recommended content is obtained in step 1021, or there is no correlation between multiple candidate recommended contents, one candidate recommended content may be taken as the recommended content.

For example, when an advertisement is recommended, the correlation between the commodity element and the advertisement element may be calculated, for example, if two advertisement elements with similar video styles are highly correlated, an advertisement containing the two advertisement elements may be placed in the same advertisement. For example, if the scenes of the two advertisement elements are the same scene, the time attributes are morning and noon, and the included goods are the car and the watch, the car advertisement with the time attribute of the advertisement element being morning and the advertisement with the time attribute of the advertisement element being noon can be connected in series to form a time-continuous video advertisement.

It should be noted that in the above exemplary implementation of the method for synthesizing recommended content described with reference to fig. 4, a screening step may be performed first, the candidate object elements and the candidate scene elements are screened based on the correlation among the attributes of the candidate scene elements, the correlation among the attributes of the candidate object elements, and the correlation between the attributes of the candidate scene elements and the candidate object elements, then the placement index, the direction index, and the motion trajectory index of the candidate scene elements are obtained, then the candidate object elements and the candidate scene elements with high correlation are fused into the candidate recommended content, and the candidate recommended content is concatenated according to the time attribute of the candidate scene elements to form the complete recommended content.

In the content recommendation method provided by the embodiment of the application, the candidate object element and the candidate scene element are determined based on the user attribute information and the environment attribute information, and then the recommended content is synthesized according to the candidate object element and the candidate scene element. More information can be provided in the recommended content, and the utilization rate of the display position of the recommended content is improved. And more targeted personalized content can be provided, so that the conversion rate of the recommended content is improved.

The method provided by the embodiment can be used for the intelligent advertisement recommendation system. The system can acquire an image in front of the billboard through a camera arranged on the billboard, perform human body detection on the image, detect a target user A, and then determine that the target user A pays attention to the content on the billboard through focus detection. The system can perform individual analysis on a target user A, the analysis result is male, the target user A is 40-50 years old, the target user A wears high-grade dark blue Western-style clothes and black leather shoes, the facial feature analysis has the characteristics of responsibility, confidence, strong emotional stability, inward direction and strong purchasing power, then the target user A can be recommended with commodities such as black business cars, deep-color high-grade POLO shirts and certain brand high-grade watches, and the recommended advertising elements can include classical style music, high-grade home scenes, business office building office scenes, urban road conditions and the like. And finally, fusing according to the relevance between the commodities and the advertisement elements, wherein the generated advertisement can be a video for a man in the morning to wear a blue high-grade POLO shirt, wear a high-grade watch and drive a black business car to a business office building. During the period, the male can also participate in the conference after inserting the watch, and drive away in sunset.

The smart advertisement recommendation system may also recommend advertisements for a user group that includes multiple users. For example, 6 target users A, B, C, D, E, F may be detected by analyzing the image, and it is determined by focus detection that 6 target users are paying attention to the content on the billboard. The system may first perform individual analysis on the 6 target users, and then may perform group analysis based on the results of the individual analysis and the posture and relative position relationship between the 6 target users. The analysis result was A, B, C, which is three people at home, and the probability of being a user group 1, D, E, which is a lover relationship, is high, and the result is a user group 2, and F is a user group 3. The system can generate three sections of advertisements, and advertisement recommendation is respectively carried out on the user groups 1, 2 and 3.

In some embodiments, the content recommendation method may further include:

and 103, displaying the recommended content in a time division presentation or space division presentation mode.

If the number of users or user groups to be served acquired in step 101 is plural, after generating the recommended content, the recommended content needs to be presented to the plural users or user groups to be served in an appropriate manner. Taking the example of displaying the recommended content on the electronic display screen as an example, in this embodiment, the recommended content may be displayed in a time-division presentation manner or a space-division presentation manner. The time division presentation mode is suitable for the electronic screen with smaller screen area, and the space division presentation mode is suitable for the screen with larger screen area or the curved screen.

In some implementations, presenting the recommended content in a spatial-division presentation may be performed by: the display positions of the recommended contents are firstly divided into sub-areas with the number equal to that of the users or the user groups, and then the recommended contents corresponding to each user or each group of users are displayed in the sub-areas where the sight focus positions of the users/the group of users are located. It should be noted that pupil position detection and depth detection can be performed on the face image when performing user gaze focus detection, thereby determining the position of the area concerned by the user on the screen. Further, the user visual field range can be determined according to the pupil position of the user, so that the size of the displayed recommended content can be determined.

In a further implementation, the gaze focus position of the user may also be tracked, and the display position of the recommended content of each user or each group of users may be adjusted according to the change of the gaze focus position of the user. If the user or the user group is in a moving state, the position change of the user or the user group can be tracked through pedestrian detection, or when the user or the user group is in a static state but the attention area changes, the change of the focus position of the sight line of the user can be detected and determined in real time based on the pupil position, so that the change state of the position concerned by the user can be acquired. In this case, the display position of the recommended content may be adjusted so that the content recommended for the user or the user group may be projected within the visual field of the user or the user group all the time.

With further reference to fig. 5a, a schematic diagram illustrating an effect of presenting recommended content in a spatial division manner is shown. The scenario of fig. 5a may utilize a billboard in a mall hall or hotel hall to recommend advertisements to customers. In fig. 5a, the display screen for presenting recommended content is a cylindrical screen 510. The system detects target

service object users

501 and 502, where user 501's gaze focus is located in area 511 and user 502's gaze focus is located in area 512. Through individual analysis of the user 501 and the user 502, the fact that the interest degree and the demand degree of the user 501 for the watch are large and the demand degree of the user 502 for the car is large is obtained, the watch advertisement and the car advertisement can be displayed in the

areas

511 and 512 respectively, and different personalized advertisements can be recommended to different users at different positions of the cylindrical screen.

With further reference to fig. 5b, a schematic diagram of another effect of presenting recommended content in a spatial division presentation is shown. The presentation position shown in fig. 5b may be a flat display screen within a transfer aisle like a subway. These screens may be laid along walls, and during the transfer, the user may present personalized advertising on the screens on the walls. As shown in fig. 5b, the screen 520 may be divided into a plurality of sub-areas. The gaze focus of user 503, user 504, and user 505 are in area 521, area 522, and area 523, respectively. The contents recommended to each user may be presented on the corresponding area. For example, a jacket and skirt advertisement is recommended to user 503, a watch advertisement is recommended to user 504, and a car advertisement is recommended to user 505. And the position of the sight focus of the user can be tracked in real time in the moving process of the user, and the position of the displayed recommended content is adjusted according to the change of the position of the sight focus of the user. The content presented in area 522 may be switched to a car advertisement, for example, when the user's gaze focus position moves to area 522. When the users overlap, recommended content corresponding to the users with unobstructed view can be displayed on the screen.

The screen for time division presentation may be a raster display screen, and different recommended contents are switched by fast movement of a raster.

In some implementations, presenting the recommended content in a time-division presentation manner may be performed by switching at least one recommended content corresponding to each user or each group of users at certain time intervals at a presentation position of the recommended content. The time interval may be a persistence time of human eyes, and thus, the display of the plurality of recommended contents may be realized by using a persistence phenomenon of human eyes.

In a further implementation, the gaze focal position of the user may also be tracked, and the presentation angle of the recommended content of each user or each group of users may be adjusted according to the change of the gaze focal position of the user. The method comprises the steps of detecting the position of a focus and depth information of a user in real time through focus detection while presenting recommended content, so as to determine the change of the visual field range of the user, and then adjusting the direction of a grating based on the change of the visual field range of the user, so that the recommended content is presented in the visual field range of the user all the time.

Further reference is made to fig. 6, which shows a schematic diagram of the principle of presenting recommended content in a time-division presentation. As shown in fig. 6, 602 is a camera for capturing images, and a Light Emitting Diode (LED) projection array 601 presents images or video to a user via a raster display screen 603, wherein when the raster is moved to a certain position, a user whose left and right eyes are respectively located in

regions

611 and 612 can see a first type of image containing content recommended for the user; when the raster is shifted to another position, the user whose left and right eyes are respectively located in the area 613 and the area 614 can see the second type of image containing the content recommended for the user. The camera 602 may detect a change in a focal position of the user's gaze in real time, and the grating adjusts an angle according to the movement of the user's gaze, ensuring that the displayed recommended content is always within the user's field of view.

According to the method for displaying the recommended content in the space division or time division presentation mode, multiple personalized recommended contents can be displayed for multiple or multiple groups of users at the same time, the utilization rate of the display positions of the recommended contents is improved, and the display positions can be automatically adjusted through focus detection, so that content recommendation is more intelligent.

Further referring to fig. 7, a schematic structural diagram of a content recommendation device according to an embodiment of the present application is shown.

As shown in fig. 7, the content recommendation apparatus 700 may include an acquisition unit 701 and a synthesis unit 702. The obtaining unit 701 may be configured to obtain user attribute information and/or environment attribute information. The synthesizing unit 702 may be configured to synthesize recommended content based on the user attribute information and/or the environment attribute information. In some embodiments, composition unit 702 may include a determination subunit 7021 and a composition subunit 7022. The determining subunit 7021 is configured to determine candidate content elements based on the user attribute information and/or the environment attribute information acquired by the acquiring unit 701, where the candidate content elements may include candidate object elements and candidate scene elements. The synthesizing subunit 7022 may be configured to synthesize the recommended content based on the candidate object elements and the candidate scene elements determined by the determining subunit 7021.

In this embodiment, the acquisition unit 701 may extract user attribute information based on an image captured by a camera. The user attribute information may include user individual attribute information and group attribute information. The individual attribute information may be information obtained by analyzing individual characteristics of each user, and may include information such as gender, age, race, dress style, makeup style, health status, posture, and the like of the user. The group attribute information may be information obtained based on relationships among a plurality of users, and may be social relationship information of the users, including family relationships, lover relationships, friendships, and the like. In some implementations, the user attribute information may be retrieved by a variety of classifier classifications or regressors.

In some implementations, the manner in which the obtaining unit 701 is configured for the environment attribute information may include, but is not limited to, receiving current time information through a network and/or receiving spatial information corresponding to a presentation location of the recommended content through the network.

In some implementations, the determining subunit 7021 may determine the candidate element content by constructing a global energy function using the recommendation model set based on the user attribute information and the environment attribute information acquired by the acquiring unit 701, and performing an optimization solution on the global energy function. The recommendation model set may include at least one of a first recommendation model set recommending object elements and/or scene elements according to the user attribute information, a second recommendation model set in which object elements and/or scene elements are jointly recommended, and a third recommendation model set recommending object elements and/or scene elements according to the environment attribute information.

In a further implementation, the first set of recommendation models may be a set of sub-models of interest relationships between the user attribute information and attributes of the object elements and/or attributes of the scene elements, wherein the sub-models of interest relationships between at least one user attribute information and attributes of the object elements and/or attributes of the scene elements may be included. The second set of recommendation models may be a set of sub-models of interest relationships between attributes of object elements and/or attributes of scene elements, which may include at least one sub-model of interest relationships between attributes of object elements and/or attributes of scene elements. The third set of recommendation models may be a set of sub-models of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements, wherein at least one sub-model of interest relationships between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements may be included. The interest relationship may be represented by interestingness statistics. The method for obtaining the interestingness statistical data includes, but is not limited to: experience setting, questionnaire statistics, and online shopping website data statistics.

In some implementations, the synthesizing subunit 7022 may obtain a placement index, a direction index, and a motion trajectory index of the candidate scene element, fuse the candidate object element and the candidate scene element according to the indexes, generate the candidate recommended content, then screen the candidate recommended content according to the relevance between the candidate object element and the candidate scene element, the relevance between different candidate scene elements, and the relevance between different candidate object elements, and finally concatenate the screened candidate recommended contents according to a time sequence to form the complete and smooth recommended content.

In some embodiments, the content recommendation device may further include a presentation unit 703. The presentation unit 703 may be configured to present the recommended content synthesized by the synthesis unit 702 in a time-division presentation or a space-division presentation. The time division presentation mode can present recommended contents on a display screen provided with a movable grating, and the recommended contents corresponding to a plurality of users or user groups are switched within the persistence time of vision by changing the angle of the grating and utilizing the persistence of vision of human eyes. The space division presentation mode can present recommended content on a screen with a large area or a curved screen, divide the screen into a plurality of sub-areas, and present corresponding recommended content in the sub-area concerned by each user or each group of users. In a further implementation, the implementation focus changes of the user or the user group can be tracked, and the display position or the angle of the recommended content can be adjusted in real time.

The content recommendation device provided by the embodiment of the application can provide more targeted personalized content, provide more information in recommended content, and improve the utilization rate of the recommended content display position and the conversion rate of the recommended content.

It should be understood that the elements described in the content recommendation device 700 correspond to various steps in the method described with reference to fig. 1-6. Thus, the operations and features described above for the method are also applicable to the content recommendation device 700 and the units included therein, and are not described herein again.

With further reference to fig. 8, a schematic structural diagram of a content recommendation system according to an embodiment of the present application is shown.

The content recommendation system 800 may include at least a processor 801 and a display device 802. Wherein the processor 801 may comprise the content recommendation device 700 described above in connection with fig. 7. The display device may be configured to display the recommended content generated by the processor. It is to be understood that the processor may be a stand-alone processing unit for performing the content recommendation method. In some implementations, the content recommendation system can also include an input device such as a keyboard, mouse, or the like; a memory, such as a hard disk, for storing candidate object elements and candidate scene elements; a communication unit such as a network interface card of a LAN card, a modem, or the like, which performs communication processing via a network such as the internet; and a removable medium such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory so that a computer program read out therefrom is installed in the memory as needed.

As another aspect, the present application also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiments; or it may be a separate computer-readable storage medium that is not incorporated into the terminal device. The computer readable storage medium stores one or more programs, which may include program code for performing the methods illustrated in the flowcharts.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, apparatuses, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A method for recommending content, the method comprising:

acquiring user attribute information and environment attribute information;

synthesizing recommended content based on the user attribute information and the environment attribute information;

tracking a gaze focus position of a user; and

dynamically displaying the recommended content of the user at a display position corresponding to the sight focus position of the user, wherein the dynamic display comprises the following steps: determining sub-areas corresponding to users on a display screen, wherein the sub-areas corresponding to the users are different; and displaying the recommended content of the user on the sub-area of the display screen, and changing the corresponding sub-area according to the movement of the user.

2. The method of claim 1, wherein the composite recommendation comprises:

determining candidate content elements based on the user attribute information and/or the environment attribute information, the candidate content elements including candidate object elements and/or candidate scene elements; and

and synthesizing the recommended content according to the candidate object element and/or the candidate scene element.

3. The method of claim 2, wherein the obtaining user attribute information comprises:

acquiring an image of an area where the display position of the recommended content is located;

determining a user as a service object from the image; and

performing individual analysis and population analysis on the user.

4. The method of claim 3, the determining a user from the image as a service object, comprising:

detecting a pedestrian in the image and a gaze focus position of the pedestrian;

judging whether the sight focus position of the pedestrian is located at the display position of the recommended content; and

if yes, determining that the pedestrian is the user serving as the service object.

5. The method of any of claims 3-4, wherein the performing an individual analysis of the user comprises:

dividing each user in the image into a plurality of sub-images according to the human body part; and

the sub-images are analyzed using a classifier and/or a regressor to obtain user attribute information.

6. The method of any of claims 3-4, wherein the performing a group analysis on the users comprises grouping the users, comprising:

classifying the plurality of users in the image according to social relations by adopting a classifier based on the clothing, the association degree of the postures and the relative position information of the plurality of users in the image; and/or

And clustering a plurality of users in the image by utilizing the user attribute information based on the individual analysis result.

7. The method of claim 1, wherein the obtaining user attribute information comprises:

capturing images of a plurality of users of an electronic device; dividing the plurality of users into a plurality of groups based on distances between the plurality of users; and obtaining group user attribute information corresponding to each of the plurality of groups; and the composite recommended content includes: providing group recommended content corresponding to each of the plurality of groups based on the group user attribute information and the environment attribute information.

8. The method of claim 1, wherein the environment attribute information comprises:

at least one of location information, time information, weather information, holiday information, and current trending events of the electronic device.

9. The method of claim 2, wherein the determining candidate content elements comprises: determining candidate element content using a set of recommendation models based on the user attribute information and/or the environment attribute information,

wherein the set of recommendation models includes at least one of:

the method comprises the steps of recommending a first recommendation model set of object elements and/or scene elements according to user attribute information, recommending a second recommendation model set of object elements and/or scene elements which are jointly recommended, and recommending a third recommendation model set of object elements and/or scene elements according to environment attribute information.

10. The method of claim 9,

the first recommendation model set comprises at least one sub-model of interest relationship between user attribute information and attributes of object elements and/or attributes of scene elements;

the second recommendation model set comprises at least one sub-model of interest relationship between attributes of object elements and/or attributes of scene elements;

the third recommendation model set comprises at least one sub-model of interest relationship between the environment attribute information and the attributes of the object elements and/or the attributes of the scene elements.

11. The method of claim 10, wherein determining candidate content elements using a set of recommendation models based on the user attribute information and/or the environment attribute information comprises:

training a sub-model in the recommendation model set based on the interestingness statistical data to determine parameters of the sub-model;

establishing a global energy function based on the recommendation model set;

carrying out global optimization solution on the global energy function to obtain candidate content elements enabling the global energy function to be optimal;

wherein the interestingness statistics comprise:

the interest degree statistical data of the user attribute information to the attribute of the object element and/or the attribute of the scene element;

interestingness statistics between attributes of different object elements;

interestingness statistics between attributes of different scene elements;

interestingness statistics between attributes of the object elements and attributes of the scene elements; and

and the environment attribute information is used for calculating interest degree statistical data of the attribute of the object element and/or the attribute of the scene element.

12. The method of claim 11, wherein the global energy function comprises a first energy function, a second energy function, and a third energy function;

the first energy function comprises: based on the first recommendation model set, adopting a classifier and/or a regressor to calculate recommendation probabilities of the attributes of the object elements and/or the attributes of the scene elements corresponding to the user attribute information;

the second energy function comprises: based on the second recommendation model set, calculating the probability that the attributes of the object elements and/or the attributes of the scene elements are jointly recommended by adopting a classifier and/or a regressor;

the third energy function comprises: and calculating recommendation probabilities of the attributes of the object elements and/or the attributes of the scene elements corresponding to the environment attribute information by adopting a classifier and/or a regressor based on the third recommendation model set.

13. The method of claim 11, wherein the interestingness statistics are obtained by at least one of: experience setting, questionnaire statistics, and data statistics of online shopping websites.

14. The method of claim 2, wherein synthesizing recommended content from the candidate object elements and candidate scene elements comprises:

obtaining a placement index, a direction index and a motion trail index of the candidate scene element;

fusing the candidate object elements and the candidate scene elements according to the placement index, the direction index and the motion trail index to generate candidate recommended contents; and

and screening the candidate recommended contents based on the correlation among the attributes of the candidate scene elements, the correlation among the attributes of the candidate object elements and the correlation between the attributes of the candidate scene elements and the attributes of the candidate object elements, and fusing the screened candidate recommended contents to generate the recommended contents.

15. The method of claim 7, further comprising:

and displaying the recommended content in a time-division presentation or space-division presentation mode.

16. The method of claim 15, wherein the presenting the recommended content in a spatial division presentation comprises:

dividing the display positions of the recommended contents into sub-areas with the number equal to that of users/user groups; and

and displaying the recommended content corresponding to each user/group of users into the sub-area where the user/group of users sight focus position is located.

17. The method of claim 15, wherein the presenting the recommended content in a time-division presentation manner includes:

the recommended content corresponding to each/group of users is switched at certain time intervals at the presentation position of the recommended content.

18. The method of claim 17, wherein the presenting the recommended content in a time-division presentation further comprises:

tracking a gaze focus location of the user; and

and adjusting the display angle of the recommended content corresponding to each user/group of users according to the change of the sight focus position of the users.

19. The method of claim 2,

the object element comprises a commodity and the scene element comprises an advertisement element; and/or

The attributes of the object element include at least one of: the category, price, color, and brand of the goods; and/or

The attributes of the scene elements include at least one of: visual style, story line, suitable merchandise, people, time, place, and score.

20. The method of claim 7, further comprising:

the gaze focus position of each group of users is determined by: determining a gaze focus position of the group according to an average of gaze focus positions of a plurality of users within the group;

determining a display area of the screen corresponding to the gaze focus position of the group;

and displaying the group recommendation content on a display area of the screen.

21. The method of claim 1, wherein the dynamically displaying comprises: at least one of a color, a position, a shape, and a size of the display region corresponding to the viewpoint focus position of the user is adjusted.

22. The method of claim 1, wherein the obtaining environmental attribute information comprises:

the environment attribute information is received from a server or from an electronic device within a preset range of an electronic device recommending content to a user.

23. The method of claim 1, wherein the obtaining user attribute information and environment attribute information comprises:

the method includes acquiring user attribute information based on an image acquisition device acquiring an image of each of a plurality of users of an electronic apparatus providing contents to the users, and acquiring environment attribute information based on ambient environment information acquired by one or more of the image acquisition device and a sensor.

24. The method according to claim 2, wherein the user attribute information is derived based on a plurality of sub-images derived by segmenting each user in the image by human body part; wherein the user attribute information includes information corresponding to each of a plurality of body parts of each user.

25. The method according to claim 2, wherein synthesizing recommended content according to the candidate object element and/or candidate scene element comprises:

and synthesizing a plurality of pieces of recommended content according to the candidate object elements and/or the candidate scene elements.

26. The method of claim 1, wherein the user attribute information comprises at least one of: biometric information, gender, age, race, expression, skin condition, facial features, preferences, clothing style, accessory style, preferred brand, make-up style, health status, character, and purchasing power.

27. An apparatus for recommending contents, the apparatus comprising:

an acquisition unit configured to acquire user attribute information and environment attribute information; and

a synthesizing unit configured to synthesize recommended content based on the user attribute information and the environment attribute information;

a presentation unit configured to track a gaze focus position of a user; and

dynamically displaying recommended contents of a user at a display position corresponding to a sight focus position of the user, wherein the dynamic display comprises the following steps: determining sub-areas corresponding to users on a display screen, wherein the sub-areas corresponding to the users are different; and displaying the recommended content of the user on the sub-area of the display screen, and changing the corresponding sub-area according to the movement of the user.

28. The apparatus of claim 27, wherein the synthesis unit comprises:

a determining subunit configured to determine candidate content elements based on the user attribute information and/or the environment attribute information, the candidate content elements including candidate object elements and/or candidate scene elements;

and the synthesis subunit is configured to synthesize the recommended content according to the candidate object element and/or the candidate scene element.

29. The apparatus of claim 28, wherein the means for acquiring acquires the user attribute information comprises:

determining a user as a service object from the image; and

performing individual analysis and population analysis on the user.

30. The apparatus according to claim 29, wherein the manner in which the acquisition unit determines the user as the service object from the image includes:

31. The apparatus according to any one of claims 29-30, wherein the means for obtaining unit performs individual analysis on the user comprises:

32. The apparatus according to any of claims 29-30, wherein the means for obtaining performs group analysis on the users comprises means for grouping the users into groups, and the means for grouping the users comprises:

And clustering a plurality of users in the image by using the user attribute information based on the individual analysis result.

33. The apparatus of claim 27, wherein the means for acquiring acquires the user attribute information comprises:

capturing images of a plurality of users of an electronic device; dividing the plurality of users into a plurality of groups based on distances between the plurality of users; and acquiring group user attribute information corresponding to each of the plurality of groups; and the composite recommended content includes: providing group recommended content corresponding to each of the plurality of groups based on the group user attribute information and the environment attribute information.

34. The apparatus of claim 27, wherein the environment attribute information comprises:

35. The apparatus of claim 28, wherein the determining the manner in which the sub-unit determines the candidate content elements comprises: determining candidate element content using a set of recommendation models based on the user attribute information and/or the environment attribute information,

wherein the set of recommendation models includes at least one of:

recommending a first recommendation model set of object elements and/or scene elements according to the user attribute information, recommending a second recommendation model set of object elements and/or scene elements which are jointly recommended, and recommending a third recommendation model set of object elements and/or scene elements according to the environment attribute information.

36. The apparatus of claim 35,

37. The apparatus of claim 36, wherein the means for determining the candidate content elements using the set of recommendation models based on the user attribute information and/or the environment attribute information comprises:

training the submodels in the recommendation model set based on the interestingness statistical data to determine parameters of the submodels;

establishing a global energy function based on the recommendation model set;

wherein the interestingness statistics comprise:

interestingness statistics between attributes of different object elements;

interestingness statistics between attributes of different scene elements;

and the environment attribute information is interest statistical data of the attributes of the object elements and/or the attributes of the scene elements.

38. The apparatus of claim 37, wherein the global energy function comprises a first energy function, a second energy function, and a third energy function;

39. The apparatus of claim 37, wherein the interestingness statistic is obtained by at least one of: experience settings, questionnaire statistics, and data statistics of online shopping websites.

40. The apparatus of claim 28, wherein the means for synthesizing the recommended content according to the candidate object element and the candidate scene element comprises:

41. The apparatus of claim 33, wherein the presentation unit is configured to present the recommended content in a time-division presentation or a space-division presentation.

42. The apparatus of claim 41, wherein the means for presenting the recommended content in a spatial division presentation comprises:

and displaying the recommended content corresponding to each user/group of users into the sub-area where the user/group of users sight focus positions are located.

43. The apparatus of claim 41, wherein the presenting unit presents the recommended content in a time-division presentation manner, and comprises:

44. The apparatus of claim 43, wherein the means for presenting presents the recommended content in a time-division manner further comprises:

tracking a gaze focus position of the user; and

45. The apparatus of claim 28,

The attributes of the object element include at least one of: the category, price, color, and brand of the good; and/or

The attributes of the scene element include at least one of: visual style, story line, suitable merchandise, people, time, place, and score.

46. The apparatus of claim 33, further comprising:

47. The apparatus of claim 27, wherein the dynamic display comprises: adjusting at least one of a color, a position, a shape, and a size of a display area corresponding to a viewpoint focus position of a user.

48. The apparatus of claim 27, wherein the obtaining unit is further configured to:

the environment attribute information is received from a server or from electronic devices within a preset range of the electronic device recommending the content to the user.

49. The apparatus of claim 27, wherein the obtaining unit is further configured to:

the method includes acquiring user attribute information based on an image acquisition device acquiring an image of each of a plurality of users of an electronic apparatus providing content to the users, and acquiring environment attribute information based on ambient environment information acquired by one or more of the image acquisition device and a sensor.

50. The apparatus of claim 28, wherein the user attribute information is derived based on a plurality of sub-images obtained by segmenting each user in the image by human body part; wherein the user attribute information includes information corresponding to each of a plurality of body parts of each user.

51. The apparatus of claim 28, wherein the synthesis subunit is further configured to:

52. The apparatus of claim 27, wherein the user attribute information comprises at least one of: biometric information, gender, age, race, expression, skin condition, facial features, preferences, clothing style, accessory style, preferred brand, make-up style, health status, character, and purchasing power.

53. A content recommendation system, characterized in that the system comprises a processor and a display device;

the display device is configured to display recommended content;

the processor comprises a content recommendation device according to any of claims 27-52.

54. An electronic device comprising a processor and a display device;

the display device is configured to display recommended content;

the processor is configured to perform the method of any of claims 1-26.