Disclosure of Invention
The application provides an image recognition method and device, which improve the recognition accuracy of an image of a person face (person face) crossing ages.
In order to achieve the above purpose, the application adopts the following technical scheme:
In a first aspect, a method of image recognition may include: acquiring identity characteristics of a target facial image, wherein the identity characteristics are characteristics except for age characteristics in identifiable characteristics in the facial image; according to the identity characteristics of the target face image and the sample library, selecting a first sample face image in the sample library as a recognition result of the target face image; the sample library comprises face features and age features of one or more sample face images; the face features are identifiable features in the face image, and the age features are used for indicating photographing ages of people in the face image; the first sample face image is a sample face image with the face features and the target face image in the sample library meeting a first condition relative to the first features of the first sample face image; the first feature of the target face image relative to the first sample face image is a sum of an identity feature of the target face image and an age feature of the first sample face image.
According to the image recognition method provided by the application, the identity characteristic of the target face image is added with the age characteristic of the sample face image in the sample library to obtain the first characteristic of the target face image, and the first characteristic of the target face image is compared with the face characteristic in the sample library to obtain the recognition result. Because the first features are aligned with the face features of the sample facial images in the sample library in terms of age, the identification process of the application is equivalent to converting the comparison of the cross ages into the comparison of the same ages, so that the identification process is equivalent to compensating the influence of the age features on the identification precision, and the precision of the cross-age face identification is improved.
The first condition may be configured according to a requirement of a user, which is not particularly limited in the present application.
With reference to the first aspect, in one possible implementation manner, the identity feature of the target image may be obtained by inputting the target face image into the first neural network. The first neural network may be a converged convolutional neural network, or a converged fully-connected neural network, or others.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, according to the acquired identity information of the target face image, the feature of each sample face image in the sample library is compared in parallel, that is, the feature of each sample face image is respectively compared, so as to select a first sample face image in the sample library.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, according to the acquired identity information of the target face image, the identity information is compared with the features of the sample face images in the sample library according to a preset sequence, so as to realize selecting the first sample face image in the sample library. The preset sequence can be configured according to actual requirements, and the application is not limited.
For example, the preset sequence is a sequence from small to large, or a sequence from present to previous photographing time, or other sequences, and the application is not limited.
With reference to the first aspect or any one of the foregoing possible implementation manners, in a possible implementation manner, the method may further include: acquiring a photographing age range of a target face image; the sample library comprises face features and age features of sample face images of which the current ages are in a photographing age range. In the possible implementation manner, the sample library is the face features and the age features of the sample face images of the current age within the photographing age range after screening, so that the selection range is reduced, the processing efficiency of image recognition is improved, and meanwhile, the accuracy of image recognition is improved.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the photographing age range of the target image may be obtained by inputting the target face image into a fifth neural network. The fifth neural network may be a converged convolutional neural network, or a converged fully-connected neural network, or others.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the image recognition device subtracts a birth date of each sample face image in the base from a current date to obtain a current age of each sample face image.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the image recognition device subtracts a photographing time of each sample face image in the base from a current date to obtain a current age of each sample face image.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the current date may be obtained according to attribute information of the target face image.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the first condition may include: the distance is less than or equal to a threshold or the similarity is greater than or equal to a limit or otherwise. The threshold and the limit may be set according to experience of a user, which is not limited in the present application.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the similarity may include cosine similarity.
With reference to the first aspect or any one of the foregoing possible implementation manners, in one possible implementation manner, the distance may include any one of the following: euclidean distance, mahalanobis distance.
With reference to the first aspect and the foregoing one possible implementation manner, in another possible implementation manner, if there are multiple candidate sample face images in the sample library, the candidate sample face images are sample face images in which a face feature and a target face image in the sample library satisfy a first condition with respect to a first feature of the candidate sample face images; selecting a first sample face image in a sample library as a recognition result of a target face image, wherein the method comprises the following steps: candidate sample face images satisfying the second condition are selected as the first sample face images. In the possible implementation manner, the candidate sample face image is further judged, the obtained first sample face image has higher similarity with the target face image, and the accuracy of the image recognition method is further improved.
The second condition may be configured according to the needs of the user, which is not limited by the present application.
With reference to the first aspect and one of the foregoing possible implementation manners, in another possible implementation manner, the second condition may include a minimum distance.
With reference to the first aspect and one of the foregoing possible implementation manners, in another possible implementation manner, the second condition may include a maximum similarity.
With reference to the first aspect and one of the foregoing possible implementation manners, in another possible implementation manner, the second condition may include that a difference between a current age of the person in the candidate face image and a reference value of a photographed age range of the target face image is minimum. Wherein the reference value may be the middle value of the range or other values.
In a second aspect, an image recognition apparatus is provided, where the apparatus may be a server in an image recognition system, may be an apparatus or a chip system in a server, or may be an apparatus that can be used in cooperation with a server. The image recognition device can realize the functions executed in the aspects or the possible designs, and the functions can be realized by hardware or can be realized by executing corresponding software by hardware. The hardware or software comprises one or more modules corresponding to the functions. Such as: the image recognition apparatus may include: the device comprises a first acquisition unit and a processing unit.
The first acquisition unit is used for acquiring identity features of the target face image, wherein the identity features are features except for age features in identifiable features in the face image.
And the processing unit is used for selecting a first sample face image in the sample library as a recognition result of the target face image according to the identity characteristics of the target face image and the sample library.
The sample library comprises face features and age features of one or more sample face images; the face features are identifiable features in the face image, and the age features are used for indicating photographing ages of people in the face image; the first sample face image is a sample face image with the face features and the target face image in the sample library meeting a first condition relative to the first features of the first sample face image; the first feature of the target face image relative to the first sample face image is a sum of an identity feature of the target face image and an age feature of the first sample face image.
According to the image recognition device provided by the application, the identity characteristic of the target face image is added with the age characteristic of the sample face image in the sample library to obtain the first characteristic of the target face image, and the first characteristic of the target face image is compared with the face characteristic in the sample library to obtain the recognition result. Because the first features are aligned with the face features of the sample facial images in the sample library in terms of age, the identification process of the application is equivalent to converting the comparison of the cross ages into the comparison of the same ages, so that the identification process is equivalent to compensating the influence of the age features on the identification precision, and the precision of the cross-age face identification is improved.
It should be noted that, the image recognition apparatus provided in the second aspect is configured to perform the image recognition method provided in the first aspect, and the specific implementation may refer to the specific implementation of the first aspect.
In a third aspect, an embodiment of the present application provides an image recognition apparatus, which may include: a processor, a memory; a processor, a memory coupled, the memory operable to store computer program code comprising computer instructions which, when executed by image recognition means, cause the image recognition means to perform the image recognition method as described in the first aspect or any one of the possible implementations.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, which may include: computer software instructions; the computer software instructions, when run in a computer, cause the computer to perform the method of image recognition as claimed in the first aspect or any of the possible implementations of the first aspect.
In a fifth aspect, embodiments of the application provide a computer program product which, when run on a computer, causes the computer to perform the method of image recognition as claimed in the first aspect or any of the possible implementations.
In a sixth aspect, an embodiment of the present application provides a chip system, where the chip system uses an apparatus for image recognition; the chip system comprises an interface circuit and a processor; the interface circuit and the processor are interconnected through a circuit; the interface circuit is used for receiving signals from a memory in the image recognition device and sending signals to the processor, wherein the signals comprise computer instructions stored in the memory; when the processor executes the computer instructions, the system-on-chip performs the method of image recognition as described in the first aspect or any one of the possible implementations.
It should be appreciated that the description of technical features, aspects, benefits or similar language in the present application does not imply that all of the features and advantages may be realized with any single embodiment. Conversely, it should be understood that the description of features or advantages is intended to include, in at least one embodiment, the particular features, aspects, or advantages. Therefore, the description of technical features, technical solutions or advantageous effects in this specification does not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantageous effects described in the present embodiment may also be combined in any appropriate manner. Those of skill in the art will appreciate that an embodiment may be implemented without one or more particular features, aspects, or benefits of a particular embodiment. In other embodiments, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
Detailed Description
The terms first, second, third and the like in the description and in the claims and in the above-described figures, are used for distinguishing between different objects and not necessarily for limiting a particular order.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion that may be readily understood.
In the description of the present application, unless otherwise indicated, "/" means that the objects associated in tandem are in a "or" relationship, e.g., A/B may represent A or B; the "and/or" in the present application is merely an association relationship describing the association object, and indicates that three relationships may exist, for example, a and/or B may indicate: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. Also, in the description of the present application, unless otherwise indicated, "a plurality" means two or more than two. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
In the embodiment of the present application, at least one may also be described as one or more, and a plurality may be two, three, four or more, and the present application is not limited thereto.
To facilitate understanding, the terms involved in the present application are explained first.
A facial image may refer to an image that contains the face of a person. The face image may also be referred to as a face image of a person, a face image, or a face picture.
The target face image may refer to a face image as a recognition target in the image recognition method. For example, the target face image may be acquired in real time for the image acquisition device, or may be a face image input by an administrator.
Features may refer to mathematical representations used to quantify facial images. Illustratively, the feature may be a vector or matrix.
Face features, which may refer to features identifiable in a facial image. Specifically, the face feature may be a sum of an age feature and an identity feature. Facial features may be extracted through a neural network.
Age characteristics may be characteristics for indicating age. Age characteristics may be extracted through neural networks, or may be obtained through age conversion.
Identity features may refer to features other than age features among the identifiable features in the facial image. Facial features may be extracted through a neural network.
The base library may be a database comprising one or more sample facial images or features of one or more sample facial images. The features of a sample facial image may include facial features and age features of the sample facial image. The base may further include age information for each of the sample facial images, which may include one or more of the following: age of photograph, time of photograph, date of birth.
A sample library that may be used to store one or more sample facial images or features of one or more sample facial images for comparison in an image recognition method. The features of a sample facial image may include facial features and age features of the sample facial image.
The photographing age may refer to the age of the person at the time of photographing the face image.
The current age may refer to the age of the person in the facial image at the current date. Specifically, the current age of a sample facial image in the base may be the current date minus the birth date of the person in the sample facial image.
The recognition result of the target face image may refer to a sample face image similar to the target face image.
Currently, there are two main methods for implementing cross-age face recognition.
The method 1 comprises the step of realizing cross-age face recognition by erasing age information in face information.
The implementation process of the method 1 can be as follows: firstly, a general face database and a cross-age face database are acquired, wherein the cross-age face database comprises a plurality of face images classified according to ages; training a neural network model by using the universal face database and the age-span face database classified according to ages, so that the neural network can erase age characteristics in the face image; and then, inputting the face images (target face images and sample face images) to be compared into the neural network model, and determining the result of the cross-age face recognition of the target face images by judging the similarity between the identity features of the neural network output after the age features are removed.
And 2, realizing the face recognition of the cross ages by means of an age reference dictionary.
The implementation process of the method 2 can be as follows: firstly, constructing a cross-age reference dictionary according to an external face database, wherein the cross-age reference dictionary contains high-level features of different local blocks of different age groups; then, obtaining the high-level characteristics of each local block of the face image to be identified; and coding and pooling the high-level features of each local block of the face images (the target face image and the sample face image) to be compared with the high-level features of different local blocks of different age groups in the cross-age reference dictionary to obtain the face features with blurred age information, and finally determining the cross-age face recognition result of the target face image according to the similarity between the face features with blurred age information.
However, the method 1 erases the age information and other information which is highly coupled with the age, such as wrinkles, and the like, so that the information is lost; the method 2 introduces noise while introducing an age reference dictionary, and the difference of the age reference dictionary can cause uncertainty on the final recognition result depending on the establishment of the age reference dictionary in the method 2, thereby affecting the stability of the system. Therefore, as the age increases, the face can change greatly, the age factor is not considered when the method is used for realizing the cross-age face recognition, the influence of the age on the face image is ignored, namely the influence of the age characteristic on the recognition precision is ignored, and therefore the recognition precision is not high.
Based on the above, the embodiment of the application provides an image recognition method, which is characterized in that the identity characteristic of a target face image is added with the age characteristic of a sample face image in a sample library to obtain the first characteristic of the target face image, and the first characteristic of the target face image is compared with the face characteristic in the sample library to obtain a recognition result. Because the first features are aligned with the face features of the sample facial images in the sample library in terms of age, the identification process of the application is equivalent to converting the comparison of the cross ages into the comparison of the same ages, so that the identification process is equivalent to compensating the influence of the age features on the identification precision, and the precision of the cross-age face identification is improved.
The following describes in detail the implementation of the embodiment of the present application with reference to the drawings.
The image recognition method provided by the embodiment of the application can be applied to the image recognition system shown in fig. 1. As shown in fig. 1, the image recognition system may include a server 101 and an administrator 102.
Wherein an administrator 102 is used to manage the server 101. Alternatively, the administrator 102 may manage the server 101 through a terminal device, or the administrator 102 may directly manage the server 101.
The server 101 is used for image recognition according to the scheme provided by the application. Wherein a sample library is stored in the server 101. The target face image for image recognition by the server 101 may be input by the administrator 102.
Further, the server 101 may also display the result of the image recognition to the administrator 102; or the server 101 may also display the word "no recognition result" to the administrator 102.
Alternatively, the server 101 may include a display screen, and the server 101 displays the result of the image recognition to the administrator 102 through the display screen. Or the server 101 may display the result of the image recognition to the administrator 102 through the screen of the terminal device.
The server 101 may be a physical server, or a cloud server, or other device having data processing and storage capabilities, which is not limited by the present application.
Optionally, as shown in fig. 1, an image collector 103 may be further included in the image recognition system, and is configured to collect a facial image, and upload the collected facial image to the server 101. Accordingly, the target face image for image recognition by the server 101 may be uploaded by the image collector 103.
The image collector 103 may be an independent camera, a mobile phone camera, a computer camera, a dome camera, a camera of an Augmented Reality (AR) \virtual reality (VR) device, or the like, which is not limited in the present application.
By way of example, the image recognition system illustrated in fig. 1 may be used to find lost old people and children, or to catch long-term evasive persons, etc.
The following describes the embodiments of the present application with reference to the drawings.
In one aspect, an embodiment of the present application provides an image recognition apparatus for performing the image recognition method provided by the present application. The image recognition device may be disposed in the server 101 shown in fig. 1, and the image recognition device may be part or all of the server 101. Or the image recognition device may be deployed separately, for example, as an electronic device or a system-on-chip with associated data processing and storage capabilities.
Fig. 2 illustrates an image recognition apparatus 20 according to an embodiment of the present application. As shown in fig. 2, the image recognition device 20 may include a processor 201, a memory 202, and a transceiver 203.
The following describes the respective constituent elements of the image recognition apparatus 20 in detail with reference to fig. 2:
Wherein the memory 202 may be a volatile memory (RAM), such as a random-access memory (RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory (flash memory), a hard disk (HARD DISK DRIVE, HDD) or a solid state disk (SSD-STATE DRIVE, SSD); or a combination of the above-mentioned types of memories for storing program code, configuration files, data information, image information, or other content that may implement the methods of the present application.
The processor 201 is a control center of the image recognition apparatus 20. For example, processor 201 may be a central processing unit (central processing unit, CPU), an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present application, such as: one or more microprocessors (DIGITAL SINGNAL processors, DSPs), or one or more field programmable gate arrays (field programmable GATE ARRAY, FPGAs).
The transceiver 203 is used for information interaction of the image recognition apparatus 20 with other devices. For example, the transceiver 203 is used for information interaction between the image recognition device 20 and the image collector.
Optionally, as shown in fig. 2, the image recognition device 20 may further include an image collector 204. The image collector 204 is used to collect face images of persons.
Specifically, the processor 201 performs the following functions by running or executing software programs and/or modules stored in the memory 202:
Acquiring identity characteristics of a target facial image, wherein the identity characteristics are characteristics except for age characteristics in identifiable characteristics in the facial image; according to the identity characteristics of the target face image and the sample library, selecting a first sample face image in the sample library as a recognition result of the target face image; the sample library comprises face features and age features of one or more sample face images; the face features are identifiable features in the face image, and the age features are used for indicating photographing ages of people in the face image; the first sample face image is a sample face image with the face features and the target face image in the sample library meeting a first condition relative to the first features of the first sample face image; the first feature of the target face image relative to the first sample face image is a sum of an identity feature of the target face image and an age feature of the first sample face image.
In another aspect, an embodiment of the present application provides an image recognition method, as shown in fig. 3, the method may include:
s301, the image recognition device acquires the identity characteristic of the target face image.
Specifically, S301 may include, but is not limited to, step a and step B.
And step A, the image recognition device acquires a target face image.
In one possible implementation, the image recognition device may collect the uploaded face image as the target face image by the image collector.
In another possible implementation, the image recognition apparatus may take the face image input by the administrator as the target face image.
It should be noted that the target face image may be one or more face images, and in the embodiment of the present application, the target face image is taken as an example of one face image, and other details are not repeated.
And B, the image recognition device acquires the identity characteristics of the target face image.
Specifically, the image recognition device may input the target face image into a first neural network for extracting identity features, and the output of the first neural network is the identity features of the target face image.
The first neural network may be a convolutional neural network or a fully-connected neural network or others. The first neural network is a neural network trained to converge, and the method for training the first neural network is not limited in the present application.
Alternatively, the first neural network may be trained by the image recognition device, or trained by other devices to converge and then be configured in the image recognition device.
By way of example, fig. 4 illustrates a process for assisting in training a first neural network by an antagonism network, which may be: inputting a face image into a first neural network, outputting face features corresponding to the face image by the first neural network, inputting the face features into an countermeasure network, outputting an age range corresponding to the face image by the countermeasure network, and obtaining age gradient information according to the age range by the countermeasure network, wherein the age gradient information does not belong to the age range corresponding to the face image; the antagonism network transmits the age gradient information back to the first neural network, and the first neural network is disturbed to generate the characteristic related to the age according to the facial image. The first neural network may be similarly trained through multiple sets of facial images until the first neural network converges.
S302, the image recognition device selects a first sample facial image in a sample library as a recognition result of the target facial image according to the identity characteristics of the target facial image and the sample library.
The sample library may include facial features and age features of one or more sample facial images, among others.
In particular, the sample library described herein may be determined according to actual requirements, and the manner of determining the sample library is not limited in the present application. Alternatively, the manner in which the sample library is determined may include, but is not limited to, the following two ways.
The first implementation: the base library is converted to a sample library.
In a possible implementation manner, the basic library includes a plurality of sample face images, each sample face image in the basic library can be input into the neural network to extract the face features and the age features of the neural network, and the face features and the age features of each sample face image in the basic library are recorded as the sample library.
In another possible implementation manner, the base includes a plurality of sample face images and photographing ages of each sample face image, each sample face image in the base may be input into the neural network to extract face features of the sample face image, the photographing ages of each sample face image in the base may be input into the neural network to obtain age features, and the face features and the age features of each sample face image in the base may be recorded as the sample base.
In another possible implementation manner, the base includes a plurality of sample face images and a birth date and a photographing time of each sample face image, the photographing age can be obtained by subtracting the birth date from the photographing time of each sample face image, each sample face image in the base is input into the neural network to extract the face features of the base, the photographing age of each sample face image in the base is input into the neural network to obtain the age features, and the face features and the age features of each sample face image in the base are recorded as the sample base.
In yet another possible implementation, the base may also include face features and age features of a plurality of sample facial images, and then in the first implementation, the base may be directly used as the sample base.
The second implementation: the sample library was obtained from the base library by age screening.
In the second implementation, the face features and the age features of the sample face image with the current age within the photographing age range can be selected from the basic library according to the photographing age range of the target face image, and the face features and the age features of the sample face image with the current age within the photographing age range can be used as the sample library. The specific procedure of the second implementation may refer to the following procedures of S303 and S304.
Specifically, the image recognition device in S302 may compare the identity information of the target face image obtained in S301 with each sample face image in the sample library, so as to implement S302 to select the first sample face image in the sample library. Alternatively, the comparison means may include, but are not limited to, the following two comparison means:
The comparison mode a and the image recognition device may perform parallel comparison with the features of each sample face image in the sample library according to the identity information of the target face image acquired in S301, that is, compare with the features of each sample face image respectively, so as to realize that S302 selects the first sample face image in the sample library.
The comparison mode b and the image recognition device can compare the identity information of the target face image obtained in the S301 with the features of the sample face images in the sample library according to a preset sequence, so as to realize that the S302 selects the first sample face image in the sample library. The preset sequence can be configured according to actual requirements, and the application is not limited. For example, the preset sequence may be a sequence from small to large, or may be a sequence from present to previous photographing time, or other sequences.
Here, the image recognition device may be the same as the comparison process of the features of each of the sample face images in the sample library according to the identity information of the target face image acquired in S301, and the comparison process may include S3021 to S3023, taking the description of the image recognition device comparing the features of the image of one of the sample face images (the second sample face image) in the sample library according to the identity information of the target face image acquired in S301 as an example.
S3021, the image recognition apparatus acquires a first feature of the target face image with respect to the second sample face image.
Wherein the first characteristic of the target face image relative to the second sample face image is a sum of an identity characteristic of the target face image and an age characteristic of the second sample face image.
It should be noted that, the user may adjust the neural network for obtaining the identity feature, the neural network for obtaining the age feature, and the neural network parameter for obtaining the face feature according to experience, so that the vector dimensions of the identity feature, the age feature, and the face feature output by each neural network remain the same, and the identity feature, the age feature, and the face feature are convenient to directly perform related computation.
S3022, the image recognition apparatus determines whether the first condition is satisfied by the first feature of the target face image with respect to the second sample face image and the face feature of the second sample face image.
The first condition can be configured according to actual requirements.
For example, the first condition may include a distance less than or equal to a threshold or a similarity greater than or equal to a limit or others. The threshold and the limit may be set according to experience of a user, which is not limited in the present application.
Wherein the distance may comprise any of the following: euclidean distance, mahalanobis distance. The similarity may include cosine similarity.
If it is determined in S3022 that the first feature of the target face image with respect to the second sample face image and the face feature of the second sample face image satisfy the first condition, the second sample face image is taken as the candidate sample face image.
S3023, the image recognition apparatus determines a first sample face image.
In a possible implementation manner, in S3023, if only one candidate sample face image exists in the sample library, corresponding to the comparison mode a, the candidate sample face image may be regarded as the first sample face image.
In another possible implementation manner, after S3023, if there are a plurality of candidate sample face images in the sample library corresponding to the comparison mode a, a candidate sample face image satisfying the second condition may be selected as the first sample face image.
The second condition can be configured according to actual requirements.
For example, the second condition may include a minimum distance or a maximum similarity; or the second condition may include that a difference between a current age of the person in the candidate sample face image and a reference value of a photographed age range of the target face image is minimum. Wherein the reference value may be the middle value of the range or other values.
In another possible implementation, in S3023, a candidate sample face image is acquired corresponding to the comparison mode b, i.e., is the first sample face image.
According to the image recognition method provided by the embodiment of the application, the identity characteristic of the target face image is added with the age characteristic of the sample face image in the sample library to obtain the first characteristic of the target face image, and the first characteristic of the target face image is compared with the face characteristic in the sample library to obtain a recognition result. Because the first features are aligned with the face features of the sample facial images in the sample library in terms of age, the identification process of the application is equivalent to converting the comparison of the cross ages into the comparison of the same ages, so that the identification process is equivalent to compensating the influence of the age features on the identification precision, and the precision of the cross-age face identification is improved.
Specifically, the first implementation in S302 is described herein as converting a base library into a sample library.
A face image may be input to a second neural network for extracting face features, and the output of the second neural network is the face features of the face image. A face image may be input to a third neural network for extracting age characteristics from the face image, and the output of the third neural network is the age characteristics of the face image. The photographed age of a face image may be input to a fourth neural network for extracting age characteristics according to the age, and the output of the fourth neural network is the age characteristics of the face image.
Wherein the second, third, fourth neural networks may be convolutional neural networks or fully-connected neural networks or others. The second neural network, the third neural network, and the fourth neural network are neural networks trained to converge, and the method for training the second neural network, the third neural network, and the fourth neural network is not limited in the present application.
Alternatively, the second neural network, the third neural network, and the fourth neural network may be trained by the image recognition apparatus, or may be configured in the image recognition apparatus after being trained by other devices to converge.
Illustratively, the training process of the second neural network may be: inputting a plurality of face images (different face images of the same person and face images of different persons) into a second neural network, outputting face characteristics of the face images by the second neural network, calculating similarity between the face characteristics of the different face images of the same person and similarity between the face characteristics of the face images of different persons, and then adjusting weights and offsets in the second neural network through reverse transmission until the second neural network converges.
Illustratively, the training process of the third neural network may be: and inputting the plurality of facial images into a third neural network, outputting facial image age characteristics (characteristics corresponding to the age range) by the third neural network, and adjusting weights and offsets in the third neural network through reverse transmission according to errors between the age characteristics output by the third neural network and actual age characteristics (differences between face characteristics and identity characteristics) of the facial images until the third neural network converges.
Illustratively, the training process of the fourth neural network may be: inputting photographing ages of a plurality of face images into a fourth neural network, outputting age characteristics of the face images by the fourth neural network, and adjusting weights and offsets in the fourth neural network through reverse transmission according to errors between the age characteristics corresponding to the face images output by the fourth neural network and actual age characteristics (difference between face characteristics and identity characteristics) of the face images until the fourth neural network converges.
Further, when the sample library in S302 is obtained by the second implementation, as shown in fig. 5, before S302, the image recognition method provided by the embodiment of the present application may further include S303 and S304.
S303, the image recognition device acquires a photographing age range of the target face image.
The target face image may be input into a fifth neural network for extracting a photographing age range, and the output of the fifth neural network is the photographing age range of the target face image.
Wherein the fifth neural network may be a convolutional neural network or a fully-connected neural network or others. The fifth neural network is a neural network trained to converge, and the method for training the fifth neural network is not limited in the present application.
Alternatively, the fifth neural network may be trained by the image recognition device, or trained by other devices to converge and then be configured in the image recognition device.
Illustratively, the training process of the fifth neural network may be: and inputting the plurality of face images into a fifth neural network, outputting a probability value of an age range corresponding to the face images by the fifth neural network, and reversely transmitting and adjusting weights and offsets in the fifth neural network according to the error between the probability value of the age range corresponding to the face images output by the fifth neural network and the actual age range probability (probability is 1) of the face images until the fifth neural network converges.
S304, the image recognition device screens face features and age features of the sample face image with the current age within the photographing age range from the basic library to serve as a sample library.
S304 may be implemented as: the image recognition device acquires the current age of each sample face image in the basic library, and the image recognition device screens the face features and the age features of the sample face images with the current age within the photographing age range of the target face image from the basic library according to the current age of each sample face image in the basic library to serve as the sample library.
In a possible implementation manner, the image recognition device obtains the current age of each sample face image according to the birth date and the current date of each sample face image in the base. The current date may be obtained according to attribute information of the target face image, or may be input by an administrator or otherwise, which is not limited by the present application.
In another possible implementation manner, the image recognition device obtains the current age of each sample face image according to the photographing age of each sample face image, the photographing time of each sample face image and the current date in the base. The current date may be obtained according to attribute information of the target face image, or may be input by an administrator or otherwise, which is not limited by the present application.
Alternatively, different sample face images in the base and sample libraries may be distinguished by information such as image identification or ID number or person name.
If the basic library includes the sample facial image, the step S304 may be performed after the step S302 of converting the sample facial image into the facial feature and the age feature according to the first implementation.
For example, assuming that the photographed age range of the target face image acquired by the image recognition apparatus is 20-30 years old, the current date is 2019, and the base is shown in table 1. Wherein a row in table 1 represents the relevant content of one sample face image.
TABLE 1
| Sample facial image identification |
Facial features |
Age characteristics |
Birth date (year) |
| Sample face image 1 |
R1 |
N1 |
1980 |
| Sample face image 2 |
R2 |
N2 |
1998 |
| Sample face image 3 |
R3 |
N3 |
2003 |
| Sample face image 4 |
R4 |
N4 |
1992 |
| …… |
…… |
…… |
…… |
Table 1 is merely an example, and is not particularly limited.
The image recognition device may subtract the birth date of each sample face image from the current date according to the birth date of each sample face image in table 1 to obtain the current age of each sample face image in the base, where the base illustrated in table 1 is converted into the base as shown in table 2.
TABLE 2
| Sample facial image identification |
Facial features |
Age characteristics |
Birth date (year) |
Current age (age) |
| Sample face image 1 |
R1 |
N1 |
1980 |
39 |
| Sample face image 2 |
R2 |
N2 |
1998 |
21 |
| Sample face image 3 |
R3 |
N3 |
2003 |
15 |
| Sample face image 4 |
R4 |
N4 |
1992 |
27 |
| …… |
…… |
…… |
…… |
…… |
Since the photographed age range of the target face image is 20-30 years, the sample face images whose current ages fall within the photographed age range in the basic library shown in table 2 are the sample face image 2 and the sample face image 4, and the feature contents of the sample face image 2 and the sample face image 4 can be selected from the basic library as a sample library, which can be shown in table 3.
TABLE 3 Table 3
| Sample facial image identification |
Facial features |
Age characteristics |
| Sample face image 2 |
R2 |
N2 |
| Sample face image 4 |
R4 |
N4 |
It should be noted that table 3 is merely an example, and is not particularly limited.
Further optionally, the first to fifth neural networks for extracting features are independent neural networks, and a multi-feature extracting neural network may be established and trained to converge, so as to implement functions of part or all of the first to fifth neural networks, and extract part or all of the identity feature, the age feature and the face feature. The training process of the multi-feature extraction neural network may refer to the training processes of the first neural network to the fifth neural network, which is not described in detail herein.
The image recognition method provided by the application is described below by taking a scene of searching for lost children as an example.
Specifically, an urban public security office finds a lost child, and needs to confirm the specific identity of the lost child. The police obtains the facial image of the lost child through photographing, and inputs the facial image of the lost child as a target facial image to the image recognition device. The image recognition device adopts the image recognition method provided by the application to select the facial image similar to the facial image of the lost child in the information base of the lost child in a comparison way.
As shown in fig. 6, the image recognition process of comparing and selecting a face image similar to the face image of the missing child in the information base of the missing child by the image recognition device may include:
the image recognition device acquires the identity characteristics and the photographing age range of the target face image.
The image recognition device acquires an information base (basic base) of a lost child, and the information base of the lost child stores face features, age features and photographing ages of a plurality of sample face images such as a sample face image 1, a sample face image 2, a sample face image 3 and the like.
The image recognition device screens out sample face images (sample face image 1 and sample face image 3) with the current age within the shooting age range of the target face image, and records the face features and the age features of the sample face image 1 and the sample face image 3 as a sample library.
The image recognition device adds the identity features of the target face image to the age features of the sample face image 1 and the sample face image 3 respectively to obtain first features of the target face image relative to the sample face image 1 and first features of the target face image relative to the sample face image 3.
The image recognition device calculates the Euclidean distance between the first feature of the target face image relative to the sample face image 1 and the sample face image 1, calculates the Euclidean distance between the first feature of the target face image relative to the sample face image 3 and the sample face image 3, and obtains that the Euclidean distance between the first feature of the target face image relative to the sample face image 1 and the sample face image 1 is smaller than a preset threshold and meets a first condition (is smaller than the preset threshold), and the Euclidean distance between the first feature of the target face image relative to the sample face image 3 and the sample face image 3 is larger than the preset threshold and does not meet the first condition, so the image recognition device outputs the sample face image 1 as a recognition result to a user, namely, the lost child is considered to be a person corresponding to the sample face image 1.
The above description mainly describes the scheme provided by the embodiment of the application from the viewpoint of the working principle of the image recognition device. It will be appreciated that the image recognition device, in order to achieve the above-described functions, comprises corresponding hardware structures and/or software modules that perform the respective functions. Those skilled in the art will readily appreciate that the present application can be implemented in hardware or a combination of hardware and computer software in connection with the examples described in connection with the embodiments disclosed herein. Whether a function is implemented as hardware or computer software driven hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The embodiment of the application can divide the functional modules of the image recognition device according to the method example, for example, each functional module can be divided corresponding to each function, and two or more functions can be integrated in one processing module. The integrated modules may be implemented in hardware or in software functional modules. It should be noted that, in the embodiment of the present application, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation.
In the case of dividing the respective functional modules by the respective functions, fig. 7 shows an image recognition apparatus 70 according to an embodiment of the present application, which is used to implement the functions of the image recognition apparatus in the above-mentioned method. The image recognition device 70 may be a server, a device in the server, or a device that can be used in cooperation with the server. The image recognition device 70 may be a chip system. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices. As shown in fig. 7, the image recognition apparatus 70 may include: a first acquisition unit 701 and a processing unit 702. Wherein the first obtaining unit 701 is configured to perform S301 in fig. 3 or fig. 5; the processing unit 702 is configured to execute S302 in fig. 3 or fig. 5. All relevant contents of each step related to the above method embodiment may be cited to the functional description of the corresponding functional module, which is not described herein.
Further, as shown in fig. 7, the image recognition apparatus 70 may further include a second acquisition unit 703. Wherein the second acquisition unit 703 is configured to perform S303 in fig. 5.
As shown in fig. 8, an image recognition apparatus 80 is provided for implementing the functions of the image recognition apparatus in the above method according to an embodiment of the present application. The image recognition device 80 may be a server, a device in the server, or a device that can be used in cooperation with the server. The image recognition device 80 may be a chip system. The image recognition device 80 includes at least one processing module 801 for implementing the functions of the image recognition device in the method provided by the embodiment of the present application. Illustratively, the processing module 801 may be used to perform the processes S301, S302 in fig. 3 or the processes S301, S302, S303, S304 in fig. 5. Reference is made specifically to the detailed description in the method examples, and details are not described here.
The image recognition device 80 may also include at least one memory module 802 for storing program instructions and/or data. The memory module 802 is coupled to the processing module 801. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units, or modules, which may be in electrical, mechanical, or other forms for information interaction between the devices, units, or modules. The processing module 801 may cooperate with the storage module 802. The processing module 801 may execute program instructions stored in the storage module 802. At least one of the at least one memory module may be included in the processing module.
The image recognition apparatus 80 may further comprise a communication module 803 for communicating with other devices via a transmission medium, such that the means for determining in the image recognition apparatus 80 may communicate with other devices.
When the processing module 801 is a processor, the storage module 802 is a memory, the communication module 803 is a transceiver, and the image recognition device 80 according to fig. 8 of the embodiment of the present application may be the image recognition device shown in fig. 2.
As described above, the image recognition device 70 or the image recognition device 80 according to the embodiments of the present application may be used to implement the functions of the image recognition device in the method implemented by the embodiments of the present application, and for convenience of explanation, only the portions related to the embodiments of the present application are shown, and specific technical details are not disclosed, referring to the embodiments of the present application.
Still further embodiments of the present application provide a computer readable storage medium that may include computer software instructions that, when executed on a computer, cause the computer to perform the steps performed by the image recognition device in the embodiments described above with reference to fig. 3 or 5.
Still further embodiments of the present application provide a computer program product for causing a computer to perform the steps performed by the image recognition apparatus in the embodiments shown in fig. 3 or fig. 5 described above when the computer program product is run on the computer.
Still further embodiments of the present application provide a chip system. The chip system comprises an interface circuit and a processor; the interface circuit and the processor are interconnected through a circuit; the interface circuit is used for receiving signals from the memory of the image recognition device and sending signals to the processor, wherein the signals comprise computer instructions stored in the memory; when the processor executes the computer instructions, the system-on-chip performs the various steps performed by the image recognition device in the embodiments shown in fig. 3 or fig. 5 described above.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of functional modules is illustrated, and in practical application, the above-described functional allocation may be implemented by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and the parts displayed as units may be one physical unit or a plurality of physical units, may be located in one place, or may be distributed in a plurality of different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions for causing a device (may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of specific embodiments of the present application, and the scope of the present application is not limited thereto, but any changes or substitutions within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.