CN103093212B

CN103093212B - The method and apparatus of facial image is intercepted based on Face detection and tracking

Info

Publication number: CN103093212B
Application number: CN201310032050.3A
Authority: CN
Inventors: 曹林; 朱希安; 周汐
Original assignee: Beijing Information Science and Technology University
Current assignee: Beijing Information Science and Technology University
Priority date: 2013-01-28
Filing date: 2013-01-28
Publication date: 2015-11-18
Anticipated expiration: 2033-01-28
Also published as: CN103093212A

Abstract

The invention discloses a method and a device for intercepting a face image based on face detection and tracking, and belongs to the technical field of face tracking. The method includes: using a cascade classifier to detect the face of the image to be detected; when the face target is detected, using the mean value tracking algorithm to perform face tracking on the face target; when the face target leaves the detection area, in each On the frame, judge whether the face detection and face tracking in the same frame correspond to the same face target according to the position of the target, and select each frame in which the face detection and face tracking correspond to the same face target; in each selected frame , calculate the coincidence degree between the face detection window and the face tracking window in the same frame, and use the face image obtained by face detection on the frame with the largest coincidence degree as the intercepted face image. The device includes: a detection module, a tracking module, a judging module and an intercepting module. The invention intercepts a relatively clear human face image, and improves the accuracy and tracking effect of human face tracking.

Description

Method and device for capturing face images based on face detection and tracking

技术领域technical field

本发明涉及人脸跟踪技术领域，特别涉及一种基于人脸检测和跟踪截取人脸图像的方法和装置。The present invention relates to the technical field of human face tracking, in particular to a method and device for capturing human face images based on human face detection and tracking.

背景技术Background technique

随着安全需求的提高，人流量统计、人员特征识别、人脸识别技术等商业价值已经开始显露，并逐步开始应用。人脸检测和人脸跟踪作为这些任务的重要环节，具有非常重要的作用和意义。近年来，研究人员在这一领域投入了大量的时间和精力，致力于开发出快速准确的人脸检测方法和跟踪方法。With the improvement of security requirements, the commercial value of people flow statistics, personnel feature recognition, and face recognition technology has begun to be revealed and gradually started to be applied. As an important part of these tasks, face detection and face tracking have a very important role and significance. In recent years, researchers have invested a lot of time and energy in this field, devoting themselves to developing fast and accurate face detection methods and tracking methods.

人脸检测是指在给定的图片中确定出人脸的位置及大小的过程。目前，常用的人脸检测方法是基于Haar特征和Boosted级联的人脸检测方法。该算法的核心思想是通过迭代选出多个具有不同分类能力的弱分类器进行组合形成强分类器，并由多个强分类器先后顺序组合起来形成级联分类器，作为最终的人脸检测器。Face detection refers to the process of determining the position and size of a human face in a given picture. At present, the commonly used face detection method is a face detection method based on Haar features and Boosted cascade. The core idea of the algorithm is to iteratively select multiple weak classifiers with different classification capabilities to form a strong classifier, and then combine multiple strong classifiers sequentially to form a cascade classifier as the final face detection device.

人脸跟踪是指在输入图像序列中确定某个人脸的运动轨迹及大小变化的过程。人脸跟踪技术具有重要的潜在应用价值，它作为自动人脸识别、视频检索、视频监控等领域中的一项关键技术，受到研究者的普遍重视。目前，常用的人脸跟踪方法有mean-shift算法、cam-shift算法、粒子滤波器等等。Face tracking refers to the process of determining the trajectory and size change of a certain face in the input image sequence. Face tracking technology has important potential application value. As a key technology in the fields of automatic face recognition, video retrieval, video surveillance, etc., it is generally valued by researchers. At present, commonly used face tracking methods include mean-shift algorithm, cam-shift algorithm, particle filter and so on.

然而，通常情况下，人脸跟踪的目标在移动过程中，目标的大小形状、光照等条件会发生变化。目前的人脸跟踪技术随着跟踪帧数的增加，跟踪误差会逐渐增大，导致跟踪效果变差，跟踪结果的精度较低，而找到视频中出现人脸的过程中较为清晰的一帧保存下来作为日后数据库使用也是安防系统中重要的要求和功能。However, under normal circumstances, the size, shape, illumination and other conditions of the target will change during the moving process of the target tracked by the face. With the increase of the number of tracking frames in the current face tracking technology, the tracking error will gradually increase, resulting in poor tracking effect and low accuracy of the tracking results. When the face appears in the video, a clearer frame is saved. It is also an important requirement and function in the security system to use it as a database in the future.

发明内容Contents of the invention

为了提高人脸跟踪结果的精度，本发明提供了一种基于人脸检测和跟踪截取人脸图像的方法的方法和装置。所述技术方案如下：In order to improve the accuracy of face tracking results, the present invention provides a method and device based on the method of face detection and tracking to intercept a face image. Described technical scheme is as follows:

一方面，本发明提供了一种基于人脸检测和跟踪截取人脸图像的方法，所述方法包括：On the one hand, the present invention provides a kind of method based on face detection and tracking interception face image, described method comprises:

采用级联分类器对待检测图像进行人脸检测；Use cascaded classifiers to perform face detection on the image to be detected;

当检测到人脸目标时，将当前要跟踪的帧转化为HSV图像，获取其亮度，色度，饱和度的信息；随机选取所述人脸目标在上一帧的位置周围的位置作为中心点，提取以目标面积为大小的图像区域信息；然后根据所述上一帧人脸目标的位置信息、人脸目标的初始位置信息，使用预定的状态方程估计出所述人脸目标在所述当前要跟踪的帧出现的位置；在HSV空间计算颜色信息，并根据颜色相似性计算每个估计的位置的权重，对相似程度大的值赋予高权重，对相似度小的值赋予低权重，对计算出的所有权重求取权重平均值，根据所述权重平均值获取对应的位置，按照获取的所述位置对所述人脸目标进行人脸跟踪；When a face target is detected, convert the current frame to be tracked into an HSV image, and obtain information on its brightness, chroma, and saturation; randomly select the position of the face target around the position of the previous frame as the center point , extract the image area information with the target area as the size; then according to the position information of the face target in the last frame and the initial position information of the face target, use a predetermined state equation to estimate the position of the face target in the current The position where the frame to be tracked appears; calculate the color information in the HSV space, and calculate the weight of each estimated position according to the color similarity, assign a high weight to the value with a large similarity, and assign a low weight to a value with a small similarity. Obtaining a weight average value for all the calculated weights, obtaining a corresponding position according to the weight average value, and performing face tracking on the human face target according to the obtained position;

当所述人脸目标离开检测区域时，在人脸检测和人脸跟踪的每个帧上，计算该帧内人脸检测的目标位置的左上角顶点与人脸跟踪的目标位置的左上角顶点之间的距离，并且计算该帧内人脸检测窗口边长与人脸跟踪窗口边长的比值，当所述距离小于等于预设的距离阈值且所述比值小于等于预设的比值阈值时，判定该帧内的人脸检测与人脸跟踪对应同一个人脸目标，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；When the human face target leaves the detection area, on each frame of human face detection and human face tracking, calculate the upper left corner vertex of the target position of human face detection in this frame and the upper left corner vertex of the target position of human face tracking and calculate the ratio of the face detection window side length to the face tracking window side length in the frame, when the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset ratio threshold, Determine that the face detection and face tracking in the frame correspond to the same face target, and select each frame in which the face detection and face tracking correspond to the same face target;

在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，比较计算得到的所有重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像。In each selected frame, calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame, compare all the coincidence degrees obtained by calculation, and compare the face image obtained by face detection on the frame with the largest coincidence degree as the intercepted face image.

其中，所述方法还包括：Wherein, the method also includes:

当根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应同一个人脸目标时，将该帧内人脸跟踪得到的人脸图像替换为人脸检测得到的人脸图像，以替换后的人脸图像继续进行人脸跟踪。When it is judged according to the position of the target that the face detection and face tracking in a certain frame correspond to the same face target, replace the face image obtained by face tracking in the frame with the face image obtained by face detection to replace The final face image continues to face tracking.

其中，所述方法还包括：Wherein, the method also includes:

当根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应不同的人脸目标时，将该帧内人脸检测到的人脸图像作为新的人脸目标，对所述新的人脸目标启动人脸跟踪。When it is judged according to the position of the target that the face detection and face tracking in a certain frame correspond to different face targets, the face image detected by the face in the frame is used as a new face target, and the new face target is used for the new target. The face target starts face tracking.

另一方面，本发明还提供了一种基于人脸检测和跟踪截取人脸图像的装置，所述装置包括：On the other hand, the present invention also provides a kind of device based on face detection and tracking interception face image, and described device comprises:

检测模块，用于采用级联分类器对待检测图像进行人脸检测；Detection module, for adopting cascade classifier to carry out face detection on the image to be detected;

跟踪模块，用于当检测到人脸目标时，将当前要跟踪的帧转化为HSV图像，获取其亮度，色度，饱和度的信息；随机选取所述人脸目标在上一帧的位置周围的位置作为中心点，提取以目标面积为大小的图像区域信息；然后根据所述上一帧人脸目标的位置信息、人脸目标的初始位置信息，使用预定的状态方程估计出所述人脸目标在所述当前要跟踪的帧出现的位置；在HSV空间计算颜色信息，并根据颜色相似性计算每个估计的位置的权重，对相似程度大的值赋予高权重，对相似度小的值赋予低权重，对计算出的所有权重求取权重平均值，根据所述权重平均值获取对应的位置，按照获取的所述位置对所述人脸目标进行人脸跟踪；The tracking module is used to convert the current frame to be tracked into an HSV image when a human face target is detected, and obtain information on its brightness, chroma, and saturation; randomly select the position of the human face target around the previous frame The position of the face is used as the center point to extract the image area information with the target area as the size; then according to the position information of the face target in the last frame and the initial position information of the face target, use a predetermined state equation to estimate the face The position where the target appears in the current frame to be tracked; the color information is calculated in the HSV space, and the weight of each estimated position is calculated according to the color similarity, and the value with a large similarity is assigned a high weight, and the value with a small similarity is assigned a high weight Giving a low weight, obtaining a weight average value for all calculated weights, obtaining a corresponding position according to the weight average value, and performing face tracking on the human face target according to the obtained position;

判断模块，用于当所述人脸目标离开检测区域时，在人脸检测和人脸跟踪的每个帧上，计算该帧内人脸检测的目标位置的左上角顶点与人脸跟踪的目标位置的左上角顶点之间的距离，并且计算该帧内人脸检测窗口边长与人脸跟踪窗口边长的比值，当所述距离小于等于预设的距离阈值且所述比值小于等于预设的比值阈值时，判定该帧内的人脸检测与人脸跟踪对应同一个人脸目标，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；Judgment module, for when the face target leaves the detection area, on each frame of face detection and face tracking, calculate the upper left corner vertex of the target position of face detection in the frame and the target of face tracking The distance between the vertices in the upper left corner of the position, and calculate the ratio of the face detection window side length to the face tracking window side length in the frame, when the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset When the ratio threshold value of , it is judged that the face detection in the frame corresponds to the same face target with the face tracking, and each frame of the same face target corresponding to the face detection and the face tracking is selected;

截取模块，用于在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，比较计算得到的所有重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像。The interception module is used to calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame in each selected frame, compare all the coincidence degrees obtained by calculation, and detect the face on the frame with the largest coincidence degree The obtained face image is used as the intercepted face image.

其中，所述跟踪模块还用于：Wherein, the tracking module is also used for:

当所述判断模块根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应同一个人脸目标时，将该帧内人脸跟踪得到的人脸图像替换为人脸检测得到的人脸图像，以替换后的人脸图像继续进行人脸跟踪。When the judging module judges according to the position of the target that the face detection and face tracking in a certain frame correspond to the same face target, replace the face image obtained by face tracking in the frame with the face obtained by face detection Image, continue face tracking with the replaced face image.

当所述判断模块根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应不同的人脸目标时，将该帧内人脸检测到的人脸图像作为新的人脸目标，对所述新的人脸目标启动人脸跟踪。When the judging module judges according to the position of the target that the face detection and face tracking in a certain frame correspond to different face targets, the face image detected by the face in the frame is used as a new face target, Start face tracking for the new face target.

本发明提供的技术方案带来的有益效果是：通过均值跟踪算法对级联分类器检测到的人脸目标进行人脸跟踪，当人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像，在充分利用检测资源和跟踪资源的基础上，截取了较清晰的人脸图像，提高了人脸跟踪的精度，提升了跟踪效果，并为截取清晰的人脸图像提供了数据上的支持。The beneficial effect brought by the technical solution provided by the present invention is: the face target detected by the cascade classifier is tracked by the mean value tracking algorithm. On each frame of the face detection and face tracking, select each frame corresponding to the same face target; in each selected frame, calculate the coincidence degree of the window of the face detection and the window of the face tracking in the same frame, The face image obtained by face detection on the frame with the largest overlap is used as the intercepted face image. On the basis of making full use of detection resources and tracking resources, a clearer face image is intercepted, which improves the accuracy of face tracking , which improves the tracking effect and provides data support for capturing clear face images.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without creative effort.

图1是本发明一实施例提供的基于人脸检测和跟踪截取人脸图像的方法流程图；Fig. 1 is the flow chart of the method for intercepting a face image based on face detection and tracking provided by an embodiment of the present invention;

图2是本发明实施例提供的级联分类器的检测示意图；Fig. 2 is a schematic diagram of the detection of the cascade classifier provided by the embodiment of the present invention;

图3是本发明另一实施例提供的基于人脸检测和跟踪截取人脸图像的方法流程图；Fig. 3 is the flow chart of the method for intercepting a face image based on face detection and tracking provided by another embodiment of the present invention;

图4是本发明实施例提供的计算重合度的示意图；Fig. 4 is a schematic diagram of calculating coincidence degree provided by an embodiment of the present invention;

图5是本发明一实施例提供的基于人脸检测和跟踪截取人脸图像的装置结构图；FIG. 5 is a structural diagram of a device for capturing and capturing a face image based on face detection and tracking provided by an embodiment of the present invention;

图6是本发明另一实施例提供的基于人脸检测和跟踪截取人脸图像的装置结构图。Fig. 6 is a structural diagram of a device for capturing and capturing a face image based on face detection and tracking provided by another embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明实施方式作进一步地详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

本发明实施例涉及级联分类器。所述级联分类器是由多个分类器串联组成的，该多个强分类器的个数又称为级联分类器的级数。例如，10级的级联分类器由10个强分类器组成等等。通常，级联分类器在进行人脸检测是经过训练的，训练时可以用预先准备好的正样本和负样本进行训练。其中，所述正样本和负样本的大小和数量，本发明对此不做具体限定，如正样本为20*20像素大小的人脸图像，样本量为10000个，负样本为自然界中任意不含人脸的20*20像素图像，样本量为20000个等等。优选地，作为样本的图像应尽量多样化，从人们的各个生活环境中取材较为合适。训练完成后还可以根据训练结果是否达到预定的检测率、虚警率等各种指标来对分类器进行调整，如增加级联分类器的级数等等。其中，预定的检测率、虚警率的数值本发明对此也不做限定，可以根据需要设定，如预先设定每一级分类器的人脸检测率为99％，非人脸的虚警率为30％等等。对于每一级分类器来说，在训练的过程中，对于已经检测出来的负样本即识别出为非人脸的负样本，在进入下一级分类器之前，需要进行样本替换，将这些已经检测出来的负样本替换为数量相同的其它负样本，然后与未检测出来是非人脸的负样本一起输入给下一级分类器，继续进行训练。Embodiments of the invention relate to cascaded classifiers. The cascade classifier is composed of multiple classifiers in series, and the number of the multiple strong classifiers is also called the number of stages of the cascade classifier. For example, a 10-level cascade of classifiers consists of 10 strong classifiers and so on. Usually, the cascade classifier is trained for face detection, and can be trained with pre-prepared positive samples and negative samples. Wherein, the size and quantity of the positive sample and the negative sample are not specifically limited in the present invention. For example, the positive sample is a face image with a size of 20*20 pixels, the sample size is 10,000, and the negative sample is any different image in nature. A 20*20 pixel image containing a face, with a sample size of 20,000 and so on. Preferably, the images used as samples should be as diverse as possible, and it is more appropriate to draw materials from various living environments of people. After the training is completed, the classifier can be adjusted according to whether the training result reaches the predetermined detection rate, false alarm rate and other indicators, such as increasing the number of cascaded classifiers and so on. Wherein, the numerical value of predetermined detection rate, false alarm rate the present invention also does not limit to this, can set according to need, as presetting the face detection rate of each level classifier 99%, the false alarm rate of non-human face The police rate is 30% and so on. For each level of classifier, during the training process, for the detected negative samples that are identified as non-face negative samples, before entering the next level of classifier, it is necessary to perform sample replacement. The detected negative samples are replaced with other negative samples of the same number, and then input to the next classifier together with the negative samples that are not detected as non-human faces to continue training.

其中，级联分类器是逐级进行人脸检测的，从第一级分类器开始，每一级分类器进行人脸检测后的结果输入到下一级的分类器中继续进行人脸检测，直到最后一级分类器检测完毕输出人脸检测的结果。Among them, the cascaded classifiers perform face detection step by step. Starting from the first-level classifier, the results of face detection by each classifier are input to the next-level classifier to continue face detection. Until the last level of classifier is detected, the result of face detection is output.

参见图1，本发明一实施例提供了一种基于人脸检测和跟踪截取人脸图像的方法，包括：Referring to Fig. 1, an embodiment of the present invention provides a kind of method based on face detection and tracking interception face image, comprising:

101：采用级联分类器对待检测图像进行人脸检测。101: Use a cascade classifier to perform face detection on the image to be detected.

102：当检测到人脸目标时，使用均值跟踪算法对所述人脸目标进行人脸跟踪。102: When a face target is detected, perform face tracking on the face target using a mean value tracking algorithm.

103：当所述人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，根据目标的位置判断同一帧内的人脸检测与人脸跟踪是否对应同一个人脸目标，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧。103: When the face target leaves the detection area, on each frame of face detection and face tracking, judge whether the face detection and face tracking in the same frame correspond to the same face target according to the position of the target, and select Face detection and face tracking correspond to each frame of the same face target.

104：在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，比较计算得到的所有重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像。104: In each selected frame, calculate the coincidence degree of the window of face detection and the window of face tracking in the same frame, compare all the coincidence degrees obtained by calculation, and calculate the person detected by face detection on the frame with the largest degree of coincidence The face image is used as the intercepted face image.

所述待检测图像通常为视频图像，是一段连续的视频，当然也可以为一组静态的图像，本发明对此不做具体限定。级联分类器对待检测图像进行检测时，可以按照预设的检测窗口进行人脸检测，检测每一个检测窗口中是否有人脸的存在。检测窗口的大小本发明对此不做具体限定，可以根据需要设置，如设置为20*20像素大小的窗口、25*25像素大小的窗口、或者30*30像素大小的窗口等等。所述检测窗口在图像上的平移顺序可以为从上至下，从左至右，本发明对此不做具体限定。The image to be detected is usually a video image, which is a continuous video, and of course it can also be a group of static images, which is not specifically limited in the present invention. When the cascade classifier detects the image to be detected, it can perform face detection according to the preset detection window, and detect whether there is a human face in each detection window. The size of the detection window is not specifically limited in the present invention, and can be set as required, such as a window with a size of 20*20 pixels, a window with a size of 25*25 pixels, or a window with a size of 30*30 pixels, etc. The translation sequence of the detection window on the image may be from top to bottom and from left to right, which is not specifically limited in the present invention.

参见图2，为本实施例提供的级联分类器进行人脸检测的示意图。其中，级联分类器为N级，共有N个分类器。从分类器1开始逐级对待检测图像进行人脸检测，检测到的人脸就通过该分类器作为输出结果进入下一级分类器继续进行检测，检测到的非人脸就作为拒绝的结果输出到非人脸池中。最后一个分类器N检测完成后输出最终结果就是检测得到的人脸目标。Referring to FIG. 2 , it is a schematic diagram of face detection performed by cascade classifiers provided in this embodiment. Among them, the cascade classifier is N-level, and there are N classifiers in total. Starting from classifier 1, face detection is performed on the image to be detected step by step, and the detected face is passed through the classifier as the output result to enter the next classifier for further detection, and the detected non-face is output as the result of rejection to the non-human face pool. After the last classifier N is detected, the final result is the detected face target.

所述均值跟踪算法是指对人脸目标的估计位置的权重取均值，并按该均值对应的位置进行跟踪。具体地，当检测到人脸目标时，使用均值跟踪算法对所述人脸目标进行人脸跟踪，可以包括：The mean value tracking algorithm refers to taking the mean value of the weights of the estimated position of the human face target, and tracking according to the position corresponding to the mean value. Specifically, when a human face target is detected, using the mean value tracking algorithm to perform face tracking on the human face target may include:

当检测到人脸目标时，估计所述人脸目标在下一帧出现的位置；When a human face target is detected, estimate the position where the human face target appears in the next frame;

计算每个估计的位置的权重，对计算出的所有权重求取权重平均值；Calculate the weight of each estimated position, and calculate the weight average of all the calculated weights;

根据所述权重平均值获取对应的位置，按照获取的所述位置对所述人脸目标进行人脸跟踪。The corresponding position is obtained according to the weight average value, and the face tracking is performed on the face target according to the obtained position.

本实施例中，在人脸检测和人脸跟踪的各个帧上，根据目标的位置判断同一帧内的人脸检测与人脸跟踪是否对应同一个人脸目标，包括：In this embodiment, on each frame of face detection and face tracking, it is judged according to the position of the target whether the face detection and face tracking in the same frame correspond to the same face target, including:

在人脸检测和人脸跟踪的每个帧上，计算该帧内人脸检测的目标位置的左上角顶点与人脸跟踪的目标位置的左上角顶点之间的距离，并且计算该帧内人脸检测窗口边长与人脸跟踪窗口边长的比值，当所述距离小于等于预设的距离阈值且所述比值小于等于预设的比值阈值时，判定该帧内的人脸检测与人脸跟踪对应同一个人脸目标。On each frame of face detection and face tracking, calculate the distance between the upper left corner vertex of the target position of face detection in the frame and the upper left corner vertex of the target position of face tracking, and calculate the distance between the target position of the face in the frame The ratio of the side length of the face detection window to the side length of the face tracking window. When the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset ratio threshold, it is determined that the face detection and human face in the frame Tracking corresponds to the same face target.

本实施例中，所述方法还包括：In this embodiment, the method also includes:

本实施例中，人脸检测为实时进行，可以每一帧都进行检测，或者每隔几帧进行检测，如每隔两帧进行人脸检测等等，本发明对此不做具体限定。人脸跟踪的步长与人脸检测的步长可以相同，也可以不同，优选地，本实施例中人脸检测与人脸跟踪采用相同的步长进行，如均为每帧执行人脸检测和跟踪或者每隔两帧执行人脸检测和跟踪等等。In this embodiment, face detection is performed in real time, and detection may be performed every frame, or every few frames, such as face detection every two frames, etc., which is not specifically limited in the present invention. The step size of face tracking and the step size of face detection can be the same or different. Preferably, in this embodiment, face detection and face tracking adopt the same step size. For example, face detection is performed in each frame. and tracking or perform face detection and tracking every two frames and so on.

本实施例提供的上述方法，通过均值跟踪算法对级联分类器检测到的人脸目标进行人脸跟踪，当人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像，在充分利用检测资源和跟踪资源的基础上，截取了较清晰的人脸图像，提高了人脸跟踪的精度，提升了跟踪效果，并为截取清晰的人脸图像提供了数据上的支持。In the above-mentioned method provided by this embodiment, the face target detected by the cascade classifier is used for face tracking through the mean tracking algorithm. When the face target leaves the detection area, on each frame of face detection and face tracking, Select each frame corresponding to the same face target for face detection and face tracking; in each selected frame, calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame, and the window with the largest coincidence degree The face image obtained by face detection on the frame is used as the intercepted face image. On the basis of making full use of detection resources and tracking resources, a clearer face image is intercepted, which improves the accuracy of face tracking and improves the tracking effect. , and provides data support for intercepting clear face images.

参见图3，本发明另一实施例提供了一种基于人脸检测和跟踪截取人脸图像的方法，包括：Referring to Fig. 3, another embodiment of the present invention provides a kind of method based on face detection and tracking interception face image, comprising:

301：采用级联分类器对待检测图像进行人脸检测。301: Perform face detection on the image to be detected by using cascaded classifiers.

所述待检测图像通常为视频图像，是一段连续的视频，当然也可以为一组静态的图像，本发明对此不做具体限定。The image to be detected is usually a video image, which is a continuous video, and of course it can also be a group of static images, which is not specifically limited in the present invention.

级联分类器对待检测图像进行检测时，可以按照预设的检测窗口进行人脸检测，检测每一个检测窗口中是否有人脸的存在。检测窗口的大小本发明对此不做具体限定，可以根据需要设置，如设置为20*20像素大小的窗口、25*25像素大小的窗口、或者30*30像素大小的窗口等等。所述检测窗口在图像上的平移顺序可以为从上至下，从左至右，本发明对此不做具体限定。When the cascade classifier detects the image to be detected, it can perform face detection according to the preset detection window, and detect whether there is a human face in each detection window. The size of the detection window is not specifically limited in the present invention, and can be set as required, such as a window with a size of 20*20 pixels, a window with a size of 25*25 pixels, or a window with a size of 30*30 pixels, etc. The translation sequence of the detection window on the image may be from top to bottom and from left to right, which is not specifically limited in the present invention.

302：当检测到人脸目标时，使用均值跟踪算法对所述人脸目标进行人脸跟踪。302: When a face target is detected, perform face tracking on the face target using a mean value tracking algorithm.

本实施例中，检测到人脸目标时就启动人脸跟踪。人脸跟踪的位置是按照算法估计出来的。通常会估计出多个位置，所述多个位置表示该人脸目标在下一帧可能出现的位置，根据每个位置对应的权重可以从该多个位置中计算出一个合适的位置进行跟踪。In this embodiment, face tracking is started when a face target is detected. The position of face tracking is estimated according to the algorithm. Usually, multiple positions are estimated, and the multiple positions represent possible positions of the face target in the next frame. According to the weight corresponding to each position, a suitable position can be calculated from the multiple positions for tracking.

本步骤可以具体包括以下步骤：This step may specifically include the following steps:

其中，可以使用预定的状态方程，如二阶马尔科夫链自回归方程，估计所述人脸目标在下一帧出现的位置。所述计算权重可以在HSV空间计算颜色信息，并根据颜色相似性计算每个估计的位置的权重。具体地，可以将当前要跟踪的帧转化为HSV图像，获取其亮度，色度，饱和度的信息；接着随机选取人脸目标在上一帧的位置周围的位置作为中心点，提取以目标面积为大小的图像区域信息；然后根据上一帧人脸目标的位置信息、人脸目标的初始位置信息，使用合适的状态方程如二阶马尔科夫链自回归方程估计出人脸目标在本帧可能的位置。每次选取的中心点不同，在本帧的估计位置也有所不同。例如，选取300次不同的中心点，共得到300个估计位置，计算每个位置的HSV空间的特征信息，利用直方图和初始图像进行比对，得到其相似程度的数据，对相似程度大的值赋予高权重，对相似度小的值赋予低权重，最后的位置中心由所有估计位置的权重均值计算得到。Wherein, a predetermined state equation, such as a second-order Markov chain autoregressive equation, may be used to estimate the position where the face target appears in the next frame. The calculating weight may calculate color information in HSV space, and calculate the weight of each estimated position according to color similarity. Specifically, the current frame to be tracked can be converted into an HSV image, and its brightness, chroma, and saturation information can be obtained; then the position of the face target around the position of the previous frame is randomly selected as the center point, and the target area is extracted is the size of the image area information; then according to the position information of the face target in the previous frame and the initial position information of the face target, use a suitable state equation such as the second-order Markov chain autoregressive equation to estimate the face target in this frame possible location. The center point selected each time is different, and the estimated position in this frame is also different. For example, select 300 different center points to obtain 300 estimated positions in total, calculate the feature information of the HSV space of each position, use the histogram to compare with the initial image, and obtain the similarity data. Values with high weights are given high weights, values with low similarity are given low weights, and the final location center is calculated by the weighted average of all estimated locations.

本实施例中，估计人脸目标在下一帧出现的位置可以为一个，通常为多个。对于估计出来的每一个位置都会对应一个权重，该权重就代表该位置出现的可能性，权重越大表明人脸目标在该位置出现的可能性就越大，相反，权重越小表明人脸目标在该位置出现的可能性就越小。当估计出多个位置时，获取每一个位置的权重，得到多个权重，然后对该多个权重求取平均值得到权重平均值。In this embodiment, there may be one estimated position of the human face target appearing in the next frame, and usually multiple positions. Each estimated position will correspond to a weight, which represents the possibility of the position, the greater the weight, the greater the possibility of the face target appearing at the position, and the smaller the weight, the face target It is less likely to appear in this position. When multiple positions are estimated, the weight of each position is obtained to obtain multiple weights, and then the multiple weights are averaged to obtain a weight average.

303：当所述人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，计算该帧内人脸检测的目标位置的左上角顶点与人脸跟踪的目标位置的左上角顶点之间的距离，并且计算该帧内人脸检测窗口边长与人脸跟踪窗口边长的比值，当所述距离小于等于预设的距离阈值且所述比值小于等于预设的比值阈值时，判定该帧内的人脸检测与人脸跟踪对应同一个人脸目标。303: When the face target leaves the detection area, on each frame of face detection and face tracking, calculate the upper left corner vertex of the target position of face detection in the frame and the upper left corner of the target position of face tracking The distance between the vertices, and calculate the ratio of the face detection window side length to the face tracking window side length in the frame, when the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset ratio threshold , it is determined that the face detection and face tracking in the frame correspond to the same face target.

其中，所述距离阈值和比值阈值可以根据需要预先设置，本发明对具体数值不做限定。例如，可以设置距离阈值为人脸检测窗口边长的5％，将比值阈值设置为5％、10％等等。Wherein, the distance threshold and the ratio threshold can be preset according to needs, and the present invention does not limit specific values. For example, the distance threshold can be set to 5% of the side length of the face detection window, and the ratio threshold can be set to 5%, 10%, and so on.

本实施例中，人脸检测的目标位置的左上角顶点与人脸跟踪的目标位置的左上角顶点之间的距离小于等于距离阈值，并且人脸检测窗口边长与人脸跟踪窗口边长的比值小于等于比值阈值时，认为人脸检测和人脸跟踪的两个位置非常接近，可以将这两个位置视为同一个人脸目标的位置，从而可以确定出检测到的人脸目标也是当前正在跟踪的人脸目标。In this embodiment, the distance between the upper left corner vertex of the target position of face detection and the upper left corner vertex of the target position of face tracking is less than or equal to the distance threshold, and the side length of the face detection window is equal to the side length of the face tracking window. When the ratio is less than or equal to the ratio threshold, it is considered that the two positions of face detection and face tracking are very close, and these two positions can be regarded as the position of the same face target, so that it can be determined that the detected face target is also currently being tracked. Tracked face targets.

304：选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧。304: Select each frame corresponding to the same face target for face detection and face tracking.

305：在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，比较计算得到的所有重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像。305: In each selected frame, calculate the coincidence degree of the window of face detection and the window of face tracking in the same frame, compare all the coincidence degrees obtained by calculation, and calculate the human face detected by the frame with the largest degree of coincidence The face image is used as the intercepted face image.

参见图4，本实施例中，任一帧内人脸检测的窗口和人脸跟踪的窗口的重合度可以通过以下方式来计算：Referring to Fig. 4, in the present embodiment, the coincidence degree of the window of face detection and the window of face tracking in any frame can be calculated in the following manner:

其中，人脸检测的窗口的左上角顶点坐标为(x₁,y₁)，边长为d₁，人脸跟踪的窗口的左上角顶点坐标为(x₂,y₂)，边长为d₂，这里假设两个窗口均为正方形。Among them, the vertex coordinates of the upper left corner of the window for face detection are (x ₁ , y ₁ ), and the side length is d ₁ , and the vertex coordinates of the upper left corner of the window for face tracking are (x ₂ , y ₂ ), and the side length is d ₂ , here it is assumed that both windows are square.

首先，计算重合窗口的宽度：比对x₁,x₂的大小，取值x_maxleft＝max(x₁,x₂)作为重合窗口的左上角顶点的横坐标；再计算x₁+d₁和x₂+d₂，取值x_minright＝min(x₁+d₁,x₂+d₂)作为重合窗口的右上角顶点的横坐标；此时，可以计算出重合窗口的宽为w＝x_minright-x_maxleft。First, calculate the width of the overlapping window: compare the sizes of x ₁ and x ₂ , and take the value x_maxleft=max(x ₁ , x ₂ ) as the abscissa of the top left corner of the overlapping window; then calculate x ₁ +d ₁ and x ₂ +d ₂ , take the value x_minright=min(x ₁ +d ₁ ,x ₂ +d ₂ ) as the abscissa of the upper right vertex of the overlapping window; at this time, the width of the overlapping window can be calculated as w=x_minright-x_maxleft .

然后，计算重合窗口的高度：比对y₁,y₂的大小，取值y_maxtop＝max(y₁,y₂)作为重合窗口的左上角顶点的纵坐标；再计算y₁+d₁和y₂+d₂，取值y_minbottom＝min(y₁+d₁,y₂+d₂)作为重合窗口的左下角顶点的纵坐标；此时，可以计算出重合窗口的高为h＝y_minbottom-y_maxtop。Then, calculate the height of the coincident window: compare the sizes of y ₁ and y ₂ , and take the value y_maxtop=max(y ₁ , y ₂ ) as the ordinate of the top left corner of the coincident window; then calculate y ₁ +d ₁ and y ₂ +d ₂ , take the value y_minbottom=min(y ₁ +d ₁ ,y ₂ +d ₂ ) as the ordinate of the bottom left corner vertex of the overlapping window; at this time, the height of the overlapping window can be calculated as h=y_minbottom-y_maxtop .

最后，根据重合窗口的宽和高计算出重合窗口的面积为S＝w×h，计算人脸检测窗口的面积根据重合窗口的面积和人脸检测窗口的面积，计算人脸检测窗口和人脸跟踪窗口的重合度为destination＝min(S,S_detect)/S×100％。Finally, according to the width and height of the overlapping window, the area of the overlapping window is calculated as S=w×h, and the area of the face detection window is calculated According to the area of the coincident window and the area of the face detection window, the coincidence degree of the face detection window and the face tracking window is calculated as destination=min(S, S_detect)/S*100%.

本实施例中，在计算得到的各个重合度中选取最大的重合度，将该最大的重合度对应的帧内人脸检测的人脸图像作为最终截取的人脸图像，可以得到最清晰的结果。如果最大的重合度有多个，则可以在该多个最大重合度对应的多个帧中选取第一个出现的帧，将该帧内人脸检测的人脸图像作为最后截取的人脸图像。In this embodiment, the largest coincidence degree is selected among the calculated coincidence degrees, and the face image of the intra-frame face detection corresponding to the maximum coincidence degree is used as the final intercepted face image, and the clearest result can be obtained . If there are multiple maximum coincidence degrees, the first frame that appears in the multiple frames corresponding to the multiple maximum coincidence degrees can be selected, and the face image detected by the face in the frame is used as the last intercepted face image .

上述步骤中已经启动了人脸目标的跟踪，如果级联分类器检测到多个人脸目标，则会启动该多个人脸目标的跟踪，即分别对检测到的每个人脸目标进行跟踪，此处不一一说明。In the above steps, the tracking of the face target has been started. If the cascade classifier detects multiple face targets, the tracking of the multiple face targets will be started, that is, each detected face target is tracked separately. Here Do not explain one by one.

进一步地，上述方法还可以包括以下步骤：Further, the above method may also include the following steps:

和/或，上述方法还可以包括以下步骤：And/or, the above method may also include the following steps:

本实施例中，在当前帧跟踪人脸目标得到的位置可以称为跟踪位置(英文：TrackLocation)，由于人脸检测也在同步进行，因此，在当前帧也会得到检测到的人脸目标的位置，该位置可以称为检测位置(英文：DetectLocation)。很显然，TrackLocation和DetectLocation是人脸检测和人脸跟踪针对同一个人脸目标得到的两个结果，这两个结果有着必然的联系。在实际情况中，这两个位置通常极其相近，但是从准确度来说，DetectLocation对目标位置的判断一般更加准确，而TrackLocation随着对目标的跟踪帧数的增加，误差将会渐渐加大，这是因为随着目标移动，目标的大小形状，光照等等条件都在变化，和初始的信息差距越来越大，跟踪效果也会越来越差。因此，本实施例将跟踪得到的人脸图像的位置替换为人脸检测得到的人脸图像的位置，令TrackLocation＝DetectLocation，以替换后的位置继续进行跟踪，可以提高跟踪精度，减少跟踪误差。当跟踪的人脸目标离开跟踪区域后，将人脸跟踪结果和人脸检测结果进行比对，找到跟踪过程中人脸跟踪结果和人脸检测结果位置最为接近的帧，可以认为此时的人脸可追踪性强，干扰少，清晰度高，因此选取该帧上人脸检测得到的人脸图像作为最终的截图，精确度较高。In this embodiment, the position obtained by tracking the human face target in the current frame can be called a tracking position (English: TrackLocation). location, which may be referred to as a detection location (English: DetectLocation). Obviously, TrackLocation and DetectLocation are two results obtained by face detection and face tracking for the same face target, and these two results are necessarily related. In actual situations, these two positions are usually very close, but in terms of accuracy, DetectLocation is generally more accurate in judging the target position, while TrackLocation will gradually increase the error as the number of tracking frames of the target increases. This is because as the target moves, the size and shape of the target, lighting and other conditions are changing, and the gap with the initial information is getting bigger and bigger, and the tracking effect will be getting worse. Therefore, in this embodiment, the position of the face image obtained by tracking is replaced by the position of the face image obtained by face detection, and TrackLocation=DetectLocation is set to continue tracking with the replaced position, which can improve tracking accuracy and reduce tracking errors. When the tracked face target leaves the tracking area, the face tracking result is compared with the face detection result, and the frame with the closest position between the face tracking result and the face detection result is found during the tracking process. The face has strong trackability, less interference, and high definition. Therefore, the face image obtained by face detection on this frame is selected as the final screenshot, which has high accuracy.

本实施例中，当跟踪的人脸目标移出跟踪区域时，可以停止跟踪，并将该人脸目标所占用的跟踪资源全部释放，从而可以节省跟踪资源，避免资源浪费。In this embodiment, when the tracked face target moves out of the tracking area, the tracking can be stopped, and all tracking resources occupied by the face target can be released, so that tracking resources can be saved and waste of resources can be avoided.

本实施例提供的上述方法，通过均值跟踪算法对级联分类器检测到的人脸目标进行人脸跟踪，当人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像，在充分利用检测资源和跟踪资源的基础上，截取了较清晰的人脸图像，提高了人脸跟踪的精度，提升了跟踪效果，并为截取清晰的人脸图像提供了数据上的支持。本实施例中采用均值跟踪算法进行跟踪，对各个估计位置的权重求取平均值，由于各个估计位置反映了人脸目标在下一帧出现的各种可能性，因此，权重平均值对应的位置更全面、更准确地代表人脸目标在下一帧出现的位置，按照权重平均值对应的位置进行人脸跟踪准确性更高。将跟踪结果和检测结果进行比对，相互校验，使得截图清晰人脸图像的过程中有了明确的数据依据。当跟踪结果和检测结果位置十分接近时，可以认为此时的人脸可追踪性强，干扰少，清晰度较高，因此选取该帧作为最终的截图，这也是本发明和现在市面产品中的最大的不同之处。In the above-mentioned method provided by this embodiment, the face target detected by the cascade classifier is used for face tracking through the mean tracking algorithm. When the face target leaves the detection area, on each frame of face detection and face tracking, Select each frame corresponding to the same face target for face detection and face tracking; in each selected frame, calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame, and the window with the largest coincidence degree The face image obtained by face detection on the frame is used as the intercepted face image. On the basis of making full use of detection resources and tracking resources, a clearer face image is intercepted, which improves the accuracy of face tracking and improves the tracking effect. , and provides data support for intercepting clear face images. In this embodiment, the mean value tracking algorithm is used to track, and the weights of each estimated position are averaged. Since each estimated position reflects the various possibilities of the face target appearing in the next frame, the position corresponding to the weighted average value is more accurate. It fully and more accurately represents the position of the face target in the next frame, and the accuracy of face tracking is higher according to the position corresponding to the weight average. Comparing the tracking results with the detection results and verifying each other provides a clear data basis for the process of capturing clear face images. When the tracking result is very close to the detection result, it can be considered that the face at this time has strong traceability, less interference, and high definition, so this frame is selected as the final screenshot, which is also the goal of the present invention and current market products. The biggest difference.

参见图5，本发明一实施例提供了一种基于人脸检测和跟踪截取人脸图像的装置，包括：Referring to Fig. 5, an embodiment of the present invention provides a device for capturing a face image based on face detection and tracking, including:

检测模块501，用于采用级联分类器对待检测图像进行人脸检测；The detection module 501 is used to perform face detection on the image to be detected by using a cascade classifier;

跟踪模块502，用于当检测模块501检测到人脸目标时，使用均值跟踪算法对所述人脸目标进行人脸跟踪；Tracking module 502, for when detection module 501 detects human face target, use mean value tracking algorithm to carry out human face tracking to described human face target;

判断模块503，用于当所述人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，根据目标的位置判断同一帧内的人脸检测与人脸跟踪是否对应同一个人脸目标，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；Judging module 503, for when the face target leaves the detection area, on each frame of face detection and face tracking, judge whether the face detection and face tracking in the same frame correspond to the same person according to the position of the target Face target, select each frame corresponding to the same face target for face detection and face tracking;

截取模块504，用于在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，比较计算得到的所有重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像。The intercepting module 504 is used to calculate the coincidence degree of the window of face detection and the window of face tracking in the same frame in each selected frame, compare all the coincidence degrees obtained by the calculation, and place the face on the frame with the largest degree of coincidence The detected face image is used as the intercepted face image.

参见图6，跟踪模块502包括：Referring to Figure 6, the tracking module 502 includes:

估计单元502a，用于当所述检测模块检测到人脸目标时，估计所述人脸目标在下一帧出现的位置；An estimating unit 502a, configured to estimate the position where the human-face object appears in the next frame when the detection module detects the human-face object;

计算单元502b，用于计算每个估计的位置的权重，对计算出的所有权重求取权重平均值；Calculation unit 502b, configured to calculate the weight of each estimated position, and calculate the weight average of all the calculated weights;

跟踪单元502c，用于根据所述权重平均值获取对应的位置，按照获取的所述位置对所述人脸目标进行人脸跟踪。The tracking unit 502c is configured to acquire a corresponding position according to the weight average value, and perform face tracking on the face target according to the acquired position.

其中，判断模块503用于：Wherein, the judging module 503 is used for:

其中，跟踪模块502还用于：Wherein, the tracking module 502 is also used for:

当判断模块503根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应同一个人脸目标时，将该帧内人脸跟踪得到的人脸图像替换为人脸检测得到的人脸图像，以替换后的人脸图像继续进行人脸跟踪。When the judging module 503 judges according to the position of the target that the face detection and face tracking in a certain frame correspond to the same face target, replace the face image obtained by face tracking in the frame with the face image obtained by face detection , continue face tracking with the replaced face image.

当判断模块503根据目标的位置判断出某一帧内的人脸检测与人脸跟踪对应不同的人脸目标时，将该帧内人脸检测到的人脸图像作为新的人脸目标，对所述新的人脸目标启动人脸跟踪。When the judging module 503 judges according to the position of the target that the face detection in a certain frame is different from the face tracking corresponding to the face target, the face image detected by the face in the frame is used as a new face target. The new face object starts face tracking.

本实施例提供的上述装置可以执行上述任一方法实施例中提供的方法，详细过程见方法实施例中的描述，此处不赘述。所述装置可以应用于计算机等电子设备中，本发明对此不做具体限定。The above-mentioned device provided in this embodiment can execute the method provided in any one of the above-mentioned method embodiments. For the detailed process, refer to the description in the method embodiment, and details are not repeated here. The device can be applied to electronic equipment such as computers, which is not specifically limited in the present invention.

本实施例提供的上述装置，通过均值跟踪算法对级联分类器检测到的人脸目标进行人脸跟踪，当人脸目标离开检测区域时，在人脸检测和人脸跟踪的各个帧上，选出人脸检测与人脸跟踪对应同一个人脸目标的各个帧；在选出的各个帧中，计算同一帧内人脸检测的窗口与人脸跟踪的窗口的重合度，将重合度最大的帧上人脸检测得到的人脸图像作为截取的人脸图像，在充分利用检测资源和跟踪资源的基础上，截取了较清晰的人脸图像，提高了人脸跟踪的精度，提升了跟踪效果，并为截取清晰的人脸图像提供了数据上的支持。采用均值跟踪算法进行跟踪，对各个估计位置的权重求取平均值，由于各个估计位置反映了人脸目标在下一帧出现的各种可能性，因此，权重平均值对应的位置更全面、更准确地代表人脸目标在下一帧出现的位置，按照权重平均值对应的位置进行人脸跟踪准确性更高。通常，人脸检测结果的准确度通常都会高于人脸跟踪结果的准确度，因此，将跟踪得到的人脸图像的位置替换为人脸检测得到的人脸图像的位置，以替换后的位置继续进行跟踪，可以提高跟踪精度，减少跟踪误差，并且为截图清晰人脸图像提供的数据依据。The above-mentioned device provided in this embodiment performs face tracking on the face target detected by the cascade classifier through the mean value tracking algorithm. When the face target leaves the detection area, on each frame of face detection and face tracking, Select each frame corresponding to the same face target for face detection and face tracking; in each selected frame, calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame, and the window with the largest coincidence degree The face image obtained by face detection on the frame is used as the intercepted face image. On the basis of making full use of detection resources and tracking resources, a clearer face image is intercepted, which improves the accuracy of face tracking and improves the tracking effect. , and provides data support for intercepting clear face images. The average tracking algorithm is used for tracking, and the weight of each estimated position is averaged. Since each estimated position reflects the various possibilities of the face target appearing in the next frame, the position corresponding to the weighted average is more comprehensive and accurate. The ground represents the position where the face target appears in the next frame, and the accuracy of face tracking is higher according to the position corresponding to the weight average. Usually, the accuracy of the face detection result is usually higher than the accuracy of the face tracking result, therefore, replace the position of the face image obtained by tracking with the position of the face image obtained by face detection, and continue with the replaced position Tracking can improve tracking accuracy, reduce tracking errors, and provide data basis for screenshots of clear face images.

本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成，也可以通过程序来指令相关的硬件完成，所述的程序可以存储于一种计算机可读存储介质中，上述提到的存储介质可以是只读存储器，磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above embodiments can be completed by hardware, and can also be completed by instructing related hardware through a program. The program can be stored in a computer-readable storage medium. The above-mentioned The storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk, and the like.

以上所述仅为本发明的较佳实施例，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within range.

Claims

1. a kind of method based on face detection and tracking interception face image, it is characterized in that, described method comprises:

Use cascaded classifiers to perform face detection on the image to be detected;

When a face target is detected, convert the current frame to be tracked into an HSV image, and obtain information on its brightness, chroma, and saturation; randomly select the position of the face target around the position of the previous frame as the center point , extract the image area information with the target area as the size; then according to the position information of the face target in the last frame and the initial position information of the face target, use a predetermined state equation to estimate the position of the face target in the current The position where the frame to be tracked appears; calculate the color information in the HSV space, and calculate the weight of each estimated position according to the color similarity, assign a high weight to the value with a large similarity, and assign a low weight to a value with a small similarity. Obtaining a weight average value for all the calculated weights, obtaining a corresponding position according to the weight average value, and performing face tracking on the human face target according to the obtained position;

When the human face target leaves the detection area, on each frame of human face detection and human face tracking, calculate the upper left corner vertex of the target position of human face detection in this frame and the upper left corner vertex of the target position of human face tracking and calculate the ratio of the face detection window side length to the face tracking window side length in the frame, when the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset ratio threshold, Determine that the face detection and face tracking in the frame correspond to the same face target, and select each frame in which the face detection and face tracking correspond to the same face target;

In each selected frame, calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame, compare all the coincidence degrees obtained by calculation, and compare the face image obtained by face detection on the frame with the largest coincidence degree as the intercepted face image.

2. The method according to claim 1, characterized in that the method further comprises:

When it is judged according to the position of the target that the face detection and face tracking in a certain frame correspond to the same face target, replace the face image obtained by face tracking in the frame with the face image obtained by face detection to replace The final face image continues to face tracking.

3. The method according to claim 1, characterized in that the method further comprises:

When it is judged according to the position of the target that the face detection and face tracking in a certain frame correspond to different face targets, the face image detected by the face in the frame is used as a new face target, and the new face target is used for the new target. The face target starts face tracking.

4. a kind of device based on face detection and tracking interception face image, it is characterized in that, described device comprises:

Detection module, for adopting cascade classifier to carry out face detection on the image to be detected;

The tracking module is used to convert the current frame to be tracked into an HSV image when a human face target is detected, and obtain information on its brightness, chroma, and saturation; randomly select the position of the human face target around the previous frame The position of the face is used as the center point to extract the image area information with the target area as the size; then according to the position information of the face target in the last frame and the initial position information of the face target, use a predetermined state equation to estimate the face The position where the target appears in the current frame to be tracked; the color information is calculated in the HSV space, and the weight of each estimated position is calculated according to the color similarity, and the value with a large similarity is assigned a high weight, and the value with a small similarity is given a high weight Giving a low weight, obtaining a weight average value for all calculated weights, obtaining a corresponding position according to the weight average value, and performing face tracking on the human face target according to the obtained position;

Judgment module, for when the face target leaves the detection area, on each frame of face detection and face tracking, calculate the upper left corner vertex of the target position of face detection in the frame and the target of face tracking The distance between the upper left corner vertices of the position, and calculate the ratio of the face detection window side length to the face tracking window side length in the frame, when the distance is less than or equal to the preset distance threshold and the ratio is less than or equal to the preset When the ratio threshold value of , it is judged that the face detection in the frame corresponds to the same face target with the face tracking, and each frame of the same face target corresponding to the face detection and the face tracking is selected;

The interception module is used to calculate the coincidence degree between the window of face detection and the window of face tracking in the same frame in each selected frame, compare all the coincidence degrees obtained by calculation, and detect the face on the frame with the largest coincidence degree The obtained face image is used as the intercepted face image.

5. The device according to claim 4, wherein the tracking module is also used for:

When the judging module judges according to the position of the target that the face detection and face tracking in a certain frame correspond to the same face target, replace the face image obtained by face tracking in the frame with the face obtained by face detection Image, continue face tracking with the replaced face image.

6. The device according to claim 4, wherein the tracking module is also used for:

When the judging module judges according to the position of the target that the face detection and face tracking in a certain frame correspond to different face targets, the face image detected by the face in the frame is used as a new face target, Start face tracking for the new face target.