CN114708543A

CN114708543A - Examination student positioning method in examination room monitoring video image

Info

Publication number: CN114708543A
Application number: CN202210629393.7A
Authority: CN
Inventors: 刘说; 潘帆; 李翔; 赵启军; 黄珂; 杨玲; 杨智鹏
Original assignee: Chengdu University of Information Technology
Current assignee: Chengdu University of Information Technology
Priority date: 2022-06-06
Filing date: 2022-06-06
Publication date: 2022-07-05
Anticipated expiration: 2042-06-06
Also published as: CN114708543B

Abstract

The invention relates to the field of image processing, in particular to a method for positioning examinees in an examination room monitoring video image, which mainly comprises the steps of firstly carrying out frame selection marking based on head hair areas of the examinees on a large amount of examination room monitoring video image data containing different examination scenes and different examinees according to the ear visible conditions of the examinees in the examination room monitoring video image data, establishing an examinee head hair area data set, carrying out primary screening based on high false alarm rate target detection on the basis, finally establishing a model based on SSD deep learning target detection, positioning the head hair areas of the examinees, and finally realizing the positioning of the examinees.

Description

A method for locating candidates in surveillance video images of examination room

技术领域technical field

本发明属于图像处理领域，具体涉及一种考场监控视频图像中考生定位方法。The invention belongs to the field of image processing, and in particular relates to a method for locating a candidate in a monitoring video image of an examination room.

背景技术Background technique

在世界范围内，考试一直作为重要的检验、选拔手段广泛使用，这是由于其在一定程度上能确保公平、公正。然而为了顺利通过考试，存在各种各样作弊手段，为了保证考试的公平、公正原则，考试监控系统大量应用于各类考试中。然而，考场拥有了视频监控系统，却并不意味着能很好的解决作弊问题。In the world, the examination has been widely used as an important test and selection method, because it can ensure fairness and justice to a certain extent. However, in order to successfully pass the exam, there are various cheating methods. In order to ensure the fairness and impartiality of the exam, the exam monitoring system is widely used in various exams. However, the fact that the examination room has a video surveillance system does not mean that it can solve the problem of cheating very well.

这是由于视频监控虽然能较为完整的记录考场信息，但是是否存在考试作弊行为，仍然需要相关部门投入大量的人力去对这些视频数据进行后期的处理和审查，其中很大比例的视频中是没有作弊行为的，但每一段视频都需要经过相关人员的仔细审查，由此产生了大量的工作量，由此产生了对考场监控视频中考生的行为进行自动识别的需求，而如何对考场监控视频中考生进行定位，则成为了一个必须解决的关键问题。This is because although video surveillance can record the information of the examination room relatively completely, whether there is cheating in the examination still requires relevant departments to invest a lot of manpower to process and review these video data in the later stage. cheating behavior, but each video needs to be carefully reviewed by relevant personnel, resulting in a lot of workload, which has created the need to automatically identify the behavior of candidates in the examination room surveillance video, and how to check the examination room surveillance video. The positioning of middle school students has become a key problem that must be solved.

对于考场监控视频检测定位方法大致可以分为基于背景差分的方法、基于模板匹配的方法、基于图像特征的方法，这些方法存在检测范围有限，对于考场布局的依赖性较大等问题。The detection and positioning methods of surveillance video in the examination room can be roughly divided into methods based on background difference, methods based on template matching, and methods based on image features. These methods have problems such as limited detection range and greater dependence on the layout of the examination room.

发明内容SUMMARY OF THE INVENTION

本发明针对上述现有技术的不足，提出了一种考场监控视频图像中考生定位方法，包括以下步骤：Aiming at the deficiencies of the above-mentioned prior art, the present invention proposes a method for locating a candidate in a monitoring video image of an examination room, comprising the following steps:

步骤1：对包含了不同考试场景、不同考生的大量考场监控视频图像数据进行基于考生头顶部头发区域的框选标记，然后建立考场监控视频图像数据的考生头顶部头发区域数据集；Step 1: Make a frame selection mark based on the hair area on the top of the examinee's head for a large number of examination room surveillance video image data that includes different examination scenes and different candidates, and then establish a data set of the examinee's head hair area of the examination room surveillance video image data;

步骤2：建立针对考场监控视频图像数据中考生头顶部头发区域定位的目标检测深度学习模型，首先对考场监控视频图像数据中的可能为头发的像素点进行筛选，得到预处理图像数据，然后在预处理图像数据上进行基于SSD的深度学习目标检测；Step 2: Establish a target detection deep learning model for locating the hair area on the top of the examinee's head in the surveillance video image data of the examination room. First, screen the pixels that may be hair in the surveillance video image data of the examination room to obtain the preprocessed image data, and then in the Perform SSD-based deep learning target detection on preprocessed image data;

步骤3：将建立的考场监控视频图像数据的考生头顶部头发区域数据集按比例进行划分，分别生成训练数据集和测试数据集，对建立的针对考场监控视频图像数据中考生头顶部头发区域定位的目标检测深度学习模型进行训练和测试，得到最终目标检测模型

；Step 3: Divide the data set of the hair area on the top of the candidate's head according to the proportion of the established monitoring video image data of the examination room, respectively generate a training data set and a testing data set, and locate the hair area on the top of the candidate's head in the established monitoring video image data for the examination room. The target detection deep learning model is trained and tested to obtain the final target detection model

;

步骤4：将考场监控视频初始图像数据输入到最终目标检测模型

中，然后得到对考场监控视频图像数据中的考生定位结果。Step 4: Input the initial image data of the surveillance video of the examination room into the final target detection model

, and then obtain the positioning results of the candidates in the monitoring video image data of the examination room.

进一步的，步骤1：对包含了不同考试场景、不同考生的大量考场监控视频图像数据进行基于考生头顶部头发区域的框选标记，具体方法为：对考场监控视频图像数据中的考生的头顶部头发区域进行基于考生耳朵显露情况及基于近似图像的方框标记。Further, step 1: carry out a frame selection mark based on the hair area on the top of the examinee's head for a large number of examination room surveillance video image data including different examination scenes and different candidates, and the specific method is: the examination room surveillance video image data. The hair area is box marked based on the exposure of the candidate's ears and based on the approximate image.

进一步的，对考场监控视频图像数据中的考生的头顶部头发区域进行基于考生耳朵显露情况方框标记，耳朵显露情况分为：两个耳朵显露，一个耳朵显露，没有显露耳朵。Further, the hair area on the top of the candidate's head in the monitoring video image data of the examination room is marked with a box based on the candidate's ear exposure. The ear exposure is divided into: two ears are exposed, one ear is exposed, and no ears are exposed.

进一步的，对考场监控视频图像数据中的考生的头顶部头发区域进行基于近似图像的方框标记，具体为生成的边框的水平和垂直边都平行于图像数据边缘。Further, the approximate image-based frame marking is performed on the hair area on the top of the examinee's head in the monitoring video image data of the examination room, specifically, the horizontal and vertical sides of the generated frame are parallel to the edge of the image data.

进一步的，视频图像数据中考生两个耳朵显露，则框选区域为：以边缘检测所得头发与额头交界的最低点为框选区域底部，以框选区域底部到边缘检测所得头发最顶部距离的

倍为框选高度，以头发左右两侧与背景交界最长距离的

倍为框选宽度，组成框选区域，变量

和

为加权系数。Further, if the two ears of the examinee are exposed in the video image data, the frame selection area is: take the lowest point of the junction between the hair and the forehead obtained by edge detection as the bottom of the frame selection area, and take the distance from the bottom of the frame selection area to the top of the hair obtained by edge detection.

times the height of the frame selection, the longest distance between the left and right sides of the hair and the background

times the width of the frame selection, forming a frame selection area, variable

and

is the weighting factor.

进一步的，视频图像数据中考生一个耳朵显露，则框选区域为：以头发与额头交界的最高点和最低点之间的中间值点为框选区域底部，以框选区域底部到头发最顶部距离的

倍为框选高度，以显露的一个耳朵与头发交界的边缘与另一侧头发与背景交界的边缘的最长距离的

倍为框选宽度，组成框选区域，变量

和

为加权系数。Further, if one ear of the examinee is exposed in the video image data, the frame selection area is: the middle value point between the highest point and the lowest point at the junction of the hair and the forehead is the bottom of the frame selection area, and the bottom of the frame selection area to the top of the hair. distance

times the height of the box selection, to the longest distance between the edge where one ear meets the hair and the edge where the other side meets the background.

and

is the weighting factor.

进一步的，视频图像数据中考生没有显露耳朵，以头发与额头交界的最高点为框选区域底部，以框选区域底部到头发最顶部的距离为框选高度，以额头在图像数据中显示的水平宽度为框选宽度，组成框选区域。Further, in the video image data, the candidates do not show their ears. The highest point at the junction of the hair and the forehead is the bottom of the frame selection area, the distance from the bottom of the frame selection area to the top of the hair is the frame selection height, and the forehead is displayed in the image data. The horizontal width is the width of the frame selection, forming the frame selection area.

进一步的，步骤2：对考场监控视频图像数据中的可能为头发的像素点进行筛选，得到预处理图像数据，具体方法为：首先对图像数据进行灰度化处理，得到灰度图像数据

；然后对灰度图像数据中的每个像素点的像素值按照

进行取反，得到灰度取反图像数据

，其中

、

为图像数据

、

中的横坐标为

，纵坐标为

的像素点的灰度值；对灰度取反图像数据

进行高虚警率的CFAR目标检测，得到筛选后的图像数据，设置阈值

，对筛选后的图像数据进行二值化处理，得到预处理图像数据

。Further, step 2: screening the pixels that may be hair in the monitoring video image data of the examination room to obtain preprocessed image data, and the specific method is: first, perform grayscale processing on the image data to obtain grayscale image data

; Then the pixel value of each pixel in the grayscale image data is calculated according to

Invert to get grayscale inverse image data

,in

,

for image data

,

The abscissa in is

, the ordinate is

The gray value of the pixel point; invert the grayscale image data

Perform CFAR target detection with high false alarm rate, obtain filtered image data, and set thresholds

, binarize the filtered image data to obtain preprocessed image data

.

进一步的，步骤2：在预处理图像数据上进行基于SSD的深度学习目标检测，具体为：将二值检测结果图像数据

作为索引图像，将索引图像中非0像素值像素点在图像中的坐标映射到相应考场监控视频图像数据上，以映射的像素点作为锚框中心点，基于SSD目标检测框架，建立针对考场监控视频图像数据头发区域定位的目标检测模型。Further, step 2: perform deep learning target detection based on SSD on the preprocessed image data, specifically: the binary detection result image data

As an index image, map the coordinates of pixels with non-zero pixel values in the index image to the corresponding monitoring video image data of the examination room, and use the mapped pixels as the center point of the anchor frame. Based on the SSD target detection framework, establish a monitoring system for the examination room. Object detection model for hair region localization in video image data.

进一步的，步骤4：将考场监控视频初始图像数据输入到最终目标检测模型

中，然后得到对考场监控视频图像数据中的考生定位结果，具体为：将考场监控视频初始图像数据输入到最终目标检测模型

中，得到头发区域框选结果，将每个框选区域向下扩展自身范围的

倍，得到更新的区域框选结果，将该更新的区域框选结果定为考生定位结果。Further, step 4: input the initial image data of the surveillance video of the examination room into the final target detection model

, and then obtain the location results of the candidates in the surveillance video image data of the examination room, specifically: input the initial image data of the surveillance video of the examination room into the final target detection model

, get the result of the hair area frame selection, and extend each frame selection area down to its own range

times, to obtain the updated area frame selection result, and set the updated area frame selection result as the candidate positioning result.

本发明解决了以下技术问题：The present invention solves the following technical problems:

1、提出一种根据考场监控视频图像数据中考生的耳朵可见情况，对考场监控视频图像数据进行基于考生头发区域的框选标记方法，提高了考生头发区域数据集的准确性和可靠性。1. This paper proposes a method of frame selection and marking based on the examinee's hair area for the examination room surveillance video image data according to the examinee's ear visibility in the examination room surveillance video image data, which improves the accuracy and reliability of the examinee's hair area data set.

2、通过对考场监控视频图像数据中的可能为头发的像素点进行基于高虚警率的目标检测的初步筛选，有效的提高了考生头发区域检测的准确性。2. Through the preliminary screening of the target detection based on the high false alarm rate for the pixels in the monitoring video image data of the examination room that may be hair, the accuracy of the examinee's hair area detection is effectively improved.

3、对考场监控视频图像数据的二值检测结果作为索引图像，以索引图像为基础进行锚框选择，在提高了考生头发区域检测的准确性的同时又降低了目标检测模型的复杂度。3. The binary detection result of the monitoring video image data of the examination room is used as the index image, and the anchor frame is selected based on the index image, which improves the accuracy of the examinee's hair area detection and reduces the complexity of the target detection model.

附图说明Description of drawings

图1为一种考场监控视频图像中考生定位方法流程图。Figure 1 is a flowchart of a method for locating candidates in a surveillance video image of an examination room.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整的描述，方法流程图如图1所示，包括以下步骤：The technical solutions in the embodiments of the present invention will be described clearly and completely below in conjunction with the accompanying drawings in the embodiments of the present invention. The flow chart of the method is shown in Figure 1 and includes the following steps:

一种考场监控视频图像中考生定位方法，包括：A method for locating candidates in a monitoring video image of an examination room, comprising:

;

倍为框选高度，以头发左右两侧与背景交界最长距离的

倍为框选宽度，组成框选区域，变量

和

and

is the weighting factor.

倍为框选宽度，组成框选区域，变量

和

and

is the weighting factor.

；然后对灰度图像数据中的每个像素点的像素值按照

进行取反，得到灰度取反图像数据

，其中

、

为图像数据

、

中的横坐标为

，纵坐标为

的像素点的灰度值；对灰度取反图像数据

Invert to get grayscale inverse image data

,in

,

for image data

,

The abscissa in is

, the ordinate is

The gray value of the pixel point; invert the grayscale image data

, binarize the filtered image data to obtain preprocessed image data

.

显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的其他实施例，都属于本发明保护的范围。Obviously, the described embodiments are only some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

Claims

1. A method for positioning examinees in an examination room monitoring video image is characterized by comprising the following steps:

step 1: performing frame selection marking based on the head hair area of the examinee on a large amount of examination room monitoring video image data containing different examination scenes and different examinees, and then establishing an examinee head hair area data set of the examination room monitoring video image data;

step 2: establishing a target detection deep learning model for positioning a hair area at the top of an examinee in examination room monitoring video image data, firstly screening pixel points which are possibly hairs in the examination room monitoring video image data to obtain preprocessed image data, and then carrying out deep learning target detection based on SSD on the preprocessed image data;

and step 3: dividing an examinee's head top hair region data set of the established examination room monitoring video image data in proportion to generate a training data set and a testing data set respectively, training and testing the established target detection depth learning model for positioning the examinee's head top hair region in the examination room monitoring video image data to obtain a final target detection model

；

And 4, step 4: inputting initial image data of examination room monitoring video into final target detection model

Then, the examinee positioning result in the examination room monitoring video image data is obtained.

2. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 1: the method comprises the following steps of performing frame selection marking based on the head hair area of an examinee on a large amount of examination room monitoring video image data containing different examination scenes and different examinees, and specifically comprises the following steps: and carrying out box marking based on the ear exposure condition of the examinee and based on the approximate image on the head and top hair area of the examinee in the examination room monitoring video image data.

3. The method according to claim 2, wherein the examination room monitoring video image is characterized in that the examination room monitoring video image data is subjected to examination room ear exposure condition-based frame marking on the top of the head and hair area of the examinee, and the ear exposure condition is divided into: two ears are exposed, one ear is exposed, and no ear is exposed.

4. The method according to claim 2, wherein the top hair area of the head of the examinee in the examinee surveillance video image data is marked by a box based on an approximate image, and the horizontal and vertical edges of the generated frame are parallel to the edges of the image data.

5. The method according to claim 3, wherein if two ears of the examinee are exposed in the video image data, the frame selection area is: using the lowest point of the boundary between the hair and the forehead obtained by the edge detection as the bottom of the frame selection region, and the distance from the bottom of the frame selection region to the top of the hair obtained by the edge detection

Selecting the height as the frame, and taking the longest distance between the left and right sides of the hair and the background

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

6. The method of claim 3, wherein if an ear of the test taker in the video image data is exposed, the selection area is: using the middle point between the highest point and the lowest point of the junction between the hair and the forehead as the bottom of the framed selection area, and the distance from the bottom of the framed selection area to the top of the hair

Height is selected as the longest distance between the edge of one exposed ear and hair and the edge of the other side hair and background

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

7. The method as claimed in claim 3, wherein the examinee is located in the video image data without exposing ears, the highest point of the boundary between the hair and the forehead is the bottom of the frame selection area, the distance from the bottom of the frame selection area to the top of the hair is the frame selection height, and the horizontal width of the forehead displayed in the image data is the frame selection width, thereby forming the frame selection area.

8. The method for positioning the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 2: screening pixel points which are possibly hairs in examination room monitoring video image data to obtain preprocessed image data, wherein the specific method comprises the following steps: firstly, the image data is grayed to obtain the grayscale image data

(ii) a Then, the pixel value of each pixel point in the gray image data is calculated according to

Performing inversion to obtain grayscale inverted image data

Wherein

、

As image data

、

On the abscissa of

On the ordinate of

The gray value of the pixel point; inverting gray scale image data

CFAR target detection with high false alarm rate is carried out to obtain screened image data, and a threshold value is set

Carrying out binarization processing on the screened image data to obtain preprocessed image data

。

9. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 2: performing SSD-based deep learning target detection on the preprocessed image data, specifically: image data of binary detection result

As an index image, mapping the coordinates of non-0 pixel value pixel points in the index image to corresponding examination room monitoring video image data, and using the mapped pixel points as the index imageAnd establishing a target detection model for positioning the hair region of the examination room monitoring video image data based on an SSD target detection framework as the center point of the anchor frame.

10. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 4: inputting initial image data of examination room monitoring video into final target detection model

Then, obtaining the examinee positioning result in the examination room monitoring video image data, specifically: inputting initial image data of examination room monitoring video into final target detection model

In the method, a hair region frame selection result is obtained, and each frame selection region is expanded downwards to the self range

And obtaining an updated region frame selection result, and determining the updated region frame selection result as a test taker positioning result.