CN114494359A

CN114494359A - Small sample moving object detection method based on abnormal optical flow

Info

Publication number: CN114494359A
Application number: CN202210096206.3A
Authority: CN
Inventors: 陈华杰; 许琮擎; 周枭; 占俊杰
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2022-01-26
Filing date: 2022-01-26
Publication date: 2022-05-13

Abstract

The invention discloses a small sample moving object detection method based on abnormal optical flow, which is characterized in that jumping-out directly detects a moving object. Because of the dynamic characteristics of the video, the strong optical flow estimation capability of the optical flow estimation network is utilized to calculate the motion information between frames, and the motion abnormal information, namely the optical flow abnormal area, is extracted to realize the detection of the moving target. Under the motion background, the detection of the motion target can still be better realized. Through experimental verification, the detection network trained by the optical flow diagram data set containing a single category can still detect optical flow abnormal areas of other categories, namely the limitation of detection on moving targets caused by different target categories is broken through.

Description

Small sample moving object detection method based on abnormal optical flow

Technical Field

The invention belongs to the field of deep learning, and relates to a small sample moving object detection method based on abnormal optical flow.

Background

The detection of moving objects in a dynamic background is an important component of imaging detection systems such as visible light detection, infrared imaging detection, etc. Object detection is the localization of an object of interest on a single frame image/image sequence. In many cases, the background to which object detection is confronted is an extremely complex sequence of images, which presents a significant challenge to the job.

Along with the continuous expansion of the application range and depth of the military/civil affairs such as the clustering unmanned combat, the security protection, the agriculture and the like under the complex environment, the requirement on the detection of the moving target is more urgent, and higher requirements are also provided. The main reasons include: 1) the requirement of remote detection on the detection distance is higher and higher, and the physical size of the target represented on the image is smaller and smaller; 2) the appearance of miniaturized targets such as unmanned planes strengthens the trend; 3) limitations in detector imaging resolution, such as relatively limited improvement in resolution of infrared imaging; 4) group targets (such as unmanned aerial vehicle groups) with multiple targets/dense distribution provide higher requirements for information refinement processing in the detection tracking process; 5) the diversification of the sensor loading platform increases the algorithm difficulty because the background and the target in the collected image sequence are moving and cannot be removed in a simple static clutter filtering way under the moving background for a mobile loading platform (such as an airborne platform).

However, in the target detection and even the whole deep learning direction, the detection capability of the model needs to have a large amount of data as a support, but obtaining a large amount of labeled data often consumes high labor and time costs. Therefore, in the target detection task, it is increasingly desired to utilize less data in training a deep network model to learn some information useful for the task, so that a better network model can be trained with only a small amount of labeled target data. In the image information collected by the movable imaging sensor, the motion information of the target and the background is utilized, the demand for the sample can be reduced, and the performance of the target detection algorithm is improved under the constraint of a small sample.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a small sample moving object detection method based on abnormal optical flow, and aims to overcome the limitation of the existing moving object detection.

The optical flow estimation calculates the dynamic information of two adjacent frames of images, and provides a motion vector for each pixel in the images: the motion speed of the x-axis and the motion speed of the y-axis, and the motion vector matrix represents the motion field of the whole image. The method utilizes an optical flow estimation network to carry out optical flow estimation to obtain an optical flow diagram.

If a moving object exists in an image, that is, if the magnitude and direction of the velocity of a certain block region, which is a so-called moving object, do not match the background, the optical flow estimation of the moving object and the optical flow estimation of the environment in the corresponding optical flow map are greatly different, and the optical flow region having a large difference from the surroundings is referred to as an optical flow abnormal region. The position of the optical flow abnormal area is the position of the moving object, and the detection of the optical flow abnormal area can realize the detection of the moving object.

In general, the motion of the background is consistent, the motion vector thereof is also consistent, and the background is represented as a uniform area on the optical flow diagram; the motion of the moving object is inconsistent with that of the background, and the representation of the moving object on the optical flow graph is greatly different from that of the background area. Compared with an original image (with a complex background), the discrimination of the moving target and the background on the optical flow graph is greatly enhanced, the difficulty of target detection is greatly reduced, and a small sample is used for training a detection network to obtain stronger generalization capability.

The method comprises the following specific steps:

and (1) training the optical flow estimation network by using a public data set.

Optical flow estimation based on neural networks is faster than that of the conventional method, so optical flow estimation is performed using neural networks. Downloading an existing public data set with a large scale to train an optical flow estimation network, and then carrying out optical flow estimation on two frames of images by using the trained optical flow estimation network to obtain an optical flow graph;

and (2) training the detection network by using the optical flow graph.

The optical flow estimation network generates an optical flow graph, the optical flow graph is a two-channel matrix, and the two-channel matrix is subjected to normalization processing; because the moving object and the background move in different sizes and directions, the gradeability of the object and the background in the light flow graph is increased, and simultaneously the outline and texture information of the moving object are blurred, so that different moving objects have similar performances on the light flow graph. The area of the moving object on the optical flow graph is called an optical flow abnormal area. And identifying the moving target on the original data set, and using the generated identification file and the normalized optical flow graph for training the detection network. Unlike a general training set, the general training set samples are image datasets, which are three-channel matrices; the training set of the invention is a light flow graph, the light flow graph is a two-channel matrix, the two-channel matrix comprises the motion information of the image, and the detection network utilizes the motion information to realize the detection of the abnormal region of the light flow, namely the detection of the target on a velocity vector field. And the data of the two channels can reduce the expenditure of hardware and accelerate the speed of network reasoning.

And (3) the two networks are connected in series for operation.

And connecting the two trained networks in series, and mapping the detection result of the detection network to the original image to realize the detection of the moving target. The detection of the abnormal area of the optical flow, namely the detection of the moving target, can be realized by detecting the data set with the moving target of the same kind as the training set; the data set with the moving target of different types from the training set is detected, and the detection of the abnormal area of the optical flow, namely the detection of the moving target, can still be realized; in a dynamic background, the detection of the abnormal area of the optical flow, namely the detection of the moving target can still be realized.

Preferably, the common data set is FlyingCharrs, Sintel or MPI-Sintel.

Preferably, the optical flow estimation network is RAFT.

Preferably, the training detection network is centret.

Preferably, the normalization method is to normalize the matrix of the corresponding channel by using the maximum value and the minimum value of each channel of a single light flow graph.

Preferably, the two-channel optical flow diagram in step (2) is replaced by a three-channel optical flow diagram, and the three-channel optical flow diagram is: adding a third channel data to the normalized optical flow diagram of the two channels; and the third channel data is the velocity vector module value in the optical flow graph.

The invention has the following beneficial effects: the key point of the invention is that the jump-out directly detects the moving target. Because of the dynamic characteristics of the video, the strong optical flow estimation capability of the optical flow estimation network is utilized to calculate the motion information between frames, and the motion abnormal information, namely the optical flow abnormal area, is extracted to realize the detection of the moving target. Under the motion background, the detection of the motion target can still be better realized. Through experimental verification, the detection network trained by the optical flow diagram data set containing a single category can still detect optical flow abnormal areas of other categories, namely the limitation of detection on moving targets caused by different target categories is broken through.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a graph showing the results of the experiment;

FIG. 3 is a graph showing the results of the experiment.

Detailed Description

The invention is further analyzed with reference to specific implementations as follows.

In the experiment, public data sets with large scale, such as FlyingChars, Sintel, MPI-Sintel and other public data sets are used for training an optical flow estimation network RAFT; then, an unmanned aerial vehicle is used for collecting a plurality of groups of video data sets containing pedestrian targets, the video data sets are divided into a training set and a testing set, and the CenterNet is used as a detection network to train the video data sets. The method comprises the following specific steps:

The optical flow estimation network RAFT is trained by downloading an existing and open relatively large-scale dataset, and common datasets such as FlyingCharis, Sintel, MPI-Sintel, etc. may be used.

And (2) training the detection network by using the optical flow graph.

The RAFT generates a light flow diagram, the light flow diagram is a two-channel matrix, and the two-channel matrix is normalized by a common method that the matrix of each channel is normalized by the maximum value and the minimum value of each channel. Because the moving object and the background move in different sizes and directions, the gradeability of the object and the background in the light flow graph is increased, and simultaneously the outline and texture information of the moving object are blurred, so that different moving objects have similar performances on the light flow graph. The area of the moving object on the optical flow graph is called an optical flow abnormal area. And identifying the moving target on the original data set, and using the generated identification file and the normalized optical flow graph to train a detection network CenterNet. Unlike a general training set, the general training set samples are image datasets, which are three-channel matrices; the training set of the invention is a light flow graph, the light flow graph is a two-channel matrix, the two-channel matrix comprises the motion information of the image, and the detection network utilizes the motion information to realize the detection of the abnormal region of the light flow, namely the detection of the target on a velocity vector field. And the data of the two channels can reduce the expenditure of hardware and accelerate the speed of network reasoning.

In order to match with the interface of the current mainstream neural network, the module value of the velocity vector in the optical flow diagram can be also calculated, the optical flow diagram is used as a third channel to expand the optical flow diagram into three-channel data, the semantic information of a data set is increased, and the identification performance is improved.

In practical situations, the normalized optical flow graph (two-channel matrix) is trained and detected in a scene with a requirement on speed; in a scene with high requirement on the recognition rate, the expanded optical flow diagram (three-channel matrix) can be trained and detected.

And (3) the two networks are connected in series for operation.

And connecting the two trained networks RAFT and CenterNet in series, and mapping the detection result of the detection network CenterNet to the original image to realize the detection of the moving target. The network performance is detected on the test set, and the result shows that the target can be effectively detected under the static or dynamic background, and the moving target can still be detected on the data set which has the moving target of different types from the training set, namely, the data set has certain generalization capability, so that the limitation of limiting the detection of the moving target due to different target types is broken through.

Fig. 2 and 3 are graphs showing the experimental results of the present invention.

The above embodiments are not intended to limit the present invention, and the present invention is not limited to the above embodiments, and all embodiments are within the scope of the present invention as long as the requirements of the present invention are met.

Claims

1. A small sample moving object detection method based on abnormal optical flow is characterized by comprising the following specific steps:

step (1), training an optical flow estimation network;

downloading an existing and public data set to train an optical flow estimation network, and then carrying out optical flow estimation on two frames of images by using the trained optical flow estimation network to obtain an optical flow graph; \ u

Step (2), training a detection network by using an optical flow diagram;

the optical flow diagram is a two-channel matrix, and the two-channel matrix is subjected to normalization processing;

identifying a moving target on an original data set, and using the generated identification file and the normalized optical flow graph for training a detection network;

step (3), two networks are connected in series for operation

And connecting the two trained networks in series to realize the detection of the moving target.

2. The method of claim 1, wherein the method comprises: the public data set is FlyingCharrs, Sintel or MPI-Sintel.

3. The method of claim 1, wherein the method comprises: the optical flow estimation network is RAFT.

4. The method of claim 1, wherein the method comprises: the training detection network is CenterNet.

5. The method of claim 1, wherein the method comprises: the normalization method is to use the maximum value and the minimum value of each channel of a single light flow diagram to perform normalization processing on the matrix of the corresponding channel.

6. The method of claim 1, wherein the method comprises: replacing the optical flow graphs of the two channels in the step (2) with optical flow graphs of the three channels, wherein the optical flow graphs of the three channels are as follows: adding a third channel data to the normalized optical flow diagram of the two channels; and the third channel data is the velocity vector modulus value in the optical flow diagram.