CN111933275B - A Depression Assessment System Based on Eye Movement and Facial Expression - Google Patents

A Depression Assessment System Based on Eye Movement and Facial Expression Download PDF

Info

Publication number
CN111933275B
CN111933275B CN202010692613.1A CN202010692613A CN111933275B CN 111933275 B CN111933275 B CN 111933275B CN 202010692613 A CN202010692613 A CN 202010692613A CN 111933275 B CN111933275 B CN 111933275B
Authority
CN
China
Prior art keywords
eye movement
expression
module
features
depression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010692613.1A
Other languages
Chinese (zh)
Other versions
CN111933275A (en
Inventor
胡斌
杨民强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lanzhou University
Original Assignee
Lanzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lanzhou University filed Critical Lanzhou University
Priority to CN202010692613.1A priority Critical patent/CN111933275B/en
Publication of CN111933275A publication Critical patent/CN111933275A/en
Application granted granted Critical
Publication of CN111933275B publication Critical patent/CN111933275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/48Extraction of image or video features by mapping characteristic values of the pattern into a parameter space, e.g. Hough transformation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Public Health (AREA)
  • Databases & Information Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Primary Health Care (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供一种基于眼动与面部表情的抑郁评估系统,通过对眼动、表情的有效特征的提取和分析,实现客观量化的非侵入式抑郁评估手段。包括情绪刺激模块、表情采集模块、眼动采集模块、眼动特征提取模块、表情特征提取模块、机器学习分类模块和自动评估模块;表情采集模块采集被试者观看情绪刺激模块输出的不同的情绪刺激图片时的表情信息;眼动采集模块采集被试者观看情绪刺激模块输出的不同的情绪刺激图片时的眼动信息;眼动特征提取模块对获得的眼动图像信息提取眼动特征,表情特征提取模块对获得的表情图像信息提取表情特征;机器学习分类模块进行特征融合和机器学习分类;自动评估模块依据机器学习分类结果对被试者的抑郁程度进行评估。

The invention provides a depression assessment system based on eye movement and facial expression, which realizes an objective and quantified non-invasive depression assessment means by extracting and analyzing effective features of eye movement and facial expression. Including emotional stimulation module, expression collection module, eye movement collection module, eye movement feature extraction module, expression feature extraction module, machine learning classification module and automatic evaluation module; the expression collection module collects the different emotions output by the subjects watching the emotional stimulation module The expression information when stimulating the pictures; the eye movement acquisition module collects the eye movement information when the subjects watch the different emotional stimulation pictures output by the emotional stimulation module; the eye movement feature extraction module extracts eye movement features, expression The feature extraction module extracts expression features from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; the automatic evaluation module evaluates the degree of depression of the subjects according to the machine learning classification results.

Description

Depression evaluation system based on eye movement and facial expression
Technical Field
The invention relates to the technical field of early detection of computer-aided depression, in particular to a depression evaluation system based on eye movement and facial expression, which analyzes the eye movement and facial expression characteristics through a machine learning method to perform early depression evaluation.
Background
Depression is a common psychological disorder affecting about 3.5 million people worldwide, and world health organization (World Health Organization, WHO) predicts that depression will be the second greatest disease worldwide by 2020, next to heart disease. However, diagnosis and efficacy assessment of depression mainly depend on subjective assessment methods such as family history, patient self-description and clinical scale, and objective measurement methods and tools are lacking, so that early-stage affective disorder identification is difficult, and patients often miss the optimal treatment opportunity.
In the biomedical field, it has been found that the spontaneous behavior of eye movements and expressions is closely related to its psychological state (especially depression symptoms). Studies have shown that patients with depressive disorders respond differently to different emotional stimuli than non-depressed patients, and these responses are often subconscious, such as eye movements, expressions, etc. Compared with the traditional evaluation means, the physiological indexes are more objective, and meanwhile, the non-invasive equipment can conveniently collect data and is convenient to operate.
Disclosure of Invention
According to the problems existing in the prior art, the invention provides a depression evaluation system based on eye movement and facial expression, and the association relationship between the eye movement, the expression and depression disorder is established by extracting the effective characteristics of the eye movement and the expression and carrying out fusion analysis on the effective characteristics of the eye movement and the expression, so that a non-invasive depression evaluation means with objective quantification is realized.
The technical scheme of the invention is as follows:
1. the depression evaluation system based on the eye movement and the facial expression is characterized by comprising an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic evaluation module; the expression acquisition module is used for acquiring expression information when the testee watches different emotion stimulation pictures output by the emotion stimulation module; the eye movement acquisition module is used for acquiring eye movement information when the testee views different emotion stimulation pictures output by the emotion stimulation module; the eye movement feature extraction module extracts eye movement features from the obtained eye movement image information, and the expression feature extraction module extracts expression features from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; the automatic evaluation module evaluates the depression degree of the tested person according to the machine learning classification result.
2. The eye movement acquisition module comprises a foreground camera and an eye movement camera, wherein the foreground camera is arranged in the forehead middle area of the tested person and is used for shooting the field area of the tested person, the resolution is 1080p, and the sampling rate reaches 30fps; the eye movement camera is arranged in the left and right areas of the cheek of the tested person and is used for shooting pupil images of the left and right eyes of the tested person, the higher frame number is required, the resolution is 120x120, and the sampling rate reaches 200fps.
3. The eye movement acquisition module comprises a lens frame for accommodating a foreground camera and an eye movement camera; the glasses frame is made of polyurethane materials through 3D printing and comprises a foreground camera bracket, an lengthened nose pad and a pupil camera bracket; the foreground camera support is positioned above the eyebrows, the foreground camera is fixed at the center part, and the foreground camera support is supported on the nose through the lengthened nose support; the eye movement camera support is connected to the left side and the right side of the foreground camera support, and the joint of the eye movement camera support and the foreground camera support is provided with an arc-shaped structure, so that the temples of the traditional glasses can pass through the arc-shaped structure; the end part of the eye movement camera support is fixed with the eye movement camera, and the eye movement camera support is respectively rotated outwards by a certain angle, so that the eye movement camera can not shade the cheek part and photograph the pupil obliquely upwards, and the support can be used for stretching and rotating, so that the eye movement camera support is suitable for testees with different facial types.
4. The expression acquisition module comprises an expression acquisition camera which is arranged at a proper position in front of a tested person so as to shoot the complete face area of the tested person, and the resolution is 4096x 2160 and the sampling rate is 60fps by adopting the compass c1000 e.
5. The emotion stimulation module comprises picture materials capable of giving positive, neutral and negative different emotion stimulations to the testee and audio materials with the same emotion as that of the picture stimulations.
6. The eye movement feature extraction module extracts eye movement features, namely eye movement feature extraction is carried out on the obtained pupil image, and comprises the steps of extracting pupil radius and pupil center coordinate information in the image by adopting a Canny edge detection operator and a Hough circle detection algorithm, and further calculating the eye movement track and pupil size change; for eye movement data, gaussian filtering is used for denoising, then a Canny edge detection operator and Hough circle are used for detecting the circle center and the size of the pupil, and meanwhile, the characteristics of the gazing area and gazing time of a tested person are calculated.
7. The expression feature extraction module extracts expression features, namely TOFS features are extracted from the obtained expression images, a video stream of the whole sequence is cut into a plurality of video segments by adding a sliding window and based on MDMO features of an optical flow field, and the expression features are extracted to obtain 41-dimensional feature vectors; the method comprises the steps of firstly using a CNN convolutional neural network to calculate a human face region, calculating 66 characteristic points and 36 ROI interested regions of the human face, and finally calculating TOFS characteristics.
8. The machine learning classification module comprises the steps of feature fusion: and carrying out multi-mode parallel feature fusion on the two groups of feature vectors of the extracted eye movement features and expression features, combining the two groups of feature vectors into a complex vector space through the complex vector, and then carrying out dimension reduction on the vector space through principal component analysis.
9. The machine learning classification module includes the steps of training a classifier: marking the collected eye movement data and expression data according to whether the tested person is a depressed patient, then taking the eye movement data and the expression data together with whether the label is a depressed patient as training data, carrying out classification calculation by adopting a decision tree, and establishing and training a classifier model.
10. The automatic assessment module comprises the steps of automatic assessment: collecting eye movement data and expression data of a person to be tested with unknown depression, extracting features and fusing the features, inputting the collected eye movement data and expression data into a trained classifier model, automatically evaluating whether the person to be tested has depression tendency or not by the classifier model according to the input features, and outputting depression degree classification, wherein the classification result is as follows: normal or depressed.
The invention has the technical effects that:
according to the depression evaluation system based on the eye movement and the facial expression, provided by the invention, the association relationship between the eye movement, the expression and the depression disorder is established by extracting the effective characteristics of the eye movement and the expression and carrying out fusion analysis on the effective characteristics, so that a non-invasive depression evaluation means with objective quantification is realized.
The invention improves in hardware, and improves the structure of the glasses frame in order to avoid shielding important areas when collecting eye movement and expression data and to consider that most people need to wear glasses nowadays. In addition, the classification algorithm combines the processing of eye movement and expression data, the characteristic calculation and extraction of the eye movement and expression data and the multi-modal analysis, and a classification model is obtained through the machine learning classification algorithm. The trained classification model is applied to an automatic depression evaluation system, the detection process is visualized, the operation is convenient, and meanwhile, the depression degree of a tested person is evaluated by a wearable non-invasive method, so that the early evaluation function is realized.
Drawings
Fig. 1 is a schematic diagram of the overall framework of the system of the present invention.
Fig. 2a is a structural perspective view of a 3D printing frame according to an embodiment of the present invention.
Fig. 2b is a front view of a 3D print frame of an embodiment of the present invention.
Fig. 2c is a side view of a 3D print frame of an embodiment of the present invention.
Fig. 2D is a top view of a 3D print frame of an embodiment of the present invention.
Fig. 3a is a front view of a live wearing view schematic of a 3D print frame according to an embodiment of the present invention.
Fig. 3b is a side view of a live wearing schematic of a 3D print frame according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of the system workflow of the present invention.
Fig. 5 is a schematic flow chart of extracting TOFS features.
1-foreground camera support, 2-lengthened nose pad, 3-foreground camera, 4-eye movement camera, 5-arc structure and 6-eye movement camera support
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of the overall framework of the system of the present invention. A depression evaluation system based on eye movement and facial expression comprises an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic evaluation module; the expression acquisition module is used for acquiring expression information when the testee watches different emotion stimulation pictures output by the emotion stimulation module; the eye movement acquisition module is used for acquiring eye movement information when the testee views different emotion stimulation pictures output by the emotion stimulation module; the eye movement feature extraction module extracts eye movement features from the obtained eye movement image information, and the expression feature extraction module extracts expression features from the obtained expression image information; the machine learning classification module performs feature fusion and machine learning classification; the automatic evaluation module evaluates the depression degree of the tested person according to the machine learning classification result.
The eye movement acquisition module comprises a foreground camera and an eye movement camera, wherein the foreground camera is arranged in the forehead middle area of the tested person and is used for shooting the field area of the tested person, the resolution is 1080p, and the sampling rate reaches 30fps; the eye movement camera is arranged in the left and right areas of the cheek of the tested person and is used for shooting pupil images of the left and right eyes of the tested person, the higher frame number is required, the resolution is 120x120, and the sampling rate reaches 200fps. The expression acquisition module comprises an expression acquisition camera which is arranged at a proper position in front of the tested person so as to shoot the complete face area of the tested person, and the resolution is 4096x 2160 and the sampling rate is 60fps by adopting the compass c1000 e.
The embodiment of the invention adopts the 3D printing frame to arrange the foreground camera and the eye movement camera. As shown in fig. 2a, 2b, 2c, 2D, there are respectively a perspective view, a front view, a side view, and a top view of a 3D printing frame according to an embodiment of the present disclosure. The glasses frame is made of polyurethane materials through 3D printing and comprises a foreground camera bracket 1, an lengthened nose pad 2 and an eye movement camera bracket 6; the foreground camera support 1 is positioned above the eyebrows, the foreground camera 3 is fixed at the center part, and the foreground camera support 1 is supported on the nose through the lengthened nose support 2; the eye movement camera support 6 is connected to the left side and the right side of the foreground camera support 1, and the joint of the eye movement camera support 6 and the foreground camera support 1 is provided with a section of arc-shaped structure 5, so that the temples of the traditional glasses can pass through the arc-shaped structure; the end part of the eye movement camera bracket is fixed with the eye movement camera 4, the eye movement camera bracket 6 respectively rotates outwards by a certain angle, so that the eye movement camera can not shade the cheek part and photograph the pupil obliquely upwards, and the bracket can stretch out and draw back and rotate to adapt to the testees with different facial types.
As shown in fig. 3a and 3b, the 3D printing frame according to the embodiment of the present invention is a front view and a side view of a wearing live view schematic diagram, respectively. The design structure of the 3D printing glasses frame is mainly improved in two aspects of not shielding a facial expression interest area and adapting to a near-sighted person. The main body of the glasses frame is made of polyurethane materials, so that the thickness of the glasses frame is not too large, and the glasses frame has stronger toughness. The spectacle frame is innovated for experimental data acquisition, firstly, the acquired data comprise eye movements and expressions, the spectacle frame needs to minimum shielding of facial expressions, experiments prove that the shielding of forehead and nose has small influence on expression acquisition, and key parts such as eyebrows, glasses and mouth are prevented from being shielded as much as possible, so that the spectacle frame is upwards designed, the heights of the eyes and the spectacle frame are integrally raised, the front spectacle frame is positioned above the eyebrows, the nose pads of the glasses are prolonged and concentrated in front of the nose, and materials with good toughness and small volume are adopted, so that shielding of other positions is reduced; in addition, aiming at the increasing number of people wearing myopia glasses, the glasses frame is also adapted, and the heights of the glasses legs are properly improved. Meanwhile, an arc-shaped notch is additionally arranged at the position, close to the temple of a person, of the camera brackets at the two sides, so that the glasses legs of the myopia glasses can pass through the arc-shaped notch, and the tested eyes can be conveniently used; in addition, the eye movement camera support is rotated outwards by 15 degrees, so that a camera shooting the pupil can not shield the cheek part, and meanwhile, the camera orientation is selected, so that the camera can shoot the pupil obliquely upwards, the face is not shielded, and meanwhile, the pupil image acquisition is guaranteed.
Fig. 4 is a schematic diagram of the system workflow of the present invention. Before the experiment starts, firstly, the experiment environment is ensured to be comfortable and relatively quiet, the interference of the external environment to the testee is reduced, and the noise source is eliminated. After the experimental scene meets the requirements, synchronous eye movement and expression acquisition are started, picture stimulation of different emotions of a tested person is respectively given in the acquisition process, and the picture stimulation with the same emotion as the picture emotion is assisted, wherein the stimulated picture is divided into: positive, neutral and negative. After the original data of the eye movement and the expression are obtained, the eye movement and the expression data are preprocessed, then the eye movement characteristics and the expression characteristics are extracted, the characteristics are fused in a characteristic layer fusion mode to obtain effective characteristics, a decision tree of a machine learning classification algorithm is used for classifying, a classifier model is built and trained, the predicted value obtained through the model fits with a true value, the fact that the actual evaluation task is good is guaranteed, and the classifier is obtained. The automatic evaluation system based on the trained classifier is used for automatically evaluating the depression degree of a tested person, the same data acquisition and the same feature extraction process are carried out on a tested person with unknown depression condition, the features are input into the automatic evaluation system after the features are calculated, the classifier outputs the depression degree classification through the input features, and the classification result is as follows: normal or depressed.
Wherein the mood-stimulus module is designed to: 1) Nine-point positioning: yellow marks appear at nine points of the screen, namely, the middle, upper, lower, left, right, upper left, lower right, lower left and upper right, so that the tested person views the yellow marks as pupil positioning calibration 2) continuous picture stimulation: the picture material is from the international emotion picture system (IAPS) and comprises picture material capable of giving positive, neutral and negative different emotion stimuli to the subject and audio material of the same emotion as the picture stimulus. The experimental paradigm sequentially shows that the neutral, positive, neutral and negative pictures are repeated twice, each group of pictures shows 5 pictures with the same attribute, each picture shows 5s, and the different groups of pictures have 5s intervals.
After data is ready to be collected, a screen starts to play a video stimulation paradigm, and simultaneously an eye movement collection module and an expression collection module start to work to collect eye moving picture information and expression picture information of a tested person when watching different emotion stimulation pictures; the expression acquisition module and the eye movement acquisition module both adopt a mode of recording image data, record tested eye movement data and expression data, and store the data into a video format after the video stimulation paradigm is finished. The eye movement and expression data are marked according to whether the tested person is a depressed patient or not, and the eye movement and expression data and whether the tested person is the depressed patient or not are used as training data. Whether a patient is depressed is determined by a doctor in a hospital using a conventional diagnostic method such as inquiry.
The eye movement feature extraction module extracts eye movement features, namely eye movement feature extraction is carried out on the obtained pupil image, and comprises the steps of extracting pupil radius and pupil center coordinate information in the image by adopting a Canny edge detection operator and a Hough circle detection algorithm, and further calculating the eye movement track and pupil size change; for eye movement data, gaussian filtering is used for denoising, then a Canny edge detection operator and Hough circle are used for detecting the circle center and the size of the pupil, and meanwhile, the characteristics of the gazing area and gazing time of a tested person are calculated. The expression feature extraction module extracts expression features, namely extracts TOFS features from the obtained expression images, cuts the video stream of the whole sequence into a plurality of video segments based on the optical flow field features and adding a sliding window, and extracts the expression features to obtain 41-dimensional feature vectors; the method comprises the steps of firstly using a CNN convolutional neural network to calculate a human face region, calculating 66 characteristic points and 36 ROI interested regions of the human face, and finally calculating TOFS characteristics.
The eye movement feature extraction is mainly to extract eye movement features of a shot pupil image, the module comprises a Canny edge detection operator and pupil radius and pupil center coordinate information in a Hough circle detection extraction image, and further eye movement track and pupil size change are calculated, and the detailed steps are as follows:
the first step: canny edge detection operator
(1) Gaussian filtering
The Canny edge detection algorithm is sensitive to noise, so that smoothing filtering is firstly carried out on an image to reduce the influence of noise on an edge detection result, and a Gaussian filter is used for convolving the image for smoothing the image, so that the image is smoothed, and the influence of noise on the edge detection result is reduced. The generation equation of the gaussian filter kernel of size (2k+1) x (2k+1) is given by:
(2) Calculating gradient intensity and direction
The most important feature of the edge is that the gray value varies drastically, and then the change in gray value is described by a gradient. One pixel has 8 neighborhoods, then there is a gradient in four directions, up, down, left, right, diagonal, so the Canny algorithm uses four operators to detect horizontal, vertical, and diagonal edges in the image. The operator calculates the gradient in the form of image convolution, the following two templates are convolved with the original image to obtain a differential value graph of x and y axes, and finally the gradient G and the direction theta of the point are calculated, wherein the formula is as follows:
θ=arctan(G y /G x )
(3) Non-maximum suppression
After the gradient calculation is performed on the image in the previous step, the edge extracted based on the gradient value is still very blurred. Therefore, an edge-thinning algorithm, non-maximum suppression, is required, which serves to compare the gradient intensity of the current pixel with two pixels along the positive and negative gradient directions. If the gradient intensity of the current pixel is maximum compared to the other two pixels, the pixel point remains as an edge point, otherwise the pixel point will be suppressed. This allows for a more accurate identification of the actual edges of the image.
(4) Dual threshold detection
Although non-maximum suppression algorithms can detect actual edges more accurately, the presence of noise and color variations has an impact on the detection results. To address these spurious responses, it is necessary to filter edge pixels with weak gradient values and preserve edge pixels with high gradient values, i.e., when the gradient value is above a set threshold, the pixel point can be considered a strong edge pixel; conversely, below the threshold point, the point is considered a weak edge pixel point, which is suppressed in subsequent detection.
(5) Suppressing isolated low threshold points
The pixel point of the strong edge detected in the previous step is determined as an edge, however, the weak edge pixel may be an actual edge or an error caused by noise or color change. Therefore, in order to filter out noise while preserving the actual edges, by looking at the weak edge pixels and 8 neighborhood pixels thereof, the weak edge points can be preserved as true edges as long as one is a strong edge pixel.
And a second step of: hough circle detection
The Hough transform is a method of detecting a curve by taking advantage of the duality between points on the curve and the parameters of the curve. This work is widely used for the detection of certain analytical curves in gray scale images, in particular straight lines, circles and parabolas.
When there is a circle in the image, then its edges must belong to the edges of the image, and in the x-y coordinate system, the general equation for the circle is as follows:
(x-a) 2 +(y-b) 2 =r 2
converted from an x-y coordinate system to an a-b coordinate system. Written in the following form (a-x) 2 +(b-y) 2 =r 2 . Then a point on the circular boundary in the x-y coordinate system corresponds to a circle in the a-b coordinate system. The circular boundary of the x-y coordinate system contains countless points, and there are countless circles in the corresponding a-b coordinate system, and the circles meet the equal distance from the circle center (a, b), so that the circles on the a-b coordinate system intersect at a point, and the intersection point is the circle center (a, b) of the circle. The number of circles at the local intersection points is counted, the center coordinates can be obtained by taking each local maximum value, and the radius of the intersected circles is determined, so that the radius r value is obtained.
Calculating a fixation point, a first target interest point and a fixation time as characteristic values of eye movement through the change of the pupil center point; the radius of the pupil serves as a direct indicator of pupil constriction. The above eye movement locus and pupil variation are taken as eye movement characteristics.
And extracting the whole video sequence by using a sliding window, and comparing the change of the pupil radius and the pupil center coordinates of the previous frame picture and the next frame picture.
dx i =|x i -x i+1 |(i=0,1,2......)
dy i =|y i -y i+1 |(i=0,1,2......)
dr i =|r i -r i+1 |(i=0,1,2......)
If dr i If the eye movement type is smaller than the set threshold, the eye movement type of the two frames is determined to be gazing, and if the eye movement type is the following dr i+1 Still less than the threshold, is considered to be still in fixation state until dr n And if the time is greater than the threshold value, counting the current fixation time. Finally calculate the first fixation time (t f ) I.e. average gaze time
Sequentially averaging dx, dy, drFinally get->Five-dimensional feature vectors.
The expression feature extraction is to extract TOFS features from the obtained expression image, and the 41-dimensional feature vector is obtained by extracting the expression features in the whole sequence based on the light flow field features and the sliding window, and the detailed steps are as follows:
fig. 5 is a schematic flow chart of extracting TOFS features. The expression feature extraction module comprises CNN for identifying facial areas, calculating 68 facial key points, 36 interested areas and TOFS features. The TOFS feature calculation is based on MDMO (Main Directional Mean Optical-flow) feature and is combined with a feature selection algorithm of a sliding window, the expression recognition of the optical flow feature in a short sequence video has good accuracy, but the video sequence of the experiment is long, in order to ensure the accuracy and robustness of the recognition algorithm, the feature vector of the sliding window in unit time is extracted on the basis of the feature extraction, and then the TOFS feature of 41 dimensions is obtained.
The MDMO feature is based on optical flow and the dataset may be a video segment or picture. For example, for a sequence of images (f 1 ,f 2 ,...,f m ) This feature is based on a facial motion coding system, using 68 facial key points to divide the facial region of each frame into 36 regions of interest (ROIs), calculating the optical flow between frames. For each frame f i I > 1, will each ROIk=1, 2..the optical flow vectors in 36 are divided into 8-direction bins, the most numerous of optical flow vectors are selected, and the primary direction of optical flow is the average of all optical flow vectors in that bin. The optical flow vectors are represented in polar coordinates (ρ ii ),ρ i And theta i Is the magnitude and direction of the optical flow. In order to eliminate the influence of different frames of different video segments, we normalize the frames to obtain the final characteristics:
wherein:
we represent the 72-dimensional feature as:
where α is an adjustable parameter, we set the value of this parameter to 0.9 according to the experimental results of the paper of the optical flow characteristics.
The time-frequency domain statistical characteristic of the sliding window is that the duration of each video segment collected by the user in the experiment is about 2 minutes, and the segment of video records the emotion change of the testee in the test process. We have found through experimentation that if the optical flow characteristics are found for the entire video, the apparent degree of emotional change is eliminated to some extent. Moreover, the facial expression change of the tested person is not obvious in part of the time period, so the paper provides a sliding window algorithm to search the key video segment, and the optical flow characteristic is better extracted. For each sequence of pictures (f 1 ,f 2 ,...,f m ) The frame number of the sliding window is n, the picture sequence subset obtained by the sliding window is gamma, and the frame number of gamma is marked as frame γ The sliding window may be described as:
wherein, the value relation of i and n is as follows:
the optical flow change characteristics of the tested person in each small time period are obtained by utilizing a sliding window algorithm, and in order to better utilize the information, the time-frequency domain statistical characteristics of the sliding window are extracted: mean μ, variance s, standard deviation σ, skewness γ 1 Kurtosis K r . Sliding window optical flow feature sequence for each subjectThe time-frequency domain statistical characteristics are as follows:
skewness is a measure of the direction and extent of deflection of a statistical data distribution, and we use the skewness to measure the symmetry of optical flow features throughout the process:
wherein kappa is 23 Representing second and third order central moments, respectively, E is the operation of averaging.
Kurtosis is normalized fourth-order central moment, and the kurtosis characteristic of sliding window data is extracted to measure the distribution condition of optical flow:
we combine these statistics with optical flow features to construct the final 41-dimensional feature TOFS:
the machine learning classification module includes steps of feature fusion and machine learning to build and train the classifier. Feature fusion is to obtain two sets of feature vectors about eye movement and expression features from the above steps, combine the two sets of feature vectors into a complex vector space by one complex vector using a parallel feature fusion method, and then perform dimension reduction on the vector space by using Principal Component Analysis (PCA), and includes the following steps:
(1) Forming the original data into an n (eigenvector) row m (sample number) column matrix X according to columns;
(2) Zero-equalizing each row of X, namely subtracting the average value of the row;
(3) Solving a covariance matrix;
(4) Obtaining eigenvalues and corresponding eigenvectors r of the covariance matrix;
(5) And arranging the eigenvectors into a matrix according to the corresponding eigenvalues from top to bottom, and taking the first k rows to form a matrix P, namely the data from dimension reduction to k dimension.
Training a classifier: after the collected eye movement data and expression data are marked according to whether a tested person is a depressed patient or not, the eye movement data and the expression data together with whether the label of the depressed patient is used as training data, the decision tree carries out classification calculation on the extracted effective feature matrix, a classifier model is built and trained, and a classification model with higher accuracy is obtained.
The automatic evaluation module is used for applying the classifier model obtained by the machine learning classification module, the tested person without the manual diagnosis result is subjected to the acquisition of eye movement data and expression data and the characteristic extraction and the characteristic fusion according to the data acquisition and the characteristic extraction modes, the effective characteristics are calculated, the effective characteristics are input into a trained classifier, the classifier is used for evaluating the depression degree, the depression degree classification of the tested person is output, and the classification result is that: normal or depressed.

Claims (7)

1. The depression evaluation system based on the eye movement and the facial expression is characterized by comprising an emotion stimulation module, an expression acquisition module, an eye movement feature extraction module, an expression feature extraction module, a machine learning classification module and an automatic evaluation module; the expression acquisition module is used for acquiring expression information when a tested person views different emotion stimulation pictures output by the emotion stimulation module in a quiet environment;
the eye movement acquisition module is used for acquiring eye movement information when a tested person views different emotion stimulation pictures output by the emotion stimulation module in a quiet environment;
the eye movement feature extraction module extracts eye movement features from the obtained eye movement image information;
the expression feature extraction module is configured to obtain a face region and a time-frequency domain statistical feature of a tested person, extract 66 feature points and 36 ROI (region of interest) regions of the face according to the face region, extract 36-dimensional optical flow features of an optical flow field according to 66 feature points and 36 ROI regions of interest of the face, and then perform simultaneous calculation on a mean value, kurtosis, variance, standard deviation and skewness in the time-frequency domain statistical feature and the 36-dimensional optical flow features to obtain TOFS features;
the time-frequency domain statistical features are a plurality of feature vectors in unit time, wherein the feature vectors are obtained by performing feature extraction on video frequency bands; the video segment is obtained by cutting a data set of the acquired testee through a sliding window;
the machine learning classification module is configured to perform feature fusion on the eye movement features and the TOFS features, and perform machine learning classification on effective features obtained after feature fusion;
the automatic evaluation module evaluates the depression degree of the tested person according to the machine learning classification result.
2. The eye movement and facial expression based depression assessment system according to claim 1, wherein the expression acquisition module comprises an expression acquisition camera, the expression acquisition camera is arranged at a proper position in front of the subject to shoot the complete face area of the subject, and the resolution is 4096x 2160 with the compass c1000e and the sampling rate of 60fps.
3. The eye-movement and facial-expression-based depression assessment system according to claim 1, wherein the emotion-stimulating module comprises picture material capable of giving positive, neutral and negative different emotional stimuli to the subject, respectively, and audio material of the same emotion as the picture stimulus.
4. The depression evaluation system based on eye movement and facial expression according to any one of claims 1 to 3, wherein the eye movement feature extraction module extracts eye movement features, which is to extract eye movement features from an obtained pupil image, and comprises a step of extracting pupil radius and pupil center coordinate information in the image by adopting a Canny edge detection operator and a Hough circle detection algorithm, so as to calculate changes in eye movement track and pupil size; for eye movement data, gaussian filtering is used for denoising, then a Canny edge detection operator and Hough circle are used for detecting the circle center and the size of the pupil, and meanwhile, the characteristics of the gazing area and gazing time of a tested person are calculated.
5. The eye movement and facial expression based depression assessment system of claim 4, wherein the machine learning classification module comprises the step of feature fusion: and carrying out multi-mode parallel feature fusion on the two groups of feature vectors of the extracted eye movement features and expression features, combining the two groups of feature vectors into a complex vector space through complex vectors, and then carrying out dimension reduction on the vector space through Principal Component Analysis (PCA).
6. The eye movement and facial expression based depression assessment system of claim 5, wherein the machine learning classification module comprises the step of training a classifier: marking the collected eye movement data and expression data according to whether the tested person is a depressed patient, then taking the eye movement data and the expression data together with whether the label is a depressed patient as training data, carrying out classification calculation by adopting a decision tree, and establishing and training a classifier model.
7. The eye movement and facial expression based depression assessment system of claim 6, wherein the automatic assessment module comprises the step of automatically assessing: collecting eye movement data and expression data of a person to be tested with unknown depression, extracting features and fusing the features, inputting the collected eye movement data and expression data into a trained classifier model, automatically evaluating whether the person to be tested has depression tendency or not by the classifier model according to the input features, and outputting depression degree classification, wherein the classification result is as follows: normal or depressed.
CN202010692613.1A 2020-07-17 2020-07-17 A Depression Assessment System Based on Eye Movement and Facial Expression Active CN111933275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010692613.1A CN111933275B (en) 2020-07-17 2020-07-17 A Depression Assessment System Based on Eye Movement and Facial Expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010692613.1A CN111933275B (en) 2020-07-17 2020-07-17 A Depression Assessment System Based on Eye Movement and Facial Expression

Publications (2)

Publication Number Publication Date
CN111933275A CN111933275A (en) 2020-11-13
CN111933275B true CN111933275B (en) 2023-07-28

Family

ID=73313299

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010692613.1A Active CN111933275B (en) 2020-07-17 2020-07-17 A Depression Assessment System Based on Eye Movement and Facial Expression

Country Status (1)

Country Link
CN (1) CN111933275B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614583A (en) * 2020-11-25 2021-04-06 平安医疗健康管理股份有限公司 Depression grade testing system
CN112603320A (en) * 2021-01-07 2021-04-06 岭南师范学院 Optical nondestructive special children detector based on facial expression analysis and detection method
CN113052064B (en) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN113413154A (en) * 2021-05-14 2021-09-21 兰州大学 Wearable eye movement and facial expression synchronous acquisition system
CN113269122B (en) * 2021-06-09 2024-11-05 兰州大学 Wearable eye movement and facial expression synchronous acquisition device and method
CN113658697B (en) * 2021-07-29 2023-01-31 北京科技大学 Psychological assessment system based on video fixation difference
CN113946217B (en) * 2021-10-20 2022-04-22 北京科技大学 Intelligent auxiliary evaluation system for enteroscope operation skills
CN114758378B (en) * 2022-03-07 2025-04-22 国科温州研究院(温州生物材料与工程研究所) A depression recognition system based on deep learning behavioral entropy
CN114743680B (en) * 2022-06-09 2022-09-06 云天智能信息(深圳)有限公司 Method, device and storage medium for evaluating non-fault
CN115607159B (en) * 2022-12-14 2023-04-07 北京科技大学 Depression state identification method and device based on eye movement sequence space-time characteristic analysis
CN116030964A (en) * 2023-01-09 2023-04-28 中国人民武装警察部队特色医学中心 A method and system for identifying depression based on support vector machines
CN116010901A (en) * 2023-01-09 2023-04-25 中国人民武装警察部队特色医学中心 A method and system for identifying depression based on a multimodal data fusion model
CN116211306A (en) * 2023-03-06 2023-06-06 济南国科医工科技发展有限公司 Mental health self-assessment system based on eye movement and electrocardiographic signals
CN116935470A (en) * 2023-07-21 2023-10-24 中国人民大学 Picture and vocabulary self-reference emotion excitation delayed task cognition system
CN118315072B (en) * 2024-06-11 2024-08-16 杭州虚之实科技有限公司 Eye movement data analysis method and system for neural development disorder function assessment
CN119405313A (en) * 2024-09-25 2025-02-11 浙江大学 A method and system for visual-electroencephalographic-physiological multimodal signal acquisition and real-time emotion assessment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109157231A (en) * 2018-10-24 2019-01-08 阿呆科技(北京)有限公司 Portable multi-channel Depression trend assessment system based on emotional distress task

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107212851B (en) * 2017-07-28 2024-05-31 温州市人民医院 Wireless eye movement instrument
CN107609516B (en) * 2017-09-13 2019-10-08 重庆爱威视科技有限公司 Adaptive eye movement method for tracing
CN107773248B (en) * 2017-09-30 2024-10-29 优视眼动科技(北京)有限公司 Eye tracker and image processing method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109157231A (en) * 2018-10-24 2019-01-08 阿呆科技(北京)有限公司 Portable multi-channel Depression trend assessment system based on emotional distress task

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Main Directional Mean Optical Flow Feature for Spontaneous Micro-Expression Recognition;Yong-Jin Liu等;《IEEE Transactions on Affective Computing》;20151001;第7卷(第4期);摘要,第3.1节 *
基于脑电及卷积神经网络的抑郁症实时监测方法研究;赵盛杰;《中国优秀硕士学位论文全文数据库信息科技辑》(第12期);I140-178 *

Also Published As

Publication number Publication date
CN111933275A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN111933275B (en) A Depression Assessment System Based on Eye Movement and Facial Expression
US12193749B2 (en) Smartphone-based digital pupillometer
Yiu et al. DeepVOG: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning
US10413180B1 (en) System and methods for automatic processing of digital retinal images in conjunction with an imaging device
Lopez et al. Detecting exercise-induced fatigue using thermal imaging and deep learning
CN110448267B (en) Multimode fundus dynamic imaging analysis system and method
De Almeida et al. Computational methodology for automatic detection of strabismus in digital images through Hirschberg test
ES3034281T3 (en) Method and system for anonymizing facial images
TW201701820A (en) Method for detecting eye movement, program thereof, memory medium of the program, and device for detecting eye movement
CN117918021A (en) Extracting signals from camera observations
CN113011286B (en) Strabismus discrimination method and system based on video-based deep neural network regression model
Shah et al. Eyedentify: A dataset for pupil diameter estimation based on webcam images
CN116092157A (en) Intelligent facial tongue diagnosis method, system and intelligent equipment
Punuganti Automatic detection of nystagmus in bedside VOG recordings from patients with vertigo
US20250302294A1 (en) Biometric ocular measurements using deep learning
Waqar Contact-free heart rate measurement from human face videos and its biometric recognition application
Valencia Automatic detection of diabetic related retina disease in fundus color images
Shyla et al. Multiple Feature-Based Glaucoma Detection From Optic Images Using Support Vector Machine
Lashkar et al. A motion-based waveform for the detection of breathing difficulties during sleep: S. Lashkar, H. Ammar
Angerer Stresserkennung mit Hilfe von Gesichtsausdrücke aus Videosequenzen: von Paul Angerer
Zheng et al. VIOMA: Video-Based Intelligent Ocular Misalignment Assessment
Saavedra-Peña Saccade latency determination using video recordings from consumer-grade devices
KR20250151512A (en) Detection and evaluation of pupil movement
CN120674038A (en) Early screening method and device for facial microexpressive stroke hidden affective disorder
Scebba Robust Health Monitoring with Digital Cameras

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant