CN120853236B - A method and system for eye movement correction based on codec and feature decoupling - Google Patents
A method and system for eye movement correction based on codec and feature decouplingInfo
- Publication number
- CN120853236B CN120853236B CN202510957643.3A CN202510957643A CN120853236B CN 120853236 B CN120853236 B CN 120853236B CN 202510957643 A CN202510957643 A CN 202510957643A CN 120853236 B CN120853236 B CN 120853236B
- Authority
- CN
- China
- Prior art keywords
- eye
- image
- rotation
- tar
- posture information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
- G06V40/193—Preprocessing; Feature extraction
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Human Computer Interaction (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Ophthalmology & Optometry (AREA)
- Medical Informatics (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a method and a system for correcting eye spirit based on decoupling of a coder and a decoder and characteristics. The method comprises the steps of a, collecting an original face image I, obtaining an eye image I c and head posture information H gt of a user from the original face image I, b, extracting features of the I c and the H gt to obtain a vector attribute code z i representing static attribute features of the user, The eye posture correcting method comprises the steps of representing current eye posture information G pre of the current eye posture and representing a rotation attribute code z r of an eye rotation attribute, carrying out three-dimensional transformation on z r according to G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to G tar, generating a target eye image I pre according to z i and z pre, and pasting I pre back to the original face image I to output a face image after eye correction. the method effectively solves the problem of image distortion caused by attribute and gesture feature confusion in the traditional method, and greatly improves the accuracy and naturalness of eye correction.
Description
Technical Field
The invention relates to the technical field of computer vision and artificial intelligence, in particular to an eye-god correcting method and system based on a codec and feature decoupling.
Background
Eye Correction (size Correction), an innovative technique applied to face feature processing, aims to align the eyes of a user with a target line of sight by adjusting the direction or position of eyeballs. The technology has wide application value in the fields of video conference, virtual reality, augmented reality and the like.
At present, the technical scheme for realizing eye correction is mainly divided into two major categories, namely a hardware method and a software method.
Traditional eye correction often relies on specialized hardware devices such as an eye tracker or infrared cameras. These devices track the eye of the user in real time by capturing eye movement trajectories and positions and change the direction of the eye by means of optical correction or hardware adjustment. However, such methods tend to have high equipment costs and often require additional wear or fixtures for the user, resulting in poor user experience, and also tend to be limited in scope and difficult to use in mass settings such as video conferencing.
The software method mainly uses a traditional image processing algorithm to perform geometric transformation or pixel replacement on an eye region in an image by analyzing facial feature points (such as eyes, nose, mouth and the like) in the image so as to realize eye correction. This approach has high flexibility and speed and does not require special hardware support. However, such conventional algorithms are typically based on static rules, have limited processing power, lack adaptive optimization capabilities, and have difficulty in handling diverse user demands.
With the development of deep learning models such as convolutional neural Networks (Convolutional Neural Networks, CNNs) and generating countermeasure Networks (GENERATIVE ADVERSARIAL Networks, GANs), neural network-based eye correction techniques gradually become research hotspots, but the existing neural network-based eye correction techniques still have the problems of insufficient precision, poor naturalness, low calculation efficiency and the like.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention provides a method and system for correcting eye spirit based on the decoupling of the codec and the features.
One aspect of the present invention provides an eye correction method based on codec and feature decoupling, comprising the steps of:
step a, acquiring an original face image I through image acquisition equipment, and acquiring an eye image I c and head posture information H gt of a user from the original face image;
Step b, extracting features of the eye image I c and the head posture information H gt to obtain a vector attribute code z i representing static attribute features of a user, current eye posture information G pre representing current eye posture and a rotation attribute code z r representing rotation attribute of the eye;
Step c, performing three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar;
Step d, generating a target eye image I pre according to the vector attribute code z i and the rotation attribute code z pre, and
And e, pasting the target eye image I pre back to the original face image I, and outputting the face image after eye correction.
Another aspect of the present invention provides a codec-and feature-decoupling-based eye correction system, comprising:
The data input module is used for acquiring an original face image I through the image acquisition equipment and acquiring an eye image I c and head posture information H of a user from the original face image gt
The feature extraction module is used for extracting features of the eye image I c and the head posture information H gt to obtain a vector attribute code z i representing static attribute features of a user, current eye posture information G pre representing current eye posture and a rotation attribute code z r representing rotation attributes of eyes;
The feature transformation module is used for carrying out three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar;
A decoding module that generates a target eye image I pre from the vector attribute code z i and the rotational attribute code z pre;
And the image generation module is used for pasting the target eye image I pre back to the original face image I and outputting the face image after the eye correction.
The invention has the following beneficial effects:
1. The invention ensures that the generated image can accurately and naturally realize the adjustment of eye spirit while maintaining personalized characteristics through the characteristic coding decoupling technology, and simultaneously ensures that the corrected image is natural and lifelike in vision by combining the generation of texture details generated by the antagonism network optimization.
2. The method has strong real-time performance and is suitable for low-resource equipment, by adopting heavy parameterization processing on the trained multi-layer neural convolution network, the calculation cost is obviously reduced during the eye correction test, the real-time correction can be realized in the mobile equipment, and the method is suitable for scenes with high requirements on interactivity and real-time performance, such as video conferences, virtual reality and the like.
Drawings
Fig. 1 is a flow chart of a method of eye correction based on codec and feature decoupling in accordance with a preferred embodiment of the present invention.
Fig. 2 is a block diagram of a codec and feature decoupling based eye correction system in accordance with a preferred embodiment of the present invention.
Fig. 3 is a schematic diagram of a multi-layer neural convolutional network of an encoding module and a decoding module employing a re-parameterization process according to a preferred embodiment of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are only intended to better understand the content of the study of the invention and are not intended to limit the scope of the invention.
As shown in FIG. 1, the method according to a preferred embodiment of the present invention comprises the following steps a-e.
Firstly, acquiring an original face image I through an image acquisition device such as a camera, wherein the original face image comprises face information, and acquiring an eye image I c and head posture information H gt of a user from the original face image.
In a preferred embodiment, the acquiring an eye image of the user in step a further comprises the steps of:
step a1, identifying corresponding 106 face key points from the original face image I;
step a2, positioning an eye area for the 106 face key points, wherein the eye area preferably keeps the left and right edges at 16 pixels from the left and right corners and the lower edge at 16 pixels from the lower orbit;
And a3, cutting the original face image according to the eye region to generate the eye image I c. Preferably, the image is 96 pixels high, 64 pixels high.
In step a, the head posture information H gt includes a pitch angle, which is an up-down yaw angle, and a yaw angle, which is a right-left yaw angle.
Then, step b is to extract the features of the eye image I c and the head posture information H gt to obtain a vector attribute code z i representing the static attribute features of the user, current eye posture information G pre representing the current eye posture and a rotation attribute code z r representing the rotation attribute of the eye.
Preferably, in the step b, the feature extraction module performs feature extraction operation by using a convolution neural network after the re-parameterization processing, where the convolution neural network is composed of a plurality of cascaded convolution layers and activation functions. Preferably, the user static attribute features are user personalization features including gender, age, skin tone, etc. Preferably, the vector attribute code z i is a 256-dimensional vector. The current eye pose information G pre=(Ppre,Ypre).Ppre before correction indicates the current pitch angle, and Y pre indicates the current yaw angle. Preferably, the rotational attribute code z r is 48 dimensions.
And c, performing three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar.
Preferably, the target eye posture information G tar=(Ptar,Ytar is preset before this step, where P tar represents a target pitch angle, and Y tar represents a target yaw angle, which are all represented by radians. Preferably, in the video call scene, the target attitude may be set to an attitude of the user facing the front view direction, that is, the pitch angle and the yaw angle are both 0 degrees, that is, the target attitude is (0, 0). In other scenarios, the target pose may be manually specified by the user, e.g., 15 ° upward looking up and 10 ° oblique to the left as targetsThe object is 15 ° down looking down and 10 ° right looking obliquely
Preferably, step c further comprises the following steps c1-c3. The dimensions of z r were recombined to 3 x 16 before steps c1-c3 and were characterized as 16 three dimensions.
Step c1, acquiring a three-dimensional rotation matrix R pre corresponding to the current eye posture information G pre:
step c2, obtaining a three-dimensional rotation matrix R tar corresponding to the target eye posture information G tar:
wherein, R pre matrix includes left-right rotation and up-down rotation, and R tar matrix includes left-right rotation and up-down rotation.
Step c3, performing rotation inversion operation on the R pre, and performing transformation operation through an R tar rotation matrix to obtain a rotation attribute code corresponding to the target eye postureFinally, z pre of 3 x 16 dimensions was reorganized to 48 dimensions, thus keeping with z r. This process completes the natural adjustment of the eye pose from the current direction G pre to the target direction G tar in hidden space.
The next step d is to generate a target eye image I pre from the vector attribute code z i and the rotational attribute code z pre.
Step d further includes a decoding module formed by the multi-layer convolutional neural network performing feature extraction on the vector attribute code z i and the rotation attribute code z pre to generate a target eye image I pre.
Preferably, in the step d, the decoding module performs feature extraction operation by using a convolutional neural network after the re-parameterization processing, where the convolutional neural network is composed of a bilinear upsampling layer, a plurality of cascaded convolutional layers and an activation function. The decoding module adopts bilinear upsampling operation, and is excellent in detail retention and edge smoothing of a generated image, so that the artifact problem caused by a deconvolution method is effectively avoided.
And finally, in the step e, the target eye image I pre is pasted back to the original face image I, and the face image after eye correction is output. The system pastes the generated target eye image I pre back to the original face image I according to the key point, thereby realizing seamless fusion with the original face and avoiding the occurrence of split feeling on the whole look and feel. The corrected complete face image is displayed on a screen in real time, and the user can directly check the adjusted eye effect. In a video communication or other scenario, the adjusted image may be used directly for real-time transmission.
As described above, the invention independently codes the personalized attribute and the eye gesture feature of the user through the feature extraction module to respectively generate the attribute code z i and the rotation attribute code z r, thereby realizing feature decoupling. Specifically, the attribute code z i characterizes personalized features of the user, such as static information of the gender, age, skin color, etc., and the rotation attribute code z r is used for describing dynamic rotation features of the eyes of the user. In the eye correction process, only the rotation attribute code z r is adjusted, and the natural rotation process of the eyeball is simulated through three-dimensional posture transformation. Meanwhile, the attribute code z i is kept unchanged, so that the generated image can accurately and naturally realize the adjustment of the eye spirit while the personalized characteristics are kept.
One embodiment of the present invention also provides a codec and feature decoupling based eye correction system 20 comprising:
The data input module 21 is used for acquiring an original face image I through image acquisition equipment and acquiring an eye image I c and head posture information H gt of a user from the original face image;
The feature extraction module 22 performs feature extraction on the eye image I c and the head pose information H gt to obtain a vector attribute code z i representing static attribute features of the user, current eye pose information G pre representing current eye pose, and a rotation attribute code z r representing rotation attributes of the eye;
the feature transformation module 23 performs three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar;
A decoding module 24 that generates a target eye image I pre from the vector attribute code z i and the rotational attribute code z pre;
the image generating module 25 pastes the target eye image I pre back to the original face image I, and outputs the face image after eye correction.
The data input module 21 further includes a facial point recognition module 211 and a head pose prediction module 212.
The facial point recognition module 211 performs the following steps:
step a1, identifying corresponding 106 face key points from the original face image I;
step a2, positioning an eye area for the 106 face key points, wherein the eye area preferably keeps the left and right edges at 16 pixels from the left and right corners and the lower edge at 16 pixels from the lower orbit;
and a3, cutting the original face image according to the eye region to generate the eye image I c.
The head pose prediction module 212 is configured to predict a head pose H gt of a corresponding face in the image I, including an up-down yaw angle (pitch angle) and a left-right yaw angle (yaw angle).
Preferably, the multi-layer neural convolution network of the feature extraction module (encoding module) 22 and the decoding module 24 of the present invention adopts a simplified and optimized network design, so as to solve the problem that the real-time performance cannot be realized under the condition of limited resources in the prior art.
In the preferred embodiment, the feature extraction module (encoding module) 22 and decoding module 24 employ a re-parameterization of the trained multi-layer neural convolutional network, thereby reducing computation and memory read-write during testing. Specifically, for a convolution module, the module structure is shown in fig. 3, where there are three branches during training, the first branch is a 3*3 convolution layer followed by a batch normalization layer (BN), the second branch is a 1*1 convolution layer followed by a batch normalization layer (BN), and the third branch is a residual connection (assuming that the input channel is the same as the output channel, otherwise there is no third branch). I.e. in the training phase, the three branches are trained together, effectively simulating the residual structure and the spatially independent feature transformation structure. In the test stage (eye correction processing stage), 1*1 convolution can be regarded as 3*3 convolution kernel with 0 periphery, residual structure can be regarded as 3*3 convolution kernel with 1 center and 0 periphery, then parameters of batch normalization layers can be combined into convolution kernel, finally parameters of three kernels are added, namely 2 convolutions, 2 batch normalization layers and one residual connection can be combined into one layer convolution (3*3 convolution), so that the number of layers, the calculated amount and the memory read-write are greatly reduced.
In the preferred embodiment, the convolution layers all use 3*3 convolution kernels. Because 3*3's convolution kernel is much more computationally efficient than other convolution kernels, while taking into account neighborhood information.
In the preferred embodiment, all convolution modules are of a one-way architecture without any side branches at the time of testing. Because the operations such as residual error, cross-layer connection and the like have small calculation amount, a large amount of extra memory is occupied, so that the operation efficiency is reduced. Therefore, the invention does not have any residual error and cross-layer connection structure in actual use.
In a preferred embodiment, the eye correction system 20 further includes a supervisory optimization module 26 that optimizes parameters of the various modules in the system by way of data-driven training. The module 26 uses five loss functions to improve the quality of the generated image and the accuracy of the pose adjustment.
(1) Reconstruction loss, namely measuring the difference between the generated image and the real image at the pixel level by calculating the pixel errors of the predicted image I pre and the target image I gt during training.
(2) Perceptual loss-computing the difference in perceptual features of the image generated by the input image I pre and the target image I gt at training through the pre-trained vgg network.
(3) Combat losses based on the discriminators feedback of the generated combat network (GAN), the discriminators discriminate by training that the predicted image I pre is false and the target image I gt is true, and the generator tries to confuse the discriminators, thereby enhancing the realism of the generated image I pre.
(4) Posture loss, namely measuring the angle error between the currently predicted eye posture and the target posture by calculating the square angle distance between the posture information G pre and the target posture G tar during training.
(5) And calculating the cosine similarity of the attribute codes z i of the input image and the attribute codes z t of the target image of the same person, and ensuring that different images of the same person have similar attribute codes.
The supervisory optimization module 26 optimizes the five loss functions in combination so that the corrected image generated by the system is not only visually realistic but also accurate in pose adjustment.
It will be apparent to those skilled in the art that the above embodiments are provided for illustration only and not for limitation of the invention, and that variations and modifications of the above described embodiments are intended to fall within the scope of the claims of the invention as long as they fall within the true spirit of the invention.
Claims (9)
1. An eye correction method based on the decoupling of a codec and a feature is characterized by comprising the following steps:
step a, acquiring an original face image I through image acquisition equipment, and acquiring an eye image I c and head posture information H gt of a user from the original face image;
Step b, extracting features of the eye image I c and the head posture information H gt to obtain a vector attribute code z i representing static attribute features of a user, current eye posture information G pre representing current eye posture and a rotation attribute code z r representing rotation attribute of the eye;
Step c, performing three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar;
Step d, extracting the characteristics of the vector attribute code z i and the rotation attribute code z pre to generate a target eye image I pre, and
Step e, pasting the target eye image I pre back to the original face image I, outputting the face image after eye correction,
Wherein step c further comprises the steps of:
Step c1, acquiring a three-dimensional rotation matrix R pre corresponding to the current eye posture information G pre;
Step c2, acquiring a three-dimensional rotation matrix R tar corresponding to the target eye posture information G tar;
Step c3, performing rotation inversion operation on the R pre, and performing transformation operation through an R tar rotation matrix to obtain a rotation attribute code corresponding to the target eye posture
2. The method for correcting eye relief based on codec and feature decoupling as claimed in claim 1, wherein the step a of obtaining an eye image of the user further comprises the steps of:
a1, identifying a plurality of corresponding face key points from the original face image I;
step a2, positioning an eye area for the plurality of face key points;
and a3, cutting the original face image according to the eye region to generate the eye image I c.
3. The method for correcting eye according to claim 1, wherein in the step a, the head pose information H gt includes an up-down yaw angle, i.e., pitch angle, and a left-right yaw angle, i.e., yaw angle.
4. The method for correcting eye according to claim 1, wherein in step b, the feature extraction operation is performed by using a convolutional neural network after the re-parameterization process, and the convolutional neural network is composed of a plurality of cascaded convolutional layers and an activation function.
5. The method for correcting eye relief based on codec and feature decoupling according to claim 1, wherein in steps c1 and c2, the current eye pose information G pre=(Ppre,Ypre), wherein P pre represents a current pitch angle, Y pre represents a current yaw angle, the target eye pose information G tar=(Ptar,Ytar), wherein P tar represents a target pitch angle, Y tar represents a target yaw angle,
In the above formula, the R pre matrix includes a left-right rotation and a up-down rotation, and the R tar matrix includes a left-right rotation and a up-down rotation.
6. The method for correcting eye according to claim 1, wherein in step d, the feature extraction operation is performed by using a convolutionally neural network after the re-parameterization process, wherein the convolutionally neural network is composed of a bilinear upsampling layer, a plurality of cascaded convolutionally layers and an activation function.
7. A codec-and feature-decoupling-based eye correction system using the method of any one of claims 1-6, comprising:
The data input module is used for acquiring an original face image I through the image acquisition equipment and acquiring an eye image I c and head posture information H of a user from the original face image gt
The feature extraction module is used for extracting features of the eye image I c and the head posture information H gt to obtain a vector attribute code z i representing static attribute features of a user, current eye posture information G pre representing current eye posture and a rotation attribute code z r representing rotation attributes of eyes;
The feature transformation module is used for carrying out three-dimensional transformation on the rotation attribute code z r according to the current eye posture information G pre and preset target eye posture information G tar to obtain a rotation attribute code z pre corresponding to the target eye posture information G tar;
A decoding module that generates a target eye image I pre from the vector attribute code z i and the rotational attribute code z pre;
And the image generation module is used for pasting the target eye image I pre back to the original face image I and outputting the face image after the eye correction.
8. The system of claim 7, wherein the feature extraction module and the decoding module apply a re-parameterization to the trained multi-layer neural convolutional network for eye correction.
9. The system of claim 7, further comprising a supervisory optimization module for optimizing parameters in the neural convolutional network of the feature extraction module and the decoding module by data driven training, the supervisory optimization module using a plurality of loss functions.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202510957643.3A CN120853236B (en) | 2025-07-11 | 2025-07-11 | A method and system for eye movement correction based on codec and feature decoupling |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202510957643.3A CN120853236B (en) | 2025-07-11 | 2025-07-11 | A method and system for eye movement correction based on codec and feature decoupling |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN120853236A CN120853236A (en) | 2025-10-28 |
| CN120853236B true CN120853236B (en) | 2026-02-06 |
Family
ID=97407919
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202510957643.3A Active CN120853236B (en) | 2025-07-11 | 2025-07-11 | A method and system for eye movement correction based on codec and feature decoupling |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN120853236B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112733795A (en) * | 2021-01-22 | 2021-04-30 | 腾讯科技(深圳)有限公司 | Method, device and equipment for correcting sight of face image and storage medium |
| CN118247830A (en) * | 2022-12-22 | 2024-06-25 | 杭州海康威视数字技术股份有限公司 | Method, device, electronic device and storage medium for generating sight line image samples |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3362946B1 (en) * | 2015-10-16 | 2020-08-26 | Magic Leap, Inc. | Eye pose identification using eye features |
| IL284572B2 (en) * | 2019-01-03 | 2024-12-01 | Immersix Ltd | Eye tracking system and method |
| US11694419B2 (en) * | 2021-09-06 | 2023-07-04 | Kickback Space Inc. | Image analysis and gaze redirection using characteristics of the eye |
| CN117315012B (en) * | 2022-06-20 | 2025-11-11 | 广州视源电子科技股份有限公司 | Self-calibration method, device, equipment and medium for eye line-of-sight deviation parameters |
| CN117115321B (en) * | 2023-10-23 | 2024-02-06 | 腾讯科技(深圳)有限公司 | Method, device, equipment and storage medium for adjusting eye posture of virtual character |
-
2025
- 2025-07-11 CN CN202510957643.3A patent/CN120853236B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112733795A (en) * | 2021-01-22 | 2021-04-30 | 腾讯科技(深圳)有限公司 | Method, device and equipment for correcting sight of face image and storage medium |
| CN118247830A (en) * | 2022-12-22 | 2024-06-25 | 杭州海康威视数字技术股份有限公司 | Method, device, electronic device and storage medium for generating sight line image samples |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120853236A (en) | 2025-10-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12131436B2 (en) | Target image generation method and apparatus, server, and storage medium | |
| US20240212252A1 (en) | Method and apparatus for training video generation model, storage medium, and computer device | |
| US12197640B2 (en) | Image gaze correction method, apparatus, electronic device, computer-readable storage medium, and computer program product | |
| CN112733795B (en) | Method, device and equipment for correcting sight of face image and storage medium | |
| JP7542740B2 (en) | Image line of sight correction method, device, electronic device, and computer program | |
| US20220358675A1 (en) | Method for training model, method for processing video, device and storage medium | |
| US11734889B2 (en) | Method of gaze estimation with 3D face reconstructing | |
| CN116385667B (en) | Reconstruction method of three-dimensional model, training method and device of texture reconstruction model | |
| CN114187165A (en) | Image processing method and device | |
| CN114998514B (en) | Method and device for generating virtual characters | |
| WO2024055379A1 (en) | Video processing method and system based on character avatar model, and related device | |
| CN117218246A (en) | Training method, device, electronic equipment and storage medium for image generation model | |
| CN111754622B (en) | Facial three-dimensional image generation method and related equipment | |
| US20250014149A1 (en) | Image synthesis method and apparatus, storage medium, and electronic device | |
| Sun et al. | Ssat++: A semantic-aware and versatile makeup transfer network with local color consistency constraint | |
| WO2025232361A1 (en) | Video generation method and related apparatus | |
| CN120853236B (en) | A method and system for eye movement correction based on codec and feature decoupling | |
| CN119484953A (en) | Method and device for generating digital human video based on multimodal feature fusion based on temporal position coding | |
| CN118115576A (en) | Image processing method, device and related equipment | |
| CN115714888B (en) | Video generation method, device, equipment and computer readable storage medium | |
| US20260080601A1 (en) | Automatic rigging with 2d supervised learning | |
| Wang et al. | DEDD: Stereo Image Super-Resolution Reconstruction Based on Disparity Estimation and Domain Diffusion | |
| CN120390061A (en) | Video frame generation method, device and electronic device | |
| CN119417953A (en) | Image animation generation method, device, electronic device and storage medium | |
| CN121937594A (en) | Three-dimensional digital person generation method and device and electronic equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |