CN120635288A - Scene reconstruction method based on deferred rendering and 3D Gaussian - Google Patents

Scene reconstruction method based on deferred rendering and 3D Gaussian

Info

Publication number
CN120635288A
CN120635288A CN202510837308.XA CN202510837308A CN120635288A CN 120635288 A CN120635288 A CN 120635288A CN 202510837308 A CN202510837308 A CN 202510837308A CN 120635288 A CN120635288 A CN 120635288A
Authority
CN
China
Prior art keywords
gaussian
rendering
dimensional
reflection
normal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510837308.XA
Other languages
Chinese (zh)
Inventor
陈中衡
张世雄
魏文应
肖铁军
邓严萍
李宇浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Bohua Ultra Hd Innovation Center Co ltd
Original Assignee
Guangdong Bohua Ultra Hd Innovation Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Bohua Ultra Hd Innovation Center Co ltd filed Critical Guangdong Bohua Ultra Hd Innovation Center Co ltd
Priority to CN202510837308.XA priority Critical patent/CN120635288A/en
Publication of CN120635288A publication Critical patent/CN120635288A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/00Three-dimensional [3D] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/00Three-dimensional [3D] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/00Three-dimensional [3D] image rendering
    • G06T15/50Lighting effects
    • G06T15/506Illumination models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three-dimensional [3D] modelling for computer graphics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • General Physics & Mathematics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Image Generation (AREA)

Abstract

The invention provides a scene reconstruction method based on delay rendering and three-dimensional Gaussian, which comprises the steps of S1, generating initial three-dimensional point cloud data based on a multi-view image, S2, constructing a trainable structure body for three-dimensional Gaussian modeling, S3, realizing a set of micro-learnable normal reconstruction mechanism by carrying out normal initialization and residual optimization on Gaussian ellipsoid primitives and combining depth consistency constraint so as to enhance geometric expression capacity of illumination modeling, S4, introducing a reflection training mechanism based on ambient light and reflection directions to generate Gaussian attributes based on visual angles, S5, generating a final image by a micro Gaussian sputtering rendering algorithm, and optimizing pixel loss of the final image. The invention remarkably improves the realism and geometric consistency of the Gaussian sputtering model under the complex illumination condition, and solves the technical problems of unreal rendering effect, inaccurate surface normal estimation, weak propagation capability and the like of the existing three-dimensional Gaussian sputtering model under the complex illumination condition.

Description

Scene reconstruction method based on delay rendering and three-dimensional Gaussian
Technical Field
The invention belongs to the technical field of computer graphics and three-dimensional reconstruction, and particularly relates to a scene reconstruction method based on delay rendering and three-dimensional Gaussian.
Background
Generating realistic Novel views (Novel VIEW SYNTHESIS, NVS) has been a central challenge in the field of three-dimensional reconstruction, aimed at reconstructing three-dimensional scenes from sparse multi-view images and rendering high-fidelity, geometrically coherent synthetic images at any new view. The technology has important application value in the fields of medical image reconstruction, industrial precision detection, cultural heritage protection, virtual reality system, automatic driving environment sensing and the like.
The traditional three-dimensional scene representation has an explicit and implicit dual paradigm that the explicit representation relies on discrete geometric structures (such as point clouds and voxel grids) to realize real-time rendering through graphics pipeline hardware acceleration, but is limited by Nyquist sampling theorem constraint so that microstructure reconstruction precision and high-frequency detail reproduction capability show exponential attenuation, the implicit function representation constructs a differentiable volume rendering model through continuous mathematical functions (such as a symbol distance function and a radiation field), typically represented by a nerve radiation field (Neural RADIANCE FIELDS, NERF [1]), and can realize high-precision illumination reconstruction with view angle related characteristics. The three-dimensional Gaussian sputtering (3D Gaussian Splatting,3DGS [2]) innovatively combines the advantages of explicit geometric and implicit functions, namely, discrete three-dimensional Gaussian ellipsoid primitives are built based on input point cloud data, coupling characterization of local geometric features and optical properties is achieved through position coordinates, covariance matrixes and transparency parametric modeling, and then self-adaptive continuous optimization of the discrete Gaussian primitives under multiple view angles is driven by means of a micro-rasterized pipeline, so that sub-millimeter reconstruction accuracy (DTU CD=0.42) is achieved while real-time rendering capability (> 30 fps) is maintained.
However, modeling of the three-dimensional Gaussian sputtering technology on the mirror surface material in the complex light field still has the defects that the ellipsoidal primitive is lack of normal information, the surface structure is distorted and the mirror surface reflection is deviated due to insufficient ambient light coupling, and the real-time rendering capability of a large-scale scene is restricted due to the video memory pressure caused by independent illumination calculation of the existing method.
The problem of insufficient modeling of mirror surface materials in a complex light field of the current 3DGS is mainly difficult to solve, and the problem is that the primitive lacks a clear surface structure, so that normal estimation is difficult, and a physical illumination model related to a visual angle cannot be effectively introduced. In addition, the introduction of high quality reflection modeling often accompanies a proliferation of memory consumption, challenges its original real-time rendering capabilities, and also requires complex modification of the micro-renderable frame to support finer illumination-geometry coupling mechanisms.
Overcoming the problems can obviously improve the expressive force of the 3DGS in the sense of realism rendering, and expand the application prospect of the 3DGS in scenes with high precision and high efficiency in industrial detection, virtual reality and the like. By fusing physical illumination modeling and Gaussian primitive optimization technology, a new generation neural rendering paradigm with structure details, real illumination and real-time performance is hopeful to be constructed.
Disclosure of Invention
The invention provides a scene reconstruction method based on delayed rendering and three-dimensional Gaussian, which solves the technical problems of unreal rendering effect, inaccurate surface normal estimation, weak propagation capacity and the like of the traditional three-dimensional Gaussian sputtering model under complex illumination conditions.
The technical scheme of the invention is as follows:
The scene reconstruction method based on the delay rendering and the three-dimensional Gaussian comprises the following steps of S1, generating initial three-dimensional point cloud data based on a multi-view image, S2, constructing a trainable structure body for three-dimensional Gaussian modeling, S3, realizing a set of micro-learnable normal reconstruction mechanism by carrying out normal initialization and residual optimization on Gaussian ellipsoid primitives and combining depth consistency constraint so as to enhance geometric expression capacity of illumination modeling, S4, introducing a reflection training mechanism based on ambient light and reflection directions to generate Gaussian attributes based on visual angles, and S5, generating a final image by a micro-Gaussian sputtering rendering algorithm and optimizing pixel loss of the final image and a real image.
Optionally, in the above method for reconstructing a scene based on delay rendering and three-dimensional gaussian, in step S1, the three-dimensional point cloud is formed by a discrete three-dimensional point set, and includes three-dimensional space coordinates and associated RGB color attributes under a world coordinate system, and the camera parameter system includes two core parameters of an internal reference matrix and an external reference matrix.
Optionally, in the above-mentioned scene reconstruction method based on delay rendering and three-dimensional gaussian, in step S2, a three-dimensional gaussian structure body including three-dimensional position, covariance matrix, transparency, and color of learnable attributes is constructed by using sparse point cloud and camera pose as input, and is used for characterizing appearance and shape characteristics of each gaussian point in three-dimensional space, so as to provide parameterization support for micro-renderable and reflection modeling, and in a world coordinate system, three-dimensional gaussian ellipsoidal primitives are defined by using sparse point cloud as a central point, and spatial distribution characteristics thereof are defined:
each gaussian primitive is associated with a learnable geometric attribute, which is defined by a covariance matrix, and a rendering characteristic Controlling the geometry of ellipsoids, rendering characteristics including opacity parametersControlling the intensity of light attenuation during rendering, RGB color vectorsMapping to viewing angle dependent spherical harmonic coefficients;
To ensure semi-positive determination of covariance matrix during training, three-dimensional Gaussian ellipsoids are prevented from being degenerated into low-dimensional structures such as lines or points, and are decoupled into independently optimizable geometric components, and the matrix is rotatedScaling matrix:
Wherein to ensure a rotation matrixNormalized to the quaternion of its parametersThe quaternion builds a rotation matrix by the following formula:
in the differentiable rendering stage, a three-dimensional Gaussian is first projected onto a camera coordinate system with projection covariance The calculation is as follows:
wherein W is a rigid body transformation matrix from a world coordinate system to a camera coordinate system, and J is an affine approximated jacobian matrix of projective transformation;
The rendering adopts an alpha-blending mechanism based on physical light transmission, namely, firstly, depth ordering is carried out on Gaussian primitives projected to a pixel influence domain, and color mixing is realized by accumulating light attenuation weights and opacity by adopting a back-to-front mixing sequence by default:
transmittance of light Analyzing and deducing the characteristic value of the projection covariance matrix sigma', and representing the attenuation degree of the light after penetrating through the preamble Gaussian point;
The stage adopts The composite loss function synergistically optimizes rendered image quality:
Wherein the method comprises the steps of The pixel-level color accuracy is constrained,Maintaining structural similarity, super-parametersTaking 0.2.
Optionally, in the above-mentioned scene reconstruction method based on delay rendering and three-dimensional gaussian, in step S3, a three-dimensional ellipsoidal primitive surface normal reconstruction mechanism is introduced for each ellipsoidal primitive covariance matrixExtracting and normalizing feature vectors corresponding to the minimum feature values through feature decompositionThe shortest principal axis direction representing the local geometry of an ellipsoid, which may be directed inside and outside the surface, defines the vector of the direction of observationAligning the shortest principal axis as an initial normal for a normalized direction vector from the center of the ellipsoid to the optical center of the camera:
Wherein the method comprises the steps of Ensuring an initial normalDirection and line of sight directionThe included angle is strictly smaller thanIntroducing trainable residual vectorsNon-linear correction of the initial normal:
applying a penalty term simultaneously effectively suppresses excessive offset of the residual vector:
adding a depth penalty term to AndRespectively representing the center position of an ellipsoid of the optical center position of the camera, and observing the direction: Depth rendering: Then the adjacent ellipsoids have a depth difference: Depth gradient direction: . The depth difference and the normal are as follows:
Depth correction loss function definition:
Wherein the method comprises the steps of Is a set of adjacent point pairs, obtained by K-nearest neighbor or radius search,Is an adaptive weight;
Controlling the weight attenuation of the space distance, taking the average radius of the Gaussian ellipsoid, Controlling the weight attenuation of the depth difference, which is used for reducing the weight at the depth discontinuity and avoiding crossing the boundary penalty;
based on the optimized normal field And calculating the reflection direction with the incident ray direction by a reflection operator:
Optionally, in the above-mentioned scene reconstruction method based on delay rendering and three-dimensional gaussian, in step S4, a new gaussian-property reflection intensity is generated at this stage And a reflected light coefficient s for controlling the influence of the ambient light on the final rendering color, opening an ascending mechanism of the spherical simple harmonic function,
For each view, the contribution of each attribute to each pixel is represented by a gaussian weight in a manner similar to step S3:
Other gaussian properties are also expressed in the same way:
Optionally, in the above-mentioned scene reconstruction method based on delayed rendering and three-dimensional gaussian, in step S5, in a training and reasoning stage, the micro-renderable process based on reflection information and normal models the color of gaussian points as a combination of a basic color term and a specular reflection term, generates a final image by a micro-gaussian sputter rendering algorithm, and optimizes by pixel loss with a real image, wherein:
The rendering color function is decoupled into ambient light and diffuse light of the object itself:
Wherein the ambient light contributes Is a learning environment map queried in the reflection direction by bilinear interpolation:
diffuse light color contribution Is a three-order ball simple harmonic function through trainingAnd specular reflection coefficient calculation;
In the reflection training process, the reflection-free Gaussian is used for training ) Addition in normal propagationIs a noise of (a) a noise of (b).
According to the technical scheme of the invention, the beneficial effects are that:
The geometric reconstruction precision and stability are improved, the geometric structure rationality of the Gaussian ellipsoids is ensured by decoupling the covariance matrix into the rotation matrix and the scale matrix, the degradation into low-dimensional structures such as lines or points is effectively avoided, and the structural expression capacity of the model is enhanced. The explicit normal estimation and propagation mechanism is used for acquiring an initial normal based on a method of aligning a shortest main axis with a viewing angle, further enhancing the capturing capability of a model on complex curved surface details through learning residual correction, realizing regional diffusion of a correct normal through the explicit propagation mechanism guided by reflection intensity, and improving the accuracy and consistency of an integral normal field. The reflection rendering mechanism has strong physical consistency, namely, the ambient light is guided to be sampled through the reflection direction, the reflection intensity and the specular light coefficient are increased, and a reflection weight function related to a visual angle is introduced, so that the rendering result has more physical consistency and sense of reality, and particularly, the rendering result is more natural in specular material and high-light area.
The invention enhances the cooperative relationship among Gaussian primitives, namely, periodically enhances the transparency and the size of the reflection Gaussian with the correct normal, so that the reflection Gaussian overlaps with the adjacent Gaussian in a pixel space, thereby driving the synchronous optimization of the wrong normal area and obviously improving the generalization and self-adaptation capability of the model. The invention not only maintains the high efficiency and the expressive force of three-dimensional Gaussian sputtering on structural modeling, but also introduces a normal-guided reflection modeling mechanism, thereby greatly improving the rendering quality under complex illumination scenes, and being applicable to various high-precision graphic application scenes such as virtual reality, three-dimensional reconstruction, game engines, video rendering and the like.
For a better understanding and explanation of the conception, working principle and inventive effect of the present invention, the present invention is described in detail below by way of specific examples with reference to the accompanying drawings, in which:
drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a flow chart of a scene reconstruction method based on delayed rendering and three-dimensional gaussian sputtering of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples, in order to make the objects, technical methods and advantages of the present invention more apparent. These examples are illustrative only and are not limiting of the invention.
The invention discloses a scene reconstruction method based on delay rendering and three-dimensional Gaussian, which comprises a structural training stage and a reflection training stage. In the structure training, a three-dimensional Gaussian set is constructed based on sparse point cloud, a stable geometric structure is obtained through covariance matrix decoupling, an initial normal estimation method combining principal axis direction and visual angle information is provided, and then a learnable normal residual error is introduced to carry out fine optimization. And in the reflection training stage, the reflection direction is calculated according to the optimized normal, the environment map is guided to sample, the spherical harmonic function is combined to obtain the relevant reflection color of the visual angle, and the final output color is generated by combining diffuse reflection and reflection light. After training, the transparency and the size of gauss with correct normals are enhanced, so that the normal information of the gauss is guided to propagate to surrounding areas, and the accuracy of a global normal field is improved. The invention obviously improves the reality and geometric consistency of the Gaussian sputtering model under the complex illumination condition, and is suitable for three-dimensional reconstruction, virtual reality, graphic rendering and other scenes.
According to the invention, a Phong model is introduced to replace simple spherical simple harmonic function color representation through a surface normal reconstruction mechanism under the restriction of a sight line direction, the coloring contribution of ambient light and diffuse light to a mirror surface material is separated by combining an ambient map, mirror surface light learning elements are newly added, the color rendering of the mirror surface object with reflection attribute is more realistically restored, meanwhile, a delayed rendering framework with decoupling of geometry and illumination is adopted, and the calculation resource allocation is optimized through a mixed precision storage and video memory multiplexing strategy, so that the limitation of the traditional method is broken through, and the high-precision real-time rendering of the mirror surface material under a complex light field is realized.
The method comprises the steps of firstly synchronously obtaining sparse three-dimensional point cloud and camera pose information by utilizing a multi-view three-dimensional reconstruction method (such as COLMAP) based on an input image sequence, secondly establishing a learnable three-dimensional Gaussian ellipsoid set around the sparse point cloud in a structural training stage, guaranteeing geometric stability of the three-dimensional Gaussian ellipsoid set through decoupling covariance matrixes, jointly optimizing parameters such as color, transparency and the like under a micro-rendering frame, further extracting a principal axis of the Gao Sixie variance matrix through characteristic decomposition, judging an initial normal direction by combining a camera observation direction, introducing a trainable residual to realize fine reconstruction of a normal field, then calculating a reflection direction based on the optimized normal in a reflection training stage, introducing reflection intensity related to a modeling view angle of a high-order spherical harmonic function, inquiring reflection colors by adopting an environment map, newly increasing reflection intensity and specular light attributes, obtaining the two by training, and finally decoupling a final color into two parts of diffuse reflection light and environment light in a rendering stage, and fusing the diffuse reflection intensity to obtain a final output image.
As shown in fig. 1, the scene reconstruction method based on delay rendering and three-dimensional gauss of the present invention comprises the following steps:
and S1, generating initial three-dimensional point cloud data based on the multi-view image. Based on the input image sequence, an initial three-dimensional point cloud reconstruction and camera pose parameter calculation are synchronously realized by adopting a multi-view stereoscopic vision algorithm (such as COLMAP).
The three-dimensional point cloud consists of a discrete three-dimensional point set, comprises three-dimensional space coordinates under a world coordinate system and associated RGB color attributes, and the camera parameter system comprises an internal reference matrix (focal length, principal points and distortion coefficients) and an external reference matrix (rotation matrix and translation vector).
Step S2, constructing a trainable structure body for three-dimensional Gaussian modeling. The sparse point cloud and the camera pose are used as input to construct a three-dimensional Gaussian structure body containing three-dimensional position, covariance matrix, transparency, color and other learnable attributes, and the three-dimensional Gaussian structure body is used for representing appearance and shape characteristics of each Gaussian point in a three-dimensional space, so that parameterization support is provided for micro-renderable and reflection modeling. In the world coordinate system, a three-dimensional Gaussian ellipsoid primitive is defined by taking a sparse point cloud as a central point, and the spatial distribution characteristic is as follows:
each gaussian primitive is associated with a learnable geometric attribute, which is defined by a covariance matrix, and a rendering characteristic Controlling the geometry of ellipsoids, rendering characteristics including opacity parametersControlling the intensity of light attenuation during rendering, RGB color vectorsMapping to viewing angle dependent spherical harmonic coefficients
To ensure semi-positive nature of covariance matrix in training process, three-dimensional Gaussian ellipsoids are prevented from being degenerated into low-dimensional structure (such as line or point), and the covariance matrix is decoupled into geometrical components capable of being optimized independently, and rotatedScaling matrix:
Wherein to ensure a rotation matrixNormalized to the quaternion of its parameters. Specifically, the quaternion builds a rotation matrix by the following formula:
in the differentiable rendering stage, a three-dimensional Gaussian is first projected onto a camera coordinate system with projection covariance The calculation is as follows:
Where W is the rigid body transformation matrix of the world coordinate system to the camera coordinate system and J is the jacobian matrix of the affine approximation of the projective transformation.
Rendering employs an alpha-blending mechanism based on physical light transmission by first depth ordering (by default back-to-front blending order) the gaussian primitives projected to the pixel impact domain, and color blending by accumulation of light attenuation weights and opacity:
transmittance of light The characteristic value of the projection covariance matrix sigma' is analytically deduced to represent the attenuation degree of the light after penetrating through the preamble Gaussian point.
The stage adoptsThe composite loss function synergistically optimizes rendered image quality:
Wherein the method comprises the steps of The pixel-level color accuracy is constrained,Maintaining structural similarity, super-parametersTaking 0.2.
In the initial geometric modeling process, only a first-order spherical simple harmonic function is used, namely, color characterization is simplified into RGB (red, green and blue) attributes of diffuse reflection. The optimization process completes the convergence of the parameters of the position, covariance matrix, opacity and the like of the geometric primitive through about 3000 iterations.
And S3, carrying out normal initialization and residual optimization on Gaussian ellipsoid primitives, and combining depth consistency constraint to realize a set of micro-learnable normal reconstruction mechanism so as to enhance the geometric expression capacity of illumination modeling.
Introducing a three-dimensional ellipsoidal primitive surface normal reconstruction mechanism, and aiming at covariance matrixes of ellipsoidal primitivesExtracting and normalizing feature vectors corresponding to the minimum feature values through feature decompositionThe shortest principal axis direction of the local geometry of the ellipsoid is characterized. The direction of which may be directed inside and outside the surface, in order to eliminate the ambiguity of the direction, the vector of the direction of observation is definedAligning the shortest principal axis as an initial normal for a normalized direction vector from the center of the ellipsoid to the optical center of the camera:
Wherein the method comprises the steps of Ensuring an initial normalDirection and line of sight directionThe included angle is strictly smaller thanThereby establishing an explicit association of geometry with view angle. To further enhance the characterization capability of the normal field on the complex surface details, a trainable residual vector is introducedNon-linear correction of the initial normal:
applying a penalty term simultaneously effectively suppresses excessive offset of the residual vector:
to increase the geometric consistency of the three-dimensional ellipsoids, a depth penalty term is added. The 3DGS naturally has depth information but lacks normals. The normals generated by the method are not completely accurate, and the penalty term added at present is only consistent with the observation direction. While adding a depth penalty term ensures the accuracy of its geometry. By using AndRespectively representing the center position of an ellipsoid of the optical center position of the camera, and observing the direction: Depth rendering: Then the adjacent ellipsoids have a depth difference: Depth gradient direction: The depth difference and the normal line satisfy:
Depth correction loss function definition:
Wherein the method comprises the steps of Is a set of adjacent point pairs, obtained by K-nearest neighbor or radius search,Is an adaptive weight.
And controlling the weight attenuation of the space distance, and taking the average radius of the Gaussian ellipsoid.The weight decay of the depth difference is controlled for reducing the weight at depth discontinuities, avoiding crossing boundary penalties.
Based on the optimized normal fieldAnd calculating the reflection direction with the incident ray direction by a reflection operator:
and S4, introducing a reflection training mechanism based on the ambient light and the reflection direction, and generating Gaussian attribute based on the view angle.
Generating new Gaussian attribute reflection intensity at this stageAnd the reflected light coefficient s is used to control the effect of ambient light on the final rendered color. And starting an ascending mechanism of the spherical simple harmonic function.
For each view, the contribution of each attribute to each pixel is represented by a gaussian weight in a manner similar to step S3:
Other gaussian properties are also expressed in the same way:
During the experiment, it was observed that a small number of gauss, when optimized, may obtain a relatively large reflection intensity [ ] ) Such gaussians have near-correct normals. Thus, the opacity of the gaussian ellipsoids is periodically increased by at least 0.9, the reflectance is increased by at least 0.001, and the longest axis of the reflecting gaussian is magnified 1.5 times, the shortest axis being unchanged. This ensures that almost every reflected gaussian overlaps with an adjacent gaussian, ensuring that each visible gaussian contributes significantly to the surface normal. In this way, some shared pixels can also obtain an approximately correct normal when a gaussian with an approximately correct normal overlaps a gaussian without the correct normal.
And S5, generating a final image through a micro Gaussian sputtering rendering algorithm, and optimizing through pixel loss of the final image and the real image.
In this step, the micro-renderable process based on reflection information and normal models the color of gaussian points as a combination of basic color terms and specular reflection terms in a training and reasoning stage, generates a final image by a micro-gaussian sputter rendering algorithm, and optimizes by pixel loss with a real image.
The rendering color function is decoupled into ambient light and diffuse light of the object itself:
Wherein the ambient light contributes Is a learning environment map queried in the reflection direction by bilinear interpolation:
diffuse light color contribution Is a three-order ball simple harmonic function through trainingAnd the specular reflection coefficient is calculated.
During reflection training, diffuse light is prone to overfitting, which can affect the propagation mechanism of the reflection surface normal. To avoid this effect, there is no reflection Gaussian) Addition in normal propagationIs a noise of (a) a noise of (b).
Aiming at two core bottlenecks in mirror material rendering of the traditional three-dimensional Gaussian sputtering technology, the embodiment of the invention provides improvement, namely 1) specular reflection modeling error is caused by the fact that surface normal constraint is omitted on an ellipsoidal primitive, and 2) video memory resource competition is caused by multi-component coupling storage of geometric attributes and color calculation in a forward rendering pipeline.
To verify the advantages of this scheme, a Shiny dataset scene of Ref-Nerf was compared with 3DGS on the a600 server. The results are shown in Table 1.
TABLE 1 results of comparative experiments on Shiny dataset scene of Ref-Nerf on A600 Server with 3DGS of the present invention
In order to achieve the above objective, the present system innovatively proposes a scene reconstruction method based on delayed rendering and three-dimensional gaussian, a surface normal reconstruction module is constructed, and a delayed rendering pipeline is decoupled through time sequence. And the physical accurate modeling of the high-light reflecting surface under a complex light field is realized while the occupation of the video memory is optimized.
The above description is of the best mode of carrying out the inventive concept and principles of operation. The above examples should not be construed as limiting the scope of the claims, but other embodiments and combinations of implementations according to the inventive concept are within the scope of the invention.
Reference to the literature
[1] MILDENHALL B, SRINIVASAN P P, TANCIK M, etc . NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis[A/OL]. arXiv, 2020[2025-04-01]. http://arxiv.org/abs/2003.08934. DOI:10.48550/arXiv.2003.08934.
[2] KERBL B, KOPANAS G, LEIMK Ü HLER T, etc . 3D Gaussian Splatting for Real-Time Radiance Field Rendering[A/OL]. arXiv, 2023[2025-04-01]. http://arxiv.org/abs/2308.04079. DOI:10.48550/arXiv.2308.04079.

Claims (6)

1. A scene reconstruction method based on delay rendering and three-dimensional Gaussian is characterized by comprising the following steps:
s1, generating initial three-dimensional point cloud data based on a multi-view image;
s2, constructing a trainable structure body for three-dimensional Gaussian modeling;
S3, performing normal initialization and residual optimization on Gaussian ellipsoid primitives, and combining depth consistency constraint to realize a set of micro-learnable normal reconstruction mechanism so as to enhance geometric expression capacity of illumination modeling;
s4, introducing a reflection training mechanism based on ambient light and a reflection direction to generate Gaussian attribute based on the visual angle, and
S5, generating a final image through a micro Gaussian sputtering rendering algorithm, and optimizing through pixel loss of the final image and the real image.
2. The method for reconstructing a scene based on delayed rendering and three-dimensional gaussian according to claim 1, wherein in step S1, the three-dimensional point cloud is composed of a discrete three-dimensional point set, and includes three-dimensional space coordinates and associated RGB color attributes in a world coordinate system, and the camera parameter system includes two sets of core parameters, i.e., an internal reference matrix and an external reference matrix.
3. The scene reconstruction method based on delayed rendering and three-dimensional gaussian according to claim 1, wherein in step S2, a three-dimensional gaussian structure body containing three-dimensional position, covariance matrix, transparency, and color learning properties is constructed by using sparse point cloud and camera pose as input, and is used for characterizing appearance and shape characteristics of each gaussian point in a three-dimensional space, so as to provide parameterization support for micro-renderable and reflection modeling, and three-dimensional gaussian ellipsoid primitives are defined by using sparse point cloud as a central point in a world coordinate system, and spatial distribution characteristics thereof are defined:
;
each gaussian primitive is associated with a learnable geometric attribute, which is defined by a covariance matrix, and a rendering characteristic Controlling the geometry of ellipsoids, rendering characteristics including opacity parametersControlling the intensity of light attenuation during rendering, RGB color vectorsMapping to viewing angle dependent spherical harmonic coefficients;
To ensure semi-positive determination of covariance matrix during training, three-dimensional Gaussian ellipsoids are prevented from being degenerated into low-dimensional structures such as lines or points, and are decoupled into independently optimizable geometric components, and the matrix is rotatedScaling matrix:
;
Wherein to ensure a rotation matrixNormalized to the quaternion of its parametersThe quaternion builds a rotation matrix by the following formula:
;
in the differentiable rendering stage, a three-dimensional Gaussian is first projected onto a camera coordinate system with projection covariance The calculation is as follows:
;
wherein W is a rigid body transformation matrix from a world coordinate system to a camera coordinate system, and J is an affine approximated jacobian matrix of projective transformation;
The rendering adopts an alpha-blending mechanism based on physical light transmission, namely, firstly, depth ordering is carried out on Gaussian primitives projected to a pixel influence domain, and color mixing is realized by accumulating light attenuation weights and opacity by adopting a back-to-front mixing sequence by default:
;
transmittance of light Analyzing and deducing the characteristic value of the projection covariance matrix sigma', and representing the attenuation degree of the light after penetrating through the preamble Gaussian point;
The stage adopts The composite loss function synergistically optimizes rendered image quality:
;
Wherein the method comprises the steps of The pixel-level color accuracy is constrained,Maintaining structural similarity, super-parametersTaking 0.2.
4. The scene reconstruction method based on delayed rendering and three-dimensional gaussian according to claim 1, wherein in step S3, a three-dimensional ellipsoidal primitive surface normal reconstruction mechanism is introduced for each ellipsoidal primitive covariance matrixExtracting and normalizing feature vectors corresponding to the minimum feature values through feature decompositionThe shortest principal axis direction representing the local geometry of an ellipsoid, which may be directed inside and outside the surface, defines the vector of the direction of observationAligning the shortest principal axis as an initial normal for a normalized direction vector from the center of the ellipsoid to the optical center of the camera:
;
Wherein the method comprises the steps of Ensuring an initial normalDirection and line of sight directionThe included angle is strictly smaller thanIntroducing trainable residual vectorsNon-linear correction of the initial normal:
;
applying a penalty term simultaneously effectively suppresses excessive offset of the residual vector:
;
adding a depth penalty term to AndRespectively representing the center position of an ellipsoid of the optical center position of the camera, and observing the direction: Depth rendering: Then the adjacent ellipsoids have a depth difference: Depth gradient direction: The depth difference and the normal line satisfy:
;
Depth correction loss function definition:
;
Wherein the method comprises the steps of Is a set of adjacent point pairs, obtained by K-nearest neighbor or radius search,Is an adaptive weight;
;
Controlling the weight attenuation of the space distance, taking the average radius of the Gaussian ellipsoid, Controlling the weight attenuation of the depth difference, which is used for reducing the weight at the depth discontinuity and avoiding crossing the boundary penalty;
based on the optimized normal field And calculating the reflection direction with the incident ray direction by a reflection operator:
5. The method of claim 1, wherein in step S4, a new gaussian attribute reflection intensity is generated at this stage And a reflected light coefficient s for controlling the influence of the ambient light on the final rendering color, opening an ascending mechanism of the spherical simple harmonic function,
For each view, the contribution of each attribute to each pixel is represented by a gaussian weight in a manner similar to step S3:
;
Other gaussian properties are also expressed in the same way:
;
;
6. The method of claim 1, wherein in step S5, the micro-renderable process based on reflection information and normal is performed in a training and reasoning phase, modeling the color of the gaussian points as a combination of basic color term and specular reflection term, generating a final image by a micro-gaussian sputter rendering algorithm, and optimizing by pixel loss with the real image, wherein:
The rendering color function is decoupled into ambient light and diffuse light of the object itself:
;
Wherein the ambient light contributes Is a learning environment map queried in the reflection direction by bilinear interpolation:
;
diffuse light color contribution Is a three-order ball simple harmonic function through trainingAnd specular reflection coefficient calculation;
;
In the reflection training process, the reflection-free Gaussian is used for training ) Addition in normal propagationIs a noise of (a) a noise of (b).
CN202510837308.XA 2025-06-23 2025-06-23 Scene reconstruction method based on deferred rendering and 3D Gaussian Pending CN120635288A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510837308.XA CN120635288A (en) 2025-06-23 2025-06-23 Scene reconstruction method based on deferred rendering and 3D Gaussian

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510837308.XA CN120635288A (en) 2025-06-23 2025-06-23 Scene reconstruction method based on deferred rendering and 3D Gaussian

Publications (1)

Publication Number Publication Date
CN120635288A true CN120635288A (en) 2025-09-12

Family

ID=96974792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510837308.XA Pending CN120635288A (en) 2025-06-23 2025-06-23 Scene reconstruction method based on deferred rendering and 3D Gaussian

Country Status (1)

Country Link
CN (1) CN120635288A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120976449A (en) * 2025-10-21 2025-11-18 国科大杭州高等研究院 A method and system for cross-source data 3D reconstruction based on improved Gaussian sputtering
CN121236305A (en) * 2025-12-04 2025-12-30 南京邮电大学 A method for 3D scene reconstruction of low-light blurred images based on Gaussian sputtering
CN121280638A (en) * 2025-12-10 2026-01-06 浙江大学 Near-infrared auxiliary low light scene three-dimensional reconstruction method based on 3D Gaussian splatter
CN121280588A (en) * 2025-12-10 2026-01-06 南京信息工程大学 A volumetric cloud rendering method and system based on 3D Gaussian splashing
CN121482292A (en) * 2026-01-07 2026-02-06 浙江大学 A Physical Property Inversion and 3D Reconstruction Method Based on Gaussian Splashing and Differentiable Rendering

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120976449A (en) * 2025-10-21 2025-11-18 国科大杭州高等研究院 A method and system for cross-source data 3D reconstruction based on improved Gaussian sputtering
CN121236305A (en) * 2025-12-04 2025-12-30 南京邮电大学 A method for 3D scene reconstruction of low-light blurred images based on Gaussian sputtering
CN121280638A (en) * 2025-12-10 2026-01-06 浙江大学 Near-infrared auxiliary low light scene three-dimensional reconstruction method based on 3D Gaussian splatter
CN121280588A (en) * 2025-12-10 2026-01-06 南京信息工程大学 A volumetric cloud rendering method and system based on 3D Gaussian splashing
CN121482292A (en) * 2026-01-07 2026-02-06 浙江大学 A Physical Property Inversion and 3D Reconstruction Method Based on Gaussian Splashing and Differentiable Rendering

Similar Documents

Publication Publication Date Title
Kopanas et al. Point‐based neural rendering with per‐view optimization
CN120635288A (en) Scene reconstruction method based on deferred rendering and 3D Gaussian
CN115797561A (en) Three-dimensional reconstruction method, device and readable storage medium
CN103530907B (en) Complicated three-dimensional model drawing method based on images
CA2424705A1 (en) Systems and methods for providing controllable texture sampling
CN102915559A (en) Real-time transparent object GPU (graphic processing unit) parallel generating method based on three-dimensional point cloud
CN118644605B (en) 3D Gaussian-based inverse rendering method, apparatus, equipment and storage medium
CN110335275A (en) A kind of space-time vectorization method of the flow surface based on ternary biharmonic B-spline
Sharma et al. Volumetric rendering with baked quadrature fields
CN119888029A (en) Digital human reconstruction method for focusing joint point perception
Mehta et al. Filtering Environment Illumination for Interactive Physically-Based Rendering in Mixed Reality.
Huang et al. Transparentgs: Fast inverse rendering of transparent objects with gaussians
CN118071909A (en) A Gaussian sputtering method based on delayed reflection calculation
CN120807757B (en) Real-time three-dimensional rendering method, device and medium based on reflection perception
CN115906703B (en) A GPU Fluid Simulation Method for Real-Time Interactive Applications
Chen et al. SP-SeaNeRF: Underwater neural radiance fields with strong scattering perception
Peng et al. Gaussian-plus-SDF SLAM: High-fidelity 3D reconstruction at 150+ fps
CN120219664A (en) A 3D representation method for underwater scenes based on 3D Gaussian splashing
CN119478173B (en) Three-dimensional scene novel view synthesis method based on matched rays
Dai et al. Interactive mixed reality rendering on holographic pyramid
CN118196281A (en) A triangular mesh extraction method based on segmentable neural radiation field
CN116385577A (en) Method and device for generating virtual viewpoint image
Wu et al. Reshadable Impostors with Level‐of‐Detail for Real‐Time Distant Objects Rendering
CN120219528B (en) Coloring model construction method, system, device and medium based on plane Gaussian
CN120259517B (en) Three-dimensional reconstruction method containing specular reflection in dynamic scene based on double environment map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination