Gaussian Splatting Slam [CVPR 2024]
The "Gaussian Splatting SLAM" paper presents a real-time 3D reconstruction method using 3D Gaussians, achieving high-quality results for small and transparent objects with support from Dyson Technology Ltd.
Read original articleThe paper "Gaussian Splatting SLAM," presented at CVPR 2024, introduces a novel approach to incremental 3D reconstruction using a single moving monocular or RGB-D camera. The method operates in real-time at 3 frames per second and employs 3D Gaussians as the sole representation for tracking, mapping, and rendering. Key innovations include a new camera tracking formulation that optimizes directly against 3D Gaussians, enhancing tracking speed and robustness. Additionally, the method incorporates geometric verification and regularization to address ambiguities in dense reconstruction. The full SLAM system demonstrates state-of-the-art performance in novel view synthesis and trajectory estimation, successfully reconstructing small and transparent objects. The authors, Hidenobu Matsuki, Riku Murai, Paul H. J. Kelly, and Andrew J. Davison, acknowledge support from Dyson Technology Ltd. and express gratitude to various contributors for their insights. The research showcases results from self-captured sequences, highlighting the method's capability to reconstruct scenes in real-time using RGB images from an Intel Realsense d455 camera.
- The Gaussian Splatting SLAM method operates in real-time at 3fps.
- It utilizes 3D Gaussians for efficient tracking and mapping.
- The system achieves high-quality reconstruction of small and transparent objects.
- The research was supported by Dyson Technology Ltd.
- The authors contributed equally to the work presented.
Related
Mip-Splatting: Alias-Free 3D Gaussian Splatting
The paper introduces Mip-Splatting, enhancing 3D Gaussian Splatting by addressing artifacts with a 3D smoothing filter and a 2D Mip filter, achieving alias-free renderings and improved image fidelity in 3D rendering applications.
WildGaussians: 3D Gaussian Splatting in the Wild
A novel method, WildGaussians, enhances 3D scene reconstruction for in-the-wild data by combining DINO features and appearance modeling with 3D Gaussian Splatting. It outperforms NeRFs and 3DGS in handling dynamic scenes.
Shape of Motion: 4D Reconstruction from a Single Video
Shape of Motion reconstructs 4D scenes from monocular videos by modeling 3D motion. It uses SE(3) motion bases and data-driven priors for global consistency. The method excels in 3D/2D motion estimation and view synthesis, as shown in experiments. Comparisons with other methods are included.
GLOMAP – Global Structure-from-Motion Revisited
The paper introduces GLOMAP, a new system for 3D structure recovery and camera motion estimation, outperforming COLMAP in accuracy and speed, and is available as open-source software.
InstantSplat: Sparse-View SfM-Free Gaussian Splatting in Seconds
InstantSplat is a new framework for novel view synthesis from sparse images, reducing training time significantly and improving 3D scene reconstruction efficiency without relying on traditional Structure-from-Motion methods.
My desk is currently set up such that I have a large monitor in the middle. I'd like to look at the center of the screen when taking calls. I'd also like it to appear as though I am looking straight into the camera, and the camera is pointed at my face. Obviously, I cannot physically place the camera right in front of the monitor as that would be seriously inconvenient. Some laptops solve but I don't think their methods apply here as the top of my monitor ends up being quite a bit higher than what would look "good" for simple eye correction.
I have multiple webcams that I can place around the monitor to my liking. I would like to have something similar to what is seen when you open this webpage, but for a video. hopefully at higher quality since I'm not constrained to a monocular source.
I've dabbled a bit with OpenCV in the past, but the most I've done is a little camera calibration for de-warping fisheye lenses. Any ideas on what work I should look into to get started with this?
In my head, I'm picturing two camera sources: one above and one below the monitor. The "synthetic" projected perspective would be in the middle of the two.
Is capturing a point cloud from a stereo source and then reprojecting with splats the most "straightforward" way to do this? Any and all papers/advice are welcome. I'm a little rusty on the math side but I figure a healthy mix of Szeliski's Computer Vision, Wolfram Alpha, a chatbot, and of course perseverance will get me there.
This all is well and good when you are just using for a pretty visualization, but it appears gaussians have the same weakness as point clouds processed with structure from motion, in that you need lots of camera angles to get quality surface reconstruction accuracy.
Are there any examples or algorithms that can turn this into 3D objects that could be used in a video game? Any examples of someone doing that?
Related
Mip-Splatting: Alias-Free 3D Gaussian Splatting
The paper introduces Mip-Splatting, enhancing 3D Gaussian Splatting by addressing artifacts with a 3D smoothing filter and a 2D Mip filter, achieving alias-free renderings and improved image fidelity in 3D rendering applications.
WildGaussians: 3D Gaussian Splatting in the Wild
A novel method, WildGaussians, enhances 3D scene reconstruction for in-the-wild data by combining DINO features and appearance modeling with 3D Gaussian Splatting. It outperforms NeRFs and 3DGS in handling dynamic scenes.
Shape of Motion: 4D Reconstruction from a Single Video
Shape of Motion reconstructs 4D scenes from monocular videos by modeling 3D motion. It uses SE(3) motion bases and data-driven priors for global consistency. The method excels in 3D/2D motion estimation and view synthesis, as shown in experiments. Comparisons with other methods are included.
GLOMAP – Global Structure-from-Motion Revisited
The paper introduces GLOMAP, a new system for 3D structure recovery and camera motion estimation, outperforming COLMAP in accuracy and speed, and is available as open-source software.
InstantSplat: Sparse-View SfM-Free Gaussian Splatting in Seconds
InstantSplat is a new framework for novel view synthesis from sparse images, reducing training time significantly and improving 3D scene reconstruction efficiency without relying on traditional Structure-from-Motion methods.