Mean position
The center of the Gaussian in 3D space. It tells the renderer where the blob lives.
A visual encyclopedia of one of the most important modern techniques for turning ordinary photos into photorealistic, navigable 3D scenes. Instead of triangles or a heavy neural network, the scene becomes a cloud of soft, colored 3D ellipsoids called Gaussians.
Gaussian Splatting is a 3D scene representation and rendering method. It learns many small, transparent, colored Gaussian blobs from calibrated photos, then projects those blobs into the camera view and blends them to create a new image.
A Gaussian is a smooth probability-shaped blob. In 3D Gaussian Splatting, each blob acts like a tiny soft piece of visible matter. Each one stores where it is, how stretched it is, how transparent it is, and what color it emits from different viewing directions.
The scene is not a mesh. It is also not hidden inside a neural network. It is an explicit set of optimized primitives that can be projected quickly onto the screen.
The center of the Gaussian in 3D space. It tells the renderer where the blob lives.
The size, orientation, and stretch of the ellipsoid. This is why splats can become thin, wide, or elongated.
How much the Gaussian contributes to the final pixel. Transparent blending is essential to the method.
Color is often stored with spherical harmonics, allowing the same splat to look different from different directions.
The method begins like photogrammetry: capture several images, estimate camera poses, and create an initial sparse point cloud. Then the Gaussian parameters are optimized so rendered views match the original photos.
Take overlapping photos or frames of a mostly static scene.
Structure-from-motion recovers camera poses and sparse points.
Initialize one or more 3D Gaussians around the sparse points.
Adjust position, size, rotation, opacity, and color to match photos.
Split, clone, grow, or prune splats where detail is missing or unnecessary.
Project splats into the screen and alpha-blend them in visibility order.
This 2D toy demo shows the rendering intuition. Each circle or ellipse is a soft Gaussian contribution. The final image is built by accumulating many semi-transparent splats rather than drawing hard polygons.
In real 3D Gaussian Splatting, these are projected ellipses from 3D ellipsoids. The renderer sorts or tiles visible splats and blends them using alpha compositing.
Rendering means converting the 3D Gaussian cloud into a 2D image. The important trick is that each 3D ellipsoid becomes a 2D elliptical footprint in the camera view. The renderer evaluates and blends the splats that touch each pixel.
Transform each 3D Gaussian into camera space and approximate its footprint as a 2D ellipse.
Discard invisible splats and group remaining splats by screen tiles for fast GPU processing.
Use depth ordering so closer splats correctly affect what is visible in front of farther splats.
Combine color and transparency until the pixel becomes the final rendered image value.
During training, compare the rendered image with real photos and update Gaussian parameters.
Although the technique is called Gaussian Splatting, splitting is an important optimization idea: when a Gaussian is too large or cannot explain image detail, it can be split into smaller Gaussians. Weak or unnecessary splats can also be pruned.
A large or high-error Gaussian can become several smaller ones, improving detail around edges, corners, foliage, or thin structures.
Useful Gaussians may be duplicated near areas where the optimizer needs more local capacity.
Nearly invisible, unstable, or redundant Gaussians can be removed to reduce memory and improve performance.
Gaussian Splatting sits between classic graphics and neural rendering. It has the explicit, rasterization-friendly nature of graphics primitives, but it learns appearance from photos like neural radiance field methods.
| Method | Representation | Strengths | Weaknesses | Best fit |
|---|---|---|---|---|
| Mesh rendering | Triangles, materials, textures | Editable, standard in games/CAD, physically meaningful surfaces | Hard to reconstruct perfect real scenes automatically; transparency and complex appearance can be difficult | Games, engineering, animation, product visualization |
| Photogrammetry | Dense geometry plus texture maps | Produces measurable geometry and traditional assets | Can struggle with reflections, textureless areas, holes, and heavy cleanup | Surveying, heritage, asset capture |
| NeRF | Implicit neural radiance field | High visual quality and elegant continuous representation | Often slower to render because many samples and network evaluations are required | Research, view synthesis, compact learned scenes |
| 3D Gaussian Splatting | Explicit cloud of optimized transparent ellipsoids | Excellent visual quality, fast rendering, direct rasterization-like pipeline | Memory can be high; editing and simulation are less mature than meshes | Real-time captured scenes, VR previews, digital twins, immersive media |
Its main strength is fast, high-quality novel-view synthesis: moving a virtual camera through a captured real place or object while preserving photographic appearance.
Capture monuments, rooms, artifacts, and excavation sites as immersive navigable scenes.
photoreal capturearchivesCreate walk-throughs from photo or video capture without manually modeling every surface.
virtual toursAECRapidly turn real environments into background assets for previs, VFX, and scene blocking.
VFXprevisualizationReal-time rendering makes captured spaces more practical for immersive experiences.
immersivereal timeCaptured radiance fields can help with visual scene understanding and synthetic viewpoint generation.
mappingtraining dataObjects can be captured with rich appearance and inspected interactively from many angles.
e-commerce3D assetsGaussian Splatting is powerful, but not magic. It works best when input views are well-covered, camera poses are accurate, and the scene is mostly static and consistently lit.
Moving people, changing lights, water, vegetation, and traffic can create inconsistent training signals.
Mirrors, glass, glossy metals, and refractions can be difficult because they change strongly with viewpoint.
Unseen backsides, narrow gaps, or badly covered angles may produce holes, floaters, or blurry regions.
High-quality scenes may require many Gaussians, which can be heavier than a compact neural representation.
Moving walls, changing object topology, or adding physical simulation is harder than with triangle meshes.
The result may look photorealistic but still lack clean, watertight, measurable surfaces.
These answers summarize the most important practical concepts behind the technique.
No. Both are used for novel-view synthesis, but NeRF commonly stores a scene inside an implicit neural network, while 3D Gaussian Splatting stores an explicit set of optimized Gaussian primitives that can be rasterized quickly.
An anisotropic Gaussian can stretch differently in different directions. That lets one primitive approximate a flat surface patch, an elongated edge, or a thin visual feature better than a simple sphere.
A splat is the screen-space footprint of a primitive. In this method, a 3D Gaussian projects to a soft 2D ellipse, and that ellipse contributes color and opacity to pixels.
Not directly. The output is a cloud of Gaussian ellipsoids. Mesh extraction is possible with additional processing, but the native representation is not a triangle mesh.
The explicit Gaussian primitives can be processed with a rasterization-like GPU pipeline. The renderer avoids sampling a dense volume everywhere and focuses on visible splats.
A compact vocabulary for reading papers, code, and tutorials about Gaussian Splatting.