Less
Gaussians,
Texture
More:
4K
Feed-Forward Textured Splatting
Existing feed-forward Gaussian Splatting methods can't scale to
4K.
LGTM is the first native
4K feed-forward method that predicts compact
textured Gaussians.
How LGTM Works
Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in primitive count as resolution increases. This fundamentally limits their scalability, making high-resolution synthesis such as 4K intractable. Additionally, standard 3DGS couples appearance and geometry within each primitive, requiring excessive Gaussians to represent rich texture regions on geometrically simple surfaces.
We introduce LGTM, the first native 4K feed-forward textured Gaussians method for high-resolution novel view synthesis. LGTM supports native 4K inputs and predicts 4K output in a single feed-forward pass. LGTM jointly trains:
- Gaussian primitive network: Predicts a compact set of Gaussian primitives.
- Texture network: Processes high-resolution inputs to predict per-Gaussian RGBA texture maps.
By jointly training these networks, LGTM predicts compact Gaussian primitives with per-primitive textures. This decouples geometric complexity from rendering resolution, enabling high-fidelity 4K novel view synthesis without per-scene optimization, a capability previously intractable for feed-forward methods, all while using significantly fewer Gaussian primitives.
Key Advantages of LGTM
- Native 4K feed-forward reconstruction: First method to support native 4K inputs and predict 4K output in a single feed-forward pass without per-scene optimization.
- Significantly fewer primitives: Achieves high-quality rendering with much fewer Gaussian primitives compared to existing methods, improving both efficiency and scalability.
- Broadly applicable: Works with various baseline methods including monocular, two-view, and multi-view settings, with or without known camera poses.
- Superior visual quality: Consistently outperforms baselines with sharper details and better texture reproduction.