VGGT: Lightning-Fast 3D Scene Reconstruction from Images

2025-03-25
VGGT: Lightning-Fast 3D Scene Reconstruction from Images

Facebook Research introduces VGGT (Visual Geometry Grounded Transformer), a feed-forward neural network capable of inferring all key 3D attributes of a scene—extrinsic and intrinsic camera parameters, point maps, depth maps, and 3D point tracks—from one, a few, or hundreds of views in mere seconds. This user-friendly model, leveraging the power of Transformers, offers an interactive 3D visualization tool. Surprisingly, VGGT demonstrates impressive single-view reconstruction capabilities, achieving competitive results compared to state-of-the-art monocular methods, despite not being explicitly trained for this task.

AI