metadata

license: apache-2.0
pipeline_tag: image-to-3d
tags:
  - dino
  - scene-understanding
  - semantic-scene-completion
  - unsupervised
library_name: pytorch

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić^*1 Christoph Reich^*1,2,4,5 Felix Wimbauer^1,4 Oliver Hahn² Christian Rupprecht³ Stefan Roth^2,5,6 Daniel Cremers^1,4,5

¹TU Munich ²TU Darmstadt ³University of Oxford ⁴MCML ⁵ELIZA ⁶hessian.AI *equal contribution

Overview

SceneDINO is unsupervised and infers 3D geometry and features from a single image in a feed-forward manner. Distilling and clustering SceneDINO's 3D feature field results in unsupervised semantic scene completion predictions. The method is trained using multi-view self-supervision.

Installation & Quick Start

Please refer to our Github Repo.

Citation

If you find our work useful, please consider giving it a star ⭐ and citing our paper.

@inproceedings{Jevtic:2025:SceneDINO,
    author  = {Aleksandar Jevti{\'c} and
               Christoph Reich and
               Felix Wimbauer and
               Oliver Hahn and
               Christian Rupprecht and
               Stefan Roth and
               Daniel Cremers},
    title   = {Feed-Forward {SceneDINO} for Unsupervised Semantic Scene Completion},
    journal = {IEEE/CVF International Conference on Computer Vision (ICCV)},
    year    = {2025},
}