Register now After registration you will be able to apply for this opportunity online.
Metric (Semi-)Monocular Depth Estimation
The goal of the project is to augment existing monocular depth estimation models with measured sparse metric depth and fuse the information into accurate metric depth maps.
Very recently strong Monocular Depth models [1,2] have been proposed that deliver so far unseen performance. These models estimate fine depth maps, work for transparent and reflective surfaces and in complex scenes, while being very efficient and generalizing well. They also can deliver metric depth maps but appear fundamentally limited in this task by operating on a single image.
The idea of this thesis is to augment existing models with measured sparse metric depth and fuse the information via inpainting [3] or constrained diffusion [4] into accurate metric depth maps. We can assume the input coming from posed images in a temporal sequence, measurements can be achieved by sparse or line [5] matching and triangulation. To allow for the application in an industrial environment, a pixelwise quality or uncertainty estimate could be part of the result.
(1) Yang et al. “Depth Anything V2”, Arxiv 2024 (https://github.com/DepthAnything/Depth-Anything-V2)
(2) Hu et al. “DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos”, Arxiv 2024 (https://depthcrafter.github.io)
(3) Hu et al. “Deep Depth Completion from Extremely Sparse Data: A Survey”, PAMI 2022 (https://arxiv.org/pdf/2205.05335)
(4) Zhang et al. " Adding Conditional Control to Text-to-Image Diffusion Models", ICCV 2023 (https://github.com/mikonvergence/ControlNetInpaint)
(5) Liu et al. “3D Line Mapping Revisited”, CVPR 2023 (https://github.com/cvg/limap)
Very recently strong Monocular Depth models [1,2] have been proposed that deliver so far unseen performance. These models estimate fine depth maps, work for transparent and reflective surfaces and in complex scenes, while being very efficient and generalizing well. They also can deliver metric depth maps but appear fundamentally limited in this task by operating on a single image. The idea of this thesis is to augment existing models with measured sparse metric depth and fuse the information via inpainting [3] or constrained diffusion [4] into accurate metric depth maps. We can assume the input coming from posed images in a temporal sequence, measurements can be achieved by sparse or line [5] matching and triangulation. To allow for the application in an industrial environment, a pixelwise quality or uncertainty estimate could be part of the result.
(1) Yang et al. “Depth Anything V2”, Arxiv 2024 (https://github.com/DepthAnything/Depth-Anything-V2)
(2) Hu et al. “DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos”, Arxiv 2024 (https://depthcrafter.github.io)
(3) Hu et al. “Deep Depth Completion from Extremely Sparse Data: A Survey”, PAMI 2022 (https://arxiv.org/pdf/2205.05335)
(4) Zhang et al. " Adding Conditional Control to Text-to-Image Diffusion Models", ICCV 2023 (https://github.com/mikonvergence/ControlNetInpaint)
(5) Liu et al. “3D Line Mapping Revisited”, CVPR 2023 (https://github.com/cvg/limap)
How far can we push Metric Monocular Depth Estimation with Augmentation?
**Planning**
The earliest start will be 1st October 2024 (01.10.2024).
**Benefits**
At Microsoft we cannot support a regular workplace in our office, you can work on this topic from home or at ETH.
We will setup a weekly meeting schedule, where we discuss progress and ideas and decide on next steps, we can meet in the office but also online if desired.
How far can we push Metric Monocular Depth Estimation with Augmentation?
**Planning**
The earliest start will be 1st October 2024 (01.10.2024).
**Benefits**
At Microsoft we cannot support a regular workplace in our office, you can work on this topic from home or at ETH.
We will setup a weekly meeting schedule, where we discuss progress and ideas and decide on next steps, we can meet in the office but also online if desired.
Please send your CV and transcript to chvogel@microsoft.com
Website: www.microsoft.com/en-us/research/lab/mixed-reality-ai-zurich/
Please send your CV and transcript to chvogel@microsoft.com