Visual Localization on 3D Scene Graphs

In this project, we propose to investigate if scene graph technology can assist place recognition in changing environments by considering the semantic information about the environment.

Keywords: localization, 3D scene graph

Description
To operate reliably, the robots need to know at every point in time their precise location in the world. The robot localization process typically starts with a “place recognition” task. The robot must recognize whether it has previously been at a particular place. Traditional approaches describe the scene through the local patches of the images or global image descriptors. These descriptors don’t take the content of the image into account. So, in this project, we would like to investigate how to incorporate the semantic information about the objects the robot sees in the scene into the image descriptors such that it can get better at recognizing places and thus have a more reliable localization. In this project, we propose to investigate if scene graph technology can assist place recognition in changing environments by considering the semantic information about the environment. We draw inspiration from SGAligner [1] approach and will investigate its application for place recognition tasks. A potential way of tackling this problem is to jointly learn embeddings of hierarchical scene graph nodes and images. In terms of hierarchy, we mean that a room node's high-dimensional description is distilled from the objects that fall inside. The image and graph descriptors are learned so they can be robustly matched. [1] SGAligner: 3D Scene Alignment with Scene Graphs. Sayan Deb Sarkar and Ondrej Miksik and Marc Pollefeys and Daniel Barath and Iro Armeni. https://arxiv.org/abs/2304.14880
To operate reliably, the robots need to know at every point in time their precise location in the world. The robot localization process typically starts with a “place recognition” task. The robot must recognize whether it has previously been at a particular place. Traditional approaches describe the scene through the local patches of the images or global image descriptors. These descriptors don’t take the content of the image into account. So, in this project, we would like to investigate how to incorporate the semantic information about the objects the robot sees in the scene into the image descriptors such that it can get better at recognizing places and thus have a more reliable localization.
In this project, we propose to investigate if scene graph technology can assist place recognition in changing environments by considering the semantic information about the environment. We draw inspiration from SGAligner [1] approach and will investigate its application for place recognition tasks. A potential way of tackling this problem is to jointly learn embeddings of hierarchical scene graph nodes and images. In terms of hierarchy, we mean that a room node's high-dimensional description is distilled from the objects that fall inside. The image and graph descriptors are learned so they can be robustly matched.

[1] SGAligner: 3D Scene Alignment with Scene Graphs. Sayan Deb Sarkar and Ondrej Miksik and Marc Pollefeys and Daniel Barath and Iro Armeni. https://arxiv.org/abs/2304.14880
Goal
Not specified
Contact Details
Not specified

Calendar

Earliest start	No date
Latest end	No date

Location

Computer Vision and Geometry Group (ETHZ)

Other involved organizations
Robotic Systems Lab (ETHZ)

Labels

Semester Project
Master Thesis

Topics

Information, Computing and Communication Sciences