This opportunity is not published. No applications will be accepted.

Direct Scene Understanding from 3D Point Clouds

In order to overcome the limitations of camera-based segmentation, this project aims to explore learning-based panoptic segmentation of a scene using point cloud or map data obtained from a LiDAR sensor on HEAP.

Keywords: deep learning, semantic segmentation, point clouds, complementary data

Description
Camera-based panoptic and semantic segmentation approaches have shown promising results in the past few years. Unfortunately, the output of visual sensors such as cameras operating within the visible light spectrum undergo strong changes with respect to changes in the environment such as changing lighting conditions, rain, fog, etc. The use of point clouds for 3D scene understanding could allow excavators to operate more independently from lighting and weather conditions. The big improvement in the richness of data returned by modern LiDAR sensors allows for the generation of image-like representations [1]. This together with advances in suitable network architectures to process 3D point clouds [2] and the presence of large scale datasets [3] can allow us to develop a system specifically tailored to 3D scene understanding from LiDAR data. To generalize to construction environments and to be able to handle specific classes, supervision from our previously used vision-based networks can be utilized. Depending on the scope of the project, after evaluating the predictions from the raw point cloud a model that integrates early in the processing pipeline both LiDAR and RGB could be investigated. The hope is that this use of multi-modal data increases robustness of the semantic segmentation model by making use of the complementary strengths of the sensors. References: [1] https://ouster.com/blog/the-camera-is-in-the-lidar/ [2] EfficientLPS: Efficient LiDAR Panoptic Segmentation, Sirohi et al. 2021 [3] https://github.com/waymo-research/waymo-open-datas
Camera-based panoptic and semantic segmentation approaches have shown promising results in the past few years. Unfortunately, the output of visual sensors such as cameras operating within the visible light spectrum undergo strong changes with respect to changes in the environment such as changing lighting conditions, rain, fog, etc. The use of point clouds for 3D scene understanding could allow excavators to operate more independently from lighting and weather conditions.

The big improvement in the richness of data returned by modern LiDAR sensors allows for the generation of image-like representations [1]. This together with advances in suitable network architectures to process 3D point clouds [2] and the presence of large scale datasets [3] can allow us to develop a system specifically tailored to 3D scene understanding from LiDAR data. To generalize to construction environments and to be able to handle specific classes, supervision from our previously used vision-based networks can be utilized.

Depending on the scope of the project, after evaluating the predictions from the raw point cloud a model that integrates early in the processing pipeline both LiDAR and RGB could be investigated. The hope is that this use of multi-modal data increases robustness of the semantic segmentation model by making use of the complementary strengths of the sensors.

References:

[1] https://ouster.com/blog/the-camera-is-in-the-lidar/

[2] EfficientLPS: Efficient LiDAR Panoptic Segmentation, Sirohi et al. 2021

[3] https://github.com/waymo-research/waymo-open-datas
Work Packages
- Literature review on object detection and semantic segmentation with 3D point clouds - Implementation and or adaptation of open-source model to our field - Evaluation on the robot and on datasets - Optional: integration with RGB data
- Literature review on object detection and semantic segmentation with 3D point clouds
- Implementation and or adaptation of open-source model to our field
- Evaluation on the robot and on datasets
- Optional: integration with RGB data
Requirements
- Experience in Python - Experience training large neural networks - Experience in PyTorch and TensorFlow - Experience with lidar sensors and point cloud processing
- Experience in Python
- Experience training large neural networks
- Experience in PyTorch and TensorFlow
- Experience with lidar sensors and point cloud processing
Contact Details
- Julian Nubert: nubertj@ethz.ch - Lorenzo Terenzi: lterenzi@ethz.ch
- Julian Nubert: nubertj@ethz.ch
- Lorenzo Terenzi: lterenzi@ethz.ch
Student(s) Name(s)
Not specified
Project Report Abstract
Not specified

Calendar

Earliest start	2022-02-01
Latest end	2022-12-31

Location

Robotic Systems Lab (ETHZ)

Labels

Semester Project
Master Thesis
CLS Student Project [managed by Max Planck ETH Center for Learning Systems]

Topics

Information, Computing and Communication Sciences

Documents

Name	Comment	Size	Actions
LiDAR_Based_Scene_Understanding-JN,LT.pdf		2.3MB	Download