Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Direct Scene Understanding from 3D Point Clouds
In order to overcome the limitations of camera-based segmentation, this project aims to explore learning-based panoptic segmentation of a scene using point cloud or map data obtained from a LiDAR sensor on HEAP.
Keywords: deep learning, semantic segmentation, point clouds, complementary data
Camera-based panoptic and semantic segmentation approaches have shown promising results in the past few years. Unfortunately, the output of visual sensors such as cameras operating within the visible light spectrum undergo strong changes with respect to changes in the environment such as changing lighting conditions, rain, fog, etc. The use of point clouds for 3D scene understanding could allow excavators to operate more independently from lighting and weather conditions.
The big improvement in the richness of data returned by modern LiDAR sensors allows for the generation of image-like representations [1]. This together with advances in suitable network architectures to process 3D point clouds [2] and the presence of large scale datasets [3] can allow us to develop a system specifically tailored to 3D scene understanding from LiDAR data. To generalize to construction environments and to be able to handle specific classes, supervision from our previously used vision-based networks can be utilized.
Depending on the scope of the project, after evaluating the predictions from the raw point cloud a model that integrates early in the processing pipeline both LiDAR and RGB could be investigated. The hope is that this use of multi-modal data increases robustness of the semantic segmentation model by making use of the complementary strengths of the sensors.
References:
[1] https://ouster.com/blog/the-camera-is-in-the-lidar/
[2] EfficientLPS: Efficient LiDAR Panoptic Segmentation, Sirohi et al. 2021
[3] https://github.com/waymo-research/waymo-open-datas
Camera-based panoptic and semantic segmentation approaches have shown promising results in the past few years. Unfortunately, the output of visual sensors such as cameras operating within the visible light spectrum undergo strong changes with respect to changes in the environment such as changing lighting conditions, rain, fog, etc. The use of point clouds for 3D scene understanding could allow excavators to operate more independently from lighting and weather conditions.
The big improvement in the richness of data returned by modern LiDAR sensors allows for the generation of image-like representations [1]. This together with advances in suitable network architectures to process 3D point clouds [2] and the presence of large scale datasets [3] can allow us to develop a system specifically tailored to 3D scene understanding from LiDAR data. To generalize to construction environments and to be able to handle specific classes, supervision from our previously used vision-based networks can be utilized.
Depending on the scope of the project, after evaluating the predictions from the raw point cloud a model that integrates early in the processing pipeline both LiDAR and RGB could be investigated. The hope is that this use of multi-modal data increases robustness of the semantic segmentation model by making use of the complementary strengths of the sensors.
- Literature review on object detection and semantic segmentation with 3D point clouds
- Implementation and or adaptation of open-source model to our field
- Evaluation on the robot and on datasets
- Optional: integration with RGB data
- Literature review on object detection and semantic segmentation with 3D point clouds - Implementation and or adaptation of open-source model to our field - Evaluation on the robot and on datasets - Optional: integration with RGB data
- Experience in Python
- Experience training large neural networks
- Experience in PyTorch and TensorFlow
- Experience with lidar sensors and point cloud processing
- Experience in Python - Experience training large neural networks - Experience in PyTorch and TensorFlow - Experience with lidar sensors and point cloud processing
- Julian Nubert: nubertj@ethz.ch
- Lorenzo Terenzi: lterenzi@ethz.ch
- Julian Nubert: nubertj@ethz.ch - Lorenzo Terenzi: lterenzi@ethz.ch