[Phd thesis topic 2023]Analysis of dynamic scenes from an implicit neural representation (NeRF) based on LiDAR-camera data

Team: TANGRAM (joint team with LORIA and INRIA Nancy Grand Est)

Supervisors: Gilles Simon (TANGRAM, HDR) and Renato Martins (Université de Bourgogne and associate member of TANGRAM)

Contacts: gilles.simon@loria.fr ; renato.martins@u-bourgogne.fr

Application deadline: 15/05/2023

Keywords: Analysis and 3D reconstruction of dynamic scenes; Multimodal and asynchronous data; NERF – Neural radiance fields; LiDAR-camera coupling; SLAM by deep learning ;

Context:
LiDAR-camera data coupling is becoming increasingly common and is notably used for scene perception and analysis in applications such as autonomous driving, semantic 3D model acquisition and augmented reality. While RGB-D data is widely used for scene analysis and in recent deep learning Simultaneous Localization and Mapping (SLAM) approaches, the sparse and asynchronous nature of LIDAR data makes analysis more complex, especially in dynamic environments. LiDAR sensor measurements are notably affected by distortions introduced by both the sensor movement and the camera rolling-shutter.

To address these scientific issues in the context of SLAM and relocation in dynamic outdoor scenes, we want to explore in this thesis the contribution of new methods of scene representation by implicit surfaces from neural networks, and in particular implicit neural representations (NeRF) [1]. Based on this concept, various works have tried to improve the quality of reconstructed surfaces thanks to RGB-D data [2] or to take into account locally deformable scenes [3]. Others have used these models for localization [4] or for structure-from-motion (SFM) problems [5,6] with static environments. A central objective of this thesis is to study the relevance of a neural model for localization in the context of LiDAR data and for dynamic environments.

Subject:
The following points will be addressed:

• Integration of the spatio-temporal alignment of LiDAR-camera data within the NeRF representation process by taking into account the asynchronous character of the sensors with a rolling-shutter camera and a LiDAR whose acquisition dynamics differ.

• Dynamic scene representation: the objective will be to extend the use of implicit neural representations to dynamic scenes with LiDAR-camera data. Recent works on SLAM by deep learning have been proposed but most of them deal with rigid scenes and their ability to generalize to points of view far from the learning trajectories remains to be estimated. In this work, we will focus on dynamic, open-loop scenes with simple kinematic moving objects (rigid in parts).

• In order to make the modeling procedures more flexible, the possibility of using an existing neural model to adapt it to a new object of the same category using few images will also be investigated. We also want to investigate the incorporation of geometric priors present in the scene to guide the learning of the neural representation.

Skills:

• Master 2 or engineering degree in computer science, image processing/computer vision or statistical learning.

• Programming experience in Python and in a deep learning framework (Pytorch, tensorflow,…).

• Previous experience and/or interest in image analysis, computer graphics or computer vision are a plus.

Working Environment:

The PhD student will join the TANGRAM team, which is shared by LORIA and Inria Nancy Grand Est (see https://team.inria.fr/tangram/). He/she will benefit from the research environment and the expertise in image processing and analysis of this team. He/she will also benefit from the skills of M.O. Berger (scientific leader of the TANGRAM team) and of C. Demonceaux, teacher-researcher at the University of Burgundy and associate member of the TANGRAM team.

Bibligraphy:
[1] B. Mildenhall, P. Srinivasan, M . Tancij, J. Barron, R . Ramamoorthi, Ren Ng. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV 2020.

[2]. Roessle, J. Barron, B., Pratul P. Srinivasan, Matthias Niessner. Dense Depth Priors for Neural Radiance Fields from Sparse Input Views. CVPR 2022.

[3]K.Park, U. Sinha, J. Barron, S. Bouaziz, D. Goldman, S. Seitz, R. Martin-Brualla. Nerfies: Deformable Neural Radiance Fields. ICCV 2021.

[4] L. Yen-Chen, P. Florence, J. Barron, A. Rodrigue, P Isola, T. Lin. iNeRF: Inverting Neural Radiance Fields for Pose Estimation. IROS 2021. arXiv:2012.05877.

[5] Z. Zhu, S. Peng, V. Larsson, Z. Cui, M. Oswald, A. Geige, M. Pollefeys. NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM. https://arxiv.org/pdf/2302.03594.pdf

[6 ] H.Matsuki1, K. Tateno, M. Niemeyer, F. Tombari. NEWTON: Neural View-Centric Mapping for Onthe-Fly Large-Scale SLAM. https://arxiv.org/pdf/2303.13654.pdf