Autor: Guilherme Correa de Oliveira (Currículo Lattes)
Resumo
Underwater exploration is hindered by environmental challenges that limit the use of optical cameras, making imaging sonars a viable alternative for imaging. However, sonar images are inherently ambiguous and noisy, which complicates their interpretation for 3D reconstruction. While machine learning can mitigate these issues, its application is restricted by the scarcity of suitable datasets. This dissertation addresses these challenges by introducing a deep learning-based methodology to correct sonar image ambiguity by estimating the elevation angle for 3D reconstruction. A primary contribution of this work is the development of the Synthetic Enclosed Echoes (SEE) dataset, a new, comprehensive collection of annotated synthetic and real-world sonar data, created within a high-fidelity simulation of a physical test tank. To process this data, a new methodology called ElevateNET-R is proposed, which is a regression-based neural network adapted to predict the per-pixel elevation angle from a single 2D sonar image. Quantitative experiments demonstrate that the proposed ElevateNET-R model consistently outperforms existing methods from the literature, including classical approaches and other learning-based models. Furthermore, the effectiveness of the methodology was validated in a sim-to-real experiment, where the model, trained exclusively on the synthetic SEE dataset, successfully performed 3D reconstruction on real-world sonar data. The primary contributions are the public release of the expansive SEE dataset and its simulation environment to foster further research, as well as the successful validation of a regression-based network for correcting sonar ambiguity.
Palavras-chave: Sonar Image; 3D Reconstruction; Underwater Robotics