Cross-Domain Synthetic-to-Real In-the-Wild
Depth and Normal Estimation for 3D Scene Understanding

1 Indian Institute of Technology Madras

2 UNC Chapel Hill

OmniCV 2024 | Best Paper Award

Abstract



We present a cross-domain inference technique that learns from synthetic data to estimate depth and normals for in-the-wild omnidirectional 3D scenes encountered in real-world uncontrolled settings. To this end, we introduce UBotNet, an architecture that combines UNet and Bottleneck Transformer elements to predict consistent scene normals and depth. We also introduce the OmniHorizon synthetic dataset containing 24,335 omnidirectional images that represent a wide variety of outdoor environments, including buildings, streets, and diverse vegetation. This dataset is generated from expansive, lifelike virtual spaces and encompasses dynamic scene elements, such as changing lighting conditions, different times of day, pedestrians, and vehicles. Our experiments show that UBotNet achieves significantly improved accuracy in depth estimation and normal estimation compared to existing models. Lastly, we validate cross-domain synthetic-to-real depth and normal estimation on real outdoor images using UBotNet trained solely on our synthetic OmniHorizon dataset, demonstrating the potential of both the synthetic dataset and the proposed network for real-world scene understanding applications.



Dataset


OmniHorizon dataset was generated using Unreal Engine 4, featuring color images, scene depth, and world normals in a top-bottom (stereo) format, all rendered at 1024 x 512 resolution. The dataset provides 24,335 omnidirectional views for various outdoor scenarios which includes parks, market, traffic junction, underpass, uneven terrain, buildings, vehicles and pedestrians.

Features of the OmniHorizon

Outdoors Scenarios

Dynamic Lighting

Vehicles

Virtual avatars

Benchmarks

Quantitative and qualitative results for benchmark on OmniHorizon. Check paper and supplementary material for complete details.

Results on Real world images in the wild

Download

OmniHorizon dataset was developed by the team at Touchlab, IIT Madras.

The work is supported by XTIC (eXperiential Technology Innovation Centre)

Kindly use the below links to access the corresponding resources:



Paper
(arXiv)


Code
Coming Soon!!


Dataset
Coming Soon!!

If you find our work useful, please consider citingļ¼š

  

@article{omnihorizon, title={Cross-Domain Synthetic-to-Real In-the-Wild Depth and Normal Estimation for 3D Scene Understanding}, author={Bhanushali, Jay and Chakravarthula, Praneeth and Muniyandi, Manivannan}, journal={arXiv preprint arXiv:2212.05040}, year={2024} }