Indoor synthetic data generation: A systematic review

Example renderings from the same scene using different tools.

Abstract

Objective: Deep learning-based object recognition, 6D pose estimation, and semantic scene under- standing require a large amount of training data to achieve generalization. Time-consuming annota- tion processes, privacy, and security aspects lead to a scarcity of real-world datasets. To overcome this lack of data, synthetic data generation has been proposed, including multiple facets in the area of domain randomization to extend the data distribution. The objective of this review is to identify meth- ods applied for synthetic data generation aiming to improve 6D pose estimation, object recognition, and semantic scene understanding in indoor scenarios. We further review methods used to extend the data distribution and discuss best practices to bridge the gap between synthetic and real-world data. Methods: We adhered to the guidelines of the systematic PRISMA technique. Three databases, IEEE Xplore, Springer Link, and ACM, and an additional manual search were conducted. In total, we identified 241 studies and included 34 in our systematic review. Conclusion: In summary, synthetic data generation has been performed using crop-out methods, graphic APIs, 3D modeling or authoring tools, or game engine-based methods. To extend the data distribution, varying scene parameters, i.e., lighting conditions or textures and the use of distracting objects in the scene are promising

Publication
Computer Vision and Image Understanding

We gratefully acknowledge funding for this study by d.hip campus and Bundesministerium für Bildung und Forschung (BMBF) with the grant number 16SV8973.