Introduction
The cornerstone of machine learning lies in access to copious amounts of high-quality data; this is essential for training and validating systems that can precisely identify objects and subjects. Manually capturing and labeling images is a laborious, error-prone, and time-intensive task, and can often be the bottleneck in machine learning projects. Today, we share how Ekumen helped a manufacturing client overcome traditional manual image capture and labeling limitations by implementing a photorealistic simulation for automatic synthetic data generation.
The Challenge
The client, from the manufacturing industry, needed to identify packages moving along conveyor belts and being deposited into containers. Their internal team faced several challenges:
Manual image capture: thousands of pictures of boxes in various positions and conditions were required, taken from either a mock environment or a collaborative client facility.
Efficiency limitations and bottlenecks: The manual process was not only inefficient but also costly in terms of time and human resources, hindering scalability.
Data Variance Limitations: Datasets, whether from mock environments or existing client conditions, lacked the variability needed for robust ML training. New client facilities presented diverse package types, lighting conditions, and sensor positions, leading to poor model generalization and slowing down deployment in new facilities.
The Solution
To address these challenges, Ekumen implemented a photorealistic simulation using Nvidia Omniverse Isaac Sim and its OpenUSD support, then built a pipeline for its randomization and automatic dataset generation.
Given the project’s nature, boxes cannot be positioned programmatically in a way that they will look disorganized enough as to match what it looks like when they come through the real conveyor belt. To better reflect these scenarios, the randomization pipeline developed by Ekumen leverages positioning on Nvidia PhysX, and implements several performance improvements to let the physics engine run as fast as possible before enabling all photorealistic capabilities and taking the captures with the highest quality possible.
Benefits obtained
-
Precise 3D ground truth information: Provide exact data on the position and shape of each object, significantly enhancing system training. The type of data that can be captured is really variadic and goes from RGB and semantic segmentation, to depth images, instance segmentation, and 2D/3D bounding boxes.
-
Reduced costs and time: Eliminate the need for human intervention, dramatically reducing data production costs and time.
-
Scalability and efficiency: Given that no human intervention is needed, data can be generated in the cloud and uploaded for direct model ingestion easily. Data production can be escalated by spinning up additional cloud instances.
-
High-quality data: The simulation’s precision ensured that variations in lighting and shadows did not compromise package identification. Additionally, randomization of uncommon parameters (as using extremely shiny colors) prevents model overfitting.
-
Photorealism: Using tools oriented towards photorealism helps close the sim2real gap.
-
Scalability: The technology can be adapted to other industries and requirements, extending its use beyond manufacturing.
-
Reliable Ground Truth: Ensuring reusable datasets with accurate ground truth, regardless of model input format.
-
Adaptability: Training the model with new types of parcels is as simple as creating a new asset and including it in the randomization pipeline, allowing quick adaptability and easing the process of deploying the model for new clients with different needs.
Applicability across industries
In a conversation with Christian Barcelo, one of our Senior Simulation Specialists, it became clear that this methodology isn’t limited to manufacturing. Any industry that relies on camera-based object identification, such as robotics, autonomous vehicles, or aerial lifting equipment, can benefit from this innovative technology.
Conclusion
Ekumen’s implementation of photorealistic simulations has proven to be an effective and scalable solution for optimizing machine learning projects in the manufacturing industry. By reducing dependency on manual labor and enhancing data accuracy, this methodology offers a valuable tool for any company looking to innovate and improve its automated identification processes.
If you’re interested in learning more about how we can help implement this solution or need guidance on adapting this technology to your industry, please feel free to contact our team: contact@ekumenlabs.com