We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Few-click segmentation mask labeling in Amazon SageMaker Ground Truth Plus
In 2019,
Extreme point clicking with a flexible number of corrective clicks
We now have enhanced the tool to allow extra clicks of boundary points, which provides real-time feedback to the ML model. This allows you to create a more accurate segmentation mask. In the following example, the initial segmentation result isn’t accurate because of the weak boundaries near the shadow. Importantly, this tool operates in a mode that allows for real-time feedback—it doesn’t require you to specify all points at once. Instead, you can first make four mouse clicks, which will trigger the ML model to produce a segmentation mask. Then you can inspect this mask, locate any potential inaccuracies, and subsequently place additional clicks as appropriate to “nudge” the model into the correct result.
Our previous labeling tool allowed you to place exactly four mouse clicks (red dots). The initial segmentation result (shaded red area) isn’t accurate because of the weak boundaries near the shadow (bottom-left of red mask).
With our enhanced labeling tool, the user again first makes four mouse clicks (red dots in top figure). Then you have the opportunity to inspect the resulting segmentation mask (shaded red area in top figure). You can make additional mouse clicks (green dots in bottom figure) to cause the model to refine the mask (shaded red area in bottom figure).
Compared with the original version of the tool, the enhanced version provides an improved result when objects are deformable, non-convex, and vary in shape and appearance.
We simulated the performance of this improved tool on sample data by first running the baseline tool (with only four extreme clicks) to generate a segmentation mask and evaluated its mean Intersection over Union (mIoU), a common measure of accuracy for segmentation masks. Then we applied simulated corrective clicks and evaluated the improvement in mIoU after each simulated click. The following table summarizes these results. The first row shows the mIoU, and the second row shows the error (which is given by 100% minus the mIoU). With only five additional mouse clicks, we can reduce the error by 9% for this task!
. | . | Number of Corrective Clicks | . | |||
. | Baseline | 1 | 2 | 3 | 4 | 5 |
mIoU | 72.72 | 76.56 | 77.62 | 78.89 | 80.57 | 81.73 |
Error | 27% | 23% | 22% | 21% | 19% | 18% |
Integration with Ground Truth and performance profiling
To integrate this model with Ground Truth, we follow a standard architecture pattern as shown in the following diagram. First, we build the ML model into a Docker image and deploy it to
You can follow this generic pattern for your own use cases for purpose-built ML tools and to integrate them with custom Ground Truth task UIs. For more information, refer to
After provisioning this architecture and deploying our model using the
In the following figure, we show the ModelLatency metric natively emitted by SageMaker real-time inference endpoints. We can easily use various metric math functions in CloudWatch to show latency percentiles, such as p50 or p90 latency.
The following table summarizes these results for our enhanced extreme clicking tool for semantic segmentation for three instance types: p2.xlarge, p3.2xlarge, and g4dn.xlarge. Although the p3.2xlarge instance provides the lowest latency, the g4dn.xlarge instance provides the best cost-to-performance ratio. The g4dn.xlarge instance is only 8% slower (35 milliseconds) than the p3.2xlarge instance, but it is 81% less expensive on an hourly basis than the p3.2xlarge (see
SageMaker Instance Type | p90 Latency (ms) | ||
1 | p2.xlarge | 751 | |
2 | p3.2xlarge | 424 | |
3 | g4dn.xlarge | 459 |
Conclusion
In this post, we introduced an extension to the Ground Truth auto segment feature for semantic segmentation annotation tasks. Whereas the original version of the tool allows you to make exactly four mouse clicks, which triggers a model to provide a high-quality segmentation mask, the extension enables you to make corrective clicks and thereby update and guide the ML model to make better predictions. We also presented a basic architectural pattern that you can use to deploy and integrate interactive tools into Ground Truth labeling UIs. Finally, we summarized the model latency, and showed how the use of SageMaker real-time inference endpoints makes it easy to monitor model performance.
To learn more about how this tool can reduce labeling cost and increase accuracy, visit
About the authors
Jonathan Buck is a Software Engineer at Amazon Web Services working at the intersection of machine learning and distributed systems. His work involves productionizing machine learning models and developing novel software applications powered by machine learning to put the latest capabilities in the hands of customers.
Li Erran Li is the applied science manager at humain-in-the-loop services, Amazon Web Services AI, Amazon. His research interests are 3D deep learning, and vision and language representation learning. Previously he was a senior scientist at Alexa AI, the head of machine learning at Scale AI and the chief scientist at Pony.ai. Before that, he was with the perception team at Uber ATG and the machine learning platform team at Uber working on machine learning for autonomous driving, machine learning systems and strategic initiatives of AI. He started his career at Bell Labs and was adjunct professor at Columbia University. He co-taught tutorials at ICML’17 and ICCV’19, and co-organized several workshops at NeurIPS, ICML, CVPR, ICCV on machine learning for autonomous driving, 3D vision and robotics, machine learning systems and adversarial machine learning. He has a PhD in computer science at Cornell University. He is an ACM Fellow and IEEE Fellow.
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.