Amazon Web Services Solutions Library

Spot Tagging Bot Solution

What does this Amazon Web Services Solution do?

The Spot Tagging Bot solution enables customers to label digital assets (e.g. unstructured data such as photos, PDF documents, videos, etc.) with machine learning models. Using machine learning technology to label assets helps customers automate their business processes and also create knowledge maps. The solution leverages Amazon Simple Storage Service (Amazon S3) to secure customer's assets and leverages Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances to lower customers’ cost. The solution is an open source framework, customers could use the build-in bots for free or they could also contribute to framework by building their own bot by fine-tuning their own models.

Amazon Web Services Solution Overview

The Spot Tagging Bot solution has two features: the model training feature and the asset tagging feature.

The model training feature allows customers to train their own container-based machine learning model. The trained model will be uploaded to Amazon ECR.

The asset tagging feature allows customers to use machine learning models to process digital assets. All the machine learning tasks run on an Amazon Batch controlled elastic computing platform, which provides resources on demand, releases the resources once completion, and scales automatically based on the task load. The underlying instances that the entire process runs on are Amazon EC2 Spot Instances, which further saves the cost.

Architecture of Serverless Image Handler

Spot Tagging Bot Solution Architecture

Amazon API Gateway sends request to Amazon Lambda Function.
Amazon Lambda Function recursively scans all asset file paths in Amazon S3 and generates a task list.
Amazon Lambda Function saves the list into Amazon Elasticsearch Service (Amazon ES).
Amazon Lambda Function launches inference endpoint for Amazon SageMaker.
Amazon Lambda Function launches one or more Amazon Batch jobs through Amazon Step Functions.
Amazon Batch Job performs the appropriate asset task, reads the source file from the Amazon S3 bucket, and calls Amazon Sagemaker endpoint for reasoning. The reasoning result is written to the Elasticsearch Index, while the recognition result is written to the S3 path customer specified.

This solution currently contains three robots, the Car Model Classification Bot, the Sentiment Analysis Bot, and the Scene Text Recognition Bot.

1. Car Model Classification Bot

The Car Model Classification Bot taking input of the vehicle images, identifies car manufacturer, car model and other information using an image classification model. The model is generated based on AutoGluon, an open-sourced AutoML framework, fine-tuned based on ResNet50 model.

2. Sentiment Analysis Bot
The Sentiment Analysis Bot taking input of Chinese text, identifies sentiment types (positive/negative) in text context using a natural language processing model. The model is a pre-trained and is tuned based on Chinese BERT model.

3. Scene Text Recognition Bot
The Scene Text Recognition Bot recognizes Chinese text information in the print body from images. The bot uses two models to do text recognition, CTPN and CRNN. The bot uses CTPN model to extract the coordinates of the text position, and uses CRNN model to recognize the content of the text.

Show less

Spot Tagging Bot

Version 1.0.0
Last updated: 08/2020
Author: Amazon Web Services

Estimated deployment time: 20 min

Source code

View deployment guide

Launch solution in the Amazon Web Services Console

Start Your Cloud Journey with Free Trials

Try over 40 popular cloud services for free, up to 12 months

Start your free trial now

Features

Easy to Use

The built-in pre-trained models in this solution enable customers to easily use machine learning technologies. The Amazon Web Services services used in this solution such as Amazon ECS, Amazon SageMaker and Amazon Batch, enable customers easily start, scale, and manage bot tasks. The task running process is visualized, customers can check task status at any time through the Amazon Step Functions console.

Cost Optimization

The Amazon EC2 Spot Instances used in the solution allow customers to run workload at large scale. Customer can save significant costs when running massive workloads, and also can speed up workloads by running parallel tasks. Spot instances can save up to 90% cost compared to On-Demand Amazon EC2 Instances.

Open Source & Customization

All the models included in this solution are open sourced with a container-based training framework. Customers can train model with their own labeled data in the training framework software and import the training result into the bot framework. The deployed Amazon Web Services environment will automatically detect the imported model and use it in the bot jobs.