Amazon SageMaker Neo
Train models once, run anywhere with up to 2x performance improvement
Overview
Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. SageMaker removes the heavy lifting from each step of the machine learning process to make it easier to develop high quality models.
Developers spend a lot of time and effort to deliver accurate machine learning models that can make fast, low-latency predictions in real-time. This is particularly important for edge devices where memory and processing power tend to be highly constrained, but latency is very important. For example, sensors in autonomous vehicles typically need to process data in a thousandth of a second to be useful, so a round trip to the cloud and back isn’t possible. Also, there is a wide array of different hardware platforms and processor architectures for edge devices. To achieve high performance, developers need to spend weeks or months hand-tuning their model for each one. Also, the complex tuning process means that models are rarely updated after they are deployed to the edge. Developers miss out on the opportunity to retrain and improve models based on the data the edge devices collect.
Amazon SageMaker Neo automatically optimizes machine learning models to perform at up to twice the speed with no loss in accuracy. You start with a machine learning model built using MXNet, TensorFlow, PyTorch, or XGBoost and trained using Amazon SageMaker. Then you choose your target hardware platform from Intel, NVIDIA, or ARM. With a single click, SageMaker Neo will then compile the trained model into an executable. The compiler uses a neural network to discover and apply all of the specific performance optimizations that will make your model run most efficiently on the target hardware platform. The model can then be deployed to start making predictions in the cloud or at the edge. Local compute and ML inference capabilities can be brought to the edge with Amazon IoT Greengrass. To help make edge deployments easy, Amazon IoT Greengrass supports Neo-optimized models so that you can deploy your models directly to the edge with over the air updates.
Neo uses Apache TVM and other partner-provided compilers and kernel libraries. Neo is available as open source code as the Neo-AI project under the Apache Software License, enabling developers to customize the software for different devices and applications.