Amazon Web Services and Hugging Face collaborate to make generative AI more accessible and cost efficient

by Amazon Web Services | on

We’re thrilled to announce an expanded collaboration between Amazon Web Services and Hugging Face to accelerate the training, fine-tuning, and deployment of large language and vision models used to create generative AI applications. Generative AI applications can perform a variety of tasks, including text summarization, answering questions, code generation, image creation, and writing essays and articles.

Amazon Web Services has a deep history of innovation in generative AI. For example, Amazon uses AI to deliver a conversational experience with Alexa that customers are interacting with billions of times each week, and is increasingly using generative AI as part of new experiences like Create with Alexa . In addition, M5 a group within Amazon Search that helps teams across Amazon bring large models to their applications, trained large models to improve search results on Amazon.com . Amazon Web Services is constantly innovating across all areas of ML including infrastructure, tools on Amazon SageMaker ,  and AI services, such as Amazon CodeWhisperer , a service that improves developer productivity by generating code recommendations based on the code and comments in an IDE. Amazon Web Services also created purpose-built ML accelerators for the training ( Amazon Web Services Trainium ) and inference ( Amazon Web Services Inferentia ) of large language and vision models on Amazon Web Services.

Hugging Face selected Amazon Web Services because it offers flexibility across state-of-the-art tools to train, fine-tune, and deploy Hugging Face models including Amazon SageMaker , Amazon Web Services Trainium , and Amazon Web Services Inferentia . Developers using Hugging Face can now easily optimize performance and lower cost to bring generative AI applications to production faster.

High-performance and cost-efficient generative AI

Building, training, and deploying large language and vision models is an expensive and time-consuming process that requires deep expertise in machine learning (ML). Since the models are very complex and can contain hundreds of billions of parameters, generative AI is largely out of reach for many developers.

To close this gap, Hugging Face is now collaborating with Amazon Web Services to make it easier for developers to access Amazon Web Services services and deploy Hugging Face models specifically for generative AI applications. The benefits are: faster training and scaling low-latency and high-throughput inference. For example, the Amazon EC2 Trn1 instances powered by Amazon Web Services Trainium deliver faster time to train while offering up to 50% cost-to-train savings over comparable Amazon EC2 instances. Amazon EC2’s new Inf2 instances, powered by the latest generation of Amazon Web Services Inferentia , are purpose-built to deploy the latest generation of large language and vision models and raise the performance of Inf1 by delivering up to 4x higher throughput and up to 10x lower latency. Developers can use Amazon Web Services Trainium and Amazon Web Services Inferentia through managed services such as Amazon SageMaker, a service with tools and workflows for ML. Or they can self-manage on Amazon EC2.

Get started today

Customers can start using Hugging Face models on Amazon Web Services in three ways: through SageMaker JumpStart, the Hugging Face Amazon Web Services Deep Learning Containers (DLCs), or the tutorials to deploy your models to Amazon Web Services Trainium or Amazon Web Services Inferentia. The Hugging Face DLC is packed with optimized transformers, datasets, and tokenizers libraries to enable you to fine-tune and deploy generative AI applications at scale in hours instead of weeks – with minimal code changes. SageMaker JumpStart and the Hugging Face DLCs are available in all regions where Amazon SageMaker is available and come at no additional cost. Read documentation and discussion forum s to learn more or try the sample notebooks today.


The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.