Posted On: Oct 15, 2023

Amazon Step Functions announces an Optimized Integration for Amazon EMR Serverless, adding support for the Run a Job (.sync) integration pattern with 6 EMR Serverless API Actions (CreateApplication, StartApplication, StopApplication, DeleteApplication, StartJobRun, and CancelJobRun).

EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks like Apache Spark and Apache Hive without configuring, managing, and scaling clusters or servers. Step Functions is a visual workflow service that makes it easy to compose services into scalable, reliable, and resilient application components. Customers use the visual authoring and operator experience of Step Functions to create resilient and manageable multi-step EMR data processing pipelines. With this new Optimized Integration, customers can simplify these pipelines by removing steps used to monitor completion of asynchronous jobs and replace with a single Step Functions state.

These enhancements are now generally available in Amazon Web Services China (Beijing) Region, operated by Sinnet and Amazon Web Services China (Ningxia) Region, operated by NWCD. To get started, you can use the new “Run an EMR Serverless Spark Job” Sample Project for Step Functions in the Management Console or build a workflow using Step Functions Workflow Studio. To learn more, please see the Step Functions Developer Guide.