Posted On: Aug 31, 2020

EMR Notebooks is a service that provides a fully managed, Jupyter-based notebook to data scientists and engineers who write ad-hoc jobs and experiment with them. Now you can orchestrate EMR Notebooks in a non-interactive manner to run ETL workloads especially in production, in Amazon Web Services China (Beijing) Region, operated by Sinnet, and Amazon Web Services China (Ningxia) Region, operated by NWCD. Before this feature, executing notebooks required the Jupyter User Interface access through the Amazon Web Services Management Console.  

The EMR notebooks APIs enable Amazon CLI and SDK access to notebooks so you can run ETL workloads using notebooks in an automated fashion. You can leverage orchestration services such as Amazon Step functions and Apache Airflow to build resilient workflows, and execute notebooks on schedule in a non-interactive manner using cron scripts. You can also pass input parameters to notebooks and debug all executions of a notebook by accessing the historical outputs of each execution. Before this feature, you must create a new copy of the notebook and modify it, for every new combination of the input values. 

To get started with EMR notebooks, please visit EMR Notebooks Page.  

This feature is available on EMR release version 5.18.0 or later.