Q: What is AWS Batch?
AWS Batch is a set of batch management capabilities that enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters, allowing you to instead focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads using Amazon EC2 and Spot Instances.
Q: What is Batch Computing?
Batch computing is the execution of a series of programs ("jobs") on one or more computers without manual intervention. Input parameters are pre-defined through scripts, command-line arguments, control files, or job control language. A given batch job may depend on the completion of preceding jobs, or on the availability of certain inputs, making the sequencing and scheduling of multiple jobs important, and incompatible with interactive processing.
- It can shift the time of job processing to periods when greater or less expensive capacity is available.
- It avoids idling compute resources with frequent manual intervention and supervision.
- It increases efficiency by driving higher utilization of compute resources.
- It enables the prioritization of jobs, aligning resource allocation with business objectives.
Why AWS Batch
Q: Why should I use AWS Batch?
AWS Batch handles job execution and compute resource management, allowing you to focus on developing applications or analyzing results instead of setting up and managing infrastructure. If you are considering running or moving batch workloads to AWS, you should consider using AWS Batch.
Q: What use cases is AWS Batch optimized for?
AWS Batch is optimized for batch computing and applications that scale through the execution of multiple jobs in parallel. Deep learning, genomics analysis, financial risk models, Monte Carlo simulations, animation rendering, media transcoding, image processing, and engineering simulations are all excellent examples of batch computing applications.
Q: What are the key features of AWS Batch?
AWS Batch manages compute environments and job queues, allowing you to easily run thousands of jobs of any scale using Amazon EC2 and EC2 Spot. You simply define and submit your batch jobs to a queue. In response, AWS Batch chooses where to run the jobs, launching additional AWS capacity if needed. AWS Batch carefully monitors the progress of your jobs. When capacity is no longer needed, AWS Batch will remove it. AWS Batch also provides the ability to submit jobs that are part of a pipeline or workflow, enabling you to express any interdependencies that exist between them as you submit jobs.
Q: What types of batch jobs does AWS Batch support?
AWS Batch supports any job that can executed as a Docker container. Jobs specify their memory requirements and number of vCPUs.
Q: What is a Compute Resource?
An AWS Batch Compute Resource is an EC2 instance.
Q: What is a Compute Environment?
An AWS Batch Compute Environment is a collection of compute resources on which jobs are executed. AWS Batch supports two types of Compute Environments; Managed Compute Environments which are provisioned and managed by AWS and Unmanaged Compute Environments which are managed by customers. Unmanaged Compute Environments provide a mechanism to leverage specialized resources such as Dedicated Hosts, larger storage configurations, and Amazon EFS.
Q: What is a Job Definition?
A Job Definition describes the job to be executed, parameters, environmental variables, compute requirements, and other information that is used to optimize the execution of a job. Job Definitions are defined in advance of submitting a job and can be shared with others.
Q: What is the Amazon ECS Agent and how is it used by AWS Batch?
AWS Batch uses Amazon ECS to execute containerized jobs and therefore requires the ECS Agent to be installed on compute resources within your AWS Batch Compute Environments. The ECS Agent is pre-installed in Managed Compute Environments.
Q: How does AWS Batch make it easier to use EC2 Spot?
AWS Batch Compute Environments can be comprised of EC2 Spot instances. When creating a Managed Compute Environment, simplify specify that you would like to use EC2 Spot and provide a percentage of On Demand pricing that you are willing to pay and AWS Batch will take care of the rest. Unmanaged Compute Environments can also include Spot instances that you launch, including those launched by EC2 Spot Fleet.
Q. What is the pricing for AWS Batch?
There is no additional charge for AWS Batch. You only pay for the AWS Resources (e.g. EC2 Instances) you create to store and run your batch jobs.
Q: Can I use accelerators with AWS Batch?
Yes, you can use Batch to specify the number and type of accelerators your jobs require as job definition input variables, alongside the current options of vCPU and memory. AWS Batch will scale up instances appropriate for your jobs based on the required accelerators and isolate the accelerators according to each job’s needs, so only the appropriate containers can access them.
Q: Why should I use accelerators with AWS Batch?
By using accelerators with Batch, you can dynamically schedule and provision your jobs according to their accelerator needs, and Batch will ensure that the appropriate number of accelerators are reserved against your jobs. Batch will scale up your EC2 Accelerated Instances when you need them, and scale them down when you’re done, allowing you to focus on your applications. Batch has native integration with the EC2 Spot, meaning your accelerated jobs can take advantage of up to 90% savings when using accelerated instances.
Q: What accelerators can I use with AWS Batch?
You can use GPU’s on P accelerated instances currently.
Q: How do I submit jobs requiring accelerated instances to Batch?
You can specify the number and type of accelerators in the Job Definition. You specify the accelerator by describing the accelerator type (e.g., GPU – currently the only supported accelerator) and the number of that type your job requires. Your specified accelerator type must be present on one of the instance types specified in your Compute Environments. For example, if your job needs 2 GPUs, also make sure that you have specified a P-family instance in your Compute Environment.
From the API:
"resourceRequirements" : [
"type" : "GPU",
"value" : "1"
Q: Can accelerator variables in the job definition be overwritten at job submission?
Similar to vCPU and memory requirements, you can overwrite the number and type of accelerators at job submission.
Q: Can accelerated instances be used for jobs that don't need the accelerators?
With today's behavior, Batch will avoid scheduling jobs that do not require acceleration on accelerated instances when possible. This is to avoid cases where long-running jobs occupy the accelerated instance without taking advantage of the accelerator, increasing cost. In rare cases with Spot pricing and with accelerated instances as allowed types, it is possible that Batch will determine that an accelerated instance is the least expensive way to run your jobs, regardless of accelerator needs.
If you submit a job to a CE that only allows Batch to launch accelerated instances, Batch will run the jobs on those instances, regardless of their accelerator needs.
Q: How does Batch use the ECS GPU-Optimized AMI?
From now on, p-type instances will launch by default with the ECS GPU-optimized AMI. This AMI contains libraries and runtimes needed to run GPU-based applications. You can always point to a custom AMI as needed when creating a CE.
Q. How do I get started?
Follow the Getting Started Guide in our documentation to get started.
Q. What do I need to provision to get started?
There is no need to manually launch your own compute resources in order to get started. The AWS Batch web console will guide you through the process of creating your first Compute Environment and Job Queue so that you can submit your first job. Resources within your compute environment will scale up as additional jobs are ready to run and scale down as the number of runnable jobs decreases.