We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Implementing Amazon Web Services Well-Architected best practices for Amazon SQS – Part 1
This blog is written by Chetan Makvana, Senior Solutions Architect and Hardik Vasa, Senior Solutions Architect.
To help you identify and implement these best practices, Amazon Web Services provides the
This three-part blog series covers each pillar of the Amazon Web Services Well-Architected Framework to implement best practices for SQS. This blog post, part 1 of the series, discusses best practices using the
See also the other two parts of the series:
-
Implementing Amazon Web Services Well-Architected Best Practices for Amazon SQS – Part 2: Security and Reliability -
Implementing Amazon Web Services Well-Architected Best Practices for Amazon SQS – Part 3: Performance Efficiency, Cost Optimization, and Sustainability
Solution overview
This solution architecture shows an example of an inventory management system. The system leverages
These CSV files are then uploaded to an S3 bucket, consolidating and securing the inventory data for the inventory management system’s access. The system uses a Lambda function to read and parse the CSV file, extracting individual inventory update records. The backend Lambda function transforms each inventory update record into a message and sends it to an SQS queue. Another Lambda function continually polls the SQS queue for new messages. Upon receiving a message, it retrieves the inventory update details and updates the inventory levels in DynamoDB accordingly.
This ensures that the inventory quantities for each product are accurate and reflect the latest changes. This way, the inventory management system provides real-time visibility into inventory levels across different locations and suppliers, enabling the company to monitor product availability with precision. Find the example code for this solution in the
This example is used throughout this blog series to highlight how SQS best practices can be implemented based on the Amazon Web Services Well Architected Framework.
Operational Excellence Pillar
The Operational Excellence Pillar includes the ability to support development and run workloads effectively, gain insight into their operation, and continuously improve supporting processes and procedures to deliver business value. To achieve operational excellence, the pillar recommends best practices such as defining workload metrics and implementing transaction traceability. This enables organizations to gain valuable insights into their operations, identify potential issues, and optimize services accordingly to improve customer experience. Furthermore, understanding the health of an application is critical to ensuring that it is functioning as expected.
Best practice: Use infrastructure as code to deploy SQS
For managing SQS resources, you can use different IaC tools like
This blog series showcases the use of Amazon Web Services CDK with Python to demonstrate best practices for working with SQS. For example, the following Amazon Web Services CDK code creates a new SQS queue:
from aws_cdk import (
Duration,
Stack,
aws_sqs as sqs,
)
from constructs import Construct
class SqsCdBlogStack(Stack):
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)
# The code that defines your stack goes here
# example resource
queue = sqs.Queue(
self,
"InventoryUpdatesQueue",
visibility_timeout=Duration.seconds(300),
)
Best practice: Configure CloudWatch alarms for ApproximateAgeofOldestMessage
It is important to understand
One of the key metrics that SQS provides is the ApproximateAgeOfOldestMessage
metric. By monitoring this metric, you can determine the age of the oldest message in the queue, and take appropriate action to ensure that messages are processed in a timely manner. To set up alerts for the ApproximateAgeOfOldestMessage
metric, you can use CloudWatch alarms. You configure these alarms to issue alerts when messages remain in the queue for extended periods of time. You can use these alerts to act, for instance by scaling up consumers to process messages more quickly or investigating potential issues with message processing.
In the inventory management example, leveraging the ApproximateAgeOfOldestMessage
metric provides valuable insights into the health and performance of the SQS queue. By monitoring this metric, you can detect processing delays, optimize performance, and ensure that inventory updates are processed within the desired timeframe. This ensures that your inventory levels remain accurate and up-to-date. The following code creates an alarm which is triggered if the oldest inventory updates request is in the queue for more than 30 seconds.
# Create a CloudWatch alarm for ApproximateAgeOfOldestMessage metric
alarm = cloudwatch.Alarm(
self,
"OldInventoryUpdatesAlarm",
alarm_name="OldInventoryUpdatesAlarm",
metric=queue.metric_approximate_age_of_oldest_message(),
threshold=600, # Specify your desired threshold value in seconds
evaluation_periods=1,
comparison_operator=cloudwatch.ComparisonOperator.GREATER_THAN_OR_EQUAL_TO_THRESHOLD,
)
Best practice: Add a tracing header while sending a message to the queue to provide distributed tracing capabilities for faster troubleshooting
By implementing distributed tracing, you can gain a clear understanding of the flow of messages in SQS queues, identify any bottlenecks or potential issues, and proactively react to any signals that indicate an unhealthy state. Tracing provides a wider continuous view of an application and helps to follow a user journey or transaction through the application.
AWSTraceHeader
System Attribute. AWSTraceHeader
is available for use even when auto-instrumentation through the X-Ray SDK is not, for example, when building a tracing SDK for a new language. If you are using a Lambda downstream consumer, trace context propagation is automatic.
In the inventory management example, by utilizing distributed tracing with X-Ray for SQS, you can gain deep insights into the performance, behavior, and dependencies of the inventory management system. This visibility enables you to optimize performance, troubleshoot issues more effectively, and ensure the smooth and efficient operation of the system. The following code sets up a CSV processing Lambda function and a backend processing Lambda function with active tracing enabled. The Lambda function automatically receives the X-Ray TraceId
from SQS.
# Create pre-processing Lambda function
csv_processing_to_sqs_function = _lambda.Function(
self,
"CSVProcessingToSQSFunction",
runtime=_lambda.Runtime.PYTHON_3_8,
code=_lambda.Code.from_asset("sqs_blog/lambda"),
handler="CSVProcessingToSQSFunction.lambda_handler",
role=role,
tracing=Tracing.ACTIVE, # Enable active tracing with X-Ray
)
# Create a post-processing Lambda function with the specified role
sqs_to_dynamodb_function = _lambda.Function(
self,
"SQSToDynamoDBFunction",
runtime=_lambda.Runtime.PYTHON_3_8,
code=_lambda.Code.from_asset("sqs_blog/lambda"),
handler="SQSToDynamoDBFunction.lambda_handler",
role=role,
tracing=Tracing.ACTIVE, # Enable active tracing with X-Ray
)
Conclusion
This blog post explores best practices for SQS with a focus on the Operational Excellence Pillar of the Amazon Web Services Well-Architected Framework. We explore key considerations for ensuring the smooth operation and optimal performance of applications using SQS. Additionally, we explore the advantages of infrastructure as code in simplifying infrastructure management and showcase how Amazon Web Services CDK can be used to provision and manage SQS resources.
For more serverless learning resources, visit
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.