We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Implementing Amazon Web Services Well-Architected best practices for Amazon SQS – Part 2
This blog is written by Chetan Makvana, Senior Solutions Architect and Hardik Vasa, Senior Solutions Architect.
This is the second part of a three-part blog post series that demonstrates implementing best practices for
This blog post covers best practices using the
See also the other two parts of the series:
-
Implementing Amazon Web Services Well-Architected Best Practices for Amazon SQS – Part 1: Operational Excellence -
Implementing Amazon Web Services Well-Architected Best Practices for Amazon SQS – Part 3: Performance Efficiency, Cost Optimization, and Sustainability
Security Pillar
The
Best practice: Configure server-side encryption
If your application has a compliance requirement such as HIPAA, GDPR, or PCI-DSS mandating encryption at rest, if you are looking to improve data security to protect against unauthorized access, or if you are just looking for simplified key management for the messages sent to the SQS queue, you can leverage Server-Side Encryption (SSE) to protect the privacy and integrity of your data stored on SQS.
SQS and
SSE-KMS provides greater control and flexibility over encryption keys, while SSE-SQS simplifies the process by managing the encryption keys for you. Both options help you protect sensitive data and comply with regulatory requirements by encrypting data at rest in SQS queues. Note that SSE-SQS only encrypts the message body and not the message attributes.
In the inventory management example
The
# Create the SQS queue with DLQ setting
queue = sqs.Queue(
self,
"InventoryUpdatesQueue",
visibility_timeout=Duration.seconds(300),
encryption=sqs.QueueEncryption.KMS_MANAGED,
)
Best practice: Implement least-privilege access using access policy
For securing your resources in Amazon Web Services, implementing least-privilege access is critical. This means granting users and services the minimum level of access required to perform their tasks. Least-privilege access provides better security, allows you to meet your compliance requirements, and offers accountability via a clear audit trail of who accessed what resources and when.
By implementing least-privilege access using access policies, you can help reduce the risk of security breaches and ensure that your resources are only accessed by authorized users and services.
In the inventory management example, the CSV processing Lambda function doesn’t perform any other task beyond parsing the inventory updates file and sending the inventory records to the SQS queue for further processing. To ensure that the function has the permissions to send messages to the SQS queue, grant the SQS queue access to the IAM role that the Lambda function assumes. By granting the SQS queue access to the Lambda function’s IAM role, you establish a secure and controlled communication channel. The Lambda function can only interact with the SQS queue and doesn’t have unnecessary access or permissions that might compromise the system’s security.
# Create pre-processing Lambda function
csv_processing_to_sqs_function = _lambda.Function(
self,
"CSVProcessingToSQSFunction",
runtime=_lambda.Runtime.PYTHON_3_8,
code=_lambda.Code.from_asset("sqs_blog/lambda"),
handler="CSVProcessingToSQSFunction.lambda_handler",
role=role,
tracing=Tracing.ACTIVE,
)
# Define the queue policy to allow messages from the Lambda function's role only
policy = iam.PolicyStatement(
actions=["sqs:SendMessage"],
effect=iam.Effect.ALLOW,
principals=[iam.ArnPrincipal(role.role_arn)],
resources=[queue.queue_arn],
)
queue.add_to_resource_policy(policy)
Best practice: Allow only encrypted connections over HTTPS using aws:SecureTransport
It is essential to have a secure and reliable method for transferring data between Amazon Web Services services and on-premises environments or other external systems. With HTTPS, a network-based attacker cannot eavesdrop on network traffic or manipulate it, using an attack such as man-in-the-middle.
With SQS, you can choose to allow only encrypted connections over HTTPS using the aws:SecureTransport condition key in the queue policy. With this condition in place, any requests made over non-secure HTTP receive a 400 InvalidSecurity error from SQS.
In the inventory management example, the CSV processing Lambda function sends inventory updates to the SQS queue. To ensure secure data transfer, the Lambda function uses the HTTPS endpoint provided by SQS. This guarantees that the communication between the Lambda function and the SQS queue remains encrypted and resistant to potential security threats.
# Create an IAM policy statement allowing only HTTPS access to the queue
secure_transport_policy = iam.PolicyStatement(
effect=iam.Effect.DENY,
actions=["sqs:*"],
resources=[queue.queue_arn],
conditions={
"Bool": {
"aws:SecureTransport": "false",
},
},
)
Best practice: Use attribute-based access controls (ABAC)
Some use-cases require granular access control. For example, authorizing a user based on user roles, environment, department, or location. Additionally, dynamic authorization is required based on changing user attributes. In this case, you need an access control mechanism based on user attributes.
ABAC for SQS queues enables two key use cases:
- Tag-based access control: use tags to control access to your SQS queues, including control plane and data plane API calls.
- Tag-on-create: enforce tags at the time of creation of an SQS queues and deny the creation of SQS resources without tags.
Reliability Pillar
The
Best practice: Configure dead-letter queues
In a distributed system, when messages flow between sub-systems, there is a possibility that some messages may not be processed right away. This could be because of the message being corrupted or downstream processing being temporarily unavailable. In such situations, it is not ideal for the bad message to block other messages in the queue.
In the inventory management example, a DLQ plays a vital role in adding message resiliency and preventing situations where a single bad message blocks the processing of other messages. If the backend Lambda function fails after multiple attempts, the inventory update message is redirected to the DLQ. By inspecting these unconsumed messages, you can troubleshoot and redrive them to the primary queue or to custom destination using the
The following Amazon Web Services CDK code snippet shows how to create a DLQ for the source queue and sets up a DLQ policy to only allow messages from the source SQS queue. It is recommended not to set the max_receive_count
value to 1, especially when using a Lambda function as the consumer, to avoid accumulating many messages in the DLQ.
# Create the Dead Letter Queue (DLQ)
dlq = sqs.Queue(self, "InventoryUpdatesDlq", visibility_timeout=Duration.seconds(300))
# Create the SQS queue with DLQ setting
queue = sqs.Queue(
self,
"InventoryUpdatesQueue",
visibility_timeout=Duration.seconds(300),
dead_letter_queue=sqs.DeadLetterQueue(
max_receive_count=3, # Number of retries before sending the message to the DLQ
queue=dlq,
),
)
# Create an SQS queue policy to allow source queue to send messages to the DLQ
policy = iam.PolicyStatement(
effect=iam.Effect.ALLOW,
actions=["sqs:SendMessage"],
resources=[dlq.queue_arn],
conditions={"ArnEquals": {"aws:SourceArn": queue.queue_arn}},
)
queue.queue_policy = iam.PolicyDocument(statements=[policy])
Best practice: Process messages in a timely manner by configuring the right visibility timeout
Setting the appropriate visibility timeout is crucial for efficient message processing in SQS. The visibility timeout is the period during which SQS prevents other consumers from receiving and processing a message after it has been polled from the queue.
To determine the ideal visibility timeout for your application, consider your specific use case. If your application typically processes messages within a few seconds, set the visibility timeout to a few minutes. This ensures that multiple consumers don’t process the message simultaneously. If your application requires more time to process messages, consider breaking them down into smaller units or batching them to improve performance.
If a message fails to process and is returned to the queue, it will not be available for processing again until the visibility timeout period has elapsed. Increasing the visibility timeout will increase the overall latency of your application. Therefore, it’s important to balance the tradeoff between reducing the likelihood of message duplication and maintaining a responsive application.
In the inventory management example, setting the right visibility timeout helps the application fail fast and improve the message processing times. Since the Lambda function typically processes messages within milliseconds, a visibility timeout of 30 seconds is set in the following Amazon Web Services CDK code snippet.
queue = sqs.Queue(
self,
" InventoryUpdatesQueue",
visibility_timeout=Duration.seconds(30),
)
It is recommended to keep the
MaximumBatchingWindowInSeconds
. This allows Lambda function to retry the messages if the invocation fails.
Conclusion
This blog post explores best practices for SQS using the Security Pillar and Reliability Pillar of the Amazon Web Services Well-Architected Framework. We discuss various best practices and considerations to ensure the security of SQS. By following these best practices, you can create a robust and secure messaging system using SQS. We also highlight fault tolerance and processing a message in a timely manner as important aspects of building reliable applications using SQS.
For more serverless learning resources, visit
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.