Introducing Amazon Web Services Lambda response streaming

Today, Amazon Web Services Lambda is announcing support for response payload streaming. Response streaming is a new invocation pattern that lets functions progressively stream response payloads back to clients.

You can use Lambda response payload streaming to send response data to callers as it becomes available. This can improve performance for web and mobile applications. Response streaming also allows you to build functions that return larger payloads and perform long-running operations while reporting incremental progress.

In traditional request-response models, the response needs to be fully generated and buffered before it is returned to the client. This can delay the time to first byte (TTFB) performance while the client waits for the response to be generated. Web applications are especially sensitive to TTFB and page load performance. Response streaming lets you send partial responses back to the client as they become ready, improving TTFB latency to within milliseconds. For web applications, this can improve visitor experience and search engine rankings.

Other applications may have large payloads, like images, videos, large documents, or database results. Response streaming lets you transfer these payloads back to the client without having to buffer the entire payload in memory. You can use response streaming to send responses larger than Lambda’s 6 MB response payload limit up to a soft limit of 20 MB.

Response streaming currently supports the Node.js 14.x and subsequent managed runtimes. You can also implement response streaming using custom runtimes. You can progressively stream response payloads through Lambda function URLs , including as an Amazon CloudFront origin, along with using the Amazon Web Services SDK or using Lambda’s invoke API. You can not use Amazon API Gateway and Application Load Balancer to progressively stream response payloads, but you can use the functionality to return larger payloads with API Gateway.

Writing response streaming enabled functions

Writing the handler for response streaming functions differs from typical Node handler patterns. To indicate to the runtime that Lambda should stream your function’s responses, you must wrap your function handler with the streamifyResponse() decorator. This tells the runtime to use the correct stream logic path, allowing the function to stream responses.

This is an example handler with response streaming enabled:

exports.handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {
        responseStream.setContentType(“text/plain”);
        responseStream.write(“Hello, world!”);
        responseStream.end();
    }
);

The streamifyResponse decorator accepts the following additional parameter, responseStream , besides the default node handler parameters, event , and context .

The new responseStream object provides a stream object that your function can write data to. Data written to this stream is sent immediately to the client. You can optionally set the Content-Type header of the response to pass additional metadata to your client about the contents of the stream.

Writing to the response stream

The responseStream object implements Node’s Writable Stream API. This offers a write() method to write information to the stream. However, we recommend that you use pipeline() wherever possible to write to the stream. This can improve performance, ensuring that a faster readable stream does not overwhelm the writable stream.

An example function using pipeline() showing how you can stream compressed data:

const pipeline = require("util").promisify(require("stream").pipeline);
const zlib = require('zlib');
const { Readable } = require('stream');

exports.gzip = awslambda.streamifyResponse(async (event, responseStream, _context) => {
    // As an example, convert event to a readable stream.
    const requestStream = Readable.from(Buffer.from(JSON.stringify(event)));
    
    await pipeline(requestStream, zlib.createGzip(), responseStream);
});

Ending the response stream

When using the write() method, you must end the stream before the handler returns. Use responseStream.end() to signal that you are not writing any more data to the stream. This is not required if you write to the stream with pipeline() .

Reading streamed responses

Response streaming introduces a new InvokeWithResponseStream API. You can read a streamed response from your function via a Lambda function URL or use the Amazon Web Services SDK to call the new API directly.

Neither API Gateway nor Lambda’s target integration with Application Load Balancer support chunked transfer encoding. It therefore does not support faster TTFB for streamed responses. You can, however, use response streaming with API Gateway to return larger payload responses, up to API Gateway’s 10 MB limit. To implement this, you must configure an HTTP_PROXY integration between your API Gateway and a Lambda function URL, instead of using the LAMBDA_PROXY integration.

You can also configure CloudFront with a function URL as origin. When streaming responses through a function URL and CloudFront, you can have faster TTFB performance and return larger payload sizes.

Using Lambda response streaming with function URLs

You can configure a function URL to invoke your function and stream the raw bytes back to your HTTP client via chunked transfer encoding. You configure the Function URL to use the new InvokeWithResponseStream API by changing the invoke mode of your function URL from the default BUFFERED to RESPONSE_STREAM .

RESPONSE_STREAM enables your function to stream payload results as they become available if you wrap the function with the streamifyResponse() decorator. Lambda invokes your function using the InvokeWithResponseStream API. If InvokeWithResponseStream invokes a function that is not wrapped with streamifyResponse() , Lambda does not stream the response. However, this response is subject to the InvokeWithResponseStream API endpoint’s 20MB soft limit rather than a 6 MB size limit.

Using Amazon Web Services Serverless Application Model (Amazon Web Services SAM) or Amazon Web Services CloudFormation , set the InvokeMode property:

  MyFunctionUrl:
    Type: AWS::Lambda::Url
    Properties:
      TargetFunctionArn: !Ref StreamingFunction
      AuthType: AWS_IAM
      InvokeMode: RESPONSE_STREAM

Using generic HTTP client libraries with function URLs

Each language or framework may use different methods to form an HTTP request and parse a streamed response. Some HTTP client libraries only return the response body after the server closes the connection. These clients do not work with functions that return a response stream. To get the benefit of response streams, use an HTTP client that returns response data incrementally. Many HTTP client libraries already support streamed responses, including the Apache HttpClient for Java, Node’s built-in http client, and Python’s requests and urllib3 packages. Consult the documentation for the HTTP library that you are using.

Example applications

There are a number of example Lambda streaming applications in the Serverless Patterns Collection . They use Amazon Web Services SAM to build and deploy the resources in your Amazon Web Services account.

Clone the repository and explore the examples. The README file in each pattern folder contains additional information.

git clone https://github.com/aws-samples/serverless-patterns/ 
cd serverless-patterns

Time to first byte using write()

To show how streaming improves time to first bite, deploy the lambda-streaming-ttfb-write-sam pattern.

cd lambda-streaming-ttfb-write-sam

Use Amazon Web Services SAM to deploy the resources to your Amazon Web Services account. Run a guided deployment to set the default parameters for the first deployment.

sam deploy -g --stack-name lambda-streaming-ttfb-write-sam

For subsequent deployments you can use sam deploy .

Enter a Stack Name and accept the initial defaults.

Amazon Web Services SAM deploys a Lambda function with streaming support and a function URL.

Amazon Web Services SAM deploy –g

Once the deployment completes, Amazon Web Services SAM provides details of the resources.

Amazon Web Services SAM resources

The Amazon Web Services SAM output returns a Lambda function URL.

Use curl with your Amazon Web Services credentials to view the streaming response as the URL uses Amazon Web Services Identity and Access Management (IAM) for authorization. Replace the URL and Region parameters for your deployment.

curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'

You can see the gradual display of the streamed response.

Using curl to stream response from write () function

Time to first byte using pipeline()

To try an example using pipeline() , deploy the lambda-streaming-ttfb-pipeline-sam pattern.

cd ..
cd lambda-streaming-ttfb-pipeline-sam

Use Amazon Web Services SAM to deploy the resources to your Amazon Web Services account. Run a guided deployment to set the default parameters for the first deploy.

sam deploy -g --stack-name lambda-streaming-ttfb-pipeline-sam

Enter a Stack Name and accept the initial defaults.
Use curl with your Amazon Web Services credentials to view the streaming response. Replace the URL and Region parameters for your deployment.

curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda'

You can see the pipelined response stream returned.

Using curl to stream response from function

Large payloads

To show how streaming enables you to return larger payloads, deploy the lambda-streaming-large-sam application. Amazon Web Services SAM deploys a Lambda function, which returns a 7 MB PDF file which is larger than Lambda’s non-stream 6 MB response payload limit.

cd ..
cd lambda-streaming-large-sam
sam deploy -g --stack-name lambda-streaming-large-sam

The Amazon Web Services SAM output returns a Lambda function URL. Use curl with your Amazon Web Services credentials to view the streaming response.

curl --request GET https://<url>.lambda-url.<Region>.on.aws/ --user AKIAIOSFODNN7EXAMPLE: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY --aws-sigv4 'aws:amz:<Region>:lambda' -o SVS401-ri22.pdf -w '%{content_type}'

This downloads the PDF file SVS401-ri22.pdf to your current directory and displays the content type as application/pdf .

You can also use API Gateway to return a large payload with an HTTP_PROXY integration with a Lambda function URL.

Invoking a function with response streaming using the Amazon Web Services SDK

You can use the Amazon Web Services SDK to stream responses directly from the new Lambda InvokeWithResponseStream API. This provides additional functionality such as handling midstream errors. This can be helpful when building, for example, internal microservices. Response streaming is supported with the Amazon Web Services SDK for Java 2.x , Amazon Web Services SDK for JavaScript v3 , and Amazon Web Services SDKs for Go version 1 and version 2 .

The SDK response returns an event stream that you can read from. The event stream contains two event types. PayloadChunk contains a raw binary buffer with partial response data received by the client. InvokeComplete signals that the function has completed sending data. It also contains additional metadata, such as whether the function encountered an error in the middle of the stream. Errors can include unhandled exceptions thrown by your function code and function timeouts.

Using the Amazon Web Services SDK for Javascript v3

To see how to use the Amazon Web Services SDK to stream responses from a function, deploy the lambda-streaming-sdk-sam pattern.

cd ..
cd lambda-streaming-sdk-sam
sam deploy -g --stack-name lambda-streaming-sdk-sam

Enter a Stack Name and accept the initial defaults.

Amazon Web Services SAM deploys three Lambda functions with streaming support.

HappyPathFunction : Returns a full stream.
MidstreamErrorFunction : Simulates an error midstream.
TimeoutFunction : Function times out before stream completes.

Run the SDK example application, which invokes each Lambda function and outputs the result.

npm install @aws-sdk/client-lambda
node index.mjs

You can see each function and how the midstream and timeout errors are returned back to the SDK client.

Streaming midstream error

Streaming timeout error

Quotas and pricing

Streaming responses incur an additional cost for network transfer of the response payload. You are billed based on the number of bytes generated and streamed out of your Lambda function over the first 6 MB. For more information, see Lambda pricing .

There is an initial maximum response size of 20 MB, which is a soft limit you can increase. The first 6MB of the response payload is streamed without any bandwidth constraints. Beyond 6MB, there is a maximum bandwidth throughput limit of 16 Mbps (2 MB/s).

Conclusion

Today, Amazon Web Services Lambda is announcing support for response payload streaming to send partial responses to callers as the responses become available. This can improve performance for web and mobile applications. You can also use response streaming to build functions that return larger payloads and perform long-running operations while reporting incremental progress. Stream partial responses through Lambda function URLs , or using the Amazon Web Services SDK. Response streaming currently supports the Node.js 14.x and subsequent runtimes, as well as custom runtimes.

There are a number of example Lambda streaming applications in the Serverless Patterns Collection to explore the functionality.

Lambda response streaming support is also available through many Amazon Web Services Lambda Partners such as Datadog , Dynatrace , New Relic , Pulumi and Lumigo .

For more serverless learning resources, visit Serverless Land .