Amazon S3 has various features you can use to organize and manage your data in ways that support specific use cases, enable cost efficiencies, enforce security, and meet compliance requirements. Data is stored as objects within resources called “buckets”, and a single object can be up to 5 terabytes in size. S3 features include capabilities to append metadata tags to objects, move and store data across the S3 Storage Classes, configure and enforce data access controls, secure data against unauthorized users, run big data analytics, and monitor data at the object and bucket levels. Objects can be accessed through S3 Access Points or directly through the bucket hostname.

  • Each object is stored in a bucket and retrieved via a unique, developer-assigned key.
  • Objects stored in a specific region never leave the Region unless you transfer them out.
  • Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users.
  • Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.
  • Built to be flexible so that protocol or functional layers can easily be added. The default download protocol is HTTP, and the S3 API also supports HTTPS. Amazon CLI and SDK use secure HTTPS connections by default.
  • Provides functionality to simplify manageability of data through its lifetime. Includes options for segregating data by buckets, monitoring and controlling spend, and automatically archiving data to even lower cost storage options. These options can be easily administered from the Amazon S3 Management Console.

Protecting Your Data

Data stored in Amazon S3 is secure by default; only bucket and object owners have access to the Amazon S3 resources they create. Amazon S3 supports multiple access control mechanisms. With Amazon S3’s data protection features, you can protect your data from both logical and physical failures, guarding against data loss from unintended user actions, application errors, and infrastructure failures. For customers who must comply with regulatory standards, Amazon S3’s data protection features can be used as part of an overall strategy to achieve compliance. The various data security and reliability features offered by Amazon S3 are described in detail below.

Amazon S3 offers flexible security features to block unauthorized users from accessing your data. Use gateway VPC endpoints and interface VPC endpoints to connect to S3 resources from your Amazon Virtual Private Cloud (Amazon VPC) and from on-premises. Amazon S3 supports both server-side encryption (with three key management options) and client-side encryption for data uploads. Use S3 Inventory to check the encryption status of your S3 objects.

Audit Logs

Amazon S3 also supports logging of requests made against your Amazon S3 resources. You can configure your Amazon S3 bucket to create access log records for the requests made against it. These server access logs capture all requests made against a bucket or the objects in it and can be used for auditing purposes.

Versioning

Amazon S3 provides further protection with versioning capability. You can use versioning to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket. This allows you to easily recover from both unintended user actions and application failures. By default, requests will retrieve the most recently written version. Older versions of an object can be retrieved by specifying a version in the request. Storage rates apply for every version stored. You can configure lifecycle rules to automatically control the lifetime and cost of storing multiple versions.

Data Security Details

Amazon S3 supports several mechanisms that give you flexibility to control who can access your data as well as how, when, and where they can access it. Amazon S3 provides four different access control mechanisms: Identity and Access Management (IAM) policies, Access Control Lists (ACLs), bucket policies, and query string authentication. IAM enables organizations with multiple employees to create and manage multiple users under a single Amazon Web Services account. With IAM policies, you can grant IAM users fine-grained control to your Amazon S3 bucket or objects. You can use ACLs to selectively add (grant) certain permissions on individual objects. Amazon S3 Bucket Policies can be used to add or deny permissions across some or all of the objects within a single bucket. With Query string authentication, you have the ability to share Amazon S3 objects through URLs that are valid for a predefined expiration time.

You can securely upload/download your data to Amazon S3 via the SSL endpoints using the HTTPS protocol.

Amazon S3 also supports logging of requests made against your Amazon S3 resources. You can configure your Amazon S3 bucket to create access log records for the requests made against it. These server access logs capture all requests made against a bucket or the objects in it and can be used for auditing purposes.

For more information on the security features available in Amazon S3, please refer to the Access Control topic in the Amazon S3 Developer Guide.

Amazon PrivateLink for Amazon S3

Amazon PrivateLink for S3 provides private connectivity between Amazon S3 and on-premises. You can provision interface VPC endpoints for S3 in your VPC to connect your on-premises applications directly with S3 over Amazon Direct Connect. Requests to interface VPC endpoints for S3 are automatically routed to S3 over the Amazon Web Services China network. You can set security groups and configure VPC endpoint policies for your interface VPC endpoints for additional access controls.

Data Durability and Reliability

Amazon S3 provides a highly durable storage infrastructure designed for mission-critical and primary data storage. Amazon S3 redundantly stores data in multiple facilities and on multiple devices within each facility. To increase durability, Amazon S3 synchronously stores your data across multiple facilities before confirming that the data has been successfully stored. In addition, Amazon S3 calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data. Unlike traditional systems, which can require laborious data verification and manual repair, Amazon S3 performs regular, systematic data integrity checks and is built to be automatically self-healing.

Standard is:

  • Backed with the Amazon S3 Service Level Agreement for availability.
  • Designed for 99.999999999% durability and 99.99% availability of objects over a given year.
  • Designed to sustain the concurrent loss of data in two facilities.

Standard - Infrequent Access is:

  • Backed with the Amazon S3 Service Level Agreement for availability.
  • Designed for 99.999999999% durability and 99.9% availability of objects over a given year.
  • Designed to sustain the concurrent loss of data in two facilities.

Amazon S3 Glacier is:

  • Designed for 99.999999999% durability of objects over a given year.
  • Designed to sustain the concurrent loss of data in two facilities.
  • Configurable retrieval times, from minutes to hours

Amazon S3 Glacier Deep Archive is:

  • Designed for durability of 99.999999999% of objects across multiple Availability Zones
  • Lowest cost storage class designed for long-term retention of data that will be retained for 7-10 years
  • Ideal alternative to magnetic tape libraries
  • Retrieval time within 12 hours

Amazon S3 Intelligent-Tiering is:

  • Designed for durability of 99.999999999% of objects across multiple Availability Zones
  • Designed for 99.9% availability over a given year
  • Automatically optimizes storage costs for data with changing access patterns
  • Stores objects in four access tiers, optimized for frequent, infrequent, and rare access
  • Frequent and Infrequent Access tiers have same low latency and high throughput performance of S3 Standard
  • Activate optional automatic archive capabilities for objects that become rarely accessed
  • Archive access and deep Archive access tiers have same performance as Glacier and Glacier Deep Archive
  • Small monthly monitoring and auto-tiering fee

Storage Management

Amazon S3 makes it easy to manage your data by giving you actionable insight to your data usage patterns and the tools to manage your storage with management policies. All of these management capabilities can be easily administered using the Amazon S3 APIs or Management Console. The various data management features offered by Amazon S3 are described in detail below.

S3 Object Tagging

With Amazon S3 Object Tagging, you can manage and control access for Amazon S3 objects. S3 Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object. With these, you’ll have the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics. These object-level tags can then manage transitions between storage classes and expire objects in the background.

S3 CloudWatch Metrics

Amazon S3 CloudWatch integration helps you improve your end-user experience by providing integrated monitoring and alarming on a host of different metrics. You can receive 1-minute CloudWatch Metrics, set CloudWatch alarms, and access CloudWatch dashboards to view real-time operations and performance of your Amazon S3 storage. For web and mobile applications that depend on cloud storage, these let you quickly identify and act on operational issues. These 1-minute metrics are available at the S3 bucket level. Additionally, you have the flexibility to define a filter for the metrics collected using a shared prefix or object tag allowing you to align metrics filters to specific business applications, workflows, or internal organizations.

S3 CloudTrail Integration

You can use Amazon CloudTrail to capture bucket-level (Management Events) and object-level API activity (Data Events) on S3 objects. Data Events include read operations such as GET, HEAD, and Get Object ACL, as well as write operations such as PUT and POST. The detail captured provides support for many types of security, auditing, governance, and compliance use cases.

Data Lifecycle Management

Amazon S3 can automatically assign and change cost and performance characteristics as your data evolves. It can even automate common data lifecycle management tasks, including capacity provisioning, automatic migration to lower cost tiers, regulatory compliance policies, and eventual scheduled deletions.

As your data ages, Amazon S3 takes care of automatically and transparently migrating your data to new hardware as hardware fails or reaches its end of life. This eliminates the need for you to perform expensive, time-consuming, and risky hardware migrations. You can set Lifecycle policies direct Amazon S3 to automatically migrate your data to lower cost storage as your data ages. You can define rules to automatically migrate Amazon S3 objects to Standard - Infrequent Access (Standard - IA), Amazon S3 Glacier, or Amazon S3 Glacier Deep Archive based on the age of the data. You can set lifecycle policies by bucket, prefix, or objects tags, allowing you to specify the granularity most suited to your use case.

When your data reaches its end of life, Amazon S3 provides programmatic options for recurring and high volume deletions. For recurring deletions, rules can be defined to remove sets of objects after a predefined time period. These rules can be applied to objects stored in Standard or Standard - IA, and objects that have been archived to Amazon S3 Glacier or Amazon S3 Glacier Deep Archive.

You can also define lifecycle rules on versions of your Amazon S3 objects to reduce storage costs. For example, you can create rules to automatically – and cleanly - delete older versions of your objects when these versions are no longer needed, saving money and improving performance. Alternatively, you can also create rules to automatically migrate older versions to either Standard – IA, Amazon S3 Glacier, or Amazon S3 Glacier Deep Archive in order to further reduce your storage costs.

Amazon S3 Intelligent-Tiering

Amazon S3 Intelligent-Tiering (S3 Intelligent-Tiering) is the only cloud storage class that delivers automatic cost savings by moving objects between four access tiers when access patterns change. S3 Intelligent-Tiering storage class is designed to optimize costs by automatically moving data to the most cost-effective access tier, without operational overhead. It works by storing objects in four access tiers: two low latency access tiers optimized for frequent and infrequent access, and two optional archive access tiers designed for asynchronous access that are optimized for rare access. Objects uploaded or transitioned to S3 Intelligent-Tiering are automatically stored in the frequent access tier. For a small monthly monitoring and automation fee per object, Amazon S3 monitors access patterns of the objects in S3 Intelligent-Tiering, and then moving the objects that have not been accessed in 30 consecutive days to the Infrequent Access tier. You can activate one or both archive access tiers to automatically move objects that haven’t been accessed for 90 days to the archive access tier and then after 180 days to the deep archive access tier. If the objects are accessed later, S3 Intelligent-Tiering moves the objects back to the frequent access tier. This means all objects stored in S3 Intelligent-Tiering are always available when needed. There are no retrieval fees when using the S3 Intelligent-Tiering storage class, and no additional tiering fees when objects are moved between access tiers. There is no minimum storage duration for S3 Intelligent-Tiering. S3 Intelligent-Tiering has a minimum object size of 128 KB for auto-tiering. You can store these objects in S3 Intelligent-Tiering, but they will not be monitored and will always be charged at the Frequent Access tier rates, with no monitoring and automation fee. It is the ideal storage class for data with access patterns that are unknown or unpredictable.

Cost Monitoring and Controls

Amazon S3 offers several features for managing and controlling your costs. You can use the Amazon Web Services Management Console or the Amazon S3 APIs to apply tags to your Amazon S3 buckets, enabling you to allocate your costs across multiple business dimensions, including cost centers, application names, or owners. You can then view breakdowns of these costs using Amazon Web Services’ Cost Allocation Reports, which show your usage and costs aggregated by your tags. For more information on tagging your S3 buckets, please see the Bucket Tagging topic in the Amazon S3 Developer Guide.

Transferring Large Amounts of Data

You can use Amazon Direct Connect to transfer large amounts of data to Amazon S3. Amazon Direct Connect makes it easy to establish a dedicated network connection from your premises to Amazon Web Services. Using Amazon Direct Connect, you can establish private connectivity between Amazon Web Services and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.

Event Notifications

Amazon S3 event notifications can be sent in response to actions taken on objects uploaded or stored in Amazon S3. Notification messages can be sent through either Amazon SNS or Amazon SQS, or delivered directly to Amazon Lambda to invoke Amazon Lambda functions.

Amazon S3 event notifications enable you to run workflows, send alerts, or perform other actions in response to changes in your objects stored in Amazon S3. You can use Amazon S3 event notifications to set up triggers to perform actions including transcoding media files when they are uploaded, processing data files when they become available, and synchronizing Amazon S3 objects with other data stores. You can also set up event notifications based on object name prefixes and suffixes. For example, you can choose to receive notifications on object names that start with “images/."

Amazon S3 event notifications are set up at the bucket level, and you can configure them through the Amazon S3 console, through the REST API, or by using an Amazon SDK.

Storage Analytics & Insights

S3 Storage Lens

S3 Storage Lens delivers organization-wide visibility into object storage usage, activity trends, and makes actionable recommendations to improve cost-efficiency and apply data protection best practices. S3 Storage Lens is the first cloud storage analytics solution to provide a single view of object storage usage and activity across hundreds, or even thousands, of accounts in an organization, with drill-downs to generate insights at the account, bucket, or even prefix level. Drawing from more than 14 years of experience helping customers optimize their storage, S3 Storage Lens analyzes organization-wide metrics to deliver contextual recommendations to find ways to reduce storage costs and apply best practices on data protection.

Storage Class Analysis

With Storage Class Analysis, you can monitor the access frequency of the objects within your S3 bucket in order to transition less frequently accessed storage to a lower cost storage class. Storage Class Analysis observes usage patterns to detect infrequently accessed storage to help you transition the right objects to S3 Standard-IA, S3 One Zone-IA, Amazon S3 Glacier, and Amazon S3 Glacier Deep Archive. You can configure a Storage Class Analysis policy to monitor an entire bucket, a prefix, or object tag. Once Storage Class Analysis detects that data is a candidate for transition to S3 Standard-IA, S3 One Zone-IA, Amazon S3 Glacier, or Amazon S3 Glacier Deep Archive you can easily create a new lifecycle policy based on these results. This feature also includes a detailed daily analysis of your storage usage at the specified bucket, prefix, or tag level that you can export to a S3 bucket.

S3 Inventory

You can simplify and speed up business workflows and big data jobs using the S3 Inventory, which provides a scheduled alternative to Amazon S3’s synchronous List API. S3 Inventory provides a CSV (Comma Separated Values) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for an S3 bucket or a shared prefix.

Query in Place

S3 Select

Amazon S3 Select is designed to help analyze and process data within an object in Amazon S3 buckets, faster and cheaper. It works by providing the ability to retrieve a subset of data from an object in Amazon S3 using simple SQL expressions. Your applications no longer have to use compute resources to scan and filter the data from an object, potentially increasing query performance by up to 400%, and reducing query costs as much as 80%. You simply change your application to use SELECT instead of GET to take advantage of S3 Select.

Performance

Amazon S3 provides industry leading performance for cloud object storage. Amazon S3 supports parallel requests, which means you can scale your S3 performance by the factor of your compute cluster, without making any customizations to your application. Performance scales per prefix, so you can use as many prefixes as you need in parallel to achieve the required throughput. There are no limits to the number of prefixes. Amazon S3 performance supports at least 3,500 requests per second to add data and 5,500 requests per second to retrieve data. Each S3 prefix can support these request rates, making it simple to increase performance significantly.

To achieve this S3 request rate performance you do not need to randomize object prefixes to achieve faster performance. That means you can use logical or sequential naming patterns in S3 object naming without any performance implications. Refer to the Performance Guidelines for Amazon S3 and Performance Design Patterns for Amazon S3 for the most current information about performance optimization for Amazon S3.

Consistency

Amazon S3 delivers strong read-after-write consistency automatically for all applications for any storage request, without changes to performance or availability, without sacrificing regional isolation for applications, and at no additional cost. With strong consistency, S3 accelerates and simplifies the migration of on-premises analytics workloads, like Apache Spark and Apache Hadoop, by removing the need to make changes to applications, and reduces costs by removing the need for extra infrastructure to provide strong consistency.

Any request for S3 storage is strongly consistent. After a successful write of a new object or an overwrite of an existing object, any subsequent read request immediately receives the latest version of the object. S3 also provides strong consistency for list operations, so after a write, you can immediately perform a listing of the objects in a bucket with any changes reflected.

Intended Usage and Restrictions

Your use of this service is subject to the Amazon Web Services Customer Agreement.