We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
Get maximum value out of your cloud data warehouse with Amazon Redshift
Every day, customers are challenged with how to manage their growing data volumes and operational costs to unlock the value of data for timely insights and innovation, while maintaining consistent performance. Data creation, consumption, and storage are predicted to grow to 175 zettabytes by 2025, forecasted by the 2022 IDC Global DataSphere report.
As data workloads grow, costs to scale and manage data usage with the right governance typically increase as well. So how do organizational leaders drive their business forward with high performance, controlled costs, and high security? With the right analytics approach, this is possible.
In this post, we look at three key challenges that customers face with growing data and how a modern data warehouse and analytics system like
Building an optimal data system
As data grows at an extraordinary rate, data proliferation across your data stores, data warehouse, and data lakes can become a challenge. Different departments within an organization can place data in a data lake or within their data warehouse depending on the type of data and usage patterns of that department. Teams may place their unstructured data like social media feeds within their
A quintessential example of a company managing analytics on billions of data points across the data lake and the warehouse in a mission-critical business environment is
With Amazon Redshift, Nasdaq was able to query their warehouse and use
Robert Hunt, Vice President of Software Engineering for Nasdaq, shared, “We have to both load and consume the 30 billion records in a time period between market close and the following morning. Data loading delayed the delivery of our reports. We needed to be able to write or load data into our data storage solution very quickly without interfering with the reading and querying of the data at the same time.”
Nasdaq’s massive data growth meant they needed to evolve their data architecture to keep up. They built their foundation of a new data lake on Amazon S3 so they could deliver analytics using Amazon Redshift as a compute layer. Nasdaq’s peak volume of daily data ingestion reached 113 billion records, and they completed data loading for reporting 5 hours faster while running 32% faster queries.
Enabling newer personas with data warehousing and analytics
Another challenge is enabling newer data users and personas with powerful analytics to meet business goals and perform critical decision-making. Where traditionally it was the data engineer and the database administrator who set up and managed the warehouse, today line of business data analysts, data scientists, and developers are all using the data warehouse to get to near-real-time business decision-making.
These personas who don’t have specialized data management or data engineering skills don’t want to be concerned with managing the capacity of their analytics systems to handle unpredictable or spiky data workloads or wait for IT to optimize for cost and capacity. Customers want to get started with analytics on large amounts of data instantly and scale analytics quickly and cost-effectively without infrastructure management.
Take the case of mobile gaming company
Igor Ivanov, Technical Director of Playrix, stated, “Amazon Redshift Serverless is great for achieving the on-demand high performance that we need for massive queries.”
Playrix had a two-fold business goal, including marketing to its end-users (game players) with near-real-time data while also analyzing their historical data for the past 4–5 years. In seeking a solution, Playrix wanted to avoid disrupting other technical processes while also increasing cost savings. The company migrated to Redshift Serverless and scaled up to handle more complicated analytics on 600 TB from the past 5 years, all without storing two copies of the data or disrupting other analytics jobs. With Redshift Serverless, Playrix achieved a more flexible architecture and saved an overall 20% in costs of its marketing stack, decreasing its cost of customer acquisition.
“With no overhead and infrastructure management,” Ivanov shared, “we now have more time for experimenting, developing solutions, and planning new research.”
Breaking down data silos
Organizations need to easily access and analyze diverse types of structured and unstructured data, including log files, clickstreams, voice, and video. However, these wide-ranging data types are typically stored in silos across multiple data stores. To unlock the true potential of the data, organizations must break down these silos to unify and normalize all types of data and ensure that the right people have access to the right data.
Data unification can get expensive fast, with time and cost spent on building complex, custom extract, transform, load (ETL) pipelines that move or copy data from system to system. If not done right, you can end up with data latency issues, inaccuracies, and potential security and data governance risks. Instead, teams are looking for ways to share transactionally consistent, live, first-party and third-party data with each other or their end customers, without data movement or data copying.
Stripe, a payment processing platform for businesses, is an Amazon Redshift customer and a partner with thousands of end customers who require access to Stripe data for their applications. Stripe built the
A modern data architecture with Amazon Redshift
These stories about harnessing maximum value from siloed data across the organization and applying powerful analytics for business insights in a cost-efficient way are possible because of Amazon Web Services’s approach to a modern data architecture for their customers. Within this architecture, Amazon Web Services’s data warehousing solution Amazon Redshift is a fully managed petabyte scale system, deeply integrated with Amazon Web Services database, analytics, and machine learning (ML) services. Tens of thousands of customers use Amazon Redshift every day to run data warehousing and analytics in the cloud and process exabytes of data for business insights. Customers looking for a highly performing, cost-optimized cloud data warehouse solution choose Amazon Redshift for the following reasons:
- Its leadership in price-performance
- The ability to break through data silos for meaningful insights
- Easy analytics capabilities that cut down data engineering and administrative requirements
- Security and reliability features that are offered out of the box, at no additional cost
The price-performance in a cloud data warehouse benchmark metric is simply defined as the cost to perform a particular workload. Knowing how much your data warehouse is going to cost and how performance changes as your user base and data processing increases is crucial for planning, budgeting, and decision-making around choosing the best data warehouse.
Amazon Redshift is able to attain the
The
For example, the integration in Amazon Redshift with
Finally, customers don’t have to pay more to secure their critical data assets.
Conclusion
Overall, customers who choose Amazon Redshift innovate in a new reality where the data warehouse scales up and down automatically as workloads change, and maximizes the value of data for all cornerstones of their business.
For market leaders like Nasdaq, they are able to ingest billions of data points daily for trading and selling at high volume and velocity, all in time for proper billing and trading the following business day. For customers like Playrix, choosing Redshift Serverless means marketing to customers with comprehensive analytics in near-real time without getting bogged down by maintenance and overhead. For Stripe, it also means taking the complexity and TCO out of ETL, removing silos and unifying data.
Although data will continue to grow at unprecedented amounts, your bottom line doesn’t need to suffer. While organizational leaders face the pressures of solving for cost optimization in all types of economic environments, Amazon Redshift gives market leaders a space to innovate without compromising their data value, performance, and budgets of their cloud data warehouse.
About the authors
Sana Ahmed is a Sr. Product Marketing Manager for Amazon Redshift. She is passionate about people, products and problem-solving with product marketing. As a Product Marketer, she has taken 50+ products to market and worked at various different companies including Sprinklr, PayPal and Facebook. Her hobbies include tennis, museum-hopping and fun conversations with friends and family.
Sunaina AbdulSalah leads product marketing for Amazon Redshift. She focuses on educating customers about the impact of data warehousing and analytics and sharing Amazon Web Services customer stories. She has a deep background in marketing and GTM functions in the B2B technology and cloud computing domains. Outside of work, she spends time with her family and friends and enjoys traveling.
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.