We use machine learning technology to do auto-translation. Click "English" on top navigation bar to check Chinese version.
How hc1 Turns Lab Data into Personalized Health Insights Using Amazon Web Services Serverless
By Gokhul Srinivasan, Sr. Partner Solutions Architect, Startup – Amazon Web Services
By Whitney Wilger, Sr. Data Engineer – hc1
|
hc1 |
|
Since 2011,
Most hc1 customers are healthcare systems and independent laboratories that store data across disparate systems. An
hc1 achieves this using the
-
hc1 Operations Management : Streamlines multiple areas of laboratory operations, from sales activities to customer and patient relationships to operations initiatives. -
hc1 Analytics : Provides automated reporting and key performance indicator (KPI) tracking in real time.
The data from the above solutions are classified into account, provider, and patient profiles that streamline complex healthcare relationships. In addition, each profile contains lab data attributes comprising orders, results, cases, tasks, and memos.
Prior State
hc1 used a Pentaho-based solution to process customer data and create the analytics and reports. The Pentaho data integration suite and user interface components were deployed across
- Infrastructure was fragile and required manual management. The deployment and change management could have been more programmer-friendly.
- Architecture did not support integration with
Amazon Web Services CloudFormation to improve the DevOps efficiency and build an automation pipeline. - Data spread across data systems causing monolithic data silos:
- MySQL on Amazon EC2: Transactional data from hc1 CRM platform.
- MySQL on Amazon Aurora: Transactional data from the laboratory information system.
- Postgres on Amazon EC2: Audit data across all hc1 platforms.
- Postgres on Amazon Aurora: FHIR HL7 messages.
- MySQL is the primary source for this process, however the lake stores data from MySQL, Postgres, and
Amazon DynamoDB .
Solution
The approach was to build a multi-tenant, scalable architecture that addressed these operational challenges while improving ownership and accountability. After evaluating options, hc1 transformed into a next-generation architecture powered by
Alternate approaches were costly and created management overhead while also requiring dedicated EC2 instances and 24/7 support. Amazon Web Services Glue consumes resources when called upon, delivering faster data transfer, less lag time, and is less expensive.
For hc1’s internal data teams, the resulting
Architecture
The source MySQL and
Below is the illustration of how the serverless architecture shown in Figure 1 would work. The architecture is split into three groups.
Raw Data Generation
This step classifies and extracts data from the source databases and moves them into S3. This step uses
Amazon Web Services Glue Data Catalog stores the customer metadata and uses permissions from Lake Formation to safely publish data while protecting data access in a granular manner. This helps track the schema changes and build a comprehensive audit and governance process.
There are five Amazon Web Services Glue extract, transform, load (ETL) jobs that transform the data and produce the output raw file in parquet format:
- Amazon Web Services Glue schema sync: Keeps Snowflake databases (analytics store) in sync with the MySQL source.
- Full load: Loads the entire customer table.
- Incremental load: Loads the incremental changes from customer table.
- Dynamic full load: Loads the entire user-defined table for the customer.
- Dynamic incremental load: Loads incremental changes from user-defined table.
These purpose-built Amazon Web Services Glue jobs isolate the flow to efficiently handle different business scenarios. In addition, some user-defined tables are large. The dynamic load jobs handle this volume independent of the incremental and full load jobs.
At the end of this step, the customer-specific raw data is isolated based on Lake Formation access control and moved to respective S3 buckets for data curation.
Data Curation
This process helps with the organization and integration of the raw data. The transformation provides a meaningful way to store reporting data by pivoting columns to rows. This process is decoupled using
Amazon Web Services Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. SNS is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication. SQS is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.
Amazon S3 event notification is used to trigger notification when raw data files are added to a specific S3 bucket. The notification configuration identifies the events and notifies SNS, and the SNS topic sends the message to the subscribed SQS queues.
You can use a Lambda function to process messages in an SQS queue. Lambda polls the queue and invokes your function synchronously with an event that contains queue messages. You can specify another queue to act as a dead-letter queue for messages that your Lambda function can’t process.
This process splits into a curation and transformation sequence and the SNS topic notifies the respective SNS queue. Apart from the queue, each sequence contains a Lambda function, and a dead-letter queue for messages that the Lambda can’t process.
The curated and transformed files are then stored in separate S3 buckets. Curated files contain data from normalized tables, while transformed files contain changes to those tables such as denormalization and pivots.
Curation to Snowflake
The final step in this process employs an Amazon Web Services Glue job, CuratedToSnowflake , which creates the report. The job ingests the files from the curated and transformed S3 buckets and produces the report data for lab insights.
The data is pushed to Snowflake through the Snowflake admin API and a client database inside Snowflake.
Figure 1 – hc1 data lake ingestion architecture.
The Amazon Web Services Glue jobs support custom data movement and improved operational stability. The process splits data into batches and uses Amazon Web Services Glue
Amazon DynamoDB is a fast, flexible NoSQL database service for single-digit millisecond performance at any scale. The architecture uses DynamoDB to store Amazon Web Services Glue bookmarks, processing status at the table and database sources level.
The architecture uses Amazon Web Services CloudFormation and
Using a data-driven, loosely coupled architecture, hc1 isolated the operation of the upstream and downstream platforms. The architecture is built on top of an existing application, avoids data duplication, and ensures high standards for data security and governance. This reduces overall friction for data flow within the hc1 platform.
Outcomes
Overall, the architecture adds better logging and altering, decreases blast radius, and improves resilience by breaking the process into three stages against one holistic option. The outcome is a single tenant software-as-a-service (SaaS) offering with one tenant per customer. The Amazon Web Services Glue jobs are deployed in each customer tenant, and Lake Formation is multi-tenant supporting all customers.
Through this architecture, hc1 modernized its data platforms with Amazon Web Services-native technologies that are highly scalable, feature-rich, and cost-effective. This approach enables hc1 internal teams to operate autonomously while providing central data discovery, governance, and auditing of the upstream and downstream applications.
hc1 can also integrate faster, implement efficiently, and quickly scale to meet internal and customer demands. This approach enables governance and easy data movement adhering to compliance and regulatory policies. Using the serverless architecture, hc1 avoided data loss, improved data sharing, improved security, and increased return on investment (ROI). This allows hc1 to turn lab data into personalized healthcare insights with speed and agility at scale.
Using the new architecture enables hc1 with three distinct, yet related, outcomes:
- Multi-modal analysis
- Lab insights
- Data security, governance, and compliance
Serverless Advantage
The architecture helps hc1 scale to thousands of active customers and focus on customer outcomes and quality improvement.
The multi-
Key advantages include:
- Scalability: Scaled to support multiple customers, the architecture handles over 71 TB of data across multiple customers.
- Resilience: Pentaho process moves data from source to destination in one giant step. The new architecture breaks out the steps and provides easy recovery with multi-AZ implementation.
- Operational improvement: Pentaho began as all full loads, changing over to differential only when full loads would no longer process for large customers. The new approach selects the incremental and full load types based on the columns and further separates the dynamic tables.
- Eliminate dependency: The workload uses built-in Amazon Web Services service integration and avoids dependency on third-party platforms, training, and upgrades. It removes management overhead with EC2 and Pentaho software.
- Cost optimization: With a pay-as-you-go model, hc1 optimized cost and never had to over-provision resources. This cost saving is above the cost reduction from third-party licenses.
- Eliminate provisioning delays: Now hc1 can scale and add more customers without capacity planning and provisioning delays.
- Audit: Access controls are pre-defined using Amazon Web Services Lake Formation and deploys Amazon Web Services Glue changes. This simplifies HIPAA and Hi-Trust auditing, and creates visibility to audit data.
Customer Benefits
The ability to run frequently and incremental loads help hc1 meet the runtime service-level agreement (SLA), thus improving customer satisfaction. The overall solution helps hc1 activate customers within a shorter duration, at a lower cost, and with an improved customer experience.
Lab diagnostic data and operational metrics often reside in several different isolated systems. Customers now have the ability to generate automated quality reports and key performance indicators (KPI) in real time, eliminating delays.
For hc1, this architecture provides a repeatable blueprint to integrate new domains and applications. Customers can also design and use user-defined fields and tables to add customer-specific data. Separate Amazon Web Services Glue jobs support this customer-defined data processing. Lab insights are delivered faster in near real-time, helping labs innovate and deliver analytics-driven outcomes, improving patient health.
Customers enjoy the flexibility with the user-defined tables a necessary step from the previous process. This enables customers by shifting dependency on product feature development. At present, the architecture handles over 117 GB of user-defined data, and this volume will continue to grow with increased customer adoption.
Conclusion
The Amazon Web Services serverless architecture adopted by hc1 enhances its customer experience, delivering personalized health insights. The approach helps hc1 to aggregate data from monolithic silos and improve efficiency through Amazon Web Services CloudFormation and a DevOps automation pipeline.
Amazon Web Services Glue and Amazon Web Services Lake Formation break the process into independent and resilient processing units and isolate the blast radius improving the platform reliability. Building on this foundation, hc1 can drive more innovation and analytics-driven solutions.
To learn more about how hc1 can help healthcare professionals transform lab data into personalized healthcare insights, visit the
.
.
hc1 – Amazon Web Services Partner Spotlight
hc1 is an Amazon Web Services Healthcare Competency Partner that ingests, organizes, and normalizes customer data to deliver analytics and improve operations management. As an outcome, customers use these insights to their fullest potential.
The mentioned AWS GenAI Services service names relating to generative AI are only available or previewed in the Global Regions. Amazon Web Services China promotes AWS GenAI Services relating to generative AI solely for China-to-global business purposes and/or advanced technology introduction.