AWS for Industries
Real-time Analytics on Patient Bedside Medical Devices
Introduction
The Children’s Hospital of Philadelphia (CHOP) recently created a proof of concept (POC) to ingest real-time HL7 data from bedside medical devices into AWS for processing and analysis. The Children’s Hospital of Philadelphia (CHOP), is the first hospital in the United States dedicated exclusively to the care of children. While its flagship campus is located in the University City neighborhood of West Philadelphia, Pennsylvania, it has more than 50 pediatric offices, specialty care centers, and surgical centers located in Pennsylvania and New Jersey and recently opened a second hospital campus, the Middleman Pavilion, in King of Prussia, PA. In addition, CHOP is home to one of the largest pediatric research facilities in the United States. The health system has more than 600 beds and nearly 1.5 million outpatient and inpatient visits
annually.
To accommodate the needs of this complex enterprise, the Safety & Quality team at CHOP built a POC to receive real time data (HL7 raw data) from on premises medical devices to cloud data platform for processing & analysis.
Business Challenge
With the large volume and high-fidelity of medical device data, it is challenging to tap into this data source for analytics. Data produced from bedside monitors, including alarm data can be analyzed to assess subjective workload on clinicians, improve patient and optimize clinical best practices.Furthermore, there is an opportunity to improve patient safety and clinical best practices through in-depth analysis.
A properly designed data infrastructure on AWS can effectively overcome this challenge and unlock new opportunities to improve patient safety through risk identification, optimize clinical best practices through data-driven insights, and enhance clinician workload management by understanding areas of strain and implementing efficient strategies.
POC Solution
CHOP worked with AWS to build a POC that ingests HL7 messages from on-premise bedside monitors into AWS. The solution architecture consist of the following components:
Figure 1. architecture for ingesting real-time bedside monitoring data
1. Apache Camel MLLP connector hosted in AWS Fargate to receives real-time HL7 data from patient monitors via MLLP connections.
2. Received HL7 messages are ingested into Amazon Kinesis Data Streams for further processing.
3. Amazon Managed Service for Apache Flink runs an Apache Flink application that parses HL7 messages and writes patient vital signs data to Amazon Timestream .
4. Amazon Timestream is a serverless time-series database, stores patient vital signs as time-series data points.
5. Amazon QuickSight is used to create a dashboard on top of time series data stored in Timestream to visualize patient vital signs.
6. Another Apache Flink application processes HL7 messages by extracting relevant data, sending the extract data (in JSON format) to an Amazon Kinesis Data Streams.
7. The extracted data is made available in the Amazon Redshift data warehouse by using materialzed views on top of Kinesis Data Streams.
8. The extracted data is also sent to the Amazon S3 data lake by utilizing Amazon Data Firehose.
Security
Services used in this architecture are HIPAA eligible. The solution also flows security best practices to protect sensitive patient data. Patient monitor data is securely transmitted from on-premise to AWS cloud by using Virtual Private Network (VPN) over an AWS Direct Connect connection, ensuring end-to-end data encryption.
To secure the data at rest Server-side encryption is enabled using AWS Key Management Service (AWS KMS) to meet strict data management requirements. KMS is integrated with multiple services, including AWS Fargate, Amazon Kinesis Data Streams, Amazon Managed Service for Apache Flink, Amazon Timestream, Amazon Redshift and Amazon S3, to encrypt data before storing it on disk.
Transport Layer Security (TLS) protocol is used to encrypt the data in transit between different AWS services.
Various services being used the Solution
- Ingestion
- AWS Fargate (to host Apache Camel connector) is a technology that you can use with Amazon ECS to run containers without having to manage servers or clusters of Amazon EC2 instances. With Fargate, you no longer have to provision, configure, or scale clusters of virtual machines to run containers. This removes the need to choose server types, decide when to scale your clusters, or optimize cluster packing.
- Amazon Kinesis Data Streams (to store raw data) – You can use to collect and process large streams of data records in real time. You can create data-processing applications, known as Kinesis Data Streams applications. A typical Kinesis Data Streams application reads data from a data stream as data records
- Processing
- Amazon Managed Service for Apache Flink uses Java, Python, Scala, or SQL to process and analyze streaming data. The service enables you to author and run code against streaming sources to perform time-series analytics, feed real-time dashboards, and create real-time metrics.
- Amazon Kinesis Data Streams (to store parsed Data) – You can use to collect and process large streams of data records in real time. You can create data-processing applications, known as Kinesis Data Streams applications. A typical Kinesis Data Streams application reads data from a data stream as data records
- Storage
- Amazon Data Firehose (to archive the data into the S3) is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon OpenSearch Service, Amazon OpenSearch Serverless, Splunk, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers, including Datadog, Dynatrace, LogicMonitor, MongoDB, New Relic, and Sumo Logic Amazon Redshift (for data warehouse) is a fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift Serverless lets you access and
analyze data without all of the configurations of a provisioned data warehouse. Resources are automatically provisioned and data warehouse capacity is intelligently scaled to deliver fast performance for even the most demanding and unpredictable workloads - Amazon Timestream (optional component) is a fast, scalable, fully managed, purpose-built time series database that makes it easy to store and analyze trillions of time series data points per day. Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost optimized storage tier based upon user defined policies`
- Amazon Data Firehose (to archive the data into the S3) is a fully managed service for delivering real-time streaming data to destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Amazon OpenSearch Service, Amazon OpenSearch Serverless, Splunk, and any custom HTTP endpoint or HTTP endpoints owned by supported third-party service providers, including Datadog, Dynatrace, LogicMonitor, MongoDB, New Relic, and Sumo Logic Amazon Redshift (for data warehouse) is a fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift Serverless lets you access and
- Visualization
- Amazon QuickSight (optional component) is a fully managed and secure service that enables you to build interactive dashboards and perform ad hoc analysis using data from multiple sources. You can add Amazon Timestream as a data source to bring time series data, such as patient vital signs to QuickSight and create a dashboard that provides real-time visibility into patients’ health metrics.
Taking the Architecture to next level
Kinesis throughput is based on the number of shards configured in the Kinesis data stream. Kinesis data stream can scale horizontally by adding more shards. However, in order to make sure patient medical data are processed in the correct order, You could use the patient Id as the partition key when writing HL7 messages to Kinesis data stream. This ensures that messages belong to the same patient are all placed on the same shard, and therefore those messages are processed in the order of arriving (i.e. FIFO).
You could also turn on “enhanced fan-out” feature for Kinesis data stream consumers, so that each consumer gets a dedicated throughput (i.e. not shared throughput). This configuration also allows us to scale up to 20 consumers per Kinesis data stream without sacrificing the read performance.
Conclusion & Outcome
The designed architecture is currently processing about 4000 messages per minute from on-premises medical devices into AWS & this could easily scale as per your organizatrion. The real-time HL7 data is made available for downstream analytics to uncover insights that can improve patient care.
CHOP and AWS have shown how streaming analytics of real-time HL7 medical device data unlocks new opportunities for data-driven improvements in patient safety and care delivery. To learn more about building streaming data solutions using using Kinesis, please visit Streaming Data Solution for Amazon Kinesis for more details.