AWS Database Blog
Category: Analytics
Gather organization-wide Amazon RDS orphan snapshot insights using AWS Step Functions and Amazon QuickSight
In this post, we walk you through a solution to aggregate RDS orphan snapshots across accounts and AWS Regions, enabling automation and organization-wide visibility to optimize cloud spend based on data-driven insights. Cross-region copied snapshots, Aurora cluster copied snapshots and shared snapshots are out of scope for this solution. The solution uses AWS Step Functions orchestration together with AWS Lambda functions to generate orphan snapshot metadata across your organization. Generated metadata information is stored in Amazon Simple Storage Service (Amazon S3) and transformed into an Amazon Athena table by AWS Glue. Amazon QuickSight uses the Athena table to generate orphan snapshot insights.
How Skello uses AWS DMS to synchronize data from a monolithic application to microservices
Skello is a human resources (HR) software-as-a-service (SaaS) platform that focuses on employee scheduling and workforce management. It caters to various sectors, including hospitality, retail, healthcare, construction, and industry. In this post, we show how Skello uses AWS Database Migration Service (AWS DMS) to synchronize data from an monolithic architecture to microservices and perform data ingestion from the monolithic architecture and microservices to our data lake.
How Channel Corporation modernized their architecture with Amazon DynamoDB, Part 2: Streams
Channel Corporation is a B2B software as a service (SaaS) startup that operates the all-in-one artificial intelligence (AI) messenger Channel Talk. In Part 1 of this series, we introduced our motivation for NoSQL adoption, technical problems with business growth, and considerations for migration from PostgreSQL to Amazon DynamoDB. In this post, we share our experience integrating with other services to solve areas that couldn’t be addressed with DynamoDB alone.
Build a streaming ETL pipeline on Amazon RDS using Amazon MSK
Customers who host their transactional database on Amazon Relational Database Service (Amazon RDS) often seek architecture guidance on building streaming extract, transform, load (ETL) pipelines to destination targets such as Amazon Redshift. This post outlines the architecture pattern for creating a streaming data pipeline using Amazon Managed Streaming for Apache Kafka (Amazon MSK). Amazon MSK offers a fully managed Apache Kafka service, enabling you to ingest and process streaming data in real time.
Modernize your legacy databases with AWS data lakes, Part 1: Migrate SQL Server using AWS DMS
This is a three-part series in which we discuss the end-to-end process of building a data lake from a legacy SQL Server database. In this post, we show you how to build data pipelines to replicate data from Microsoft SQL Server to a data lake in Amazon S3 using AWS DMS. You can extend the solution presented in this post to other database engines like PostgreSQL, MySQL, and Oracle.
Amazon Aurora PostgreSQL zero-ETL integration with Amazon Redshift is generally available
In this post, we discuss the challenges with traditional data analytics mechanisms, our approach to solve them, and how you can use Amazon Aurora PostgreSQL-Compatible Edition zero-ETL integration with Amazon Redshift, which is generally available as of October 15th, 2024.
Vector search for Amazon DynamoDB with zero ETL for Amazon OpenSearch Service
As organizations increasingly rely on Amazon DynamoDB for their operational database needs, the demand for advanced data insights and enhanced search capabilities continues to grow. Leveraging the power of Amazon OpenSearch Service and Amazon Bedrock, you can now unlock generative artificial intelligence (AI) capabilities for your DynamoDB data. In this post, we show how you […]
How Prisma Cloud built Infinity Graph using Amazon Neptune and Amazon OpenSearch Service
Palo Alto Network’s Prisma Cloud is a leading cloud security platform protecting enterprise cloud adoption from code to cloud workflows. Palo Alto Networks chose Amazon Neptune Database and Amazon OpenSearch Service as the core services to power its Infinity Graph. In this post, we discuss the scale Palo Alto Networks requires from these core services and how we were able to design a solution to meet these needs. We focus on the Neptune design decisions and benefits, and explain how OpenSearch Service fits into the design without diving into implementation details.
Stream change data in a multicloud environment using AWS DMS, Amazon MSK, and Amazon Managed Service for Apache Flink
When workloads and their corresponding transactional databases are distributed across multiple cloud providers, it can create challenges in using the data in near real time for advanced analytics. In this post, we discuss architecture, approaches, and considerations for streaming data changes from the transactional databases deployed in other cloud providers to a streaming data solution deployed on AWS.
Analyze blockchain data with natural language using Amazon Bedrock
Data within public blockchain networks such as Bitcoin and Ethereum can be accessed by anyone. However, accessing and making sense of this information has traditionally been a complex and technical undertaking. Much of the data is encoded and stored as bytes, rather than in a human-readable format. In this post, we introduce a solution that demonstrates how you can chat with blockchain data using Amazon Bedrock and the AWS Public Blockchain datasets. We discuss Amazon Bedrock, review the solution architecture, provide example prompts, share interesting findings, and go over how you can extend the solution to integrate with different data sources.