AWS Database Blog

Category: Analytics

Privileged Database User Activity Monitoring using Database Activity Streams(DAS) and Amazon OpenSearch Service

In this post, we demonstrate how to create a centralized monitoring solution using Database Activity Streams and Amazon OpenSearch Service to meet audit requirements. The solution enables the security team to gather audit data from several Kinesis data streams, enrich, process, and store it with retention to meet compliance requirements, and produce relevant alarms and dashboards.

Turn petabytes of relational database records into a cost-efficient audit trail using Amazon Athena, AWS DMS, Amazon RDS, and Amazon S3

In this post, we show how you can use AWS Database Migration Service (AWS DMS) to migrate relational data from Amazon RDS into compressed archives on Amazon S3. We discuss partitioning strategies for the resulting archive objects and how to use S3 Object Lock to protect the archive objects from modification. Lastly, we demonstrate how to query the archive objects using SQL syntax through Athena with seconds latency, even on large datasets.

Find and link similar entities in a knowledge graph using Amazon Neptune, Part 1: Full-text search

A knowledge graph combines data from many sources and links related entities. Because a knowledge graph is a gathering place for connected data, we expect many of its entities to be similar. When we find that two entities are similar to each other, we can materialize that fact as a relationship between them. In this […]

Tune replication performance with AWS DMS for an Amazon Kinesis Data Streams target endpoint – Part 3

In Part 1 of this series, we discussed the high-level architecture of multi-threaded full load and change data capture (CDC) settings to tune related parameters for better performance to replicate data to an Amazon Kinesis Data Streams target using AWS Database Migration Service (AWS DMS). In Part 2, we provided some examples of how we […]

Tune replication performance with AWS DMS for an Amazon Kinesis Data Streams target endpoint – Part 2

In Part 1 of this series, we discussed the architecture of multi-threaded full load and change data capture (CDC) settings, and considerations and best practices for configuring various parameters when replicating data using AWS Database Migration Service (AWS DMS) from a relational database system to Amazon Kinesis Data Streams. In this post, we demonstrate the […]

Tune replication performance with AWS DMS for an Amazon Kinesis Data Streams target endpoint – Part 1

AWS Database Migration Service (AWS DMS) makes it possible to replicate to Amazon Kinesis Data Streams from relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use Kinesis data streams to collect and process large streams of data records in real time. Replicating data changes to a Kinesis data stream […]

Handle tables without primary keys while creating Amazon Aurora MySQL or Amazon RDS for MySQL zero-ETL integrations with Amazon Redshift

At AWS, we have been making steady progress towards bringing our zero-ETL vision to life. With Amazon Aurora zero-ETL integration to Amazon Redshift, you can bring together the transactional data of Amazon Aurora with the analytics capabilities of Amazon Redshift. The integration helps you derive holistic insights across many applications, break data silos in your […]

Handle tables without primary keys while creating Amazon Aurora PostgreSQL zero-ETL integrations with Amazon Redshift

At Amazon Web Services (AWS), we have been making steady progress towards bringing our zero-extract, transform, and load (ETL) vision to life. With Amazon Aurora zero-ETL integration to Amazon Redshift, you can bring together the transactional data of Amazon Aurora with the analytics capabilities of Amazon Redshift. The integration helps you derive holistic insights across […]

Run complex queries on massive amounts of data stored on your Amazon DocumentDB clusters using Apache Spark running on Amazon EMR

In this post, we demonstrate how to set up Amazon EMR to run complex queries on massive amounts of data stored in your Amazon DocumentDB (with MongoDB compatibility) clusters using Apache Spark. Amazon DocumentDB (with MongoDB compatibility) is a fully managed native JSON document database that makes it easy and cost effective to operate critical document […]

Create a Virtual Knowledge Graph with Amazon Neptune and an Amazon S3 data lake

It’s common in an enterprise for data that logically fits together to be separated into different databases. Some data is better suited for one storage than another, and it may not be feasible to locate all your data in one data store. But this data often needs to be linked back together to provide a […]