AWS Cloud Operations Blog

Category: Amazon Managed Service for Prometheus

Monitoring and Visualizing Amazon EKS signals with Kiali and AWS managed open-source services

Microservices architecture enables scalability and agility for modern applications. However, distributed systems can introduce complexity when troubleshooting issues across services on different machines. To gain observability into microservices environments, operators need tools to monitor, analyze, and debug the interconnected services. Istio service mesh connects, secures, and observes microservices communications. It provides a way to manage […]

Monitoring GPU workloads on Amazon EKS using AWS managed open-source services

As machine learning (ML) workloads continue to grow in popularity, many customers are looking to run them on Kubernetes with graphics processing unit (GPU) support. Amazon Elastic Compute Cloud (Amazon EC2) instances powered by NVIDIA GPUs deliver the scalable performance needed for fast ML training and cost-effective ML inference. Monitoring GPU utilization gives valuable information for researchers working […]

Monitor Amazon EKS Control Plane metrics using AWS Open Source monitoring services

Have you encountered situations where your Kubernetes API calls are constantly throttled by the control plane? Did you see the 429 HTTP response code “Too many requests” all over the place and have no clue on what’s wrong with your cluster? In this blog post, we will talk about monitoring some of the key metrics […]

How to reduce Istio sidecar metric cardinality with Amazon Managed Service for Prometheus

How to reduce Istio sidecar metric cardinality with Amazon Managed Service for Prometheus

The complexity of distributed systems has grown significantly, making monitoring and observability essential for application and infrastructure reliability. As organizations adopt microservice-based architectures and large-scale distributed systems, they face the challenge of managing an increasing volume of telemetry data, particularly high metric cardinality in systems like Prometheus. To address this, many are turning to service […]

Choice Hotels adopts Amazon Managed Service for Prometheus for operational excellence and cost efficiency

This post was co-written with Stephen Cihak, Senior Director , Abhiram Madadi, Principal Engineer and Gopi Akula, Senior Manager at Choice Hotels Who is Choice Hotels? Choice Hotels International is one of the largest lodging franchisors in the world. A challenger in the upscale segment and a leader in midscale and extended stay, Choice has […]

Enhance observability for Amazon RDS Custom for SQL Server using Amazon Managed Service for Prometheus and Amazon Managed Grafana

In this blog post, you will learn how to improve observability on your Amazon RDS Custom for SQL Server database. You will configure metric exporters and send those metrics to Amazon Managed Service for Prometheus, to be visualized in Amazon Managed Grafana. By utilizing both Amazon Managed Service for Prometheus, and Amazon Managed Grafana, you […]

Monitor your Databricks Clusters with AWS managed open-source Services

Organizations rely heavily on cloud-based data processing and analytics platforms in today’s data-driven world to unlock valuable insights and make informed decisions. Databricks, a unified analytics platform, has emerged as a popular choice due to its seamless integration with Apache Spark, and its ability to efficiently handle large-scale data processing tasks. Many customers have implemented […]

Gain actionable business insights with monitoring of Amazon MSK with Amazon Managed Service for Prometheus and Amazon Managed Grafana

Gain actionable business insights with monitoring of Amazon MSK with Amazon Managed Service for Prometheus and Amazon Managed Grafana

Introduction Monitoring is a critical aspect of maintaining the health and performance of any distributed system. In the case of Apache Kafka-based applications, configuring robust monitoring on kafka clusters becomes more crucial due to the real-time nature of data processing. This blog is intended for individuals or organizations utilizing Apache Kafka-based applications, specifically those facing […]

Migrating to Amazon Managed Service for Prometheus with the Prometheus Operator

The Prometheus Operator allows cluster administrators to manage Prometheus clusters running in Kubernetes. It makes it easy to deploy and manage Prometheus via native Kubernetes components. In this blog post, I will demonstrate how you can deploy Prometheus via the Prometheus Operator, and how you can easily migrate your monitoring workloads to take advantage of […]

Announcing AWS CDK Observability Accelerator for Amazon EKS

Today we are happy to announce the all-new AWS CDK Observability Accelerator – a set of opinionated modules to help you set up observability for your AWS environments with AWS Native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch. AWS […]