AWS Cloud Operations & Migrations Blog

Category: Amazon Managed Service for Prometheus

Enhancing observability with a managed monitoring solution for Amazon EKS

Enhancing observability with a managed monitoring solution for Amazon EKS

Introduction Keeping a watchful eye on your Kubernetes infrastructure is crucial for ensuring optimal performance, identifying bottlenecks, and troubleshooting issues promptly. In the ever-evolving world of cloud-native applications, Amazon Elastic Kubernetes Service (EKS) has emerged as a popular choice for deploying and managing containerized workloads. However, monitoring Kubernetes clusters can be challenging due to their […]

How StormForge reduces complexity and ensures scalability with Amazon Managed Service for Prometheus

This blog post was co-written by Brent Eager, Senior Software Engineer, StormForge StormForge is the creator of Optimize Live, a Kubernetes vertical rightsizing solution that is compatible with the Kubernetes HorizontalPodAutoscaler (HPA). Using cluster-based agents, machine learning, and Amazon Managed Service for Prometheus, Optimize Live is able to continuously calculate and apply optimal resource requests, […]

Autoscaling Kubernetes workloads with KEDA using Amazon Managed Service for Prometheus metrics

Introduction With the rising popularity of applications hosted on Amazon Elastic Kubernetes Service (Amazon EKS), a key challenge is handling increases in traffic and load efficiently. Traditionally, you would have to manually scale out your applications by adding more instances – an approach that’s time-consuming, inefficient, and prone to over or under provisioning. A better […]

VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus

VTEX scales to 150 million metrics using Amazon Managed Service for Prometheus

VTEX is a multi-tenant platform with a distributed engineering operation. Observing hundreds of services in real time in an efficient manner is a technical challenge for the business. In this blog, we will show how VTEX created a resilient open source-based architecture aligned with a sharding strategy, using Amazon Managed Service for Prometheus (AMP) to […]

How Unitary achieved automatic metric collection with Amazon Managed Service for Prometheus collector

This post was co-authored with Nicolas Fournier, Platform Engineer at Unitary. Every day, over 80 years’ worth of video content is uploaded online. Some of this content can also be harmful. Unitary knows that human moderators are the current gold standard for moderation, but this manual approach does not scale. While automated systems can scale, […]

Multi-tenant monitoring across accounts and regions using Amazon Managed Service for Prometheus

Multi-tenant monitoring across accounts and regions using Amazon Managed Service for Prometheus

In this guest blog post, Nauman Noor (Managing Director), Fabio Dias (Cloud Developer), and Dylan Alibay (Cloud Developer) from the platform engineering team at State Street discuss their use of Amazon Managed Prometheus and AWS Distro for OpenTelemetry to enable monitoring in a multi-tenant, multi-account, and multi-region environment. In the ever-evolving financial services landscape, State […]

What’s new in AWS Observability at re:Invent 2023

What’s new in AWS Observability at re:Invent 2023

Let’s recap the week at AWS re:Invent 2023 with a round-up of the AWS Observability launches across Amazon CloudWatch, Amazon Managed Grafana, and Amazon Managed Service for Prometheus. From automatic instrumentation and operation of applications in CloudWatch, to agentless scraping of Prometheus metrics in Managed Service for Prometheus, read on to learn about the features […]

Monitoring and Visualizing Amazon EKS signals with Kiali and AWS managed open-source services

Microservices architecture enables scalability and agility for modern applications. However, distributed systems can introduce complexity when troubleshooting issues across services on different machines. To gain observability into microservices environments, operators need tools to monitor, analyze, and debug the interconnected services. Istio service mesh connects, secures, and observes microservices communications. It provides a way to manage […]

Monitoring GPU workloads on Amazon EKS using AWS managed open-source services

As machine learning (ML) workloads continue to grow in popularity, many customers are looking to run them on Kubernetes with graphics processing unit (GPU) support. Amazon Elastic Compute Cloud (Amazon EC2) instances powered by NVIDIA GPUs deliver the scalable performance needed for fast ML training and cost-effective ML inference. Monitoring GPU utilization gives valuable information for researchers working […]

Monitor Amazon EKS Control Plane metrics using AWS Open Source monitoring services

Have you encountered situations where your Kubernetes API calls are constantly throttled by the control plane? Did you see the 429 HTTP response code “Too many requests” all over the place and have no clue on what’s wrong with your cluster? In this blog post, we will talk about monitoring some of the key metrics […]