AWS Cloud Operations & Migrations Blog

Tag: Management Tools

Gain operational insights for NVIDIA GPU workloads using Amazon CloudWatch Container Insights

As machine learning models grow more advanced, they require extensive computing power to train efficiently. Many organizations are turning to GPU-accelerated Kubernetes clusters for both model training and online inference. However, properly monitoring GPU usage is critical for machine learning engineers and cluster administrators to understand model performance and to optimize infrastructure utilization. Without visibility […]

Best practices to optimize costs after mergers and acquisitions with AWS Organizations

Mergers and acquisitions (M&As) offer organizations the opportunity to scale operations, diversify product lines, and capture new markets. However, they come with a set of challenges, such as the nuances of integrating legacy IT systems, complying with stringent regulations, and maintaining business continuity, etc. Eliminating the redundancy of resources and optimizing processes to bring consistency […]

Using the unified CloudWatch Agent to send traces to AWS X-Ray

Today, applications are more distributed than ever before and they no longer run in isolation. This is especially the case when utilizing  Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). A distributed workload or system is one that encompasses multiple small independent components, all working together to complete a task or job. […]

Unlock Faster Releases with AWS AppConfig: The Secret Weapon for Your CI/CD Strategy

Striking a Balance Between Reliability and Agility in Cloud Operations The IT operation team of an enterprise serves as the first line of defense against potential business disruptions. They operate 24/7, acts as a hub, continuously monitor and manage the IT environment. The operation team handles and prioritizes critical IT incidents to minimize downtime and […]

How SMBs can deploy a multi-account environment quickly using AWS Organizations and AWS CloudFormation StackSets

Small and Medium Businesses (SMBs) need to operate with high availability and mitigate security risks while keeping costs low. An AWS multi-account environment with workload isolation, robust access control, cost visualization, and integrated security mechanisms can help SMBs build a platform to support growth. SMBs want to deploy a multi-account environment on AWS quickly and […]

Automating Alerts for AWS Global Network Performance

Have your applications hosted on AWS ever experienced inter-Region or inter-Availability Zone (AZ) latency and you wanted to be proactively notified on these latency changes? This blog post describes an automated mechanism to set up those alarms. AWS has introduced the ability to understand the performance of the AWS Global Network by introducing Infrastructure Performance, […]

Optimize AWS Resource Management with Tag Inventory Reports leveraging AWS Resource Explorer

Customers are increasingly seeking an efficient solution to manage their expanding AWS resources, spanning AWS accounts and Regions, amidst changes like mergers, acquisitions, and cloud migrations. AWS Tags offer an effective solution for organizing, identifying, and filtering resources by categorizing them based on criteria such as purpose, owner, or environment. AWS customers would like to […]

Easily set up Amazon CloudWatch Internet Monitor

Amazon CloudWatch Internet Monitor provides near-continuous internet measurements for your internet traffic, including availability and performance metrics, tailored to your specific workload footprint on AWS. With Internet Monitor, you can get insights into average internet performance metrics over time, as well as get alerts for issues (health events). You’re notified about events that impact your […]

AWS Health Events Intelligence Dashboards & Insights

Organizations operating mission-critical workloads on AWS, need the ability to analyze and respond to AWS service events in a timely manner to maintain operational excellence. AWS Health sends AWS Health events on behalf of other AWS services with three main categories: notifications on account administration and security, operational issues that affect AWS services, and scheduled […]

Using AWS AppConfig to Manage Multi-Tenant SaaS Configurations

Using AWS AppConfig to Manage Multi-Tenant SaaS Configurations

As a Software as a Service (SaaS) provider, you can benefit from a SaaS operating model in a number of ways. One of the most impactful benefits you can realize is improvements to your operational efficiency, and one of the fundamental techniques you can leverage is to maintain a single software version for all your […]