AWS for SAP

SAP on AWS: Network Analysis and Troubleshooting

Increasingly, SAP customers are choosing RISE with SAP on AWS to support their AWS transformation. RISE with SAP on AWS gives customers S/4HANA Cloud on the world’s most secure, reliable, and extensive cloud infrastructure, and the broadest set of services to transform their business.

However, thousands of customers still run SAP on AWS using traditional license models, and are seeking guidance on configuring their SAP systems for high availability while minimizing complexity.Ensuring reliable network connectivity and optimal performance for SAP applications running on AWS is a critical challenge faced by many organizations. With SAP systems often distributed across multiple availability zones, regions, and connected to various on-premises and cloud environments, troubleshooting network issues can be a complex task for SAP administration teams as SAP communication passes through various layers of security groups, network access control lists (NACLs), DNS, load balancers, gateways, and firewalls.

This blog post will provide you with valuable insights and practical knowledge on how to leverage various AWS tools specifically designed for network analysis and troubleshooting. By reading this blog, you’ll learn about the use cases of each tool in identifying and resolving network connectivity and performance issues that can impact your SAP applications.

Whether you’re facing challenges with network latency, connectivity issues, or performance bottlenecks, this blog will equip you with the right tools and techniques to gain visibility into the intricate network setup, diagnose problems efficiently, and ultimately ensure seamless operations for your SAP systems running on AWS.

Figure 1 is a typical solution architecture for SAP workloads on AWS distributed across multiple availability zones with the disaster recovery system in a separate region (Main components we will focus in this blog is SAP HANA databases)

Figure 1 – SAP Architecture Multi Region HA/DR

The following AWS tools help collect, analyze, detect and remediate SAP networking problems.

1. VPC Flow Logs: VPC flow logs capture IP traffic header information for network interfaces in your VPC. They’re disabled by default, but you can easily create a flow log for a VPC, subnet or network interface to monitor the traffic to and from your Amazon Elastic Compute Cloud (EC2) instances running SAP workloads.

SAP EC2 instance traffic.

a. Here’s how to configure flow logs in Figure 2:

b. VPC in region 1: To create Flow logs, navigate to VPC in AWS console →choose your VPC and select Flow Logs → Create flow log

c. We choose CloudWatch logs as destination as it makes it easier to search the logs.

Figure 2 – VPC Flow Log Settings

You can also use a custom format to select fields and define field order. Figure 3 describes all available fields

Figure 3 – VPC Flow Log fields definition

Now that we enabled the Flow Logs, let’s troubleshoot an issue: I changed the security group inbound setting on the secondary HANA standby system to block traffic from the primary database, breaking the replication connection.

SAP HANA stores database event logs and traces in diagnosis files. The trace directory for the system DB is /usr/sap/<SID> /HDB<instance>/<host>/trace and for the tenant DB it’s /usr/sap/<SID>/HDB<instance>/<host>/trace/DB_<database_name>. Looking into the index server logs, we see the error below. (This error message typically occurs when there is an issue establishing a network connection or communication channel):

generic; $TYPE$=OpenChannel; $MESSAGE$=an error occurred while opening the channel; $INFO$=internal error; $PARAM$=10.0.3.201:40002&0

To view the Flow Logs, navigate to CloudWatch Logs Insights (as our flow logs are going into CloudWatch Logs, we can use feature of that called CloudWatch Logs Insights to interactively search and analyze your log data, select the log group you created in the first step and run the query below to get the number of rejected requests:

filter action=”REJECT”
stats count(*) as numRejections by srcAddr
sort numRejections desc

You may also filter by the customer time stamp to see details. There are sample queries on the right side for finding the IP addresses where flow records were skipped, top packet transfer across hosts, etc.

Figure 4: CloudWatch Logs Insights

Alternatively, navigate to CloudWatch → Log Groups and select the flow logs group you created. You’ll see the log streams with their ENI name. Select the SAP HANA secondary ENI and filter with the keyword REJECT to see rejected connections for the ENI. You can also filter based on timestamps and other data using fields.

A Flow Log entry showing REJECT means that either the NACL or Security Group is blocking the flow. The single REJECT shown above means the inbound flow is being rejected by a Security Group or NACL. If there are two flows with the second flow showing REJECT, it means an outbound rule is blocking the traffic due to a restriction in NACL outbound rules.

If you see a request with ACCEPTED status but communication is still not happening between the primary HANA and secondary HANA, you need to evaluate the EC2 operating system (OS) level settings. Validate that all required ports are open and in listening status for communication to occur.

Troubleshooting Past Issues:
A common use case is validating past communication issues between SAP systems such as intermittent communication errors in the SAP trace files. In an example below, if SAP HANA replication shows alerts of delayed log replay or disconnection from a secondary system, you can use Flow Logs to investigate by filter the logs for the relevant time period, we used srcAddr and dstAddr to view traffic between the two HANA systems:

filter action=”ACCEPT” and
srcAddr =”10.0.2.63″

sort timestamp desc

We can also expand filters based on ports, eni-id and the message field. If there’s a need to troubleshoot flows outside the VPC’s networking scope, the following tools may help:Transit Gateway Flow Logs, AWS Network Firewall Logging, AWS Site to Site VPN Logs, Elastic Load Balancer Access Logs & Route 53 query logging.

Summary: Flow Logs capture IP traffic information for network interfaces in your VPC, enabling network traffic monitoring and troubleshooting for SAP EC2 instances. You can create flow logs, configure settings, and analyze them using CloudWatch Logs Insights to diagnose issues like blocked traffic, communication errors, and delayed log replay between SAP systems by filtering and inspecting network traffic flows.

2. Reachability Analyzer

Consider a scenario where you’re configuring replication to a disaster recovery (DR) site, but replication fails due to the destination being unreachable. VPC Flow Logs may not provide enough details in this case. This is where AWS Reachability Analyzer can help.

Reachability Analyzer is a configuration analysis tool that enables you to analyze reachability between two resources, like two SAP systems (EC2 instances). It shows component-by-component details of the path between source and destination. If no reachable path exists, it provides an explanation to help understand why.

For example, you have an SAP HANA primary in us-east-1 (N. Virginia) and DR in us-east-2 (Ohio). System replication between them fails. Use Reachability Analyzer to pinpoint the misconfiguration blocking communication.

Figure 5 is an example where Reachability Analyzer checked communication between EC2 instances in different regions connected via an AWS Transit Gateway.

Figure 5: Reachability Analyzer Path Details

In this scenario the error NO_ROUTE_TO_DESTINATION indicates that there’s a missing route entry in the route table. The VPC Route table should have a route entry to target destination 10.16.0.0/16

Summary:

Reachability Analyzer helps validate communication paths between components in your AWS environment. Specify source and destination details, and Reachability Analyzer indicates whether a reachable path exists between them. If a path exists, it shows the component-by-component details. If not, it provides an explanation to help understand why the path is unreachable. For SAP landscapes, you can use this to validate communication paths between EC2 instances and VPC endpoint services, or check connectivity between SAP HANA primary and secondary instances across availability zones.

3. AWS Transit Gateway Network Manager

AWS Transit Gateway Network Manager is a feature of AWS Transit Gateway. It centralizes management and monitoring of networking resources and connections to remote branch locations. It provides a comprehensive view of your AWS Transit Gateway network across regions, including attachments, cross-region peering, and on-premises connections. To get started, ensure you meet the prerequisites.

Many SAP environments span multiple AWS regions for disaster recovery or organizational compliance requirements. AWS Transit Gateway is commonly used to establish communication between VPCs across regions, accounts, and on-premises systems. AWS Transit Gateway Network Manager visualizes your global Transit Gateway network topology, monitors health status, and tracks metrics and events.

For example, you can view topologies for your SAP primary region, disaster recovery region, and on-premises corporate data center. The topology graph displays regions, attachments, and cross-region peerings. The topology tree organizes resources hierarchically. From the monitoring tab, you can track data transfer and packet drop statistics to monitor network performance.

Transit Gateway Network Manager helps monitor data transfer and packet drop statistics from the monitoring tab as shown in Figure 6

Figure 6: Transit Gateway Network Monitoring packet drops statistics

AWS Transit Gateway Network Manager also captures network events, policy changes, attachment events and topology changes.

Summary:
AWS Transit Gateway Network Manager centralizes management and monitoring of networking resources across regions. It provides a comprehensive view of Transit Gateway networks, attachments, peering, and connections. It helps visualize global network topologies, monitor health, track metrics, and view data transfer and packet drop statistics for performance monitoring

4. AWS Network Manager: Infrastructure Performance

Another key metric SAP experts need during performance optimization is network latency between systems in the landscape. There are various SAP tools like ABAP Meter and NIPING to measure network latency between layers. In our reference architecture, SAP primary and secondary systems are in different availability zones, while SAP DR is in a different region. Each AWS Region has at least three availability zones engineered to be isolated from failures in others. They provide low-latency connectivity to other availability zones in the same region.

The question is which region to choose for SAP deployments. AWS Network Manager Infrastructure Performance provides near real-time historical network latency across AWS regions and availability zones for a specific period. These metrics help in expanding landscapes across regions. For example, if you want a distributed high availability SAP application landscape, Infrastructure Performance shows historical stats to select the availability zone with the lowest latency.

You need to check these metrics from the AWS accounts where other workloads in different accounts used for SAP deployment. AWS maps physical availability zones randomly to availability zone names for each account (see: Availability Zone IDs). This distributes resources across zones in a region, instead of concentrating in zone “a”. As a result, us-east-1a for your account may not represent the same physical location as us-east-1a for another account.

Figure 7: Infrastructure Performance Monitor

You will see two key statistics in Figure 8:First is the health status timeline, which shows if the network performance is within the expected range (green) or if there are issues and performance is below the expected range (yellow). Second is the latency between two regions, in this case us-east-1 and us-east-2. You can select multiple sources and targets, choosing the ones that best suit your SAP workloads’ requirements. You can also publish these network performance metrics to Amazon CloudWatch by subscribing to them.

Figure 8: Network Latency Statistics between two regions (us-east-1 and us-east-2)

Use cases:

One use case involves SAP high availability (HA) architectures where SAP application servers are distributed across two Availability Zones (AZs). Some application servers need to communicate with the SAP database server in another AZ. Look out for low latency AZ pairs. Customers do have the ability to view the latency between AZs in any regions through Network Manager’s infrastructure performance page from the AWS Console. See the doc for additional details. You can use the AWS Infrastructure Performance Monitor to measure latency between AZs and plan your SAP HA landscape accordingly.

Another scenario arises when troubleshooting performance issues. You may want to validate if network latency has changed compared to past measurements.If latency remains consistent, you can rule out underlying network latency changes as the root cause.

Summary: Monitoring network latency between systems is crucial for SAP performance optimization. AWS Network Manager’s Infrastructure Performance provides near real-time historical network latency metrics across AWS regions and availability zones. This helps in selecting regions/availability zones with low latency for distributing SAP landscapes. The tool shows latency statistics between chosen sources and targets, allowing customers to measure and plan for low-latency architectures like SAP high availability deployments across availability zones. It can also help troubleshoot performance issues by comparing current vs historical latency measurements.

5. Network Access Scope

There are scenarios where you need to understand the network access scope of your system, including the components that can access it, such as internet gateways or systems in other environments. The network access scope provides visibility into inbound and outbound traffic patterns, including sources, destinations, paths, and traffic types. One example is a SAP HANA database that needs to be accessed from a remote bastion host. In such cases, it’s crucial to ensure proper network access controls and security measures are in place to prevent unauthorized access while allowing legitimate traffic flows. From the AWS console, go VPC –>Network Manger →Network Access scope → Create Network Access Scope: (Figure 9)

Figure 9: Network Access Scope Templates

You can select from predefined templates or build you own template and create a scope. For this scenario I selected validate network segmentation and provided my remote desktop as the source with the SAP HANA database as destination as shown in Figure 10.

Figure 10: Network Scope Analysis Outcome

You can expand the scope to internet communication, traffic between two subsets or add exclude conditions.

Use Cases:

Validate SAP System Accessibility from The Internet:

Most SAP systems are not directly exposed to the Internet. Use Network Scope Analysis to verify if any SAP application or SAP HANA system is directly reachable from the internet (Network Scope Analysis is a security tool provided by SAP that helps organizations identify potential security risks and vulnerabilities in their SAP landscape). Assess components like NAT gateways and firewalls in the network path between the internet gateway and the SAP systems.

Isolate Non-Production from Production Landscapes:

Some SAP customers want to isolate direct communication between non-production and production SAP landscapes so they configure a separate management network with its own network interfaces. The management network allows management tools to access systems in PROD and non-PROD environments through a single network, while Network Access Control Lists (NACLs) and/or security groups block direct communications between PROD and non-PROD environments. Use Network Access Scope to validate accessibility and topology configuration.

Summary – Network Access Scope in AWS provides visibility into inbound and outbound traffic patterns, including sources, destinations, paths, and traffic types for your systems. This is crucial for ensuring proper network access controls and security, such as allowing legitimate traffic to SAP HANA databases from remote hosts while preventing unauthorized access.

6. Amazon CloudWatch Application Insights for SAP NetWeaver and HANA

Troubleshooting network issues in SAP workloads on AWS is crucial. However, proactive notifications to identify potential problems before user impact would be helpful. This is where Amazon CloudWatch Application Insights for SAP NetWeaver and SAP HANA comes in. It can observe the entire SAP stack, from AWS infrastructure to SAP applications including high availability, with minimal configuration. It detects errors, alerts on problems within SAP applications, and helps identify root causes. Pre-configured alarms for various network metrics provide real-time insights into SAP network infrastructure health. Here’s a list of alarms that are for monitoring the health of your SAP landscape:

HANA:
hanadb_network_collision_rate
hanadb_network_receive_rate
hanadb_network_transmit_rate
hanadb_network_packet_receive_rate
hanadb_network_packet_transmit_rate
hanadb_network_transmit_error_rate
hanadb_network_receive_error_rate

EC2: Linux
NetworkIn
NetworkOut
NetworkPacketsIn
NetworkPacketsOut
Network Interface Bytes Total/sec {Windows]

By leveraging these alarms alongside the troubleshooting techniques discussed previously, you can establish a robust monitoring strategy for your SAP network on AWS. This proactive approach allows you to identify and address network-related issues before they significantly impact your SAP applications and users, ensuring optimal performance and user experience.

Conclusion

AWS provides a comprehensive suite of tools to analyze and troubleshoot network connectivity for SAP workloads and also is also a great place to run RISE either. VPC Flow Logs capture detailed traffic information, enabling you to investigate past or current communication issues between SAP systems. Reachability Analyzer helps validate communication paths between resources, such as SAP application servers and databases across availability zones. AWS Transit Gateway Network Manager visualizes global network topologies spanning multiple regions, accounts, and on-premises connections, monitoring health and performance metrics. Additionally, AWS Network Manager’s Infrastructure Performance feature provides near real-time historical network latency data across AWS regions and availability zones, crucial for optimizing SAP high availability architectures. Network Access Scope analyzes network access patterns, ensuring proper isolation between production and non-production environments and validating accessibility from the internet. Furthermore, Amazon CloudWatch Application Insights for SAP offers proactive monitoring and alerting on various network metrics, enabling early detection and resolution of potential issues before they impact users. By leveraging these powerful tools, SAP administrators can gain deep insights into network traffic patterns, identify connectivity issues, and optimize performance across their complex SAP landscapes on AWS. Whether troubleshooting past incidents, validating configurations, or proactively monitoring network health, these AWS services provide the necessary visibility and control to ensure reliable and efficient SAP operations in the cloud.

How to get started:

1. Unleash the Power of AWS for Your SAP Workloads: Leverage AWS’s comprehensive suite of network analysis and troubleshooting tools to gain deep insights, validate configurations, and ensure optimal performance for your mission-critical SAP applications.

2. Maximize SAP Uptime and Availability on AWS: Utilize AWS Transit Gateway Network Manager, Reachability Analyzer, and Network Access Scope to visualize global network topologies, monitor health and performance metrics, and validate communication paths across your SAP landscape.

3. Proactive SAP Network Monitoring on AWS: Implement Amazon CloudWatch Application Insights for SAP to receive proactive monitoring, alerting, and early detection of potential network issues, ensuring uninterrupted operations for your SAP users.

4. Unravel SAP Network Complexities on AWS: Leverage VPC Flow Logs to capture detailed traffic information, enabling comprehensive analysis and investigation of past or current communication issues between your SAP systems.

5. Optimize SAP High Availability on AWS: Utilize AWS Network Manager’s Infrastructure Performance feature to access near real-time historical network latency data across AWS regions and availability zones, ensuring optimal configuration of your SAP high availability architectures.

What’s Next

Use AWS Skill Builder to learn more about the services we discussed during this blog post, through relevant courses such as “AWS Network – Monitoring and Troubleshooting”. If you’ve never used AWS Skill Builder before, create a free account here and get learning!