AWS Partner Network (APN) Blog
How to Enable Mainframe Data Analytics on AWS Using BMC AMI Cloud Analytics
By Gil Peleg, Sr. Director of Product Management – BMC Software
By Sunil Bemarkar, Sr. Partner Solutions Architect – AWS
BMC Software |
Data insight is critical for businesses to gain a competitive advantage. Mainframe proprietary storage solutions such as virtual tape libraries (VTLs) hold valuable data locked in a platform with complex tools. This can lead to higher compute and storage costs, and make it harder to retain existing employees or train new ones.
When mainframe data is stored in a cloud storage service, however, it can be accessed by a rich ecosystem of applications and analytics tools.
BMC Software is an AWS Specialization Partner with the Migration and Modernization Competency that helps enterprises benefit from cloud economics while accelerating digital transformation and adoption of hybrid cloud architectures.
BMC enables mainframe customers to benefit from cloud technologies and economics with backup and archive directly to AWS cloud storage services, such as Amazon Glacier and Amazon Simple Storage Service (Amazon S3).
BMC AMI Cloud Analytics makes data delivered to Amazon Web Services (AWS) storage services readable and structured, enabling new analytics use cases. In this post, we will describe BMC AMI Cloud’s new features and benefits with a step-by-step walkthrough and several customer use cases.
Backup and Archival with BMC AMI Cloud Data
BMC AMI CLoud’s patented technology lets mainframe customers take advantage of AWS storage services, from affordable long-term options like Amazon Glacier and Glacier Deep Archive, to highly durable, geographically dispersed, and flexible low-cost options such as Amazon S3 object storage.
Amazon Elastic Block Store (Amazon EBS) and Amazon Elastic File System (Amazon EFS) are also supported.
Figure 1 – BMC AMI Cloud Data architecture.
In the architecture above, you can see the two main components of the BMC AMI Cloud Data software product—a lightweight agent running n z/OS providing secure data delivery and retrieval functions directly to Amazon S3, and a management server running on AWS.
BMC AMI Cloud Data provides storage, backup, restore, archive/migrate, and automatic recall for all mainframe data sets, volume types and z/OS UNIX files, as well as space management, stand-alone restore, and disaster recovery.
BMC AMI Cloud can run side-by-side with existing data management solutions to provide cloud capabilities and cost reductions. To achieve dramatic cost reductions, it provides a complete replacement for on-premises VTL and legacy data management tools.
Mainframe Data Analytics via BMC AMI Cloud
BMC AMI Cloud Analytics recently added new features for delivering and transforming mainframe data directly to AWS, enabling easy and secure integration with popular cloud analytics tools, data lakes, data warehouses, databases, and ETL solutions running. The BMC AMI Cloud Analytics solution unifies data delivery for analytics with backup and space management processes.
Mainframe customers typically use virtual tapes as a secondary storage solution for three types of data:
- Daily incremental data set backups.
- Migrated/archived data sets as part of daily space management processing.
- DB2 database image copies.
When data is stored on virtual tapes, it can’t be accessed or processed by other tools. To avoid this issue, customers send daily updates and data sets to other platforms using ETL tools and data transfer software.
This double data movement incurs high costs and wasted CPU consumption. BMC AMI Cloud offers a new way of writing mainframe data directly to a storage platform, where it can be accessed and consumed by analytics tools without requiring double data movement.
Data set backups and archives created by BMC AMI Cloud Data and BMC AMI Cloud Vault in Amazon S3 can be transformed into readable textual or binary format and processed by analytics tools such as Amazon Athena. DB2 image copies delivered by BMC AMI Cloud products directly to S3 can be transformed to CSV or JSON format so that tables can be easily loaded into modern databases or data warehouses such as Amazon Aurora and Amazon Redshift.
How BMC AMI Cloud Analytics Works
BMC AMI Cloud Analytics runs on zIIP engines and delivers data sets directly to Amazon S3 cloud storage.
Compression and encryption can be optionally applied before data is sent over the network. On AWS, BMC AMI Cloud Analytics transforms data sets and databases to standard file formats (e.g. CSV or JSON) that can be consumed by analytics services.
When used together with BMC AMI Cloud Data, data transformation in the cloud is automatically applied to backed up and archived data, leveraging existing storage management scheduling policies and life cycle management.
Because mainframe data is kept in the cloud in its original format, it can be transformed in multiple ways to support future needs as your application requirements change.
Figure 2 – BMC AMI Cloud Analytics architecuture.
BMC AMI Cloud uses FlashCopy, Concurrent copy, and DFDSS for storage replication to deliver data to Amazon S3, where it’s organized, indexed, and tagged with metadata to enable identification, fast retrieval, and transformation..
Datasets can be transformed specifically or extracted from full volume dumps; for example, sequential, partitioned, and VSAM data sets are transformed to JSON files. DB2 image copies are transformed to CSV files.
Currently supported mainframe data sources for transformation include*:
- DB2 image copies
- VSAM data sets
- Sequential data sets
- Partitioned data sets
- Extended format data sets
* Data sources are updated regularly; please inquire with BMC for latest support.
Customer Use Cases
In this section, I will describe common customer use cases for leveraging mainframe data for analytics on AWS.
Data Retention Compliance
For companies with regulatory requirements to keep data for long retention periods, BMC AMI Cloud Analytics can securely archive mainframe data to Amazon S3, Amazon Glacier, or Glacier Deep Archive for long-term retention periods at attractive costs. Data sets are always available for transparent automatic-recall by mainframe applications.
For additional protection and compliance with regulations (such as SEC 17a-4), Amazon S3 object lock may be applied to data delivered by BMC AMI Cloud Analytics to prevent data from being deleted or overwritten by providing a Write-Once-Read-Many (WORM) protection model.
Some companies retain data even after their mainframe platform has been decommissioned. The BMC AMI Cloud management interface can be used to search for mainframe data sets stored in cloud storage, and then invoke BMC AMI Cloud Analytics running on AWS to make data available for applications and analytics tools with no need for a mainframe or for retaining old equipment.
Data Warehouse and Data Lake
Data stored on proprietary storage systems makes it complex to access and manipulate data from outside the mainframe platform. Mainframe ETL and data integration services transform the data before loading it into a target data store.
BMC AMI Cloud Analytics offers a new approach by delivering data to the cloud in its original format, enabling any transformation to occur outside the mainframe platform without access to the mainframe. This approach is known as extract, load, transform (ELT) in contrast to the traditional extract, transform, load (ETL) used by legacy mainframe tools.
Business Intelligence
Mainframes store core business data that customers analyze using modern business intelligence (BI) tools. This transformation of data consumes expensive mainframe CPU cycles and increases customer software license charges
With BMC AMI Cloud Analytics, data transformation runs on AWS and does not waste any mainframe CPU cycles at all. Data is then loaded directly into AWS analytics services such as Amazon Athena, or processed through Redshift or Aurora.
Operational Intelligence
Mainframes generate vast amounts of machine data, which is hard to use for DevOps, monitoring, and automation processes when stored on-premises. BMC AMI Cloud Analytics can send this data to S3, where it can be transformed and used by services running on AWS.
For example, System Management Facility (SMF) records, which are regularly collected by mainframe customers into tape or generation data sets, can be sent via BMC AMI Cloud Analytics to Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) as part of the standard SMF collection process. This can be used together with machine data generated by other platforms to provide a complete monitoring picture.
Walkthrough
In this section, I will demonstrate how to deliver and load a DB2 table into Amazon Athena and Redshift. Once data is on AWS, it can be queried and analyzed by a variety of AWS services.
Task #1: Deliver DB2 image copy to Amazon S3 and transform into a CSV file
The following JCL job is used to deliver DB2 data, stored in a DB2 table image copy, directly to S3 and then transform it into a CSV file that can be queried by Amazon Athena or loaded into Redshift.
This JCL or similar job can be integrated into mainframe automation processes or scheduled by standard job schedulers to ensure ongoing mainframe data delivery to AWS.
Figure 3 – JCL job to copy data to Amazon S3 and transform to CSV format.
The first job step creates an image copy from the specified DB2 table. Image copies (ICs) are a common DB2 table backup format, created using standard DB2 utilities as part of the daily DB2 backup process. ICs can be incremental or may contain the full table data.
In the second job step, the DB2 image copy is delivered by BMC AMI Cloud Analytics to S3. For simplicity, this job uses a predefined data delivery policy, but many controls can be set such as the S3 region, bucket name, object name prefix, and AWS credentials. This step may also be performed from the BMC AMI Cloud graphical management interface.
The last job step invokes the BMC AMI Cloud Analytics service on AWS, to transform the image copy on S3 to a CSV file.
Task #2: Query data in Amazon S3 using Amazon Athena
The screenshot below shows how to define a table in Amazon Athena from the transformed DB2 table in S3. The pictured DDL query defines an external table from a file stored in S3 and defines the schema to access the file.
In this case, the file has been created by BMC AMI Cloud Data backup of a DB2 table and transformed to CSV format by BMC AMI Cloud Analytics.
Figure 4 – Amazon Athena query to define table from DB2 CSV file.
After the table is defined, it can be queried just like any other table and there’s no difference between mainframe data and data that originated from other platforms.
The screenshot below shows the result of querying all columns in the table.
Figure 5 – Amazon Athena query result showing DB2 data.
Task #3: Load DB2 table into Amazon Redshift data warehouse
The following screenshot demonstrates how to load a table in CSV format from Amazon S3 to Redshift.
The pictured DDL query defines a table schema with multiple columns and copies the records from a file stored in S3 into the table based on the defined schema. The file was stored in AWS S3 using BMC AMI Cloud Data or BMC AMI Cloud Vault backup of a DB2 table.
Figure 6 – Amazon Redshift table creation and DB2 CSV data load.
After the table is created, it can be queried just like any other table and there is no difference between mainframe data and data that originated from other platforms.
The screenshot below shows the result of querying all columns in the table.
Figure 7 – Amazon Redshift query result showing DB2 data.
After completing the tasks above, you will be able to use mainframe data for standard analytics processes on AWS, such as Amazon Athena and Amazon Redshift.
Summary
In this post, we discussed how to securely and efficiently deliver mainframe data directly to storage and analytics services on AWS using BMC AMI Cloud Data and BMC AMI Cloud Analytics.
These software solutions help avoid duplicate data movement for backup and data analytics, enabling you to leverage mainframe data to gain deep business insights.
BMC Software – AWS Partner Spotlight
BMC is an AWS Specialization Partner that helps enterprises benefit from cloud economics while accelerating digital transformation and adoption of hybrid cloud architectures.