AWS Machine Learning Blog

Category: Storage

Solution overview

Build flexible and scalable distributed training architectures using Kubeflow on AWS and Amazon SageMaker

In this post, we demonstrate how Kubeflow on AWS (an AWS-specific distribution of Kubeflow) used with AWS Deep Learning Containers and Amazon Elastic File System (Amazon EFS) simplifies collaboration and provides flexibility in training deep learning models at scale on both Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon SageMaker utilizing a hybrid architecture approach. […]

Configure a custom Amazon S3 query output location and data retention policy for Amazon Athena data sources in Amazon SageMaker Data Wrangler

Amazon SageMaker Data Wrangler reduces the time that it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. With Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of […]

Use RStudio on Amazon SageMaker to create regulatory submissions for the life sciences industry

Pharmaceutical companies seeking approval from regulatory agencies such as the US Food & Drug Administration (FDA) or Japanese Pharmaceuticals and Medical Devices Agency (PMDA) to sell their drugs on the market must submit evidence to prove that their drug is safe and effective for its intended use. A team of physicians, statisticians, chemists, pharmacologists, and […]

Cloud-based medical imaging reconstruction using deep neural networks

Medical imaging techniques like computed tomography (CT), magnetic resonance imaging (MRI), medical x-ray imaging, ultrasound imaging, and others are commonly used by doctors for various reasons. Some examples include detecting changes in the appearance of organs, tissues, and vessels, and detecting abnormalities such as tumors and various other type of pathologies. Before doctors can use […]

Demystifying machine learning at the edge through real use cases

October 2023: Starting in April 26th, 2024, you can no longer access Amazon SageMaker Edge Manager. For more information about continuing to deploy your models to edge devices, see SageMaker Edge Manager end of life. Edge is a term that refers to a location, far from the cloud or a big data center, where you […]

Build and deploy a scalable machine learning system on Kubernetes with Kubeflow on AWS

In this post, we demonstrate Kubeflow on AWS (an AWS-specific distribution of Kubeflow) and the value it adds over open-source Kubeflow through the integration of highly optimized, cloud-native, enterprise-ready AWS services. Kubeflow is the open-source machine learning (ML) platform dedicated to making deployments of ML workflows on Kubernetes simple, portable and scalable. Kubeflow provides many […]

Securely search unstructured data on Windows file systems with the Amazon Kendra connector for Amazon FSx for Windows File Server

Critical information can be scattered across multiple data sources in your organization, including sources such as Windows file systems stored on Amazon FSx for Windows File Server. You can now use the Amazon Kendra connector for FSx for Windows File Server to index documents (HTML, PDF, MS Word, MS PowerPoint, and plain text) stored in […]

Machine learning inference at scale using AWS serverless

With the growing adoption of Machine Learning (ML) across industries, there is an increasing demand for faster and easier ways to run ML inference at scale. ML use cases, such as manufacturing defect detection, demand forecasting, fraud surveillance, and many others, involve tens or thousands of datasets, including images, videos, files, documents, and other artifacts. […]

Scan Amazon S3 buckets for content moderation using S3 Batch and Amazon Rekognition

Dealing with content in large scale is often challenging, costly, and a heavy lift operation. The volume of user-generated and third-party content has been increasing substantially in industries like social media, ecommerce, online advertising, and media sharing. Customers may want to review this content to ensure that it follows corporate governance and regulations. But they […]

Deploy multiple machine learning models for inference on AWS Lambda and Amazon EFS

You can deploy machine learning (ML) models for real-time inference with large libraries or pre-trained models. Common use cases include sentiment analysis, image classification, and search applications. These ML jobs typically vary in duration and require instant scaling to meet peak demand. You want to process latency-sensitive inference requests and pay only for what you […]