AWS Storage Blog

Category: Artificial Intelligence

Amazon S3 featured image 2023

Siemens builds Datalake2Go on AWS to analyze disparate data globally

Siemens is a technology company focused on industry, infrastructure, transport, and healthcare. From resource-efficient factories, resilient supply chains, and smart buildings and grids, to cleaner and more comfortable transportation and advanced healthcare, the company creates technology with purpose, adding real value for its customers. Siemens technology is everywhere, supporting the critical infrastructure and vital industries […]

Optimizing enterprise MLOps in the cloud with Domino Data Lab and Amazon Elastic File System

Domino Data Lab is an AWS Partner Network (APN) partner that provides a central system of record for data science activity across an organization. The Domino solution delivers orchestration for all data science artifacts, including AWS infrastructure, data and services. As part of the solution, Domino’s platform leverages the scale, security, reliability, and cost-effectiveness of […]

Amazon S3 featured image 2023

How Visual Layer builds high quality datasets on Amazon S3

Companies from different industries use data to help their Artificial Intelligence (AI) and Machine Learning (ML) systems make intelligent decisions. For ML systems to work well, it is crucial to make sure that the massive datasets used for training ML models are of the highest quality, minimizing noise that can contribute to less-than-optimal performance. Processing […]

Accelerating GPT large language model training with AWS services

GPT, or Generative Pre-trained Transformer, is a language model that has shown remarkable progress in various vertical industries. This technology has been used to generate human-like text in fields such as finance, healthcare, legal, marketing, and many others. In finance, GPT is being used to analyze financial data, generate reports, and assist with decision-making. In […]

Machine Learning with Kubeflow on Amazon EKS with Amazon EFS

Training Machine Learning models involves multiple steps, it gets more complex and time consuming when the size of the data set for training is in the range of 100s of GBs. Data Scientists run through large number of experiments and research which includes testing and training large number of models. Kubeflow provides various ML capabilities […]

High-performance cloud storage comes of age with Amazon FSx for Lustre

The rapid maturation of cloud tools for high-performance workloads in the past several years has made it possible for household names like T-Mobile, Toyota, and Rivian to move their high-performance analytics and AI/ML environments to the cloud. These are hugely data-intensive workflows that many companies five years ago believed would never be able to be […]

A gene-editing prediction engine with iterative learning cycles built on AWS

NRGene develops cutting-edge genomic analytics products that are reshaping agriculture worldwide. Among our customers are some of the biggest and most sophisticated companies in seed-development, food and beverages, paper, rubber, cannabis, and more. In the middle of 2020, NRGene joined a consortium of companies and academic institutions to build the best-in-class gene-editing prediction platform to […]

Amazon S3

Reliable event processing with Amazon S3 Event Notifications

As AWS Solutions Architects, we help customers understand and plan AWS architectures that meet their business goals while remaining scalable, cost effective, secure, and reliable. One common pattern that comes up frequently is the desire to move from manual or polling-based strategies to reliable events processing, also known as event-driven architecture (EDA). This approach dovetails […]

Using high-performance storage for machine learning workloads on Kubernetes

Organizations are modernizing their applications by adopting containers and microservices-based architectures. Many customers are deploying high-performance workloads on containers to power microservices architecture, and require access to low latency and high throughput shared storage from these containers. Because containers are transient in nature, these long-running applications require data to be stored in durable storage. Amazon FSx […]

New on the Machine Learning blog: Speed up training on Amazon SageMaker using Amazon FSx for Lustre and Amazon EFS file systems

Deploying analytics applications and machine learning models requires storage that can scale in capacity and performance to handle workload demands with high throughput and low-latency file operations. A common use case we’re seeing centers around data science teams doing some form of analytics (e.g machine learning, genomics). AWS offers two scalable, durable, highly available file […]