AWS HPC Blog

Tag: Machine Learning

Protein language model training with NVIDIA BioNeMo framework on AWS ParallelCluster

Protein language model training with NVIDIA BioNeMo framework on AWS ParallelCluster

In this new post, we discuss pre-training ESM-1nv for protein language modeling with NVIDIA BioNeMo on AWS. Learn how you can efficiently deploy and customize generative models like ESM-1nv on GPU clusters with ParallelCluster. Whether you’re studying protein sequences, predicting properties, or discovering new therapeutics, this post has tips to accelerate your protein AI workloads on the cloud.

Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML

Enhancing ML workflows with AWS ParallelCluster and Amazon EC2 Capacity Blocks for ML

No more guessing if GPU capacity will be available when you launch ML jobs! EC2 Capacity Blocks for ML let you lock in GPU reservations so you can start tasks on time. Learn how to integrate Caacity Blocks into AWS ParallelCluster to optimize your workflow in our latest technical blog post.

How Amazon’s Search M5 team optimizes compute resources and cost with fair-share scheduling on AWS Batch

How Amazon’s Search M5 team optimizes compute resources and cost with fair-share scheduling on AWS Batch

In this post, we share how Amazon Search optimizes their use of accelerated compute resources using AWS Batch fair-share scheduling to schedule distributed deep learning workloads.