- Analytics›
- Amazon SageMaker›
- FAQs
Amazon SageMaker FAQs
General
What is the next generation of Amazon SageMaker?
The next generation of Amazon SageMaker is a unified platform for data, analytics, and AI. Bringing together widely-adopted AWS machine learning and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. It enables you to collaborate and build faster from a unified studio (preview) using familiar AWS services for model development, generative AI, data processing, and SQL analytics, accelerated by Amazon Q Developer, the most capable generative AI assistant for software development. Additionally, you can access all your data whether it’s stored in data lakes, data warehouses, third party applications or federated data sources, with governance built-in to address enterprise security needs.
How is the new SageMaker different from what I am using today for my ML workflows?
We expanded the widely-adopted Amazon SageMaker service with the comprehensive set of AWS data, analytics, and AI capabilities to deliver a unified platform of data, analytics, and AI. Going forward, the existing set of AI/ML capabilities in SageMaker for data wrangling, building, training, and deploying AI models will be referred to as Amazon SageMaker AI. Amazon SageMaker AI is integrated within the next generation of Amazon SageMaker and is also available as a standalone service for those who wish to focus specifically on building, training, and deploying AI and ML models at scale.
The next generation Amazon SageMaker includes:
- Amazon SageMaker Unified Studio (preview) – a single development environment to access and use familiar tools and functionality from purpose-built AWS analytics and AI/ML services like Amazon EMR, AWS Glue, Amazon Athena, Amazon Redshift, Amazon Bedrock, and Amazon SageMaker AI
- Amazon SageMaker Lakehouse - unified data access across Amazon S3 data lakes, Amazon Redshift, third party and federated data sources
- Amazon SageMaker Data and AI governance – enabling you to discover, govern, and collaborate on data and AI securely
What capabilities are included with the next generation of Amazon SageMaker?
The next generation of Amazon SageMaker includes the following capabilities:
- Amazon SageMaker Unified Studio (preview) – Build with all your data and tools for analytics and AI in a single environment.
- Amazon SageMaker Lakehouse – Unify data across Amazon Simple Storage Service (Amazon S3) data lakes, Amazon Redshift data warehouses, third-party and federated data sources with Amazon SageMaker Lakehouse.
- Data and AI governance – Securely discover, govern, and collaborate on data and AI with Amazon SageMaker Catalog, built on Amazon DataZone.
- Model development – Build, train, and deploy ML and FMs with fully managed infrastructure, tools, and workflows with Amazon SageMaker AI (formerly Amazon SageMaker).
- Generative AI app development – Build and scale generative AI applications with Amazon Bedrock.
- SQL analytics – Gain insights with Amazon Redshift, the most price-performant SQL engine.
- Data processing – Analyze, prepare, and integrate data for analytics and AI using open-source frameworks on Amazon Athena, Amazon EMR, and AWS Glue.
Why should I use the next generation of Amazon SageMaker?
Amazon SageMaker is a unified platform for data, analytics, and AI. Bringing together widely-adopted AWS machine learning and analytics capabilities, the next generation of SageMaker delivers an integrated experience for analytics and AI with unified access to all your data. This unified approach helps you work more efficiently with your data, increase collaboration across teams, and enhance overall productivity.
Amazon SageMaker enables you to
- Collaborate and build faster with a single data and AI development environment, using familiar AWS services for model development, generative AI, data processing, and SQL analytics.
- Develop and scale your AI use cases with a broad set of tools to train, customize, and deploy machine learning and foundation models, and rapidly create generative AI applications tailored to your business.
- Reduce data silos with an open lakehouse to unify all your data across Amazon S3 data lakes, Amazon Redshift data warehouses, third-party or federated data sources.
- Meet your enterprise security needs with built-in data and AI governance to control access to the right data, ML models, GenAI development artifacts, and compute, by the right user for the right purpose.
Can I use individual AWS services without using SageMaker?
Yes. You can continue to use individual AWS services such as Amazon SageMaker AI (formerly Amazon SageMaker), Amazon EMR for big data processing, AWS Glue, and Amazon Redshift for data warehousing independently based on your specific business requirements. There is no impact to how you currently use your individual services today.
Amazon SageMaker offers an additional benefit by providing a unified, user-friendly interface that enables access to these services. This approach helps you innovate more effectively with your data, increase collaboration across teams, and enhance overall productivity.
What existing AWS services can I use within SageMaker?
Amazon SageMaker brings together a comprehensive set of AWS AI and analytics services across Amazon SageMaker Unified Studio (preview), Amazon SageMaker Data and AI Governance, and Amazon SageMaker Lakehouse.
From Amazon SageMaker Unified Studio, you can access capabilities for data processing, SQL analytics, machine learning, and generative AI application development using existing AWS services. For data processing, services like Amazon Athena, AWS Glue, Amazon EMR, and Amazon Managed Workflows for Apache Airflow easily analyze, prepare, integrate and orchestrate data for analytics and AI at any scale. For SQL Analytics, Amazon Redshift seamlessly integrates with Amazon SageMaker Lakehouse to provide powerful SQL analytic capabilities on your unified data across Redshift data warehouses and Amazon S3 data lakes. Machine learning capabilities are delivered by Amazon SageMaker AI (previously known as Amazon SageMaker) for building, training, and deploying machine learning and foundation models. Additionally, you can develop generative AI applications using Amazon Bedrock IDE (preview).
Amazon SageMaker Data and AI Governance provides end-to-end, built-in governance through a unified data management experience in Amazon SageMaker Catalog, built on Amazon DataZone, to discover, govern, and collaborate on data and AI securely.
Amazon SageMaker Lakehouse is built on multiple catalog services across AWS Glue Data Catalog, AWS Lake Formation and Amazon Redshift to provide unified data access across Amazon S3 data lakes, Amazon Redshift data warehouses, third-party and federated data sources.
In addition, these services remain available as standalone capabilities through the AWS Management Console, giving you flexibility based on your use cases. We will enhance Amazon SageMaker platform with more services in 2025 to unify experiences across analytics and AI. These include search analytics with Amazon OpenSearch service, business intelligence with Amazon QuickSight, and streaming with the AWS streaming portfolio of services.
How do I get started with SageMaker?
Getting started with Amazon SageMaker is easy. The first step is to navigate to the Amazon SageMaker Unified Studio (preview) management console to create a domain, the organizing entity for connecting together your assets, users, and their projects for your business unit. In the management console, choose Create domain, and you will be presented with two options – Quick setup and Manual setup. Choose Quick setup to get started with a set of default configurations that can be customized later. Alternatively, you can choose Manual setup which gives you full control over your settings as you create your domain. Once your domain is created, you can navigate to the Amazon SageMaker Unified Studio (a browser-based web application) where you can use all your data and configured tools for analytics and AI. To learn more about how to get started, please refer to SageMaker documentation.
I currently use existing AWS services that are now included in SageMaker. How do I upgrade to the unified experience in SageMaker?
Your existing data development experiences in AWS services like Amazon EMR, AWS Glue, and Amazon Athena remain available. This means all existing code and resources you've created can continue to be used without disruption. We will provide easy-to-use upgrade scripts and comprehensive guidelines to bring your existing code base to the unified SageMaker experience in Q1 2025.
Is the next generation of Amazon SageMaker generally available?
We are extending Amazon SageMaker, a widely-adopted ML service into a data and AI platform by integrating the comprehensive set of AWS data, analytics, and AI tools already used by customers today. We’ve also added new capabilities to the new SageMaker platform including the SageMaker Unified Studio (preview), the SageMaker Lakehouse (GA), and the SageMaker Catalog (GA).
The new SageMaker platform includes virtually all of the components you need for SQL analytics with Amazon Redshift, data processing with Amazon EMR, AI model development with SageMaker AI, and generative AI app development with the new BedRock IDE (preview), all delivered through an integrated development experience in the unified studio (preview).
Product experience
What is a project in SageMaker?
A project entity in SageMaker helps users organize their work and provide business context over the jobs they are performing. It provides a collaborative workspace where users can collaborate on data and artifacts such as ML models, notebooks, queries, dashboards, and generative AI applications. Projects are secured so that only users who are explicitly added to the project are able to access the data and tools within it. The project creates AWS Identity and Access Management (IAM) roles based on the project-selected capabilities (for example, a data lake) that provide users with required access to do their job. Projects also provide work isolation within the same account, as well as a security boundary (security group and IAM roles).
How does Amazon Q Developer enhance productivity in SageMaker?
Amazon Q Developer is a generative AI conversational assistant integrated into the SageMaker experience that enhances your productivity throughout the development lifecycle. Through a chat interface, you can use natural language to ask questions about SageMaker, get help with code, and explore resources such as datasets. When you chat with Amazon Q Developer, it uses the context of your current conversation to provide personalized guidance and automated assistance throughout the SageMaker development experience. Amazon Q Developer can help you with code discussions, provide inline code completions, generate SQL queries, find and integrate datasets, and offer intelligent support tailored to your specific development needs.
By understanding the nuances of your work, Amazon Q Developer delivers targeted, context-aware assistance that streamlines your development process and enhances overall productivity in the SageMaker environment.
What tools are available in SageMaker for analytics and AI jobs?
SageMaker provides a unified, web-based environment that brings together powerful tools for complete data and AI workflows. Built-in IDEs enable AI/ML development, allowing you to process large data volumes from various sources using frameworks and services like PySpark, AWS Glue, and Amazon EMR.
For version control and workflow management, you can commit to Git and define workflows using Amazon MWAA. The integrated SQL query editor allows you to explore, analyze, and visualize data, with the ability to more easily save and share queries and create new datasets.
Model development is streamlined through familiar SageMaker AI tools, including Amazon SageMaker notebooks, JumpStart, HyperPod, MLFlow, Pipelines, and Model Registry. Throughout these processes, Amazon Q Developer is seamlessly integrated across SageMaker tools, providing intelligent assistance in data discovery, preparation, pipeline creation, model building and training, and code deployment.
How do I build generative AI applications in SageMaker?
Bedrock IDE (preview), integrated within SageMaker Unified Studio (preview), provides a comprehensive environment for developing generative AI applications. This intuitive interface helps you accelerate application development in a trusted and secure setting, offering access to the high-performing FMs and advanced customization capabilities of Amazon Bedrock.
You can use powerful features such as Amazon Bedrock Knowledge Bases, Guardrails, Agents, and Prompt Flows, allowing your team to rapidly tailor generative AI applications to your specific business needs while adhering to your responsible AI guidelines. The platform supports your governed access and enables secure cross-functional collaboration through access-controlled sharing and git-backed auditability.
What types of data sources does SageMaker support?
Amazon SageMaker Lakehouse unifies data across AWS data lakes, data warehouses, third-party applications, and operational databases. It gives you fast, streamlined access to your data in one place through zero-ETL integrations, federated query sources, and 240+ connectors.
How do I ensure that the data in SageMaker is properly governed and secured?
Amazon SageMaker provides end-to-end, built-in governance through a unified data management experience in Amazon SageMaker Catalog, built on Amazon DataZone. This approach enables you to catalog, discover, access, analyze, and govern both structured and unstructured data assets, machine learning models, and applications across your organization. The platform ensures that the right people have the appropriate access to the right assets, maintaining robust security and compliance standards.
How do I create and manage data pipelines in SageMaker?
You can create and manage data pipelines in SageMaker in multiple ways. SageMaker Data Processing brings together Amazon EMR, Amazon Athena, AWS Glue, and Amazon MWAA to help you integrate, prepare, and explore your data in a unified experience. You can build pipelines for ML-specific model orchestration with SageMaker AI and data pipelines and workflows with Amazon MWAA. You can also use zero-ETL integrations, which simplify data movement by removing complex extract, transform, and load (ETL) processes and enabling direct data replication across services. Visit What is zero-ETL? to learn more.
Pricing
How does SageMaker pricing work?
When using Amazon SageMaker, you will be charged as per the pricing model for the various AWS services accessible via Amazon SageMaker. There is no separate cost for using the Amazon SageMaker Unified Studio (preview), the data and AI development environment that provides the integrated experience within Amazon SageMaker. Please visit the Amazon SageMaker pricing page SageMaker pricing for more information.
Can I try SageMaker for free?
The SageMaker Free Tier helps you quickly get started innovating with data and AI at no cost. Refer to SageMaker pricing for details.
Availability
In which AWS Regions is SageMaker available?
The next generation of Amazon SageMaker is available the US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) AWS Regions. Amazon SageMaker Unified Studio and Amazon Bedrock IDE are available in preview in these AWS Regions. For future updates, please review the AWS Regional Services List.
Does SageMaker offer an SLA?
Yes. SageMaker is engineered to provide the consistent performance and uptime that mission-critical analytics and AI workloads demand. As a unified platform comprised of multiple service components, the service availability is tied to the service component used.
For detailed information on the service level agreements (SLAs) for each individual service, refer to its respective SLA documentation. SLAs will provide you with the specific uptime guarantees and reliability commitments for the various services that make up the SageMaker experience.
Available SLA documentation include: