AWS Spatial Computing Blog

3D Gaussian Splatting: Performant 3D Scene Reconstruction at Scale

Introduction

Creating high-quality 3D content is traditionally a complex, time-consuming, and resource-intensive process. From 3D modeling and texturing to final scene assembly and rendering, companies require specialized skillsets. They must invest significant effort to craft realistic virtual environments, assets, and characters – particularly if their goal is to digitally recreate elements from the real world.

For example, this AWS blog goes into detail about how reality capture technologies offer a compelling solution to simplify the process of digitizing real-world objects or environments through techniques like LiDAR scanning, Photogrammetry, and Neural Radiance Fields (NeRFs). The key idea is to capture multiple perspectives of a scene, then analyze similarities and differences across viewpoints in order to estimate scene depth and geometry. This automated approach bypasses the need for manual 3D modelling, delivering high quality results in a fraction of the time compared to traditional methods. However, these techniques require expensive and specialized hardware, are compute-intensive, and typically offer poor real-time rendering performance.

Recently, a new 3D scene reconstruction and rasterization technique called 3D Gaussian Splatting has demonstrated the potential to accelerate this workflow and still offer high quality results. In this post, we will explore what 3D Gaussian Splatting is, its advantages over previous 3D reconstruction and reality capture approaches, what this means for 3D content creation, and how organizations can leverage AWS to utilize 3D Gaussian Splatting to reconstruct real-world 3D assets at scale.

What is 3D Reconstruction?

The multiple stages of the 3D reconstruction process illustrated using a park bench

Figure 1. 3D reconstruction uses 2D input images to infer 3D depth by generating three-dimensional representations of subjects using computer vision techniques.

Before delving into 3D Gaussian Splatting, it is essential to understand the concept of 3D reconstruction. This process involves using 2D inputs, such as photographs or video frames, to recreate and reconstruct a 3D scene or object. For example, this could mean capturing a scene with a camera and converting this into an interactive 3D environment. Unlike 3D scanning techniques like LiDAR, which explicitly provides depth data as an input, 3D reconstruction techniques must estimate the camera pose and infer depth information from 2D images in order to reconstruct the scene in 3D space. Furthermore, 3D reconstruction can be thought of as a complex computer vision problem utilizing techniques like multi-view stereo, structure from motion, shape from shading/texture/focus to recreate 3D from 2D.

Photogrammetry, as the oldest example of 3D reconstruction, has been widely used in various applications since the 1980s, originally in aerial photography, surveying, and mapping. As explained in a previous AWS blog, photogrammetry leverages structure from motion and multi-view stereo to generate a point cloud of the scene. A polygon-based 3D mesh with textures overlaid can then be created from the point cloud using various techniques including Poisson or Ball-pivoting surface reconstruction. More recently in 2020, Neural Radiance Fields (“NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis”) was introduced as a new 3D reconstruction technique that presented reconstructed scenes as volumes, not meshes. This was achieved through employing neural networks to estimate light radiance properties and create artificial, novel viewpoints around a subject. Compared to photogrammetry, this new approach is generally more efficient and generates outputs faster, however the level of detail is sometimes inferior.

What is 3D Gaussian Splatting?

Demonstration of the volumetric nature of 3D gaussian splats using a park bench

Figure 2. 3D Gaussian Splats are volumetric representations of 3D space and are different to traditional, polygon-based meshes.

In 2023, a paper published at SIGGRAPH introduced 3D Gaussian Splatting (“3D Gaussian Splatting for Real-Time Radiance Field Rendering”) as a new 3D reconstruction method building upon the concept of NeRFs. The key difference is that, rather than creating novel viewpoints by using neural networks to estimate light radiance as NeRFs do, 3D Gaussian Splatting generates novel viewpoints by populating a 3D space with view-dependent “gaussians.” These appear as fuzzy, 3D primitives with colors, densities, and positions adjusted to mimic light behavior.

Instead of drawing triangles for a polygonal mesh, 3D Gaussian Splatting draws (or splats) gaussians to create a volumetric representation in which billions of gaussians can be used to recreate complex real-world environments. More importantly, 3D Gaussian Splatting differs from NeRFs in that they do not use neural networks, and instead leverages traditional machine learning optimization methods such as stochastic gradient descent – a similar concept but without layers, making it much more computationally efficient.

The 3D Gaussian Splatting technique delivers photorealistic 3D reconstruction with generally improved visual quality and reduced generation times compared to previous techniques. Additionally, it offers other benefits, positioning it as the new state-of-the-art technique for 3D reconstruction.

Accelerating Photorealistic 3D Reconstruction

3D gaussian splat shown inside a game engine enviroment to illustrate extensibility

Figure 3. 3D Gaussian Splats can be used in Digital Content Creation (DCC) tools and game engines to build realistic, interactive environments.

3D Gaussian Splatting has several key benefits over previous 3D reconstruction techniques including Photogrammetry and NeRFs:

  1. Shorter time to generate: 3D Gaussian Splatting rasterizes view-dependent gaussians directly instead of using neural networks to explicitly model 3D space. This greatly reduces the computational effort required to reconstruct a 3D scene, both in processing power and time.
  2. Superior real-time performance: The volumetric, point cloud-based nature of 3D Gaussian Splats make them easier to render in real-time compared to polygon-based meshes. 3D Gaussian Splat outputs are also more compact than NeRFs, which enables performant real-time applications on the web and even on-device.
  3. More robust output: The 3D Gaussian Splatting approach shows better resilience to noise, produces fewer visual artefacts, handles traditionally challenging aspects of 3D reconstruction like transparency, reflectiveness, and empty space more reliably than past techniques.

These advantages mean that 3D Gaussian Splatting has helped significantly lower the barrier to capturing and recreating the real world in 3D. This means:

  1. Lower barrier to entry and faster time-to-market for 3D asset creation: Accelerated adoption and broadened usage of 3D beyond traditional specialist roles and use cases. Specialist knowledge is no longer required to model complex 3D objects – all that is required is a smartphone camera and an endpoint for a 3D reconstruction pipeline powered by 3D Gaussian Splatting.
  2. Increased accessibility to interactive, real-time 3D content consumption: Reduced hardware requirements for rendering and superior performance for real-time consumption means the democratization of live 3D content across all mediums, particularly consumer computing, mobile, and AR/VR. Interactive web experiences with photorealistic 3D assets are now possible at scale even without expensive, accelerated computing.

3D Gaussian Splatting as a 3D reconstruction technology has the potential to reshape how we approach 3D asset creation across multiple industries, from virtual production to digital twins to e-commerce product visualization. These experiences benefit from the interactive, high-fidelity nature of 3D assets – all of which can be powered by the cloud to be more scalable, distributable, and performant.

3D Gaussian Splatting at Scale Using AWS

3D Gaussian Splatting, as with other 3D reconstruction techniques, lends itself well to the scale, performance, and content delivery capabilities of the cloud – for asset generation, management, and distribution to end-users across the globe. There are numerous ways to build such a workflow.

Here is an example of what a 3D Gaussian Splatting workflow could look like on AWS, starting with a high-level illustration of the overall process:

3D gaussian splatting workflow diagram showing stages of implementation

Figure 4. 3D Gaussian Splatting workflow example on AWS.

This workflow contains the following components:

  1. Input: 2D video and image media are ingested and used as inputs into the workflow.
  2. Workflow: The workflow is comprised of multiple pipelines, each processing independently and serially of each other.
  3. Pipelines: Each pipeline is a self-contained task with standardized inputs and outputs. The use of pipelines can compartmentalize scope and decouple distinct processes that can be run independently of each other. This modularity helps keep the codebase of each pipeline optimized and allows the re-use of these processes in other workflows. In this example, we have pipelines for image processing, structure from motion, 3D Gaussian Splatting training and initialization, and the 3D Gaussian Splatting viewer.
  4. Output: For 3D Gaussian Splatting generation, the output will be a 3D object which a user can view from a web browser with a capable viewer.

The following illustration shows how the previously described 3D Gaussian Splatting workflow could be implemented using AWS services:

3D gaussian splatting AWS sample architecture diagram showing a possible implementation on AWS

Figure 5. 3D Gaussian Splatting workloads benefit significantly from leveraging AWS, including process orchestration and handling using AWS services as building blocks.

The previous illustration depicts the following key components:

  1. A developer uses the AWS Cloud Development Kit (CDK) to create and deploy the infrastructure only one time before use.
  2. AWS CDK will store relevant infrastructure artifacts (uri, arn) into AWS Systems Manager Parameter Store for retrieval during deployment.
  3. After deployment, a user launches the website URL in a local browser connected to the internet.
  4. The website is hosted on Amazon Simple Storage Service (S3) and distributed globally using Amazon CloudFront.
  5. A user enters credentials to log in using Amazon Cognito user pools.
  6. A user initiates a 3D Gaussian Splatting workflow in the UI by uploading a new asset (video or a series of images) and issuing an API command.
  7. Workflow state and asset details are stored in Amazon DynamoDB.
  8. An AWS Step Functions workflow will invoke Amazon Elastic Container Service (ECS) as a process handler to run the pipelines.
  9. Amazon ECS will use an image in Amazon Elastic Container Registry (ECR) for the container.
  10. Amazon ECS will spin-up compute resources to host the container and run the 3D Gaussian Splatting pipelines.
  11. Generated assets are loaded/stored from Amazon S3 into the pipeline.
  12. Once the process handler is done, the workflow is complete.
  13. The database is updated with the state and workflow details.
  14. Amazon Simple Notification Service (SNS) can be used to notify users upon asset creation completion.
  15. A user can view the resulting splat in a web browser.
  16. Logs can be accessed using Amazon CloudWatch.

Workloads involving 3D Gaussian Splatting on AWS can integrate open-source software and third-party tools alongside AWS managed services. As an example, open-source tooling could be used for image preprocessing, while leveraging AWS managed services to run the reconstruction job.

There are many examples of other AWS services that could be integrated into this workflow. By running a 3D Gaussian Splatting workload on AWS, customers could extend application capabilities through:

  • Elastic, distributed processing: Use Amazon ECS, Amazon Elastic Kubernetes Service (EKS) and AWS Batch to package, parallelize and distribute 3D reconstruction jobs across multiple compute nodes to reduce time-to-generate and scale up and down when needed. Orchestrate the reconstruction pipeline using a compute farm manager like AWS Deadline Cloud, leveraging the portability of Open Job Description (OpenJD).
  • Content storage and management: Utilize AWS’s open-source Visual Asset Management System (VAMS) solution, which leverages Amazon S3 and Amazon DynamoDB for storage, to store, manage and retrieve scanned 3D assets for internal collaboration.
  • Graphics-accelerated computing: Leverage Amazon Elastic Compute Cloud (EC2) and the newest generation Amazon EC2 G6 Instances powered by NVIDIA L4 Tensor Core GPUs for performant accelerated compute to use 3D assets in graphics applications.
  • Content delivery and streaming: Use Amazon CloudFront to stream and deliver interactive 3D content reconstructed to end users globally, with low latency at the edge, on the web and mobile.
  • Analytics and insights: Leverage AWS analytics services like Amazon Kinesis, Amazon Athena, and Amazon QuickSight to gain insights from 3D reconstruction pipelines. Track pipeline performance, identify bottlenecks, and optimize compute efficiency at scale.
  • AI/ML integration: Integrate AWS machine learning and generative AI services like Amazon SageMaker and Amazon Bedrock to further enhance 3D reconstructions through computer vision, for example through image segmentation and semantic labelling.
  • Digital twins: Power digital twin applications on AWS IoT TwinMaker using scanned 3D assets to represent real-world systems and operations.

AWS offers a range of services to support the entire lifecycle of 3D Gaussian Splatting workloads. This includes data ingestion, distributed processing, storage, delivery, rendering, analytics, AI/ML, and digital twins.

Conclusion

Since its introduction last year, 3D Gaussian Splatting has helped drive a renaissance in 3D asset and content creation by significantly increasing the accessibility and effectiveness of 3D reconstruction reality capture technology.

The field of 3D Gaussian Splatting continues to evolve rapidly, with ongoing innovation from AWS partners. NVIDIA has explored an approach called “Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models“, which integrates 3D Gaussian Splatting and AI diffusion models to generate dynamic assets through text inputs. Additionally, Meta has worked on “Robust Gaussian Splatting“, focused on enhancing the robustness of 3D reconstruction outputs from handheld phone captures by reducing visual inconsistencies.

At AWS, we offer the cloud infrastructure, services, and ecosystem of partners to enable scalable 3D Gaussian Splatting pipelines – encompassing video ingestion, accelerated computing for model generation, storage and management of 3D assets, and real-time rendering and delivery to any device. This streamlines the process of incorporating photorealistic 3D capabilities into your applications across industries such as media, retail, manufacturing, and others.

We encourage you to experiment with 3D Gaussian Splatting and 3D reconstruction technologies yourself. Refer to the “3D Gaussian Splatting for Real-Time Radiance Field Rendering” repository on GitHub for steps on implementation and deploy this today on AWS.