AWS Launches AI Factories

What happened

AWS introduced AI Factories, enabling customers to deploy AWS AI infrastructure, including Trainium accelerators and Nvidia GPUs, networking, storage, and AI services, directly within their own data centres. This offering integrates Nvidia's NVLink Fusion into Trainium4 chips for simplified deployment and systems management. New Trainium3-based servers, featuring 144 chips per server, provide increased computing power with reduced energy consumption. The system, supported by AWS Nitro and EC2 UltraClusters, functions as a private AWS Region, offering compute, storage, database, and AI services, alongside Nvidia's accelerated computing platform and software, while allowing customers to retain data control.

Why it matters

Deploying AWS AI Factories within customer data centres introduces a new operational constraint by extending AWS's integrated infrastructure model into on-premise environments. This creates a visibility gap for internal IT operations and infrastructure management teams, as the "private AWS Region" operates with AWS-defined systems management and deployment mechanisms, potentially reducing granular control over underlying components. Consequently, IT security and compliance teams face an increased oversight burden to ensure this hybrid operational model adheres to internal security policies and regulatory requirements, despite customers retaining data control.

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

Read the newsletter →

Listen to the podcast →