Aws Re:invent 2025 - Aws Trn3 Ultraservers: Power Next-generation...

Aws Re:invent 2025 - Aws Trn3 Ultraservers: Power Next-generation...

🦄 Making great presentations more accessible. This project aims to enhances multilingual accessibility and discoverability while maintaining the integrity of original content. Detailed transcriptions and keyframes preserve the nuances and technical insights that make each session compelling.

📖 AWS re:Invent 2025 - AWS Trn3 UltraServers: Power next-generation enterprise AI performance(AIM3335)

In this video, AWS introduces Trainium3, their next-generation AI chip designed for agentic workloads and reasoning models. Joe Senerchia, Ron Diamant, and Jonathan Gray from Anthropic detail how Trainium3 delivers 360 petaflops of microscaled FP8 compute with 144-chip UltraServers connected via NeuronSwitch topology. Ron explains architectural innovations like hardware-accelerated microscaling quantization and optimized softmax instructions that maximize sustained performance. The team demonstrates 5x better tokens-per-megawatt efficiency versus Trainium2 and highlights Project Rainier's deployment of 1 million Trainium2 chips serving Claude models in production. Jonathan Gray showcases real kernel optimizations achieving 90% tensor engine utilization on Trainium3, including FP8 matrix multiplications, attention operation tuning, and SRAM-to-SRAM collectives. The presentation emphasizes PyTorch native integration, open-sourced NKI compiler, and Neuron Explorer profiling tools providing nanosecond-level observability for performance engineering.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Welcome everyone. My name is Joe Senerchia. I'm the EC2 product manager for our Inferentia and Trainium chips, and I'm super excited to have everyone here. Just a quick show of hands, how many are familiar with Inferentia and Trainium? Okay, what about Anthropic Claude models? Okay, a few more. Well, today I'm super excited because we have two experts on both of those things. We have the chief architect of Trainium, Ron Diamant, and we have Jonathan Gray, who's the Trainium inference lead for Anthropic, thinking about optimizing Claude models on Trainium.

So quickly, what we have in store today. I'll first walk through how AWS builds and thinks about building AWS AI infrastructure. Then I'll have Ron walk through Trainium and how he built it for performance, scale, and ease of use. And then Jonathan Gray will come up and look at how he actually optimizes d

Source: Dev.to