The AWS AI/ML Landscape in 2026 — Simplified

The AWS AI/ML Landscape in 2026 — Simplified

Source: Dev.to

The Three-Tier Architecture: How AWS Actually Thinks About AI/ML ## TIER 1: The Foundation Layer - Build Your Own ML Models ## Amazon SageMaker AI: The Complete ML Platform ## Core Components & Features: ## TIER 2: The GenAI Revolution - Amazon Bedrock ## Amazon Bedrock: Your Gateway to Foundation Models ## Available Foundation Models: ## Core Bedrock Features: ## TIER 3: Ready-to-Use AI Services - No ML Expertise Required ## Amazon Rekognition: Computer Vision Made Simple ## Key Features: ## Amazon Textract: Document Intelligence Beyond OCR ## Key Features: ## Amazon Comprehend: Natural Language Understanding ## Key Features: ## Amazon Polly: Text-to-Speech That Sounds Human ## Key Features: ## Amazon Transcribe: Speech-to-Text with Intelligence ## Key Features: ## Amazon Translate: Neural Machine Translation ## Key Features: ## Amazon Lex: Build Conversational Interfaces ## Key Features: ## Amazon Personalize: Real-Time Recommendations ## Key Features: ## Industry-Specific & Specialized AI Services ## Amazon Forecast: Time-Series Forecasting ## Key Features: ## Amazon Fraud Detector: ML-Powered Fraud Prevention ## Key Features: ## Amazon HealthLake: Healthcare Data Management ## Key Features: ## The New Generation: 2025 GenAI Services ## Amazon Q: The AI Assistant Family ## 1. Amazon Q Developer ## 2. Amazon Q Business ## 3. Amazon Q in QuickSight ## Kiro: Agentic IDE for Spec-Driven Development ## Amazon Nova Act: UI Automation Agent ## Amazon Bedrock AgentCore ## 🚀 Core Features ## 📌 1. Universal Framework & Model Support ## 🛠️ 2. Managed Runtime ## 🧠 3. AgentCore Memory ## 🔐 4. Identity & Security ## 🔗 5. AgentCore Gateway ## 🧪 6. Observability & Quality Controls ## 🧩 7. Tooling & Execution Enhancements ## ✅ Enterprise Benefits ## 📌 Solid Use Case: Enterprise IT Support Assistant ## Overview ## AgentCore Implementation ## Results ## Understanding the AWS AI/ML Stack vs. GenAI Stack ## The AWS Machine Learning (ML) Stack ## The AWS Generative AI (GenAI) Stack ## ML Stack vs. GenAI Stack: Decision Matrix ## Hybrid Approach: Combining ML and GenAI ## The AWS Advantage: Why This Ecosystem Matters ## 1. Breadth of Choice ## 2. Integration ## 3. Infrastructure Abstraction ## 4. Pay-as-You-Go ## 5. Security & Compliance ## 6. Performance ## 7. Innovation Velocity ## Getting Started: Your 4-Week Journey ## Week 1: Explore Ready-to-Use Services ## Week 2: Experiment with GenAI ## Week 3: Build a Custom ML Model ## Week 4: Build a Real Project ## Common Pitfalls to Avoid ## 1. Using GenAI When You Need ML ## 2. Using ML When You Need GenAI ## 3. Not Considering Costs ## 4. Ignoring Security ## 5. Skipping Monitoring ## 6. Not Planning for Scale ## The Future: What's Coming in AI/ML on AWS ## The Bottom Line ## Resources to Continue Learning A practical deep-dive into Amazon's AI/ML ecosystem and how to leverage it for real-world problems Remember when implementing machine learning meant assembling a team of PhDs, buying expensive GPU clusters, and spending months just to get a proof of concept running? Yeah, those days are gone. In 2025, AWS has transformed the AI/ML landscape into something that's actually accessible—whether you're a startup founder with a brilliant idea or an enterprise architect modernizing legacy systems. But here's the thing: AWS now offers over 30 AI/ML services. That's not a typo. Thirty. And if you're feeling overwhelmed just reading that number, you're not alone. The good news? They're not randomly thrown together. There's a method to this madness, and once you understand the architecture, everything clicks into place. AWS structures its AI/ML services like a pyramid, and understanding this structure is your secret weapon to picking the right tool for the job. Amazon SageMaker AI is the heavyweight champion of custom machine learning. This isn't just a service—it's an entire ecosystem for building, training, and deploying machine learning models at scale. 2. SageMaker Autopilot (AutoML) 3. SageMaker Feature Store 4. SageMaker Data Wrangler 5. SageMaker Training 6. SageMaker Inference 7. SageMaker Pipelines (MLOps) 9. SageMaker Model Monitor 10. SageMaker Debugger 11. SageMaker Ground Truth 13. SageMaker JumpStart Real-World Use Case: Healthcare Diagnostics A healthcare startup building a diagnostic tool for rare diseases has proprietary medical imaging data. They need a custom computer vision model because off-the-shelf solutions won't work for their specialized use case. Implementation with SageMaker: Result: From concept to production in 6 weeks instead of 6 months, with 94% diagnostic accuracy and full compliance with healthcare regulations. Amazon Bedrock is AWS's fully managed service for building generative AI applications. Instead of training foundation models from scratch (which costs millions), Bedrock gives you access to leading AI models through a single API. 1. Amazon Titan Models 4. AI21 Labs Jurassic 1. Knowledge Bases for Amazon Bedrock 2. Agents for Amazon Bedrock 3. Guardrails for Amazon Bedrock 4. Model Customization Real-World Use Case: E-Commerce AI Shopping Assistant A large e-commerce company wants to build an intelligent shopping assistant that understands customer queries, searches their product catalog, and provides personalized recommendations. Implementation with Bedrock: Step 1: Knowledge Base Setup Step 2: Agent Configuration These are fully managed, pre-trained services that you call via simple APIs. No model training, no infrastructure management—just add AI capabilities to your applications. What it does: Analyzes images and videos to detect objects, faces, text, scenes, and activities. Advanced Capabilities: Real-World Use Case: Social Media Content Moderation A social media platform receives 10 million image uploads daily and needs to moderate content before it goes live. What it does: Extracts text, handwriting, tables, forms, and structured data from scanned documents. Specialized Features: Real-World Use Case: Insurance Claims Processing An insurance company processes 50K claim forms monthly—mix of printed forms, handwritten notes, and attached receipts. What it does: Analyzes text to extract insights, sentiment, entities, and relationships. Key Phrase Extraction: PII Detection and Redaction: Custom Classification: Real-World Use Case: Customer Support Intelligence A SaaS company receives 10K support tickets daily across email, chat, and phone transcripts. What it does: Converts text into lifelike speech in 60+ languages. Speech Customization: Real-World Use Case: E-Learning Platform An online education platform offers 5K courses and wants to add audio narration in 20 languages without hiring voice actors. What it does: Converts audio and video to accurate text transcripts with advanced features. Accuracy Enhancement: Real-World Use Case: Legal Firm Deposition Management A law firm records 200+ client meetings, depositions, and court proceedings monthly and needs searchable transcripts. What it does: Translates text between 75+ languages in real-time with high accuracy. Translation Capabilities: Real-World Use Case: Global SaaS Platform A B2B SaaS company serves customers in 50 countries and needs to localize their application, documentation, and support content. What it does: Create chatbots and voice assistants with the same technology that powers Alexa. Natural Language Understanding: Real-World Use Case: Banking Customer Service Bot A bank wants to automate routine customer inquiries to reduce call center volume. Conversation Flow Example (CheckBalance): What it does: Provides personalized recommendations using the same technology as Amazon.com. Recommendation Types: Recipes (Algorithms): Real-World Use Case: Streaming Service A video streaming platform with 10M users wants to increase watch time and reduce churn. Recommendation Strategies: Business Rules Applied: What it does: Predicts future values based on historical time-series data using machine learning. Forecasting Capabilities: Data Types Supported: Domain-Specific Features: Real-World Use Case: Retail Chain Inventory Optimization A retail chain with 500 stores needs to forecast demand for 50K products to optimize inventory. What it does: Identifies potentially fraudulent online activities using machine learning. Fraud Types Detected: Real-World Use Case: Online Marketplace Fraud Prevention An online marketplace processes 1M transactions daily and loses $5M annually to fraud. What it does: Stores, transforms, and analyzes health data at scale with FHIR support. Real-World Use Case: Hospital Network Data Unification A hospital network with 5 facilities uses different EHR systems and needs unified patient records. Amazon Q is not a single product—it's a family of three specialized AI assistants, each designed for different use cases. What it does: AI-powered coding assistant for software developers. Real-World Use Case: Financial services company upgraded 500K lines of Java 8 code to Java 17 with Spring Boot 3 in 3 weeks (vs. 6 months manual), achieving 95% automated transformation with zero production bugs. What it does: Enterprise knowledge assistant that connects to your company's data sources. Real-World Use Case: Global consulting firm with 15K employees connected 10 years of project documentation, achieving 70% reduction in search time, 5 hours/week saved per consultant, and $10M annual productivity savings. What it does: Natural language interface for business intelligence. Real-World Use Case: Retail chain with 200 stores enabled executives to get answers in seconds vs. days, achieving 80% reduction in ad-hoc report requests and 100% executive adoption. What it does: Agentic coding service that transforms prompts into detailed specifications, then into working code, documentation, and tests. Conversational Development: Built on Amazon Bedrock: Real-World Use Case: Software teams use Kiro to go from prompt to feature with step-by-step guidance, reducing development time by automating documentation, test generation, and boilerplate code while maintaining code quality standards. What it does: Foundation model that can interact with user interfaces—clicking buttons, filling forms, navigating websites and applications. Visual Understanding: Multi-Step Workflows: Real-World Use Case: Accounting Firm Automation An accounting firm manually enters data into 5 different legacy systems without APIs. Amazon Bedrock AgentCore is a fully-managed agent platform built by AWS to help organizations build, deploy, operate, and scale AI agents in production, with enterprise-grade security, observability, and flexibility. Instead of just prototyping with a framework locally, AgentCore provides cloud-ready infrastructure and services so agents can run reliably at scale. These features make AgentCore suitable for real-world deployment where reliability, governance, and auditability are critical. An enterprise wants an AI agent that can handle internal IT support tickets automatically — from reading tickets and troubleshooting to resolving common issues or handing over to human support when needed. Runtime & Scaling Deploy an IT agent using AgentCore Runtime that can respond at scale as ticket volume fluctuates. Observability & Quality Now that we've explored all the services, let's create a clear distinction between the traditional ML stack and the GenAI stack—because choosing the right one matters. Philosophy: Build custom models trained on your specific data for your unique use case. 1. Amazon SageMaker AI (The Foundation) 2. Ready-to-Use ML Services (Pre-trained Models) 3. Supporting Services Typical ML Stack Architecture: Real-World ML Stack Example: Predictive Maintenance A manufacturing company wants to predict equipment failures before they happen. Why ML Stack (not GenAI): Philosophy: Use pre-trained foundation models for content generation, reasoning, and understanding. 1. Amazon Bedrock (The Foundation) 2. Amazon Q (AI Assistant) 3. Amazon Nova Act (UI Automation) 4. Amazon Bedrock AgentCore (Agent Platform) 5. Supporting GenAI Services Typical GenAI Stack Architecture: Real-World GenAI Stack Example: Enterprise Knowledge Assistant A consulting firm with 10K employees wants an AI assistant that can answer questions using their 20 years of project documentation. Why GenAI Stack (not ML): The most powerful solutions often combine both stacks: Example: Intelligent Customer Service Platform How They Work Together: Result: Best of both worlds—natural conversation with data-driven insights. Goal: Get hands-on with pre-trained AI services Time Investment: 5-10 hours Cost: Free (within Free Tier limits) Goal: Understand foundation models and Bedrock Time Investment: 10-15 hours Cost: ~$10-20 (token usage) Goal: Experience the full ML lifecycle Time Investment: 15-20 hours Cost: ~$20-50 (compute and storage) Goal: Combine multiple services into a working application Time Investment: 20-30 hours Cost: ~$50-100 Mistake: Using Bedrock for precise numerical predictions Solution: Use SageMaker for regression/classification tasks Mistake: Training a custom NLP model for document Q&A Solution: Use Bedrock with Knowledge Bases (RAG) Mistake: Running expensive GPU instances 24/7 Solution: Use Spot instances, serverless inference, or batch processing Mistake: Exposing API keys, not using VPC endpoints Solution: Use IAM roles, VPC endpoints, encryption Mistake: Deploy and forget Solution: Use Model Monitor, CloudWatch, set up alerts Mistake: Building for current load only Solution: Design for 10x growth, use auto-scaling Based on current trends and AWS's innovation velocity: 1. More Powerful Foundation Models 3. Easier Customization 5. Industry-Specific Solutions AWS has democratized AI/ML in a way that seemed impossible a decade ago. You don't need a PhD in machine learning to build intelligent applications anymore. You don't need millions in funding to train models. You don't need a team of infrastructure engineers to deploy at scale. Choose the ML Stack when: Choose the GenAI Stack when: Or combine both for the most powerful solutions. The AI revolution isn't coming—it's here. And with AWS's comprehensive AI/ML stack, you're equipped to be part of it. The tools are ready. The infrastructure is waiting. The only question is: what will you build? Official AWS Resources: Ready to start building? Pick one service from this guide, spend an hour experimenting, and see where it takes you. The best way to learn is by doing. Have questions or want to share your AWS AI/ML journey? The AWS community is incredibly helpful—don't hesitate to ask for help on re:Post or join local AWS user groups. Remember: Every expert was once a beginner. Every production system started as an experiment. Your AI/ML journey starts with a single API call. Now go build something amazing! 🚀 Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: Data Sources (S3, Databases, Streams) ↓ Data Preparation (SageMaker Data Wrangler, Glue) ↓ Feature Engineering (SageMaker Feature Store) ↓ Model Training (SageMaker Training, Autopilot) ↓ Model Evaluation (SageMaker Clarify, Debugger) ↓ Model Registry (SageMaker Model Registry) ↓ Deployment (SageMaker Endpoints, Batch Transform) ↓ Monitoring (SageMaker Model Monitor) ↓ Retraining (SageMaker Pipelines) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: Data Sources (S3, Databases, Streams) ↓ Data Preparation (SageMaker Data Wrangler, Glue) ↓ Feature Engineering (SageMaker Feature Store) ↓ Model Training (SageMaker Training, Autopilot) ↓ Model Evaluation (SageMaker Clarify, Debugger) ↓ Model Registry (SageMaker Model Registry) ↓ Deployment (SageMaker Endpoints, Batch Transform) ↓ Monitoring (SageMaker Model Monitor) ↓ Retraining (SageMaker Pipelines) CODE_BLOCK: Data Sources (S3, Databases, Streams) ↓ Data Preparation (SageMaker Data Wrangler, Glue) ↓ Feature Engineering (SageMaker Feature Store) ↓ Model Training (SageMaker Training, Autopilot) ↓ Model Evaluation (SageMaker Clarify, Debugger) ↓ Model Registry (SageMaker Model Registry) ↓ Deployment (SageMaker Endpoints, Batch Transform) ↓ Monitoring (SageMaker Model Monitor) ↓ Retraining (SageMaker Pipelines) CODE_BLOCK: User Query ↓ Application Layer (Web/Mobile/API) ↓ Amazon Bedrock Agent ↓ ├─→ Knowledge Base (RAG) │ ├─→ Vector Database (OpenSearch) │ └─→ Data Sources (S3, SharePoint, Confluence) │ ├─→ Foundation Model (Claude, Llama, Titan) │ ├─→ Guardrails (Safety, PII, Content Filtering) │ └─→ Action Groups (Lambda Functions, APIs) ├─→ Database Queries ├─→ External APIs └─→ Business Logic Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: User Query ↓ Application Layer (Web/Mobile/API) ↓ Amazon Bedrock Agent ↓ ├─→ Knowledge Base (RAG) │ ├─→ Vector Database (OpenSearch) │ └─→ Data Sources (S3, SharePoint, Confluence) │ ├─→ Foundation Model (Claude, Llama, Titan) │ ├─→ Guardrails (Safety, PII, Content Filtering) │ └─→ Action Groups (Lambda Functions, APIs) ├─→ Database Queries ├─→ External APIs └─→ Business Logic CODE_BLOCK: User Query ↓ Application Layer (Web/Mobile/API) ↓ Amazon Bedrock Agent ↓ ├─→ Knowledge Base (RAG) │ ├─→ Vector Database (OpenSearch) │ └─→ Data Sources (S3, SharePoint, Confluence) │ ├─→ Foundation Model (Claude, Llama, Titan) │ ├─→ Guardrails (Safety, PII, Content Filtering) │ └─→ Action Groups (Lambda Functions, APIs) ├─→ Database Queries ├─→ External APIs └─→ Business Logic - Fully integrated development environment (IDE) for ML - Web-based interface with JupyterLab notebooks - Visual workflow builder for ML pipelines - Real-time collaboration with shared spaces across teams - Git integration for version control - One-click access to compute resources - Automatically builds, trains, and tunes ML models - Supports classification and regression problems - Generates multiple model candidates and ranks them - Provides full visibility into model creation process - Exports Python code for customization - No ML expertise required to get started - Centralized repository for ML features - Online store for low-latency real-time inference (sub-millisecond) - Offline store for training and batch inference - Feature versioning and lineage tracking - Automatic feature discovery across teams - Point-in-time correct queries for historical data - Visual data preparation tool with 300+ built-in transformations - Import data from S3, Athena, Redshift, Snowflake - Interactive data quality insights and visualizations - Automatic data quality issue detection - Export workflows to SageMaker Pipelines - Generate Python code for custom transformations - Distributed training across multiple GPUs and instances - Supports TensorFlow, PyTorch, MXNet, scikit-learn, XGBoost - Managed spot training for up to 90% cost savings - Automatic model tuning (hyperparameter optimization) - SageMaker Training Compiler for 50% faster training - Checkpointing for fault tolerance - Real-time endpoints with auto-scaling - Serverless inference (no infrastructure management) - Batch transform for large-scale predictions - Multi-model endpoints (host multiple models on one endpoint) - Multi-container endpoints for ML pipelines - Shadow testing for A/B testing new models - CI/CD for machine learning workflows - Visual pipeline designer - Automated model retraining triggers - Integration with SageMaker Model Registry - Step caching to avoid redundant computations - Parallel execution of pipeline steps - Detect bias in training data and models - Explain model predictions with SHAP values - Feature importance analysis - Fairness metrics across demographic groups - Model explainability reports - Integration with SageMaker Model Monitor - Continuous monitoring of deployed models - Data quality monitoring (schema violations, missing values) - Model quality monitoring (accuracy drift) - Bias drift detection - Feature attribution drift - Automated alerts via CloudWatch and SNS - Real-time monitoring of training jobs - Automatic detection of training issues (vanishing gradients, overfitting) - Built-in rules for common problems - Tensor visualization and analysis - Profiling for system bottlenecks - Automatic termination of problematic jobs - Managed data labeling service - Human labeling workforce (Amazon Mechanical Turk, private, vendor) - Active learning to reduce labeling costs by 40% - Built-in workflows for images, text, video, 3D point clouds - Custom labeling workflows - Automatic data labeling using ML - Compile models for edge devices - Optimize models for 2x faster inference - Support for ARM, Intel, NVIDIA processors - Deploy to AWS IoT Greengrass - Reduce model size by up to 10x - No accuracy loss during optimization - 600+ pre-trained models from popular model hubs - One-click deployment of foundation models - Fine-tuning capabilities for domain adaptation - Solution templates for common use cases - Example notebooks for learning - Models from Hugging Face, PyTorch Hub, TensorFlow Hub - Use Ground Truth to label medical images with expert radiologists - Data Wrangler to preprocess and augment imaging data - Feature Store to manage extracted image features - Train custom ResNet model with SageMaker Training on GPU instances - Clarify to detect bias in predictions across patient demographics - Model Monitor to track model performance in production - Deploy with HIPAA-compliant endpoints for real-time diagnosis - Pipelines to automate retraining when new labeled data arrives - Titan Text: Text generation, summarization, Q&A (up to 32K tokens) - Titan Embeddings: Convert text to numerical vectors for semantic search - Titan Image Generator: Create realistic images from text descriptions - Titan Multimodal Embeddings: Process text and images together - Claude 4.5 Opus: Most capable, complex reasoning - Claude 4.5 Sonnet: Balanced performance and speed - Claude 4 Haiku: Fastest, most compact - 200K token context window - Strong at analysis, coding, math, creative writing - Open-source architecture - Multilingual support - Strong coding capabilities - Jurassic-2 Ultra and Mid - Optimized for enterprise use cases - Multilingual text generation - Command R and Command R+ - Retrieval-augmented generation (RAG) optimized - Multilingual support (10+ languages) - Stable Diffusion XL for image generation - High-quality, customizable images - Style control and fine-tuning - Connect your proprietary data sources (S3, SharePoint, Confluence, Salesforce) - Automatic data chunking and embedding - Vector database integration (Amazon OpenSearch, Pinecone, Redis) - Retrieval-Augmented Generation (RAG) without code - Automatic citation of sources in responses - Metadata filtering for precise retrieval - Hybrid search (keyword + semantic) - Build autonomous AI agents that take actions - Define agent instructions in natural language - Connect to APIs and Lambda functions - Multi-step task orchestration - Memory and context management - Action groups for organizing capabilities - Automatic API schema parsing - Content filtering (hate speech, violence, sexual content) - PII detection and redaction (names, addresses, SSN, credit cards) - Topic-based restrictions (block specific subjects) - Word filters (denied terms and phrases) - Contextual grounding checks (prevent hallucinations) - Toxicity thresholds (configurable sensitivity) - Apply to both inputs and outputs - Fine-tuning: Adapt models with your labeled data - Continued Pre-training: Train on large unlabeled datasets - Private training (data never leaves your VPC) - Custom model versioning - A/B testing between base and custom models - Automatic hyperparameter tuning - Built-in evaluation metrics (accuracy, toxicity, relevance) - Human evaluation workflows - Automatic benchmarking against test datasets - Compare multiple models side-by-side - Custom evaluation criteria - Save and version prompts - Prompt templates with variables - A/B test different prompts - Share prompts across teams - Prompt flow for multi-step workflows - Upload product catalog (100K products) to S3 - Create Bedrock Knowledge Base with product descriptions, specs, reviews - Enable hybrid search for both keyword and semantic matching - Create Bedrock Agent with Claude 3 Sonnet - Define agent instructions: "You are a helpful shopping assistant. Help customers find products, answer questions, and provide recommendations." - Connect action groups: check_inventory: Lambda function to check real-time stock get_pricing: API to fetch current prices and discounts create_cart: Add items to shopping cart track_order: Check order status - check_inventory: Lambda function to check real-time stock - get_pricing: API to fetch current prices and discounts - create_cart: Add items to shopping cart - track_order: Check order status - check_inventory: Lambda function to check real-time stock - get_pricing: API to fetch current prices and discounts - create_cart: Add items to shopping cart - track_order: Check order status - Block competitor mentions - Redact customer PII from logs - Prevent price promises ("I guarantee lowest price") - Filter inappropriate product searches - Contextual grounding to prevent hallucinated product features - Deploy agent with API Gateway - Integrate with website chat widget - Mobile app integration - Voice interface with Amazon Connect - 70% reduction in customer service tickets - 35% increase in conversion rate - Average response time: 2 seconds - Handles 50K concurrent conversations - 92% customer satisfaction score - ROI achieved in 3 months - Object and Scene Detection: Identify 10K+ objects (cars, furniture, animals) and scenes (beach, city, sunset) - Facial Analysis: Detect faces with attributes (age range, gender, emotions, glasses, beard, eyes open/closed) - Face Comparison: Compare two faces for similarity (useful for identity verification) - Celebrity Recognition: Identify 100K+ celebrities automatically - Text Detection (OCR): Extract text in multiple languages and orientations - Content Moderation: Detect explicit, suggestive, violent, or disturbing content with confidence scores - PPE Detection: Identify personal protective equipment (face covers, hand covers, head covers) - Custom Labels: Train custom models with as few as 10 images per category - Person Tracking: Track people across video frames with unique IDs - Activity Detection: Recognize activities (running, playing sports, dancing) - Object Tracking: Follow objects through video - Celebrity Recognition in Video: Identify when celebrities appear - Face Search in Video: Find specific people in video libraries - Content Moderation in Video: Detect inappropriate content with timestamps - Segment Detection: Identify black frames, color bars, end credits, shots - Technical Cue Detection: Find SMPTE color bars, black frames, opening/closing credits - Custom Moderation: Train adapters for brand-specific content policies - Streaming Video Analysis: Real-time analysis with Kinesis Video Streams - Batch Processing: Analyze thousands of images in parallel - Images uploaded to S3 trigger Lambda function - Rekognition DetectModerationLabels API analyzes each image - Custom Labels model trained to detect platform-specific violations (logo misuse, banned symbols) - Images with confidence > 90% automatically rejected - Images with 50-90% confidence sent to human moderators - Facial recognition prevents banned users from creating new accounts - Text detection identifies phone numbers and URLs in images - 95% of inappropriate content blocked automatically - Human moderation workload reduced by 80% - Average processing time: 300ms per image - Cost: $0.001 per image analyzed - False positive rate: < 2% - Printed Text Detection: Extract text with 99%+ accuracy - Handwriting Recognition: Read cursive and printed handwriting - Multi-language Support: 100+ languages including Arabic, Chinese, Japanese - Layout Understanding: Preserve document structure (paragraphs, columns, headers) - Confidence Scores: Per-word confidence levels - Key-Value Pair Detection: Automatically identify form fields and values - Checkbox Detection: Recognize selected/unselected checkboxes - Radio Button Detection: Identify selected options - Signature Detection: Locate signature fields - Relationship Mapping: Link keys to their corresponding values - Table Structure Recognition: Identify rows, columns, cells - Merged Cell Handling: Understand complex table layouts - Multi-page Tables: Track tables spanning multiple pages - Nested Tables: Extract tables within tables - Cell Relationships: Maintain row/column associations - Queries: Ask specific questions about documents ("What is the invoice total?") - AnalyzeExpense: Extract data from invoices and receipts (vendor, date, line items, tax, total) - AnalyzeID: Extract information from identity documents (passports, driver's licenses) - Custom Adapters: Train on your document types for improved accuracy - Layout Analysis: Understand document structure (titles, headers, footers, page numbers) - Claims submitted via mobile app or email - Documents uploaded to S3 - Textract AnalyzeDocument extracts: Policyholder information (name, policy number, date of birth) Claim details (incident date, description, amount claimed) Checkboxes (injury type, property damage) Handwritten notes from adjusters - Policyholder information (name, policy number, date of birth) - Claim details (incident date, description, amount claimed) - Checkboxes (injury type, property damage) - Handwritten notes from adjusters - Textract AnalyzeExpense processes receipts: Vendor names, dates, line items, totals - Vendor names, dates, line items, totals - Extracted data validated and inserted into claims system - Queries feature asks: "What is the total claim amount?" "When did the incident occur?" - Policyholder information (name, policy number, date of birth) - Claim details (incident date, description, amount claimed) - Checkboxes (injury type, property damage) - Handwritten notes from adjusters - Vendor names, dates, line items, totals - Processing time: 30 seconds (down from 10 minutes manual) - 98% extraction accuracy - 90% straight-through processing (no human intervention) - $2M annual savings in processing costs - Claims settled 5x faster - Document-level Sentiment: Overall positive, negative, neutral, or mixed - Targeted Sentiment: Sentiment toward specific entities ("The food was great but service was slow") - Confidence Scores: Probability for each sentiment - Multi-language Support: 100+ languages - Built-in Entity Types: Person, location, organization, date, quantity, title, event, brand, commercial item - Custom Entity Recognition: Train models for domain-specific entities (product codes, medical terms) - Entity Linking: Connect entities to knowledge bases - Confidence Scores: Per-entity confidence levels - Identify important phrases in text - Rank by relevance - Multi-language support - Identify dominant language in text - Support for 100+ languages - Confidence scores for each detected language - Part-of-speech tagging (noun, verb, adjective) - Tokenization - Sentence boundary detection - Discover topics in document collections - Unsupervised learning - Topic distribution per document - Identify personally identifiable information - Detect: names, addresses, SSN, credit cards, phone numbers, emails, IP addresses, passport numbers, driver's licenses - Redaction modes: mask, replace with entity type, or remove - Confidence scores - Train custom text classifiers - Multi-class and multi-label classification - As few as 50 training examples per class - Automatic model training and deployment - Extract medical entities (medications, conditions, procedures, anatomy, test results) - Detect protected health information (PHI) - Understand relationships (medication dosage, test results) - ICD-10-CM and RxNorm code linking - HIPAA eligible - All tickets ingested into S3 - Comprehend analyzes each ticket: Sentiment Analysis: Identify angry customers (priority routing) Entity Recognition: Extract product names, feature requests, error codes Custom Classification: Categorize by issue type (billing, technical, feature request) PII Detection: Redact customer data before storing in analytics database Key Phrases: Identify trending issues - Sentiment Analysis: Identify angry customers (priority routing) - Entity Recognition: Extract product names, feature requests, error codes - Custom Classification: Categorize by issue type (billing, technical, feature request) - PII Detection: Redact customer data before storing in analytics database - Key Phrases: Identify trending issues - Results feed into: Automatic ticket routing Priority queues (negative sentiment = high priority) Product team dashboard (feature requests, bugs) Knowledge base article suggestions - Automatic ticket routing - Priority queues (negative sentiment = high priority) - Product team dashboard (feature requests, bugs) - Knowledge base article suggestions - Sentiment Analysis: Identify angry customers (priority routing) - Entity Recognition: Extract product names, feature requests, error codes - Custom Classification: Categorize by issue type (billing, technical, feature request) - PII Detection: Redact customer data before storing in analytics database - Key Phrases: Identify trending issues - Automatic ticket routing - Priority queues (negative sentiment = high priority) - Product team dashboard (feature requests, bugs) - Knowledge base article suggestions - 60% faster ticket routing - 40% reduction in response time - 25% improvement in customer satisfaction - Identified 3 critical bugs within hours of first report - Automatic compliance with data privacy regulations - Neural TTS Voices: Most natural-sounding, human-like quality - Generative Voices: Create unique brand voices - Long-form Voices: Optimized for long content (audiobooks, articles) - Standard Voices: Cost-effective option - Newscaster Style: Professional news anchor tone - Conversational Style: Casual, friendly tone - 60+ Languages: Including English, Spanish, French, German, Japanese, Arabic, Hindi - SSML Support: Control pronunciation, emphasis, pauses, pitch, rate - Lexicons: Custom pronunciation for brand names, acronyms, technical terms - Speech Marks: Get metadata (phonemes, visemes, word timing) for lip-sync - Breathing Sounds: Add natural breathing for realism - Dynamic Range Compression: Optimize for different playback devices - Brand Voice: Create custom neural voice for your brand (requires voice talent recording) - Voice Cloning: Generate speech in specific person's voice (with consent) - Real-time Streaming: Stream audio as it's generated - Batch Synthesis: Generate hours of audio asynchronously - Multiple Output Formats: MP3, OGG, PCM - Course content stored as text in database - Polly generates audio narration: Neural voices for premium courses Long-form voices for lengthy lectures Newscaster style for formal content Conversational style for casual tutorials - Neural voices for premium courses - Long-form voices for lengthy lectures - Newscaster style for formal content - Conversational style for casual tutorials - Custom lexicons for: Technical terms (API, SQL, Kubernetes) Brand names (AWS, SageMaker) Acronyms (HTML, CSS, REST) - Technical terms (API, SQL, Kubernetes) - Brand names (AWS, SageMaker) - Acronyms (HTML, CSS, REST) - SSML for: Pauses between sections Emphasis on key concepts Slower speech for complex topics - Pauses between sections - Emphasis on key concepts - Slower speech for complex topics - Audio cached in CloudFront CDN - Students can adjust playback speed - Neural voices for premium courses - Long-form voices for lengthy lectures - Newscaster style for formal content - Conversational style for casual tutorials - Technical terms (API, SQL, Kubernetes) - Brand names (AWS, SageMaker) - Acronyms (HTML, CSS, REST) - Pauses between sections - Emphasis on key concepts - Slower speech for complex topics - $500K annual savings (vs. voice actors) - Audio generated in minutes (vs. weeks) - 20 languages supported (vs. 3 previously) - 40% increase in course completion rates - Accessibility compliance achieved - Update course audio in hours when content changes - Automatic Speech Recognition (ASR): 99%+ accuracy for clear audio - Real-time Streaming: Transcribe live audio with sub-second latency - Batch Transcription: Process pre-recorded audio files - Multi-language Support: 100+ languages and dialects - Automatic Language Identification: Detect language automatically - Multi-language Audio: Transcribe audio with multiple languages - Speaker Diarization: Identify and separate different speakers (up to 10 speakers) - Speaker Labels: Tag each utterance with speaker ID - Channel Identification: Separate audio channels (useful for call center recordings) - Custom Vocabulary: Add domain-specific terms, brand names, acronyms - Vocabulary Filtering: Mask or remove profanity and sensitive words - Custom Language Models: Train on your domain-specific text for better accuracy - Automatic Punctuation: Add periods, commas, question marks - Number Formatting: Convert spoken numbers to digits - Partial Results: Get transcripts as speech is detected (streaming) - Confidence Scores: Per-word confidence levels - Timestamps: Word-level and sentence-level timing - Redaction: Automatically redact PII (SSN, credit cards, names) - Content Moderation: Flag profanity and inappropriate content - Subtitle Generation: Create WebVTT and SRT subtitle files - Call Analytics: Specialized features for call center recordings Sentiment analysis per speaker Call categorization Issue detection Interruption tracking Talk time analysis Non-talk time detection - Sentiment analysis per speaker - Call categorization - Issue detection - Interruption tracking - Talk time analysis - Non-talk time detection - Sentiment analysis per speaker - Call categorization - Issue detection - Interruption tracking - Talk time analysis - Non-talk time detection - Medical terminology recognition - Specialty-specific vocabularies (cardiology, neurology, oncology) - Medication names and dosages - HIPAA eligible - Automatic PHI identification - Audio recordings uploaded to S3 - Transcribe processes with: Speaker diarization (identify attorney, client, witnesses) Custom vocabulary (legal terms, case-specific names, technical jargon) PII redaction for sensitive information Timestamps for easy reference - Speaker diarization (identify attorney, client, witnesses) - Custom vocabulary (legal terms, case-specific names, technical jargon) - PII redaction for sensitive information - Timestamps for easy reference - Transcripts stored in searchable database - Integration with case management system - Lawyers can search: "Find all mentions of contract breach in Smith deposition" - Speaker diarization (identify attorney, client, witnesses) - Custom vocabulary (legal terms, case-specific names, technical jargon) - PII redaction for sensitive information - Timestamps for easy reference - Transcription time: 30 minutes (vs. 8 hours manual) - Cost: $0.024 per minute of audio - 97% accuracy with custom vocabulary - Searchable archive of 10 years of recordings - Paralegals save 20 hours/week - Critical testimony found in seconds, not hours - 75+ Languages: Including major languages and regional dialects - Neural Machine Translation: Context-aware, fluent translations - Real-time Translation: Translate text instantly via API - Batch Translation: Translate large documents asynchronously - Automatic Language Detection: Identify source language automatically - Custom Terminology: Define how specific terms should be translated Brand names (keep unchanged) Technical terms (consistent translation) Industry jargon - Brand names (keep unchanged) - Technical terms (consistent translation) - Industry jargon - Parallel Data: Provide example translations to improve quality - Formality Control: Choose formal or informal tone (for supported languages) - Profanity Masking: Mask profane words in translations - Brand names (keep unchanged) - Technical terms (consistent translation) - Industry jargon - Document Translation: Translate Word, PowerPoint, Excel files while preserving formatting - Active Custom Translation: Real-time custom model training - Translation Quality Estimation: Confidence scores for translations - Brevity Control: Adjust translation length - HTML Translation: Translate HTML content while preserving tags - Application UI: All UI strings stored in resource files Translate API called at build time Custom terminology for product features ("Dashboard" → consistent across languages) Formality set to "formal" for business context - All UI strings stored in resource files - Translate API called at build time - Custom terminology for product features ("Dashboard" → consistent across languages) - Formality set to "formal" for business context - Help Documentation: 500 articles in English Batch translation to 20 languages Document translation preserves formatting Technical terms (API endpoints, code samples) kept in English - 500 articles in English - Batch translation to 20 languages - Document translation preserves formatting - Technical terms (API endpoints, code samples) kept in English - Customer Support: Real-time translation of support tickets Support agents respond in English, automatically translated to customer's language Custom terminology for product-specific terms - Real-time translation of support tickets - Support agents respond in English, automatically translated to customer's language - Custom terminology for product-specific terms - Marketing Content: Website content translated with formality control Regional dialect support (Spanish for Spain vs. Latin America) - Website content translated with formality control - Regional dialect support (Spanish for Spain vs. Latin America) - All UI strings stored in resource files - Translate API called at build time - Custom terminology for product features ("Dashboard" → consistent across languages) - Formality set to "formal" for business context - 500 articles in English - Batch translation to 20 languages - Document translation preserves formatting - Technical terms (API endpoints, code samples) kept in English - Real-time translation of support tickets - Support agents respond in English, automatically translated to customer's language - Custom terminology for product-specific terms - Website content translated with formality control - Regional dialect support (Spanish for Spain vs. Latin America) - 20 languages supported (vs. 3 manual translations) - Translation cost: $15 per million characters - Time to add new language: 1 day (vs. 3 months) - 35% increase in international revenue - 50% reduction in support response time for non-English customers - Consistent terminology across all touchpoints - Intents: Define what users want to accomplish - Slots: Extract specific information from user input (dates, names, numbers) - Slot Types: Built-in types (dates, numbers, cities) and custom types - Utterances: Example phrases users might say - Prompts: Questions bot asks to gather information - Confirmation: Ask users to confirm before taking action - Intent Recognition: Understand user's goal from natural language - Entity Extraction: Pull out key information (dates, locations, products) - Context Management: Remember conversation history - Multi-turn Conversations: Handle complex, multi-step interactions - Sentiment Detection: Understand user's emotional state - Automatic Speech Recognition: Voice input support - Lambda Integration: Execute business logic and API calls - Session Attributes: Store conversation state - Conditional Branching: Different conversation flows based on context - Slot Validation: Ensure collected information is valid - Fallback Intents: Handle unrecognized input gracefully - AMAZON.KendraSearchIntent: Search knowledge bases for answers - Multi-language Support: 20+ languages - Voice and Text: Same bot works for both modalities - Amazon Connect: Integrate with contact center - Facebook Messenger: Deploy to social media - Slack: Enterprise chat integration - Twilio SMS: Text message interface - Custom Applications: Web, mobile, IoT devices - CheckBalance: "What's my account balance?" - TransferFunds: "Transfer $500 from checking to savings" - PayBill: "Pay my electric bill" - ReportCard: "I lost my credit card" - FindATM: "Where's the nearest ATM?" - GetHelp: "I need to speak to someone" - User: "What's my balance?" - Bot: "I can help with that. Which account? Checking or savings?" - User: "Checking" - Bot: [Lambda calls banking API] - Bot: "Your checking account balance is $2,450.32. Anything else I can help with?" - Slot validation (account type must be checking/savings) - Lambda integration for real-time balance lookup - Session attributes to remember user's account preferences - Sentiment detection to escalate frustrated customers to human agents - Multi-factor authentication via SMS before showing sensitive info - Voice interface for phone banking - Text interface for mobile app and website - 70% of routine inquiries handled by bot - 500K calls/month deflected from human agents - $3M annual cost savings - Average interaction time: 45 seconds - 24/7 availability - Customer satisfaction: 4.2/5 stars - Escalation to human agent when needed: 15% of conversations - User Personalization: Recommend items based on user's history and preferences - Similar Items: "Customers who viewed this also viewed..." - Personalized Ranking: Rerank items based on user's preferences - Trending Now: Popular items with momentum - Next Best Action: Recommend optimal action for user engagement - Interactions: User behavior (clicks, purchases, views, ratings) - User Metadata: Demographics, preferences, subscription tier - Item Metadata: Categories, price, description, attributes - Contextual Data: Device type, location, time of day - Real-time Events: Update recommendations as users interact - Cold Start: Recommendations for new users and items - Business Rules: Apply filters and promotions Boost certain items Filter out out-of-stock items Promote seasonal content - Boost certain items - Filter out out-of-stock items - Promote seasonal content - A/B Testing: Compare recommendation strategies - Batch Recommendations: Generate recommendations for all users offline - Exploration: Balance popular items with discovery - Boost certain items - Filter out out-of-stock items - Promote seasonal content - User-Personalization: General-purpose recommendations - Personalized-Ranking: Rerank search results - Similar-Items: Item-to-item similarity - Popularity-Count: Most popular items - Next-Best-Action: Optimize for specific goals - User interactions: watch history, ratings, searches, pauses, skips - User metadata: age, location, subscription tier, device preferences - Content metadata: genre, actors, director, release year, duration, language - Homepage: User-Personalization recipe for "Recommended for You" - Video Page: Similar-Items for "Because you watched..." - Search Results: Personalized-Ranking to reorder results - Trending Section: Popularity-Count with time decay - Email Campaigns: Batch recommendations for weekly digest - Boost new releases for first 7 days - Filter content not available in user's region - Promote content user's subscription tier has access to - Reduce recommendations for genres user consistently skips - User watches 10 minutes of a show → immediately update recommendations - User rates a movie → adjust similar content recommendations - User searches for "comedy" → boost comedy recommendations - 25% increase in average watch time - 15% reduction in churn rate - 40% of content discovered through recommendations - 60% increase in email click-through rates - 30% improvement in new content discovery - ROI: 8x within first year - Automatic Model Selection: Tests multiple algorithms and picks the best - Built-in Algorithms: CNN-QR, DeepAR+, Prophet, NPTS, ARIMA, ETS - Probabilistic Forecasts: P10, P50, P90 quantiles for uncertainty - Multiple Time Series: Forecast thousands of related time series together - Missing Data Handling: Automatically fills gaps in historical data - Target Time Series: Historical values to forecast (sales, demand, traffic) - Related Time Series: Additional data that influences target (price, promotions, weather) - Item Metadata: Static attributes (product category, store location) - Retail Domain: Demand forecasting with promotions, holidays, stockouts - Inventory Planning: Optimize stock levels across locations - Workforce Planning: Predict staffing needs - EC2 Capacity: Forecast compute resource requirements - Web Traffic: Predict website visitors - Metrics: Forecast custom business metrics - Holiday Calendars: Built-in holiday effects for 250+ countries - Weather Index: Incorporate weather data automatically - What-If Analysis: Simulate different scenarios - Explainability: Understand which factors drive forecasts - Automatic Retraining: Keep models fresh with new data - Historical sales data (3 years) uploaded to S3 - Related time series: promotions, holidays, local events, weather - Item metadata: category, price tier, seasonality - Forecast generates predictions for next 12 weeks - P10 forecast for safety stock - P50 forecast for base inventory - P90 forecast for peak demand scenarios - Automated retraining weekly with latest sales data - 40% reduction in stockouts - 35% reduction in overstock - $15M annual savings in inventory costs - 25% improvement in forecast accuracy vs. previous statistical methods - Optimized distribution center allocation - Better promotional planning - Online Fraud: Fake account creation, payment fraud - Account Takeover: Unauthorized access to existing accounts - Transaction Fraud: Suspicious purchases and payments - Identity Verification: Validate user identity during onboarding - Online Fraud Insights: Pre-trained model for common fraud patterns - Transaction Fraud Insights: Detect suspicious transactions - Account Takeover Insights: Identify compromised accounts - Train on your historical fraud data - Automatic feature engineering - Model versioning and A/B testing - Continuous learning from new fraud patterns - Real-time Scoring: Evaluate transactions in milliseconds - Risk Scores: 0-1000 scale indicating fraud likelihood - Rules Engine: Combine ML predictions with business rules - Explainability: Understand why transaction was flagged - SageMaker Integration: Use custom ML models - Event Tracking: Monitor outcomes to improve models - Historical transaction data (2 years) with fraud labels - Features tracked: User behavior (account age, purchase history, login patterns) Transaction details (amount, payment method, shipping address) Device fingerprinting (IP address, browser, device ID) Velocity checks (transactions per hour, new addresses) - User behavior (account age, purchase history, login patterns) - Transaction details (amount, payment method, shipping address) - Device fingerprinting (IP address, browser, device ID) - Velocity checks (transactions per hour, new addresses) - Custom model trained on marketplace-specific fraud patterns - Rules engine: Block transactions with score > 900 Manual review for scores 700-900 Approve scores < 700 - Block transactions with score > 900 - Manual review for scores 700-900 - Approve scores < 700 - Real-time scoring at checkout - Feedback loop: confirmed fraud updates model - User behavior (account age, purchase history, login patterns) - Transaction details (amount, payment method, shipping address) - Device fingerprinting (IP address, browser, device ID) - Velocity checks (transactions per hour, new addresses) - Block transactions with score > 900 - Manual review for scores 700-900 - Approve scores < 700 - 60% reduction in fraud losses ($3M saved annually) - False positive rate reduced from 15% to 3% - Average scoring time: 50ms - Legitimate customers rarely impacted - Fraud detection rate: 95% - ROI: 15x in first year - FHIR Support: Fast Healthcare Interoperability Resources standard - Data Ingestion: Import from multiple EHR systems - Data Normalization: Standardize data from different sources - Medical NLP: Extract insights from clinical notes - Structured and Unstructured Data: Handle both types - Integrated Analytics: Query with Amazon Athena - Medical Entity Extraction: Medications, conditions, procedures - Temporal Queries: Track patient history over time - Population Health: Aggregate data for research - Cohort Identification: Find patients matching criteria - HIPAA Eligible: Meets healthcare privacy requirements - Encryption: At rest and in transit - Audit Logging: Track all data access - Access Controls: Fine-grained permissions - Data from Epic, Cerner, Meditech ingested into HealthLake - FHIR transformation normalizes data structure - Medical NLP extracts entities from clinical notes - Unified patient view across all facilities - Doctors access complete medical history regardless of where patient was treated - Research team queries de-identified data for clinical studies - Population health analytics identify high-risk patients - Complete patient history available in seconds - 50% reduction in duplicate tests - Improved care coordination - Faster diagnosis with complete information - Research insights from 500K patient records - Compliance with HIPAA maintained - Code generation in 15+ languages (Python, Java, JavaScript, TypeScript, C#, Go, Rust, etc.) - Code explanation and documentation generation - Security vulnerability detection (SQL injection, XSS, CSRF) - Automated code transformations and refactoring - Unit test generation - AWS infrastructure code generation (CloudFormation, Terraform) - IDE integration (VS Code, JetBrains, Visual Studio, Cloud9) - Free Tier: Basic code completions - Professional ($19/user/month): Unlimited completions, security scanning, code transformations - Enterprise (Custom): Private deployment, custom training, SSO - Natural language search across 40+ data sources (Slack, Teams, Confluence, SharePoint, Salesforce, S3, databases) - Semantic search with automatic source citations - Role-based access control (respects source system permissions) - PII detection and redaction - Conversational AI with multi-turn context - Analytics dashboard for query tracking - Lite ($3/user/month): 10 data sources, 100 queries/month - Plus ($20/user/month): Unlimited sources and queries - Enterprise (Custom): VPC deployment, custom training - Ask questions in plain English ("What were top 5 products last quarter?") - Automatic visualization selection and dashboard creation - Executive summaries and data storytelling - Proactive anomaly detection and insights - Trend identification and forecasting explanations - $250/month for 10 users, $25/user/month additional - Unlimited queries - Converts natural language prompts into structured specifications - Breaks down features into logical implementation steps - Generates requirements, design documents, and data flow diagrams - Creates code, tests, and API integrations - Chat with Kiro about your codebase - Request explanations for complex logic - Generate new features through conversation - Debug issues with AI assistance - Automated triggers for predefined actions - Execute tasks on file save, create, or delete events - Automate routine development tasks - Persistent project knowledge through markdown files - Define coding conventions and standards - Ensure consistent patterns across codebase - Uses multiple foundation models - Automated abuse detection - Enterprise-grade security - Free tier data may be used for service improvement - Enterprise users get customer-managed encryption keys - Granular access controls - Recognizes UI elements (buttons, forms, menus, links) - Understands screen layouts and context - Adapts to UI changes dynamically - Click, type, scroll, navigate - Fill forms with data - Submit information - Handle pop-ups and dialogs - Complete complex tasks across multiple screens - Chain actions together - Maintain context throughout workflow - Retry failed actions - Adapt when UI changes - Handle unexpected states - Provide detailed logs for audit - Web applications - Desktop applications - Mobile apps (future) - Legacy systems without APIs - Data entry automation across legacy systems - Automated testing of web applications - RPA (Robotic Process Automation) replacement - Integration with systems lacking APIs - Compliance and audit workflows - Nova Act agent trained to: Log into each system Navigate to data entry forms Fill in client information Submit and verify entries Handle error messages - Log into each system - Navigate to data entry forms - Fill in client information - Submit and verify entries - Handle error messages - Agent runs on schedule - Processes 500 entries/day - Logs all actions for audit compliance - Log into each system - Navigate to data entry forms - Fill in client information - Submit and verify entries - Handle error messages - 200 hours/month saved - 99.5% accuracy - Eliminates manual data entry errors - Frees staff for higher-value work - Works with any agent framework like LangChain, LangGraph, CrewAI, Strands Agents, etc. - Supports any foundation model, including Amazon Bedrock models (Claude, Nova, Titan) and external providers. - Purpose-built serverless environment to deploy and run agents and tools without managing servers. - Session isolation ensures each user’s context and data is protected. - Supports long-running tasks (up to hours), async jobs, streaming responses, and WebSocket interactions. - Built-in memory system for context retention across sessions and users. - Enables both short-term interaction context and long-term knowledge for personalization and coherence. - Identity management service to securely authenticate agents and sessions via OAuth, IAM, and external identity providers. - Protects credentials and supports secure access to third-party systems. - Acts as a bridge between AI agents and external APIs or Lambda functions, exposing them as tools that agents can call. - Features like debug messaging, custom encryption, semantic search for tools, and tagging for organization. - Integrated observability for metrics, logs, tracing, and dashboards so teams can monitor agent behavior. - New policy enforcement and evaluation features help ensure agents obey compliance and quality standards. - Code Interpreter tool allows agents to execute safe sandboxed code. - Browser tool lets agents interact with live websites securely at scale. - Runtime & Scaling Deploy an IT agent using AgentCore Runtime that can respond at scale as ticket volume fluctuates. - Memory & Context Memory stores session context such as user history, common resolutions, and preferences. - Memory stores session context such as user history, common resolutions, and preferences. - Identity Integration Authenticate users via corporate OAuth/SAML for secure access to internal systems. - Authenticate users via corporate OAuth/SAML for secure access to internal systems. - Tool Integrations Connect to internal APIs (e.g., helpdesk systems, knowledge base, asset inventory) using the Gateway. Agents can run diagnostic scripts via the Code Interpreter tool to gather logs or run fixes. - Connect to internal APIs (e.g., helpdesk systems, knowledge base, asset inventory) using the Gateway. - Agents can run diagnostic scripts via the Code Interpreter tool to gather logs or run fixes. - Observability & Quality Admins monitor agent effectiveness, ticket resolution rates, and anomalous behavior via observability dashboards. Built-in policy controls ensure agents don’t perform unsafe actions. - Admins monitor agent effectiveness, ticket resolution rates, and anomalous behavior via observability dashboards. - Built-in policy controls ensure agents don’t perform unsafe actions. - Memory stores session context such as user history, common resolutions, and preferences. - Authenticate users via corporate OAuth/SAML for secure access to internal systems. - Connect to internal APIs (e.g., helpdesk systems, knowledge base, asset inventory) using the Gateway. - Agents can run diagnostic scripts via the Code Interpreter tool to gather logs or run fixes. - Admins monitor agent effectiveness, ticket resolution rates, and anomalous behavior via observability dashboards. - Built-in policy controls ensure agents don’t perform unsafe actions. - Faster response times on common issues. - Reduced workload for human IT support. - Secure access to enterprise systems without exposing credentials. - Reliable audit trails for compliance. - You have proprietary data that gives you competitive advantage - Your problem is unique and pre-trained models won't work - You need complete control over model architecture and training - You want to optimize for specific metrics (accuracy, latency, cost) - You're solving prediction, classification, or regression problems - You need explainability and model governance - End-to-end ML platform - Build, train, deploy custom models - Complete control over ML lifecycle - MLOps and governance built-in - Amazon Rekognition: Computer vision (images, videos) - Amazon Textract: Document intelligence - Amazon Comprehend: Natural language processing - Amazon Transcribe: Speech-to-text - Amazon Polly: Text-to-speech - Amazon Translate: Language translation - Amazon Lex: Conversational AI - Amazon Personalize: Recommendations - Amazon Forecast: Time-series forecasting - Amazon Fraud Detector: Fraud detection - Amazon Augmented AI (A2I): Human review workflows - Amazon Lookout for Equipment: Anomaly detection for industrial equipment - Amazon Monitron: Equipment monitoring - AWS Panorama: Computer vision at the edge - Amazon DevOps Guru: ML-powered operations - Amazon CodeGuru: Code quality and performance - Unique sensor data from proprietary equipment - Need precise predictions (false positives are expensive) - Requires model explainability for maintenance teams - Must integrate with existing SCADA systems - Data Collection: IoT sensors → Kinesis → S3 - Feature Engineering: SageMaker Feature Store (temperature trends, vibration patterns, usage hours) - Model Training: SageMaker with custom XGBoost model - Deployment: Real-time endpoint for critical equipment, batch for others - Monitoring: Model Monitor tracks prediction drift - Human Review: A2I for borderline predictions - Retraining: Automated pipeline when new failure data arrives - 85% of failures predicted 48 hours in advance - 60% reduction in unplanned downtime - $10M annual savings - Model explainability helps maintenance teams understand why - You need to generate content (text, images, code) - You want conversational AI and natural language understanding - You need to reason over documents and data - You want to build AI agents that take actions - You don't have millions of labeled training examples - Time-to-market is critical - Access to leading foundation models (Claude, Llama, Titan, etc.) - Knowledge Bases for RAG - Agents for autonomous actions - Guardrails for safety - Fine-tuning for customization - AWS expertise and troubleshooting - Code generation and explanation - Business intelligence and analytics - Document search and summarization - AI agents that interact with UIs - Automate workflows across systems - RPA replacement with intelligence - Deploy agents at scale - Multi-framework support - Model flexibility - Amazon Kendra: Intelligent enterprise search (ML-powered, often used with GenAI) - Amazon Lex: Conversational interfaces (can integrate with Bedrock) - Amazon Comprehend: NLP for understanding (complements GenAI) - Need natural language understanding and generation - Don't have labeled training data - Documents are unstructured (reports, presentations, emails) - Need conversational interface - Want to deploy quickly - Data Ingestion: 500K documents → S3 - Knowledge Base: Bedrock Knowledge Base with OpenSearch vector store - Foundation Model: Claude 3 Sonnet for reasoning and generation - Agent Setup: Bedrock Agent with action groups: Search project database Check employee availability Create meeting invites Generate project proposals - Search project database - Check employee availability - Create meeting invites - Generate project proposals - Guardrails: Redact client PII Block competitor mentions Ensure professional tone - Redact client PII - Block competitor mentions - Ensure professional tone - Deployment: Slack bot + web interface - Amazon Q Integration: Help employees with AWS infrastructure questions - Search project database - Check employee availability - Create meeting invites - Generate project proposals - Redact client PII - Block competitor mentions - Ensure professional tone - Deployed in 4 weeks (vs. 6 months for custom ML) - 80% of internal questions answered without human help - Average response time: 3 seconds - 10K queries/day - 90% user satisfaction - Consultants save 5 hours/week searching for information - New employees onboard 50% faster - Bedrock Agent for conversational interface - Knowledge Base for product documentation - Claude for natural language understanding - SageMaker model for customer churn prediction - Personalize for product recommendations - Comprehend for sentiment analysis - Forecast for demand prediction - Customer asks question → Bedrock Agent (GenAI) - Agent retrieves answer from Knowledge Base (GenAI) - Agent checks customer sentiment → Comprehend (ML) - If negative sentiment → escalate to human - Agent suggests products → Personalize (ML) - Agent predicts churn risk → SageMaker (ML) - If high risk → offer retention discount - 30+ AI/ML services covering every use case - Choose the right tool for the job - Start simple, scale to complex - Services work seamlessly together - Unified IAM, VPC, CloudWatch - Data flows easily between services - No server management - Auto-scaling built-in - High availability by default - No upfront costs - Scale from prototype to production - Only pay for what you use - Encryption at rest and in transit - HIPAA, PCI-DSS, SOC 2, GDPR compliant - Your data stays in your account - Fine-grained access controls - Global infrastructure - Low-latency inference - Optimized for scale - New features released constantly - Access to latest models (Claude 3, Llama 3, etc.) - Backward compatibility maintained - Sign up for AWS Free Tier - Try Amazon Rekognition: Upload images, detect objects - Try Amazon Comprehend: Analyze text sentiment - Try Amazon Polly: Generate speech from text - Build a simple demo combining 2-3 services - Access Amazon Bedrock console - Try different foundation models (Claude, Llama, Titan) - Create a simple Knowledge Base with your documents - Build a basic chatbot using Bedrock Agent - Experiment with Guardrails - Choose a dataset (Kaggle, UCI ML Repository) - Use SageMaker Autopilot for automated ML - Explore SageMaker Studio notebooks - Train a simple model (classification or regression) - Deploy to a real-time endpoint - Test predictions via API - Document Intelligence App: Upload PDFs → Textract extracts data → Comprehend analyzes sentiment → Store in database - Content Generation Platform: Bedrock generates blog posts → Polly creates audio version → Translate to multiple languages - Smart Customer Service: Lex chatbot → Bedrock for complex queries → Personalize for recommendations - Predictive Analytics Dashboard: SageMaker model predicts outcomes → Forecast for time-series → QuickSight for visualization - Larger context windows (1M+ tokens) - Multimodal models (text + image + video + audio) - Faster inference times - Lower costs - Agents that can use any tool or API - Multi-agent collaboration - Long-running workflows - Better reasoning capabilities - Fine-tuning with less data - Faster training times - Better transfer learning - Automated prompt optimization - On-premises foundation models - Federated learning - Differential privacy - Confidential computing - Healthcare AI assistants - Financial services compliance tools - Manufacturing optimization - Retail personalization - A clear understanding of your problem - Knowledge of which AWS service fits your use case - Willingness to experiment and iterate - Focus on delivering value, not building infrastructure - You have unique data and unique problems - You need precise predictions - You want complete control - Explainability is critical - You need content generation and reasoning - You want natural language interfaces - Time-to-market is critical - You don't have labeled training data - AWS Machine Learning Blog - Amazon SageMaker Examples - Amazon Bedrock Samples - AWS AI/ML Workshops - AWS Skill Builder (Free ML courses) - AWS Certified Machine Learning - Specialty (Retired) - AWS Certified AI Practitioner - AWS Machine learning Associate - AWS Generative AI Developer Professional (Beta) - AWS re:Post (Q&A forum) - AWS Events (re:Invent, Summits, Webinars) - AWS Free Tier - Try most AI/ML services free - Many services include generous monthly free usage