AWS S3 Vectors: Finally, Cloud Scalable Vector Storage πŸš€

AWS S3 Vectors: Finally, Cloud Scalable Vector Storage πŸš€

Source: Dev.to

What Are We Discussing Here? ## Why Should You Care? πŸ€” ## The Architecture: How It Actually Works ## 1. Vector Buckets πŸͺ£ ## 2. Vector Indexes πŸ“‘ ## 3. Vectors (Clearly) ⚑ ## Strong Consistency FTW ## Let's Get Practical: Code That Actually Works πŸ’» ## Step 1: Generate Embeddings with Amazon Bedrock ## Step 2: Store Vectors in S3 Vectors ## Step 3: Query Your Vectors ## The Integration Story: Playing Nice with Others πŸ”— ## Amazon Bedrock Knowledge Bases 🧠 ## Amazon OpenSearch Service πŸ” ## Amazon SageMaker Unified Studio 🎨 ## When Should You Actually Use This? 🎯 ## The Nitty-Gritty Details πŸ”§ ## Getting Started (It's Easier Than You Think) ⚑ ## Resources & Further Reading πŸ“š Been building AI applications lately? You've probably felt the pain of vector embedding storage. Vector databases are powerful, yeah but expensive and a hassle to work with. Surprise. AWS just launched something that can save you up to 90% on vector storage. Let me introduce you to Amazon S3 Vectors. Let's get our feet wet in the fundamentals before jumping into S3 Vectors. In case you already store vectors in Pinecone, Weaviate, or have your own vector database, you're already aware that vectors are numeric encodings of your data text, images, audio, etc. They encode semantic meaning so you can retrieve similar things, drive RAG applications, or grant your AI agents some memory. The problem? Traditional vector databases are: Say hello to S3 Vectors the first cloud object store with native vector support. It's similar to Amazon S3, but optimized for vectors. You get the legendary reliability and scalability of S3, with the capability to actually query your vectors in sub-second performance. Let's be realistic: we don't all need millisecond query times for all vectors in our system. Sure, your recommendation engine has to be super fast, but what about: This is where S3 Vectors shines. Pay only for what you use, no infrastructure to provision whatsoever, and scale from thousands to billions of vectors without breaking a sweat (or your bank). S3 Vectors introduces three core concepts: A new type of S3 bucket specifically designed for vector data. Not your regular S3 bucket this one is designed for vectors from ground up. Inside each vector bucket, you create indexes to hold and organize your vectors. Each bucket supports up to 10,000 indexes and each index can hold tens of millions of vectors. Indexes are groups that bunch related vectors together. Your actual vector embeddings live here. And the best news? You can attach metadata as key-value pairs to each vector. Need to filter on date, category, user preference, or genre? No problem. S3 Vectors will automatically optimize your data as it evolves no manual fiddling required. Here's the key bit: writes are strongly consistent. When you add or update a vector, it's immediately available for queries. No eventual consistency angst. I know you're itching to see some code, so let's walk through a real example. Say you're building a movie recommendation system (because every tutorial needs movies, right? 🎬). S3 Vectors is not an isolated island. AWS made sure it plays nice with other tools you may already have on board: Create knowledge bases for RAG applications natively with S3 Vectors as your storage layer. Cut costs without sacrificing functionality. And now things get interesting: use a tiered storage approach. Store your low-frequency, long-term vectors in S3 Vectors (cheap), and scale up to OpenSearch for high-priority vectors when you need that high-QPS, low-latency performance. Picture it like this: You can even export S3 vector index snapshots directly to OpenSearch Serverless collections from the console. Best of both worlds. Build and prototype your generative AI workloads with native access to Bedrock and S3 Vectors in one integrated studio. Stick with traditional vector databases if: Some important specs to keep in mind: Via Code: Just use the boto3 S3 Vectors client as shown in the examples aboveβ€”it's that straightforward. Pro tip: Check out the S3 Vectors Embed CLI on GitHub. It lets you create embeddings and store them in S3 Vectors with single commands. Super handy for testing and quick prototypes. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: import boto3 import json # Set up Bedrock client bedrock = boto3.client("bedrock-runtime", region_name="us-west-2") # Your movie descriptions texts = [ "Star Wars: A farm boy joins rebels to fight an evil empire in space", "Jurassic Park: Scientists create dinosaurs in a theme park that goes wrong", "Finding Nemo: A father fish searches the ocean to find his lost son" ] embeddings = [] # Getting embeddings for each movie for text in texts: body = json.dumps({"inputText": text}) response = bedrock.invoke_model( modelId='amazon.titan-embed-text-v2:0', body=body ) response_body = json.loads(response['body'].read()) embedding = response_body['embedding'] embeddings.append(embedding) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import boto3 import json # Set up Bedrock client bedrock = boto3.client("bedrock-runtime", region_name="us-west-2") # Your movie descriptions texts = [ "Star Wars: A farm boy joins rebels to fight an evil empire in space", "Jurassic Park: Scientists create dinosaurs in a theme park that goes wrong", "Finding Nemo: A father fish searches the ocean to find his lost son" ] embeddings = [] # Getting embeddings for each movie for text in texts: body = json.dumps({"inputText": text}) response = bedrock.invoke_model( modelId='amazon.titan-embed-text-v2:0', body=body ) response_body = json.loads(response['body'].read()) embedding = response_body['embedding'] embeddings.append(embedding) COMMAND_BLOCK: import boto3 import json # Set up Bedrock client bedrock = boto3.client("bedrock-runtime", region_name="us-west-2") # Your movie descriptions texts = [ "Star Wars: A farm boy joins rebels to fight an evil empire in space", "Jurassic Park: Scientists create dinosaurs in a theme park that goes wrong", "Finding Nemo: A father fish searches the ocean to find his lost son" ] embeddings = [] # Getting embeddings for each movie for text in texts: body = json.dumps({"inputText": text}) response = bedrock.invoke_model( modelId='amazon.titan-embed-text-v2:0', body=body ) response_body = json.loads(response['body'].read()) embedding = response_body['embedding'] embeddings.append(embedding) COMMAND_BLOCK: # Create S3 Vectors client s3vectors = boto3.client('s3vectors', region_name='us-west-2') # Insert your vectors with metadata s3vectors.put_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", vectors=[ { "key": "v1", "data": {"float32": embeddings[0]}, "metadata": { "id": "key1", "source_text": texts[0], "genre": "scifi" } }, { "key": "v2", "data": {"float32": embeddings[1]}, "metadata": { "id": "key2", "source_text": texts[1], "genre": "scifi" } }, { "key": "v3", "data": {"float32": embeddings[2]}, "metadata": { "id": "key3", "source_text": texts[2], "genre": "family" } } ] ) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # Create S3 Vectors client s3vectors = boto3.client('s3vectors', region_name='us-west-2') # Insert your vectors with metadata s3vectors.put_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", vectors=[ { "key": "v1", "data": {"float32": embeddings[0]}, "metadata": { "id": "key1", "source_text": texts[0], "genre": "scifi" } }, { "key": "v2", "data": {"float32": embeddings[1]}, "metadata": { "id": "key2", "source_text": texts[1], "genre": "scifi" } }, { "key": "v3", "data": {"float32": embeddings[2]}, "metadata": { "id": "key3", "source_text": texts[2], "genre": "family" } } ] ) COMMAND_BLOCK: # Create S3 Vectors client s3vectors = boto3.client('s3vectors', region_name='us-west-2') # Insert your vectors with metadata s3vectors.put_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", vectors=[ { "key": "v1", "data": {"float32": embeddings[0]}, "metadata": { "id": "key1", "source_text": texts[0], "genre": "scifi" } }, { "key": "v2", "data": {"float32": embeddings[1]}, "metadata": { "id": "key2", "source_text": texts[1], "genre": "scifi" } }, { "key": "v3", "data": {"float32": embeddings[2]}, "metadata": { "id": "key3", "source_text": texts[2], "genre": "family" } } ] ) COMMAND_BLOCK: # User asks: "List movies about adventures in space" input_text = "List the movies about adventures in space" # Create embedding for the query request = json.dumps({"inputText": input_text}) response = bedrock.invoke_model( modelId="amazon.titan-embed-text-v2:0", body=request ) model_response = json.loads(response["body"].read()) query_embedding = model_response["embedding"] # Do similarity search with metadata filtering query = s3vectors.query_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", queryVector={"float32": query_embedding}, topK=3, filter={"genre": "scifi"}, # Only search sci-fi movies returnDistance=True, returnMetadata=True ) results = query["vectors"] print(results) # Star Wars will be your top match! Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: # User asks: "List movies about adventures in space" input_text = "List the movies about adventures in space" # Create embedding for the query request = json.dumps({"inputText": input_text}) response = bedrock.invoke_model( modelId="amazon.titan-embed-text-v2:0", body=request ) model_response = json.loads(response["body"].read()) query_embedding = model_response["embedding"] # Do similarity search with metadata filtering query = s3vectors.query_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", queryVector={"float32": query_embedding}, topK=3, filter={"genre": "scifi"}, # Only search sci-fi movies returnDistance=True, returnMetadata=True ) results = query["vectors"] print(results) # Star Wars will be your top match! COMMAND_BLOCK: # User asks: "List movies about adventures in space" input_text = "List the movies about adventures in space" # Create embedding for the query request = json.dumps({"inputText": input_text}) response = bedrock.invoke_model( modelId="amazon.titan-embed-text-v2:0", body=request ) model_response = json.loads(response["body"].read()) query_embedding = model_response["embedding"] # Do similarity search with metadata filtering query = s3vectors.query_vectors( vectorBucketName="my-movie-vectors", indexName="movie-embeddings", queryVector={"float32": query_embedding}, topK=3, filter={"genre": "scifi"}, # Only search sci-fi movies returnDistance=True, returnMetadata=True ) results = query["vectors"] print(results) # Star Wars will be your top match! - πŸ’Έ Expensive at scale (especially when you're storing millions or billions of vectors) - πŸ”§ Complex to maintain and provision - πŸ“Š Often overkill if you're not querying constantly - πŸ“š Long-term conversation history for your AI agents - πŸ—‚οΈ Archive data that you query occasionally - πŸ“ˆ Training datasets that grow over time - πŸ” Semantic search across millions of documents that don't need real time response - S3 Vectors: Your budget-savior data warehouse for vectors - OpenSearch: Your high-performance query engine - βœ… You need to persist large vector datasets (millions to billions) - βœ… Query rate is not thousands of QPS - βœ… Cost is a primary concern - βœ… You need sub-second query latency (not microsecond) - βœ… You don't want to worry about infrastructure - βœ… You need strong consistency guarantees - ❌ You need ultra-low latency (single-digit milliseconds) - ❌ You're servicing high-QPS real-time applications - ❌ You need advanced features like hybrid search or complex aggregations - ❌ Your vectors are frequently updated and queried simultaneously - Supported distance metrics: Cosine and Euclidean - Metadata types: String, number, boolean, and lists - Metadata filtering: All metadata is filterable out of the box - Encryption: SSE-S3 by default, or bring your own KMS keys - Access control: Default IAM policies (with a separate s3vectors namespace) - Block Public Access: Always enabled (can't be disabled) - Currently available in: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney) - Navigate to S3 β†’ Vector Buckets - Click "Create vector bucket" - Create a vector index (specify dimensionality and distance metric) - Start inserting vectors via SDK, CLI, or API - AWS S3 Vectors Official Docs - S3 Vectors Features Page - AWS Blog: Introducing S3 Vectors - Getting Started Guide - S3 Vectors Embed CLI on GitHub