Tools
Multi-Tenant Design for Bedrock Knowledge Base: Solving the Account Limit with Metadata Filtering
2026-01-01
0 views
admin
Introduction ## Background ## The Problem ## Solution ## Future Plans ## Conclusion Recently, while working with Bedrock KnowledgeBase in my daily work, I encountered some challenges related to its specifications that I'd like to share. Currently, I'm developing a multi-tenant application using Bedrock KnowledgeBase (referred to as KB below). To briefly explain KB, it's an orchestrator for implementing LLM RAG that handles vectorization of files into vector stores and can generate context-aware conversations when combined with Bedrock Agent. We're using OpenSearch as our vector store, and our design creates separate KBs and indices for each tenant. This approach ensures data isolation between tenants, which seemed like a natural design choice at the time. At some point, when checking Bedrock's quotas, I found (Knowledge Bases) Knowledge bases per account quota. This is a limit on the maximum number of KBs you can create within an account, with a hard limit of 100. With our initial design, this meant our application could only support up to 100 tenants. Therefore, we needed to reconsider our design. In reconsidering the design, we modified it so that KBs and indices are shared across multiple tenants. Since KBs have several parameters related to vectorization, such as ChunkStrategy, we created several combinations of ChunkStrategy and MaxToken parameters and let users select from these options for sharing. An important consideration with this approach is ensuring that tenant data isn't referenced during conversations with other tenants. KB provides functionality to attach custom metadata during vectorization, so we adopted a method of attaching tenant_id-like metadata and filtering documents by that ID during conversations.
https://docs.aws.amazon.com/bedrock/latest/userguide/kb-metadata.html Here's the conceptual approach:
Architecture: Below is sample code for attaching metadata to documents: To filter documents by metadata when conversing with the agent, you can implement it with code like this: With the above implementation, we can now build the application while successfully avoiding the constraints. In multi-tenant applications, it's crucial to monitor that one tenant cannot access another tenant's data. I'm thinking of creating a monitoring mechanism to ensure this isn't possible. For example, I'm considering creating multiple test tenants, inserting different documents into the same vector store for each, asking questions about other tenants' documents, and verifying that no answers are returned. This script could be executed regularly in staging environments. While monitoring system resources like CPU is important, I believe it's equally crucial to monitor data to ensure that data that shouldn't exist according to system specifications doesn't exist. These are the issues related to KB specifications and our countermeasures. Through this experience, I realized the importance of checking cloud service specifications before deciding on system design. I hope this article will be helpful to you. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK:
# ingest_documents
response = client.ingest_knowledge_base_documents( knowledgeBaseId='string', dataSourceId='string', clientToken='string', documents=[ { 'metadata': { 'type': "IN_LINE_ATTRIBUTE", 'inlineAttributes': [ { 'key': 'tenant_id', 'value': { 'type': "STRING", 'stringValue': "$tenant_id", } }, ] }, 'content': { ... } }, ]
) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
# ingest_documents
response = client.ingest_knowledge_base_documents( knowledgeBaseId='string', dataSourceId='string', clientToken='string', documents=[ { 'metadata': { 'type': "IN_LINE_ATTRIBUTE", 'inlineAttributes': [ { 'key': 'tenant_id', 'value': { 'type': "STRING", 'stringValue': "$tenant_id", } }, ] }, 'content': { ... } }, ]
) COMMAND_BLOCK:
# ingest_documents
response = client.ingest_knowledge_base_documents( knowledgeBaseId='string', dataSourceId='string', clientToken='string', documents=[ { 'metadata': { 'type': "IN_LINE_ATTRIBUTE", 'inlineAttributes': [ { 'key': 'tenant_id', 'value': { 'type': "STRING", 'stringValue': "$tenant_id", } }, ] }, 'content': { ... } }, ]
) COMMAND_BLOCK:
## invoke_agent
response = boto3.client.invoke_agent( 'knowledgeBaseConfigurations': [ { "knowledgeBaseId": "$vector_store_id", "description": "Knowledge base for document retrieval", "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "equals": { "key": "tenant_id", "value": "$tenant_id" } } } }, } ]
) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
## invoke_agent
response = boto3.client.invoke_agent( 'knowledgeBaseConfigurations': [ { "knowledgeBaseId": "$vector_store_id", "description": "Knowledge base for document retrieval", "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "equals": { "key": "tenant_id", "value": "$tenant_id" } } } }, } ]
) COMMAND_BLOCK:
## invoke_agent
response = boto3.client.invoke_agent( 'knowledgeBaseConfigurations': [ { "knowledgeBaseId": "$vector_store_id", "description": "Knowledge base for document retrieval", "retrievalConfiguration": { "vectorSearchConfiguration": { "filter": { "equals": { "key": "tenant_id", "value": "$tenant_id" } } } }, } ]
) - Shared Knowledge Base across multiple tenants
- Custom metadata (tenant_id) attached to each document
- Metadata filtering during retrieval to ensure data isolation
how-totutorialguidedev.toaimlllm