Tools

Tools: Semantic Kernel Memory: Vector Stores, Embeddings, and Semantic Search

2026-02-27 0 views admin

Tools: Semantic Kernel Memory: Vector Stores, Embeddings, and Semantic Search

Source: Dev.to

Why Memory Matters ## Understanding Embeddings ## Setting Up Memory with ISemanticTextMemory ## Storing Memories ## Semantic Search ## Memory Store Implementations ## VolatileMemoryStore (Development) ## Azure AI Search (Production Recommended) ## Qdrant (Self-Hosted Performance) ## PostgreSQL with pgvector ## Redis (Low-Latency Cache) ## Comparison Matrix ## Memory Records Deep Dive ## Working with Metadata ## Kernel Memory: Production RAG ## Importing Documents ## Asking Questions with Citations ## Filtering by Tags ## Memory in Conversational Context ## Memory Maintenance ## What's Next LLMs have a fundamental limitation: they're stateless. Every request starts fresh with no memory of previous conversations or your organization's knowledge. This is where Semantic Kernel's memory system comes in—transforming raw text into searchable vector embeddings that give your AI persistent, semantic understanding. In Part 2, we explored plugins. Now we'll dive deep into the memory layer that powers intelligent retrieval. Consider a customer support bot. Without memory, it can't: With Semantic Kernel memory, you transform unstructured text into vector embeddings—numerical representations that capture semantic meaning. Similar concepts cluster together in vector space, enabling semantic search that understands intent, not just keywords. Before diving into code, let's understand what's happening under the hood. When you send text to an embedding model, it returns a high-dimensional vector (typically 1536 or 3072 dimensions). These vectors have a remarkable property: semantically similar texts produce similar vectors. The ISemanticTextMemory interface provides a simple abstraction for storing and searching memories: Now the magic—semantic search that understands meaning: Semantic Kernel supports multiple vector stores. Choose based on your requirements: In-memory storage—fast but ephemeral: Enterprise-grade with hybrid search (vector + keyword): High-performance open-source vector database: Use your existing Postgres infrastructure: When milliseconds matter: Each memory record contains: Use metadata for filtering and organization: For production RAG pipelines, Microsoft.KernelMemory provides a more robust solution: Kernel Memory handles document processing automatically: For chat applications, combine short-term (conversation) and long-term (knowledge base) memory: Keep your memory stores healthy: In this article, we explored Semantic Kernel's memory capabilities: In Part 4, we'll put memory to work building production RAG applications—chunking strategies, retrieval patterns, context window management, and evaluation techniques. This is Part 3 of a 5-part series on Semantic Kernel. Next up: Production RAG Patterns Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>(); var texts = new[] { "The cat sat on the mat", "A feline rested on the rug", "The stock market crashed today", "Dogs are loyal companions" }; var embeddings = await embeddingService.GenerateEmbeddingsAsync(texts); // Calculate cosine similarity between vectors float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b) { var spanA = a.Span; var spanB = b.Span; float dot = 0, normA = 0, normB = 0; for (int i = 0; i < spanA.Length; i++) { dot += spanA[i] * spanB[i]; normA += spanA[i] * spanA[i]; normB += spanB[i] * spanB[i]; } return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB)); } // "cat on mat" vs "feline on rug" → ~0.92 (very similar!) // "cat on mat" vs "stock market" → ~0.31 (unrelated) Console.WriteLine($"Cat/Feline similarity: {CosineSimilarity(embeddings[0], embeddings[1]):F2}"); Console.WriteLine($"Cat/Market similarity: {CosineSimilarity(embeddings[0], embeddings[2]):F2}"); Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>(); var texts = new[] { "The cat sat on the mat", "A feline rested on the rug", "The stock market crashed today", "Dogs are loyal companions" }; var embeddings = await embeddingService.GenerateEmbeddingsAsync(texts); // Calculate cosine similarity between vectors float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b) { var spanA = a.Span; var spanB = b.Span; float dot = 0, normA = 0, normB = 0; for (int i = 0; i < spanA.Length; i++) { dot += spanA[i] * spanB[i]; normA += spanA[i] * spanA[i]; normB += spanB[i] * spanB[i]; } return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB)); } // "cat on mat" vs "feline on rug" → ~0.92 (very similar!) // "cat on mat" vs "stock market" → ~0.31 (unrelated) Console.WriteLine($"Cat/Feline similarity: {CosineSimilarity(embeddings[0], embeddings[1]):F2}"); Console.WriteLine($"Cat/Market similarity: {CosineSimilarity(embeddings[0], embeddings[2]):F2}"); COMMAND_BLOCK: var embeddingService = kernel.GetRequiredService<ITextEmbeddingGenerationService>(); var texts = new[] { "The cat sat on the mat", "A feline rested on the rug", "The stock market crashed today", "Dogs are loyal companions" }; var embeddings = await embeddingService.GenerateEmbeddingsAsync(texts); // Calculate cosine similarity between vectors float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b) { var spanA = a.Span; var spanB = b.Span; float dot = 0, normA = 0, normB = 0; for (int i = 0; i < spanA.Length; i++) { dot += spanA[i] * spanB[i]; normA += spanA[i] * spanA[i]; normB += spanB[i] * spanB[i]; } return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB)); } // "cat on mat" vs "feline on rug" → ~0.92 (very similar!) // "cat on mat" vs "stock market" → ~0.31 (unrelated) Console.WriteLine($"Cat/Feline similarity: {CosineSimilarity(embeddings[0], embeddings[1]):F2}"); Console.WriteLine($"Cat/Market similarity: {CosineSimilarity(embeddings[0], embeddings[2]):F2}"); CODE_BLOCK: using Microsoft.SemanticKernel.Memory; using Microsoft.SemanticKernel.Connectors.AzureOpenAI; // Build the memory system var memoryBuilder = new MemoryBuilder(); // Add embedding generation memoryBuilder.WithAzureOpenAITextEmbeddingGeneration( deploymentName: "text-embedding-3-large", endpoint: config["AzureOpenAI:Endpoint"]!, apiKey: config["AzureOpenAI:Key"]!); // Add a memory store (we'll explore options below) memoryBuilder.WithMemoryStore(new VolatileMemoryStore()); var memory = memoryBuilder.Build(); Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.SemanticKernel.Memory; using Microsoft.SemanticKernel.Connectors.AzureOpenAI; // Build the memory system var memoryBuilder = new MemoryBuilder(); // Add embedding generation memoryBuilder.WithAzureOpenAITextEmbeddingGeneration( deploymentName: "text-embedding-3-large", endpoint: config["AzureOpenAI:Endpoint"]!, apiKey: config["AzureOpenAI:Key"]!); // Add a memory store (we'll explore options below) memoryBuilder.WithMemoryStore(new VolatileMemoryStore()); var memory = memoryBuilder.Build(); CODE_BLOCK: using Microsoft.SemanticKernel.Memory; using Microsoft.SemanticKernel.Connectors.AzureOpenAI; // Build the memory system var memoryBuilder = new MemoryBuilder(); // Add embedding generation memoryBuilder.WithAzureOpenAITextEmbeddingGeneration( deploymentName: "text-embedding-3-large", endpoint: config["AzureOpenAI:Endpoint"]!, apiKey: config["AzureOpenAI:Key"]!); // Add a memory store (we'll explore options below) memoryBuilder.WithMemoryStore(new VolatileMemoryStore()); var memory = memoryBuilder.Build(); CODE_BLOCK: // Store individual memories await memory.SaveInformationAsync( collection: "company-policies", id: "refund-policy", text: "Customers may return any item within 30 days of purchase for a full refund. " + "Items must be unused and in original packaging. Digital products are non-refundable.", description: "Company refund and return policy", additionalMetadata: "department=customer-service,version=2024-01,priority=high"); await memory.SaveInformationAsync( collection: "company-policies", id: "shipping-policy", text: "Free shipping on orders over $50. Standard shipping takes 5-7 business days. " + "Express shipping (2-3 days) available for $9.99. Overnight shipping $24.99.", description: "Shipping rates and timeframes", additionalMetadata: "department=logistics,version=2024-01"); await memory.SaveInformationAsync( collection: "company-policies", id: "warranty-policy", text: "All electronics come with a 1-year manufacturer warranty. Extended warranties " + "available for purchase. Warranty covers defects, not accidental damage.", description: "Product warranty information", additionalMetadata: "department=support,version=2024-01"); Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: // Store individual memories await memory.SaveInformationAsync( collection: "company-policies", id: "refund-policy", text: "Customers may return any item within 30 days of purchase for a full refund. " + "Items must be unused and in original packaging. Digital products are non-refundable.", description: "Company refund and return policy", additionalMetadata: "department=customer-service,version=2024-01,priority=high"); await memory.SaveInformationAsync( collection: "company-policies", id: "shipping-policy", text: "Free shipping on orders over $50. Standard shipping takes 5-7 business days. " + "Express shipping (2-3 days) available for $9.99. Overnight shipping $24.99.", description: "Shipping rates and timeframes", additionalMetadata: "department=logistics,version=2024-01"); await memory.SaveInformationAsync( collection: "company-policies", id: "warranty-policy", text: "All electronics come with a 1-year manufacturer warranty. Extended warranties " + "available for purchase. Warranty covers defects, not accidental damage.", description: "Product warranty information", additionalMetadata: "department=support,version=2024-01"); CODE_BLOCK: // Store individual memories await memory.SaveInformationAsync( collection: "company-policies", id: "refund-policy", text: "Customers may return any item within 30 days of purchase for a full refund. " + "Items must be unused and in original packaging. Digital products are non-refundable.", description: "Company refund and return policy", additionalMetadata: "department=customer-service,version=2024-01,priority=high"); await memory.SaveInformationAsync( collection: "company-policies", id: "shipping-policy", text: "Free shipping on orders over $50. Standard shipping takes 5-7 business days. " + "Express shipping (2-3 days) available for $9.99. Overnight shipping $24.99.", description: "Shipping rates and timeframes", additionalMetadata: "department=logistics,version=2024-01"); await memory.SaveInformationAsync( collection: "company-policies", id: "warranty-policy", text: "All electronics come with a 1-year manufacturer warranty. Extended warranties " + "available for purchase. Warranty covers defects, not accidental damage.", description: "Product warranty information", additionalMetadata: "department=support,version=2024-01"); CODE_BLOCK: // Search for relevant information var searchResults = memory.SearchAsync( collection: "company-policies", query: "How long do I have to return something?", // Note: doesn't contain "refund" limit: 3, minRelevanceScore: 0.7); await foreach (var result in searchResults) { Console.WriteLine($"[{result.Relevance:P0}] {result.Metadata.Description}"); Console.WriteLine($" {result.Metadata.Text}"); Console.WriteLine($" Metadata: {result.Metadata.AdditionalMetadata}"); Console.WriteLine(); } // Output: // [94%] Company refund and return policy // Customers may return any item within 30 days of purchase... // Metadata: department=customer-service,version=2024-01,priority=high Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: // Search for relevant information var searchResults = memory.SearchAsync( collection: "company-policies", query: "How long do I have to return something?", // Note: doesn't contain "refund" limit: 3, minRelevanceScore: 0.7); await foreach (var result in searchResults) { Console.WriteLine($"[{result.Relevance:P0}] {result.Metadata.Description}"); Console.WriteLine($" {result.Metadata.Text}"); Console.WriteLine($" Metadata: {result.Metadata.AdditionalMetadata}"); Console.WriteLine(); } // Output: // [94%] Company refund and return policy // Customers may return any item within 30 days of purchase... // Metadata: department=customer-service,version=2024-01,priority=high CODE_BLOCK: // Search for relevant information var searchResults = memory.SearchAsync( collection: "company-policies", query: "How long do I have to return something?", // Note: doesn't contain "refund" limit: 3, minRelevanceScore: 0.7); await foreach (var result in searchResults) { Console.WriteLine($"[{result.Relevance:P0}] {result.Metadata.Description}"); Console.WriteLine($" {result.Metadata.Text}"); Console.WriteLine($" Metadata: {result.Metadata.AdditionalMetadata}"); Console.WriteLine(); } // Output: // [94%] Company refund and return policy // Customers may return any item within 30 days of purchase... // Metadata: department=customer-service,version=2024-01,priority=high CODE_BLOCK: var store = new VolatileMemoryStore(); memoryBuilder.WithMemoryStore(store); // Great for testing and prototyping // Data lost when process ends Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: var store = new VolatileMemoryStore(); memoryBuilder.WithMemoryStore(store); // Great for testing and prototyping // Data lost when process ends CODE_BLOCK: var store = new VolatileMemoryStore(); memoryBuilder.WithMemoryStore(store); // Great for testing and prototyping // Data lost when process ends CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.AzureAISearch; var store = new AzureAISearchMemoryStore( endpoint: config["AzureSearch:Endpoint"]!, apiKey: config["AzureSearch:Key"]!); memoryBuilder.WithMemoryStore(store); // Features: // - Hybrid search (vector + BM25 keyword) // - Semantic ranking // - Faceted filtering // - Geo-spatial queries // - Enterprise security (RBAC, private endpoints) Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.AzureAISearch; var store = new AzureAISearchMemoryStore( endpoint: config["AzureSearch:Endpoint"]!, apiKey: config["AzureSearch:Key"]!); memoryBuilder.WithMemoryStore(store); // Features: // - Hybrid search (vector + BM25 keyword) // - Semantic ranking // - Faceted filtering // - Geo-spatial queries // - Enterprise security (RBAC, private endpoints) CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.AzureAISearch; var store = new AzureAISearchMemoryStore( endpoint: config["AzureSearch:Endpoint"]!, apiKey: config["AzureSearch:Key"]!); memoryBuilder.WithMemoryStore(store); // Features: // - Hybrid search (vector + BM25 keyword) // - Semantic ranking // - Faceted filtering // - Geo-spatial queries // - Enterprise security (RBAC, private endpoints) CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Qdrant; var store = new QdrantMemoryStore( host: "localhost", port: 6333, vectorSize: 3072); // Match your embedding model memoryBuilder.WithMemoryStore(store); // Features: // - Excellent filtering capabilities // - Horizontal scaling // - Snapshot backups // - gRPC for performance Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Qdrant; var store = new QdrantMemoryStore( host: "localhost", port: 6333, vectorSize: 3072); // Match your embedding model memoryBuilder.WithMemoryStore(store); // Features: // - Excellent filtering capabilities // - Horizontal scaling // - Snapshot backups // - gRPC for performance CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Qdrant; var store = new QdrantMemoryStore( host: "localhost", port: 6333, vectorSize: 3072); // Match your embedding model memoryBuilder.WithMemoryStore(store); // Features: // - Excellent filtering capabilities // - Horizontal scaling // - Snapshot backups // - gRPC for performance CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Postgres; using Npgsql; var dataSource = NpgsqlDataSource.Create(config.GetConnectionString("Postgres")!); var store = new PostgresMemoryStore(dataSource, vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Familiar SQL ecosystem // - ACID transactions // - Complex queries combining vectors with relational data // - Cost-effective for smaller datasets Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Postgres; using Npgsql; var dataSource = NpgsqlDataSource.Create(config.GetConnectionString("Postgres")!); var store = new PostgresMemoryStore(dataSource, vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Familiar SQL ecosystem // - ACID transactions // - Complex queries combining vectors with relational data // - Cost-effective for smaller datasets CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Postgres; using Npgsql; var dataSource = NpgsqlDataSource.Create(config.GetConnectionString("Postgres")!); var store = new PostgresMemoryStore(dataSource, vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Familiar SQL ecosystem // - ACID transactions // - Complex queries combining vectors with relational data // - Cost-effective for smaller datasets CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Redis; using StackExchange.Redis; var redis = ConnectionMultiplexer.Connect(config["Redis:Connection"]!); var store = new RedisMemoryStore(redis.GetDatabase(), vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Sub-millisecond latency // - Automatic expiration (TTL) // - Cluster support // - Good for session/conversation memory Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Redis; using StackExchange.Redis; var redis = ConnectionMultiplexer.Connect(config["Redis:Connection"]!); var store = new RedisMemoryStore(redis.GetDatabase(), vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Sub-millisecond latency // - Automatic expiration (TTL) // - Cluster support // - Good for session/conversation memory CODE_BLOCK: using Microsoft.SemanticKernel.Connectors.Redis; using StackExchange.Redis; var redis = ConnectionMultiplexer.Connect(config["Redis:Connection"]!); var store = new RedisMemoryStore(redis.GetDatabase(), vectorSize: 3072); memoryBuilder.WithMemoryStore(store); // Features: // - Sub-millisecond latency // - Automatic expiration (TTL) // - Cluster support // - Good for session/conversation memory COMMAND_BLOCK: public class MemoryRecord { // Core data public string Id { get; } // Unique identifier public string Text { get; } // Original text content public ReadOnlyMemory<float> Embedding { get } // Vector representation // Metadata public MemoryRecordMetadata Metadata { get; } } public class MemoryRecordMetadata { public string Id { get; } public string Text { get; } public string Description { get; } // Human-readable description public string AdditionalMetadata { get; } // Custom key=value pairs public string ExternalSourceName { get; } // Source system reference public bool IsReference { get; } // Is this a reference to external data? } Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: public class MemoryRecord { // Core data public string Id { get; } // Unique identifier public string Text { get; } // Original text content public ReadOnlyMemory<float> Embedding { get } // Vector representation // Metadata public MemoryRecordMetadata Metadata { get; } } public class MemoryRecordMetadata { public string Id { get; } public string Text { get; } public string Description { get; } // Human-readable description public string AdditionalMetadata { get; } // Custom key=value pairs public string ExternalSourceName { get; } // Source system reference public bool IsReference { get; } // Is this a reference to external data? } COMMAND_BLOCK: public class MemoryRecord { // Core data public string Id { get; } // Unique identifier public string Text { get; } // Original text content public ReadOnlyMemory<float> Embedding { get } // Vector representation // Metadata public MemoryRecordMetadata Metadata { get; } } public class MemoryRecordMetadata { public string Id { get; } public string Text { get; } public string Description { get; } // Human-readable description public string AdditionalMetadata { get; } // Custom key=value pairs public string ExternalSourceName { get; } // Source system reference public bool IsReference { get; } // Is this a reference to external data? } COMMAND_BLOCK: // Store with rich metadata await memory.SaveInformationAsync( collection: "support-tickets", id: $"ticket-{ticket.Id}", text: $"Issue: {ticket.Title}\n\nDescription: {ticket.Description}\n\nResolution: {ticket.Resolution}", description: $"Resolved ticket: {ticket.Title}", additionalMetadata: $"category={ticket.Category},priority={ticket.Priority}," + $"resolved_date={ticket.ResolvedAt:yyyy-MM-dd},agent={ticket.AgentId}"); // Later, search and filter var results = memory.SearchAsync("customer can't login", limit: 10, minRelevanceScore: 0.75); await foreach (var result in results) { // Parse metadata for filtering var metadata = result.Metadata.AdditionalMetadata .Split(',') .Select(p => p.Split('=')) .ToDictionary(p => p[0], p => p[1]); if (metadata["category"] == "authentication") { Console.WriteLine($"Relevant auth ticket: {result.Metadata.Description}"); } } Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: // Store with rich metadata await memory.SaveInformationAsync( collection: "support-tickets", id: $"ticket-{ticket.Id}", text: $"Issue: {ticket.Title}\n\nDescription: {ticket.Description}\n\nResolution: {ticket.Resolution}", description: $"Resolved ticket: {ticket.Title}", additionalMetadata: $"category={ticket.Category},priority={ticket.Priority}," + $"resolved_date={ticket.ResolvedAt:yyyy-MM-dd},agent={ticket.AgentId}"); // Later, search and filter var results = memory.SearchAsync("customer can't login", limit: 10, minRelevanceScore: 0.75); await foreach (var result in results) { // Parse metadata for filtering var metadata = result.Metadata.AdditionalMetadata .Split(',') .Select(p => p.Split('=')) .ToDictionary(p => p[0], p => p[1]); if (metadata["category"] == "authentication") { Console.WriteLine($"Relevant auth ticket: {result.Metadata.Description}"); } } COMMAND_BLOCK: // Store with rich metadata await memory.SaveInformationAsync( collection: "support-tickets", id: $"ticket-{ticket.Id}", text: $"Issue: {ticket.Title}\n\nDescription: {ticket.Description}\n\nResolution: {ticket.Resolution}", description: $"Resolved ticket: {ticket.Title}", additionalMetadata: $"category={ticket.Category},priority={ticket.Priority}," + $"resolved_date={ticket.ResolvedAt:yyyy-MM-dd},agent={ticket.AgentId}"); // Later, search and filter var results = memory.SearchAsync("customer can't login", limit: 10, minRelevanceScore: 0.75); await foreach (var result in results) { // Parse metadata for filtering var metadata = result.Metadata.AdditionalMetadata .Split(',') .Select(p => p.Split('=')) .ToDictionary(p => p[0], p => p[1]); if (metadata["category"] == "authentication") { Console.WriteLine($"Relevant auth ticket: {result.Metadata.Description}"); } } CODE_BLOCK: using Microsoft.KernelMemory; var kernelMemory = new KernelMemoryBuilder() // LLM for summarization and answer generation .WithAzureOpenAITextGeneration(new AzureOpenAIConfig { Deployment = "gpt-4o", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.ChatCompletion }) // Embedding model .WithAzureOpenAITextEmbeddingGeneration(new AzureOpenAIConfig { Deployment = "text-embedding-3-large", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.EmbeddingGeneration }) // Vector storage .WithAzureAISearchMemoryDb(new AzureAISearchConfig { Endpoint = config["AzureSearch:Endpoint"]!, APIKey = config["AzureSearch:Key"]! }) .Build<MemoryServerless>(); Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: using Microsoft.KernelMemory; var kernelMemory = new KernelMemoryBuilder() // LLM for summarization and answer generation .WithAzureOpenAITextGeneration(new AzureOpenAIConfig { Deployment = "gpt-4o", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.ChatCompletion }) // Embedding model .WithAzureOpenAITextEmbeddingGeneration(new AzureOpenAIConfig { Deployment = "text-embedding-3-large", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.EmbeddingGeneration }) // Vector storage .WithAzureAISearchMemoryDb(new AzureAISearchConfig { Endpoint = config["AzureSearch:Endpoint"]!, APIKey = config["AzureSearch:Key"]! }) .Build<MemoryServerless>(); CODE_BLOCK: using Microsoft.KernelMemory; var kernelMemory = new KernelMemoryBuilder() // LLM for summarization and answer generation .WithAzureOpenAITextGeneration(new AzureOpenAIConfig { Deployment = "gpt-4o", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.ChatCompletion }) // Embedding model .WithAzureOpenAITextEmbeddingGeneration(new AzureOpenAIConfig { Deployment = "text-embedding-3-large", Endpoint = config["AzureOpenAI:Endpoint"]!, APIKey = config["AzureOpenAI:Key"]!, APIType = AzureOpenAIConfig.APITypes.EmbeddingGeneration }) // Vector storage .WithAzureAISearchMemoryDb(new AzureAISearchConfig { Endpoint = config["AzureSearch:Endpoint"]!, APIKey = config["AzureSearch:Key"]! }) .Build<MemoryServerless>(); CODE_BLOCK: // Import a PDF await kernelMemory.ImportDocumentAsync( filePath: "docs/product-manual.pdf", documentId: "manual-v2.1", tags: new TagCollection { { "product", "widget-pro" }, { "version", "2.1" }, { "type", "manual" } }); // Import a web page await kernelMemory.ImportWebPageAsync( url: "https://docs.company.com/api-reference", documentId: "api-docs", tags: new TagCollection { { "type", "api-documentation" } }); // Import text directly await kernelMemory.ImportTextAsync( text: "Our support hours are Monday-Friday 9am-5pm EST. " + "Emergency support available 24/7 for enterprise customers.", documentId: "support-hours", tags: new TagCollection { { "type", "policy" }, { "department", "support" } }); // Check import status (async processing) while (!await kernelMemory.IsDocumentReadyAsync("manual-v2.1")) { await Task.Delay(1000); Console.WriteLine("Processing document..."); } Console.WriteLine("Document ready for queries!"); Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: // Import a PDF await kernelMemory.ImportDocumentAsync( filePath: "docs/product-manual.pdf", documentId: "manual-v2.1", tags: new TagCollection { { "product", "widget-pro" }, { "version", "2.1" }, { "type", "manual" } }); // Import a web page await kernelMemory.ImportWebPageAsync( url: "https://docs.company.com/api-reference", documentId: "api-docs", tags: new TagCollection { { "type", "api-documentation" } }); // Import text directly await kernelMemory.ImportTextAsync( text: "Our support hours are Monday-Friday 9am-5pm EST. " + "Emergency support available 24/7 for enterprise customers.", documentId: "support-hours", tags: new TagCollection { { "type", "policy" }, { "department", "support" } }); // Check import status (async processing) while (!await kernelMemory.IsDocumentReadyAsync("manual-v2.1")) { await Task.Delay(1000); Console.WriteLine("Processing document..."); } Console.WriteLine("Document ready for queries!"); CODE_BLOCK: // Import a PDF await kernelMemory.ImportDocumentAsync( filePath: "docs/product-manual.pdf", documentId: "manual-v2.1", tags: new TagCollection { { "product", "widget-pro" }, { "version", "2.1" }, { "type", "manual" } }); // Import a web page await kernelMemory.ImportWebPageAsync( url: "https://docs.company.com/api-reference", documentId: "api-docs", tags: new TagCollection { { "type", "api-documentation" } }); // Import text directly await kernelMemory.ImportTextAsync( text: "Our support hours are Monday-Friday 9am-5pm EST. " + "Emergency support available 24/7 for enterprise customers.", documentId: "support-hours", tags: new TagCollection { { "type", "policy" }, { "department", "support" } }); // Check import status (async processing) while (!await kernelMemory.IsDocumentReadyAsync("manual-v2.1")) { await Task.Delay(1000); Console.WriteLine("Processing document..."); } Console.WriteLine("Document ready for queries!"); CODE_BLOCK: var answer = await kernelMemory.AskAsync( question: "What are the safety warnings for the Widget Pro?", filters: new MemoryFilters().ByTag("product", "widget-pro")); Console.WriteLine($"Answer: {answer.Result}"); Console.WriteLine("\nSources:"); foreach (var citation in answer.RelevantSources) { Console.WriteLine($" 📄 {citation.SourceName}"); foreach (var partition in citation.Partitions) { Console.WriteLine($" Page {partition.PageNumber}: \"{partition.Text[..Math.Min(100, partition.Text.Length)]}...\""); Console.WriteLine($" Relevance: {partition.Relevance:P0}"); } } // Output: // Answer: The Widget Pro has the following safety warnings: 1) Do not operate near water... // // Sources: // 📄 product-manual.pdf // Page 15: "SAFETY WARNINGS: Do not operate the Widget Pro near water or in humid..." // Relevance: 94% Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: var answer = await kernelMemory.AskAsync( question: "What are the safety warnings for the Widget Pro?", filters: new MemoryFilters().ByTag("product", "widget-pro")); Console.WriteLine($"Answer: {answer.Result}"); Console.WriteLine("\nSources:"); foreach (var citation in answer.RelevantSources) { Console.WriteLine($" 📄 {citation.SourceName}"); foreach (var partition in citation.Partitions) { Console.WriteLine($" Page {partition.PageNumber}: \"{partition.Text[..Math.Min(100, partition.Text.Length)]}...\""); Console.WriteLine($" Relevance: {partition.Relevance:P0}"); } } // Output: // Answer: The Widget Pro has the following safety warnings: 1) Do not operate near water... // // Sources: // 📄 product-manual.pdf // Page 15: "SAFETY WARNINGS: Do not operate the Widget Pro near water or in humid..." // Relevance: 94% CODE_BLOCK: var answer = await kernelMemory.AskAsync( question: "What are the safety warnings for the Widget Pro?", filters: new MemoryFilters().ByTag("product", "widget-pro")); Console.WriteLine($"Answer: {answer.Result}"); Console.WriteLine("\nSources:"); foreach (var citation in answer.RelevantSources) { Console.WriteLine($" 📄 {citation.SourceName}"); foreach (var partition in citation.Partitions) { Console.WriteLine($" Page {partition.PageNumber}: \"{partition.Text[..Math.Min(100, partition.Text.Length)]}...\""); Console.WriteLine($" Relevance: {partition.Relevance:P0}"); } } // Output: // Answer: The Widget Pro has the following safety warnings: 1) Do not operate near water... // // Sources: // 📄 product-manual.pdf // Page 15: "SAFETY WARNINGS: Do not operate the Widget Pro near water or in humid..." // Relevance: 94% CODE_BLOCK: // Only search enterprise documentation var enterpriseAnswer = await kernelMemory.AskAsync( question: "How do I configure SSO?", filters: new MemoryFilters() .ByTag("audience", "enterprise") .ByTag("type", "configuration")); // Search across specific document versions var v2Answer = await kernelMemory.AskAsync( question: "What's new in this version?", filters: new MemoryFilters() .ByTag("version", "2.0") .ByTag("version", "2.1")); // OR logic for same tag // Exclude certain content var publicAnswer = await kernelMemory.AskAsync( question: "What are your pricing tiers?", filters: new MemoryFilters() .ByTag("visibility", "public")); // No internal docs Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: // Only search enterprise documentation var enterpriseAnswer = await kernelMemory.AskAsync( question: "How do I configure SSO?", filters: new MemoryFilters() .ByTag("audience", "enterprise") .ByTag("type", "configuration")); // Search across specific document versions var v2Answer = await kernelMemory.AskAsync( question: "What's new in this version?", filters: new MemoryFilters() .ByTag("version", "2.0") .ByTag("version", "2.1")); // OR logic for same tag // Exclude certain content var publicAnswer = await kernelMemory.AskAsync( question: "What are your pricing tiers?", filters: new MemoryFilters() .ByTag("visibility", "public")); // No internal docs CODE_BLOCK: // Only search enterprise documentation var enterpriseAnswer = await kernelMemory.AskAsync( question: "How do I configure SSO?", filters: new MemoryFilters() .ByTag("audience", "enterprise") .ByTag("type", "configuration")); // Search across specific document versions var v2Answer = await kernelMemory.AskAsync( question: "What's new in this version?", filters: new MemoryFilters() .ByTag("version", "2.0") .ByTag("version", "2.1")); // OR logic for same tag // Exclude certain content var publicAnswer = await kernelMemory.AskAsync( question: "What are your pricing tiers?", filters: new MemoryFilters() .ByTag("visibility", "public")); // No internal docs COMMAND_BLOCK: public class ConversationalMemoryService { private readonly ISemanticTextMemory _longTermMemory; private readonly IMemoryStore _shortTermStore; private readonly Kernel _kernel; public async Task<string> ProcessMessageAsync( string conversationId, string userMessage) { // 1. Store the user message in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"User: {userMessage}", embedding: await GenerateEmbeddingAsync(userMessage))); // 2. Search long-term memory for relevant context var relevantKnowledge = await _longTermMemory .SearchAsync("knowledge-base", userMessage, limit: 3, minRelevanceScore: 0.75) .ToListAsync(); // 3. Get recent conversation history var recentHistory = await GetRecentConversationAsync(conversationId, limit: 10); // 4. Build the prompt with both memory types var prompt = $""" You are a helpful assistant. Use the following context to answer the user's question. ## Relevant Knowledge: {string.Join("\n\n", relevantKnowledge.Select(r => r.Metadata.Text))} ## Recent Conversation: {string.Join("\n", recentHistory)} ## Current Question: {userMessage} Answer: """; var response = await _kernel.InvokePromptAsync<string>(prompt); // 5. Store the response in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"Assistant: {response}", embedding: await GenerateEmbeddingAsync(response!))); return response!; } } Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: public class ConversationalMemoryService { private readonly ISemanticTextMemory _longTermMemory; private readonly IMemoryStore _shortTermStore; private readonly Kernel _kernel; public async Task<string> ProcessMessageAsync( string conversationId, string userMessage) { // 1. Store the user message in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"User: {userMessage}", embedding: await GenerateEmbeddingAsync(userMessage))); // 2. Search long-term memory for relevant context var relevantKnowledge = await _longTermMemory .SearchAsync("knowledge-base", userMessage, limit: 3, minRelevanceScore: 0.75) .ToListAsync(); // 3. Get recent conversation history var recentHistory = await GetRecentConversationAsync(conversationId, limit: 10); // 4. Build the prompt with both memory types var prompt = $""" You are a helpful assistant. Use the following context to answer the user's question. ## Relevant Knowledge: {string.Join("\n\n", relevantKnowledge.Select(r => r.Metadata.Text))} ## Recent Conversation: {string.Join("\n", recentHistory)} ## Current Question: {userMessage} Answer: """; var response = await _kernel.InvokePromptAsync<string>(prompt); // 5. Store the response in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"Assistant: {response}", embedding: await GenerateEmbeddingAsync(response!))); return response!; } } COMMAND_BLOCK: public class ConversationalMemoryService { private readonly ISemanticTextMemory _longTermMemory; private readonly IMemoryStore _shortTermStore; private readonly Kernel _kernel; public async Task<string> ProcessMessageAsync( string conversationId, string userMessage) { // 1. Store the user message in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"User: {userMessage}", embedding: await GenerateEmbeddingAsync(userMessage))); // 2. Search long-term memory for relevant context var relevantKnowledge = await _longTermMemory .SearchAsync("knowledge-base", userMessage, limit: 3, minRelevanceScore: 0.75) .ToListAsync(); // 3. Get recent conversation history var recentHistory = await GetRecentConversationAsync(conversationId, limit: 10); // 4. Build the prompt with both memory types var prompt = $""" You are a helpful assistant. Use the following context to answer the user's question. ## Relevant Knowledge: {string.Join("\n\n", relevantKnowledge.Select(r => r.Metadata.Text))} ## Recent Conversation: {string.Join("\n", recentHistory)} ## Current Question: {userMessage} Answer: """; var response = await _kernel.InvokePromptAsync<string>(prompt); // 5. Store the response in short-term memory await _shortTermStore.UpsertAsync( collection: $"conversation-{conversationId}", record: MemoryRecord.LocalRecord( id: Guid.NewGuid().ToString(), text: $"Assistant: {response}", embedding: await GenerateEmbeddingAsync(response!))); return response!; } } COMMAND_BLOCK: public class MemoryMaintenanceService { private readonly IMemoryStore _store; // Remove outdated memories public async Task PruneOldMemoriesAsync(string collection, TimeSpan maxAge) { var cutoff = DateTime.UtcNow - maxAge; var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var record in allRecords) { var metadata = ParseMetadata(record.Metadata.AdditionalMetadata); if (metadata.TryGetValue("created_at", out var createdStr) && DateTime.Parse(createdStr) < cutoff) { await _store.RemoveAsync(collection, record.Metadata.Id); } } } // Re-embed memories with a new model public async Task ReembedCollectionAsync( string collection, ITextEmbeddingGenerationService oldService, ITextEmbeddingGenerationService newService) { var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var batch in allRecords.Chunk(100)) { var texts = batch.Select(r => r.Metadata.Text).ToArray(); var newEmbeddings = await newService.GenerateEmbeddingsAsync(texts); for (int i = 0; i < batch.Length; i++) { var updated = MemoryRecord.LocalRecord( id: batch[i].Metadata.Id, text: batch[i].Metadata.Text, description: batch[i].Metadata.Description, embedding: newEmbeddings[i], additionalMetadata: batch[i].Metadata.AdditionalMetadata); await _store.UpsertAsync(collection, updated); } } } } Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: public class MemoryMaintenanceService { private readonly IMemoryStore _store; // Remove outdated memories public async Task PruneOldMemoriesAsync(string collection, TimeSpan maxAge) { var cutoff = DateTime.UtcNow - maxAge; var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var record in allRecords) { var metadata = ParseMetadata(record.Metadata.AdditionalMetadata); if (metadata.TryGetValue("created_at", out var createdStr) && DateTime.Parse(createdStr) < cutoff) { await _store.RemoveAsync(collection, record.Metadata.Id); } } } // Re-embed memories with a new model public async Task ReembedCollectionAsync( string collection, ITextEmbeddingGenerationService oldService, ITextEmbeddingGenerationService newService) { var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var batch in allRecords.Chunk(100)) { var texts = batch.Select(r => r.Metadata.Text).ToArray(); var newEmbeddings = await newService.GenerateEmbeddingsAsync(texts); for (int i = 0; i < batch.Length; i++) { var updated = MemoryRecord.LocalRecord( id: batch[i].Metadata.Id, text: batch[i].Metadata.Text, description: batch[i].Metadata.Description, embedding: newEmbeddings[i], additionalMetadata: batch[i].Metadata.AdditionalMetadata); await _store.UpsertAsync(collection, updated); } } } } COMMAND_BLOCK: public class MemoryMaintenanceService { private readonly IMemoryStore _store; // Remove outdated memories public async Task PruneOldMemoriesAsync(string collection, TimeSpan maxAge) { var cutoff = DateTime.UtcNow - maxAge; var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var record in allRecords) { var metadata = ParseMetadata(record.Metadata.AdditionalMetadata); if (metadata.TryGetValue("created_at", out var createdStr) && DateTime.Parse(createdStr) < cutoff) { await _store.RemoveAsync(collection, record.Metadata.Id); } } } // Re-embed memories with a new model public async Task ReembedCollectionAsync( string collection, ITextEmbeddingGenerationService oldService, ITextEmbeddingGenerationService newService) { var allRecords = await _store.GetBatchAsync(collection, limit: int.MaxValue).ToListAsync(); foreach (var batch in allRecords.Chunk(100)) { var texts = batch.Select(r => r.Metadata.Text).ToArray(); var newEmbeddings = await newService.GenerateEmbeddingsAsync(texts); for (int i = 0; i < batch.Length; i++) { var updated = MemoryRecord.LocalRecord( id: batch[i].Metadata.Id, text: batch[i].Metadata.Text, description: batch[i].Metadata.Description, embedding: newEmbeddings[i], additionalMetadata: batch[i].Metadata.AdditionalMetadata); await _store.UpsertAsync(collection, updated); } } } } - Remember what the customer said 5 messages ago - Access your product documentation - Know your company's policies - Learn from resolved tickets - Embeddings: Transforming text into searchable vectors - ISemanticTextMemory: The abstraction for storing and searching - Memory Stores: Azure AI Search, Qdrant, PostgreSQL, Redis - Kernel Memory: Production-grade document processing with citations - Conversational Memory: Combining short-term and long-term context

🏷️ Tags

how-totutorialguidedev.toaiopenaillmgptkernelserverpostgresqldatabasegit