Tools: From Messy Med-Reports to Smart Insights: Building a FHIR-Powered Medical RAG with Milvus

Tools: From Messy Med-Reports to Smart Insights: Building a FHIR-Powered Medical RAG with Milvus

Source: Dev.to

The Architecture: From Pixels to Structured Knowledge ## Prerequisites ## Step 1: Parsing PDF with Unstructured.io ## Step 2: Normalizing to FHIR Standard ## Step 3: Vectorizing with BGE & Storing in Milvus ## Advanced Patterns & Production Readiness ## Step 4: Semantic Retrieval ## Conclusion: The Future of Personal Health We've all been there: staring at a stack of PDF medical reports, trying to remember if that one blood test result from three years ago was "normal" or "borderline." In the age of AI, why is our personal health data still trapped in static documents? Today, we are building a Medical-Grade Personal Knowledge Base. We aren't just doing basic PDF parsing; we are implementing a sophisticated Medical RAG (Retrieval-Augmented Generation) pipeline. By combining the FHIR standard for clinical data interoperability, Milvus for high-performance vector search, and BGE Embeddings for semantic precision, we will transform scattered PDFs into a searchable, time-aware health history. Handling medical data requires more than just a Ctrl+F search. We need to normalize data so the AI understands that "High BP" and "Hypertension" refer to the same clinical concept. Here is how the data flows from a raw PDF to a queryable insight: To follow this tutorial, you'll need: Medical reports are notoriously messy—tables, headers, and footnotes everywhere. We use unstructured to extract clean text while maintaining document hierarchy. Standardizing data is the "secret sauce" of medical AI. Using the FHIR standard ensures that our RAG system understands clinical context. Instead of storing "Blood Sugar: 110", we store a Observation resource. Now we need to make this data searchable. We use BGE Embeddings to turn our FHIR resources into vectors and store them in Milvus. While this setup works for a personal project, building a production-grade medical system involves HIPAA compliance, complex FHIR terminology mapping (SNOMED CT, LOINC), and handling multi-modal data (like X-rays). For deeper dives into advanced medical data patterns and building more robust healthcare AI, I highly recommend checking out the technical deep-dives at WellAlly Blog. They cover production-ready RAG architectures that are specifically tuned for data privacy and clinical accuracy. Finally, we can query our knowledge base. Instead of keyword matching, we search for the meaning of the query. By combining FHIR, Milvus, and BGE, we've moved from "dumb" PDFs to a structured, semantically searchable medical knowledge base. This is the foundation for an AI health assistant that can truly track your longitudinal health history. Are you building something in the Medical AI space? Drop a comment below or share your thoughts on health data privacy! If you found this tutorial helpful, don't forget to ❤️ and 🦄! For more high-level architectural patterns, visit the WellAlly Blog. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK: graph TD A[Raw PDF Reports] --> B[Unstructured.io Partitioning] B --> C{LLM Extraction} C -->|Map to Standard| D[FHIR JSON Resources] D --> E[BGE Embeddings Model] E --> F[(Milvus Vector Database)] G[User Query: 'How has my glucose changed?'] --> H[Query Embedding] H --> I[Milvus Semantic Search] I --> J[Contextual Augmented Response] F --> I Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: graph TD A[Raw PDF Reports] --> B[Unstructured.io Partitioning] B --> C{LLM Extraction} C -->|Map to Standard| D[FHIR JSON Resources] D --> E[BGE Embeddings Model] E --> F[(Milvus Vector Database)] G[User Query: 'How has my glucose changed?'] --> H[Query Embedding] H --> I[Milvus Semantic Search] I --> J[Contextual Augmented Response] F --> I COMMAND_BLOCK: graph TD A[Raw PDF Reports] --> B[Unstructured.io Partitioning] B --> C{LLM Extraction} C -->|Map to Standard| D[FHIR JSON Resources] D --> E[BGE Embeddings Model] E --> F[(Milvus Vector Database)] G[User Query: 'How has my glucose changed?'] --> H[Query Embedding] H --> I[Milvus Semantic Search] I --> J[Contextual Augmented Response] F --> I COMMAND_BLOCK: from unstructured.partition.pdf import partition_pdf # Partitioning the PDF into manageable elements elements = partition_pdf( filename="blood_test_report_2023.pdf", infer_table_structure=True, strategy="hi_res" ) # Join the text for processing raw_text = "\n\n".join([str(el) for el in elements]) print(f"Extracted {len(elements)} elements from the document.") Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: from unstructured.partition.pdf import partition_pdf # Partitioning the PDF into manageable elements elements = partition_pdf( filename="blood_test_report_2023.pdf", infer_table_structure=True, strategy="hi_res" ) # Join the text for processing raw_text = "\n\n".join([str(el) for el in elements]) print(f"Extracted {len(elements)} elements from the document.") COMMAND_BLOCK: from unstructured.partition.pdf import partition_pdf # Partitioning the PDF into manageable elements elements = partition_pdf( filename="blood_test_report_2023.pdf", infer_table_structure=True, strategy="hi_res" ) # Join the text for processing raw_text = "\n\n".join([str(el) for el in elements]) print(f"Extracted {len(elements)} elements from the document.") COMMAND_BLOCK: import json # A simplified FHIR Observation schema fhir_observation = { "resourceType": "Observation", "status": "final", "code": { "coding": [{"system": "http://loinc.org", "code": "2339-0", "display": "Glucose"}] }, "valueQuantity": { "value": 110, "unit": "mg/dL" }, "effectiveDateTime": "2023-10-27T10:00:00Z" } # In a real app, an LLM would map the 'raw_text' to this JSON structure Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import json # A simplified FHIR Observation schema fhir_observation = { "resourceType": "Observation", "status": "final", "code": { "coding": [{"system": "http://loinc.org", "code": "2339-0", "display": "Glucose"}] }, "valueQuantity": { "value": 110, "unit": "mg/dL" }, "effectiveDateTime": "2023-10-27T10:00:00Z" } # In a real app, an LLM would map the 'raw_text' to this JSON structure COMMAND_BLOCK: import json # A simplified FHIR Observation schema fhir_observation = { "resourceType": "Observation", "status": "final", "code": { "coding": [{"system": "http://loinc.org", "code": "2339-0", "display": "Glucose"}] }, "valueQuantity": { "value": 110, "unit": "mg/dL" }, "effectiveDateTime": "2023-10-27T10:00:00Z" } # In a real app, an LLM would map the 'raw_text' to this JSON structure COMMAND_BLOCK: from pymilvus import MilvusClient from sentence_transformers import Transformer # Initialize BGE Embeddings model = SentenceTransformer('BAAI/bge-small-en-v1.5') client = MilvusClient("health_history.db") # Create a collection in Milvus client.create_collection( collection_name="medical_records", dimension=384 # Dimension for bge-small ) # Convert FHIR to string and embed data_str = json.dumps(fhir_observation) vector = model.encode(data_str) # Insert into Milvus client.insert( collection_name="medical_records", data=[{"id": 1, "vector": vector, "text": data_str}] ) Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: from pymilvus import MilvusClient from sentence_transformers import Transformer # Initialize BGE Embeddings model = SentenceTransformer('BAAI/bge-small-en-v1.5') client = MilvusClient("health_history.db") # Create a collection in Milvus client.create_collection( collection_name="medical_records", dimension=384 # Dimension for bge-small ) # Convert FHIR to string and embed data_str = json.dumps(fhir_observation) vector = model.encode(data_str) # Insert into Milvus client.insert( collection_name="medical_records", data=[{"id": 1, "vector": vector, "text": data_str}] ) COMMAND_BLOCK: from pymilvus import MilvusClient from sentence_transformers import Transformer # Initialize BGE Embeddings model = SentenceTransformer('BAAI/bge-small-en-v1.5') client = MilvusClient("health_history.db") # Create a collection in Milvus client.create_collection( collection_name="medical_records", dimension=384 # Dimension for bge-small ) # Convert FHIR to string and embed data_str = json.dumps(fhir_observation) vector = model.encode(data_str) # Insert into Milvus client.insert( collection_name="medical_records", data=[{"id": 1, "vector": vector, "text": data_str}] ) CODE_BLOCK: query = "What were my last blood sugar readings?" query_vector = model.encode(query) results = client.search( collection_name="medical_records", data=[query_vector], limit=3, output_fields=["text"] ) for res in results[0]: print(f"Found Record: {res['entity']['text']}") Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: query = "What were my last blood sugar readings?" query_vector = model.encode(query) results = client.search( collection_name="medical_records", data=[query_vector], limit=3, output_fields=["text"] ) for res in results[0]: print(f"Found Record: {res['entity']['text']}") CODE_BLOCK: query = "What were my last blood sugar readings?" query_vector = model.encode(query) results = client.search( collection_name="medical_records", data=[query_vector], limit=3, output_fields=["text"] ) for res in results[0]: print(f"Found Record: {res['entity']['text']}") - Python 3.9+ - Unstructured.io: For heavy-duty PDF partitioning. - Milvus: Our vector powerhouse (using Milvus Lite for easy setup). - BGE Embeddings: Specifically BAAI/bge-small-en-v1.5 for a great balance of speed and accuracy. - FHIR (Fast Healthcare Interoperability Resources): The industry standard for health data. - Multi-document tracking: Compare results across different years. - Agentic RAG: Let the AI suggest when you should schedule your next check-up based on the data.