Tools

Tools: I Built A Production Rag System On Azure Aks For $40/month — 's...

2026-02-09 0 views admin

Posted on Feb 9

• Originally published at github.com

A cloud architect's opinionated walkthrough: from blank terminal to 13 pods serving AI-powered answers, with cost breakdowns you can actually verify.

Last month, I set out to build something specific: a Retrieval-Augmented Generation system that could run on Azure Kubernetes Service — not as a proof-of-concept that lives in a Jupyter notebook, but as a real, deployable platform with ingestion pipelines, caching, observability, and a chat interface. The kind of system you'd hand to a team and say "here, extend this."

The constraint I gave myself was equally specific: keep the monthly bill under $50.

This article walks through what I built, the trade-offs I navigated, and the decisions I'd make differently if I were doing it again. If you're evaluating RAG architectures on Azure, this should save you a few weeks of trial and error.

All of this runs on a single Azure Kubernetes Service node.

Rather than describe the architecture in prose, here's the full cloud topology:

Cloud architecture: Azure managed services on the left, AKS cluster with 13 pods across 4 namespaces on the right.

Every component is deployed via Helm. Every Azure resource is provisioned via Terraform. The entire system goes from az login to serving queries in about 12 minutes.

Architecture diagrams are nice. But the real value is in why you chose one path over another. Here are the decisions I spent the most time on — and the reasoning I'd present to a team or a hiring manager.

Source: Dev.to

🏷️ Tags

github

Tools: I Built A Production Rag System On Azure Aks For $40/month — 's...

🏷️ Tags

More from Tools

Tools: How to generate a PDF from HTML in Node.js (without Puppeteer)

Tools: How I Manage AI Coding Rules Across Claude Code, Cursor, and Codex With One CLI

Tools: Your Dev Tools Are Leaking Data. Here’s Why I Built Mine to Run Entirely in the Browser.

Tools: Vibe Coding is best for repid development but, most of programmer don't knows about .

Trending

CVE-2025-61481: Critical Remote Code Execution Vulnerability in MikroTik RouterOS & SwitchOS

CVE-2025-43939: Dell Unity OS Command Injection (High)

Google disputes false claims of massive Gmail data breach

Microsoft: DNS outage impacts Azure and Microsoft 365 services

3.5B Accounts, 1 Critical Flaw: Meta Closes WhatsApp Data-Harvesting