Tools: Cómo desplegué un motor RAG en producción con Docker, Nginx y DigitalOcean

Tools: Cómo desplegué un motor RAG en producción con Docker, Nginx y DigitalOcean

El contexto

Infraestructura elegida

Docker Compose: dev vs producción

Desarrollo

Producción: las diferencias que importan

Las diferencias clave

Dockerfiles: multi-stage builds

Backend: pre-descargar modelos

Frontend: build + Nginx estático

Nginx interno del frontend (SPA)

Nginx: el reverse proxy que lo conecta todo

El bloque SSE que casi me rompe todo

El script de deploy

¿Por qué maintenance mode?

Start script del backend

Mantenimiento automático con Cron

Cloudflare + SSL: la configuración

Bonus: Cloudflare como CDN gratis

Distribución de RAM

Checklist de seguridad

Números reales

Lecciones aprendidas

1. Docker + firewall = cuidado

2. Pre-descargar modelos de ML en el build

3. 1 Uvicorn worker es suficiente (si es async)

4. Maintenance mode > zero-downtime rolling deploys

5. PostgreSQL defaults son para servidores dedicados

6. Cloudflare origin certs > Let's Encrypt

Lo que sigue

Conclusión Desplegué un motor RAG completo (FastAPI + PostgreSQL + pgvector + Redis) en un VPS de 4GB de RAM por $24/mes. En este artículo comparto la arquitectura de deploy real: Docker multi-stage builds, PostgreSQL tuneado para recursos limitados, Nginx como reverse proxy con soporte SSE, zero-downtime deploys con maintenance mode, backups automáticos y monitoreo con cron. En el artículo anterior construí un pipeline RAG de producción con búsqueda híbrida, cross-encoder reranking y cache semántico. Todo funcionaba perfecto en Docker local. El problema: pasarlo a producción en un VPS económico sin que explote. Un sistema RAG no es un CRUD típico. Tiene: ¿Por qué no Kubernetes? Porque para un solo VPS es overkill. Docker Compose con restart policies y health checks cubre el 95% de lo que necesitás para un servicio web con pocos miles de usuarios. Nada sorprendente: puertos expuestos, volúmenes para hot reload, health checks básicos. 1. Puertos solo en localhost Si publicás el puerto sin 127.0.0.1, Docker modifica iptables y bypassea el firewall del sistema. Es un error clásico. 2. PostgreSQL tuneado para 4GB Los defaults de PostgreSQL asumen un servidor dedicado con 1GB+ de RAM solo para PG. En un VPS compartido con 4 servicios más, necesitás ser conservador. 3. Redis con límite estricto Redis sin maxmemory puede crecer indefinidamente y matar el OOM killer. Con allkeys-lru, cuando llega al límite, elimina las keys menos usadas en vez de devolver errores. 4. Memory limits en el backend El backend con modelos de embeddings cargados usa ~800MB-1.2GB. El límite de 1536MB le da margen sin permitir que un memory leak se coma todo el VPS. ¿Por qué pre-descargar modelos en build? Si no lo hacés, el primer request después de cada deploy va a tardar 30-60 segundos mientras se descargan los modelos. Con pre-descarga, el contenedor arranca listo para servir. ¿Por qué PyTorch CPU-only? La versión con CUDA pesa ~2GB extra. En un VPS sin GPU es peso muerto. El frontend en producción NO corre Vite. Es Nginx sirviendo archivos estáticos. La imagen pasa de ~400MB (node + deps) a ~25MB (nginx alpine + dist). Sin estos headers, Nginx buferea los tokens del SSE y los envía todos juntos al final. El usuario ve "nada nada nada... texto completo de golpe". Estos 6 parámetros son obligatorios para streaming real a través de un reverse proxy. El backend tarda ~15 segundos en arrancar: corre migraciones de Alembic y luego Uvicorn carga los modelos de embeddings. Sin maintenance mode, durante esos 15 segundos Nginx devuelve 502 Bad Gateway. Con el archivo /etc/nginx/maintenance.on, Nginx devuelve una página 503 estilizada. El health check (/api/v1/health) queda exento para que el script pueda verificar cuándo el backend está listo. ¿Por qué 1 worker? Cada worker de Uvicorn carga su propia copia de los modelos de embeddings (~500MB). Con 2 workers ya estás en 1.3GB solo de backend. En un VPS de 4GB con PostgreSQL, Redis y Nginx, 1 worker es lo seguro. FastAPI es async, así que 1 worker maneja bien la concurrencia — las operaciones de I/O (DB, LLM API, Redis) no bloquean el event loop. Esto corre cada domingo a las 4am. Limpia basura de Docker, verifica disco y RAM, y loguea todo. Simple pero efectivo — me salvó dos veces de quedarme sin disco por imágenes Docker acumuladas. ¿Por qué no Let's Encrypt? Con Cloudflare proxy activado, Let's Encrypt no puede verificar el dominio por HTTP challenge (Cloudflare intercepta). Los certificados origin de Cloudflare duran 15 años y se configuran en 2 minutos. Con el proxy activado, Cloudflare cachea automáticamente assets estáticos (JS, CSS, imágenes). El VPS solo recibe requests de API y HTML. Esto reduce significativamente el tráfico al servidor. Así queda la distribución en un VPS de 4GB: Los ~1.5-2GB libres son para: Tip: Siempre configurá swap (2GB) como red de seguridad. Sin swap, el OOM killer mata procesos sin aviso cuando la RAM se llena. Antes de considerar el deploy "listo": Docker modifica iptables directamente. Si publicás un puerto sin 127.0.0.1, ufw deny no lo bloquea. Siempre usá 127.0.0.1: en producción y dejá que Nginx sea el único punto de entrada. Si los modelos se descargan en runtime, el primer cold start tarda un minuto. Peor: si HuggingFace tiene downtime, tu deploy falla. Pre-descargar en el Dockerfile elimina ambos problemas. La tentación es poner 4 workers "por las dudas". Pero cada uno carga ~500MB de modelos. Con FastAPI async, un solo worker maneja cientos de requests concurrentes. Escalá en workers solo cuando tengas evidencia de que el CPU es el cuello de botella. Para un VPS single-node, un rolling deploy requiere orquestación compleja. Una página 503 por 15 segundos es infinitamente más simple y nadie se queja — especialmente si el health check garantiza que se desactiva automáticamente. Los defaults de PostgreSQL asumen que tiene toda la RAM para sí. En un VPS compartido con otros 4 servicios, no tunear PostgreSQL es garantía de OOM kills. Los parámetros shared_buffers, effective_cache_size y max_connections son los primeros que hay que ajustar. Con proxy activado, Let's Encrypt es más complicado de configurar y renovar. Los origin certs de Cloudflare duran 15 años, se configuran una vez y te olvidás. Desplegar un sistema RAG no es como desplegar un CRUD. Los modelos de embeddings consumen RAM real, el streaming SSE necesita configuración específica en Nginx, y PostgreSQL con pgvector necesita tuning cuidadoso en servidores limitados. Pero tampoco necesitás Kubernetes ni un cluster de $200/mes. Un VPS de $24 con Docker Compose bien configurado, Nginx como reverse proxy, y scripts de deploy con maintenance mode + health checks es suficiente para servir miles de queries al día con latencias consistentes. Lo más importante: medí primero, escalá después. Empezar con 1 worker, 4GB de RAM y monitoreo básico te da toda la información que necesitás para tomar decisiones de infraestructura basadas en datos reales, no en estimaciones. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command

Copy

# -weight: 500;">docker-compose.yml services: db: image: pgvector/pgvector:pg16 ports: - "5433:5432" environment: POSTGRES_DB: ragdb POSTGRES_USER: raguser POSTGRES_PASSWORD: localpass123 volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U raguser -d ragdb"] interval: 5s retries: 5 redis: image: redis:7-alpine ports: - "6379:6379" backend: build: ./backend ports: - "8000:8000" volumes: - ./backend/app:/app/app # Hot reload depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: ./frontend ports: - "5173:5173" volumes: - ./frontend/src:/app/src # Hot reload # -weight: 500;">docker-compose.yml services: db: image: pgvector/pgvector:pg16 ports: - "5433:5432" environment: POSTGRES_DB: ragdb POSTGRES_USER: raguser POSTGRES_PASSWORD: localpass123 volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U raguser -d ragdb"] interval: 5s retries: 5 redis: image: redis:7-alpine ports: - "6379:6379" backend: build: ./backend ports: - "8000:8000" volumes: - ./backend/app:/app/app # Hot reload depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: ./frontend ports: - "5173:5173" volumes: - ./frontend/src:/app/src # Hot reload # -weight: 500;">docker-compose.yml services: db: image: pgvector/pgvector:pg16 ports: - "5433:5432" environment: POSTGRES_DB: ragdb POSTGRES_USER: raguser POSTGRES_PASSWORD: localpass123 volumes: - pgdata:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U raguser -d ragdb"] interval: 5s retries: 5 redis: image: redis:7-alpine ports: - "6379:6379" backend: build: ./backend ports: - "8000:8000" volumes: - ./backend/app:/app/app # Hot reload depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: ./frontend ports: - "5173:5173" volumes: - ./frontend/src:/app/src # Hot reload # -weight: 500;">docker-compose.prod.yml services: db: image: pgvector/pgvector:pg16 container_name: app-db -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:5432:5432" # Solo localhost volumes: - pgdata:/var/lib/postgresql/data command: > postgres -c shared_buffers=128MB -c effective_cache_size=256MB -c max_connections=50 -c work_mem=4MB -c maintenance_work_mem=64MB -c random_page_cost=1.1 -c effective_io_concurrency=200 -c wal_buffers=4MB -c checkpoint_completion_target=0.9 healthcheck: test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"] interval: 10s retries: 5 redis: image: redis:7-alpine container_name: app-redis -weight: 500;">restart: always command: redis-server --maxmemory 64mb --maxmemory-policy allkeys-lru ports: - "127.0.0.1:6379:6379" # Solo localhost volumes: - redisdata:/data backend: build: context: ./backend dockerfile: Dockerfile container_name: app-backend -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:8000:8000" # Solo localhost, Nginx al frente deploy: resources: limits: memory: 1536M depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: context: ./frontend dockerfile: Dockerfile.prod # Multi-stage con Nginx container_name: app-frontend -weight: 500;">restart: always ports: - "127.0.0.1:5173:5173" # Solo localhost deploy: resources: limits: memory: 64M # -weight: 500;">docker-compose.prod.yml services: db: image: pgvector/pgvector:pg16 container_name: app-db -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:5432:5432" # Solo localhost volumes: - pgdata:/var/lib/postgresql/data command: > postgres -c shared_buffers=128MB -c effective_cache_size=256MB -c max_connections=50 -c work_mem=4MB -c maintenance_work_mem=64MB -c random_page_cost=1.1 -c effective_io_concurrency=200 -c wal_buffers=4MB -c checkpoint_completion_target=0.9 healthcheck: test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"] interval: 10s retries: 5 redis: image: redis:7-alpine container_name: app-redis -weight: 500;">restart: always command: redis-server --maxmemory 64mb --maxmemory-policy allkeys-lru ports: - "127.0.0.1:6379:6379" # Solo localhost volumes: - redisdata:/data backend: build: context: ./backend dockerfile: Dockerfile container_name: app-backend -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:8000:8000" # Solo localhost, Nginx al frente deploy: resources: limits: memory: 1536M depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: context: ./frontend dockerfile: Dockerfile.prod # Multi-stage con Nginx container_name: app-frontend -weight: 500;">restart: always ports: - "127.0.0.1:5173:5173" # Solo localhost deploy: resources: limits: memory: 64M # -weight: 500;">docker-compose.prod.yml services: db: image: pgvector/pgvector:pg16 container_name: app-db -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:5432:5432" # Solo localhost volumes: - pgdata:/var/lib/postgresql/data command: > postgres -c shared_buffers=128MB -c effective_cache_size=256MB -c max_connections=50 -c work_mem=4MB -c maintenance_work_mem=64MB -c random_page_cost=1.1 -c effective_io_concurrency=200 -c wal_buffers=4MB -c checkpoint_completion_target=0.9 healthcheck: test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"] interval: 10s retries: 5 redis: image: redis:7-alpine container_name: app-redis -weight: 500;">restart: always command: redis-server --maxmemory 64mb --maxmemory-policy allkeys-lru ports: - "127.0.0.1:6379:6379" # Solo localhost volumes: - redisdata:/data backend: build: context: ./backend dockerfile: Dockerfile container_name: app-backend -weight: 500;">restart: always env_file: .env.production ports: - "127.0.0.1:8000:8000" # Solo localhost, Nginx al frente deploy: resources: limits: memory: 1536M depends_on: db: condition: service_healthy redis: condition: service_started frontend: build: context: ./frontend dockerfile: Dockerfile.prod # Multi-stage con Nginx container_name: app-frontend -weight: 500;">restart: always ports: - "127.0.0.1:5173:5173" # Solo localhost deploy: resources: limits: memory: 64M ports: - "127.0.0.1:8000:8000" # ✅ Solo Nginx puede acceder # vs - "8000:8000" # ❌ Abierto al mundo ports: - "127.0.0.1:8000:8000" # ✅ Solo Nginx puede acceder # vs - "8000:8000" # ❌ Abierto al mundo ports: - "127.0.0.1:8000:8000" # ✅ Solo Nginx puede acceder # vs - "8000:8000" # ❌ Abierto al mundo shared_buffers=128MB # 25% de la RAM disponible para PG (~512MB) effective_cache_size=256MB # Lo que el OS puede cachear max_connections=50 # No necesitás 100 con un backend async work_mem=4MB # Cuidado: se multiplica por conexión × sort ops random_page_cost=1.1 # SSD, no disco rotacional shared_buffers=128MB # 25% de la RAM disponible para PG (~512MB) effective_cache_size=256MB # Lo que el OS puede cachear max_connections=50 # No necesitás 100 con un backend async work_mem=4MB # Cuidado: se multiplica por conexión × sort ops random_page_cost=1.1 # SSD, no disco rotacional shared_buffers=128MB # 25% de la RAM disponible para PG (~512MB) effective_cache_size=256MB # Lo que el OS puede cachear max_connections=50 # No necesitás 100 con un backend async work_mem=4MB # Cuidado: se multiplica por conexión × sort ops random_page_cost=1.1 # SSD, no disco rotacional maxmemory 64mb maxmemory-policy allkeys-lru maxmemory 64mb maxmemory-policy allkeys-lru maxmemory 64mb maxmemory-policy allkeys-lru deploy: resources: limits: memory: 1536M deploy: resources: limits: memory: 1536M deploy: resources: limits: memory: 1536M # === Stage 1: Builder === FROM python:3.12-slim AS builder WORKDIR /build COPY requirements.txt . # PyTorch CPU-only (ahorro ~1.5GB vs versión con CUDA) RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu \ && -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Pre-descargar modelos de embeddings y cross-encoder RUN python -c " from sentence_transformers import SentenceTransformer, CrossEncoder SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2') CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2') " # === Stage 2: Runtime === FROM python:3.12-slim AS runtime WORKDIR /app RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ libfontconfig1 -weight: 500;">curl ca-certificates && rm -rf /var/lib/-weight: 500;">apt/lists/* # Copiar dependencias y modelos pre-descargados COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/bin /usr/local/bin COPY --from=builder /root/.cache /root/.cache COPY . . RUN chmod +x -weight: 500;">start.sh EXPOSE 8000 CMD ["./-weight: 500;">start.sh"] # === Stage 1: Builder === FROM python:3.12-slim AS builder WORKDIR /build COPY requirements.txt . # PyTorch CPU-only (ahorro ~1.5GB vs versión con CUDA) RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu \ && -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Pre-descargar modelos de embeddings y cross-encoder RUN python -c " from sentence_transformers import SentenceTransformer, CrossEncoder SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2') CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2') " # === Stage 2: Runtime === FROM python:3.12-slim AS runtime WORKDIR /app RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ libfontconfig1 -weight: 500;">curl ca-certificates && rm -rf /var/lib/-weight: 500;">apt/lists/* # Copiar dependencias y modelos pre-descargados COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/bin /usr/local/bin COPY --from=builder /root/.cache /root/.cache COPY . . RUN chmod +x -weight: 500;">start.sh EXPOSE 8000 CMD ["./-weight: 500;">start.sh"] # === Stage 1: Builder === FROM python:3.12-slim AS builder WORKDIR /build COPY requirements.txt . # PyTorch CPU-only (ahorro ~1.5GB vs versión con CUDA) RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cpu \ && -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Pre-descargar modelos de embeddings y cross-encoder RUN python -c " from sentence_transformers import SentenceTransformer, CrossEncoder SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2') CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2') " # === Stage 2: Runtime === FROM python:3.12-slim AS runtime WORKDIR /app RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ libfontconfig1 -weight: 500;">curl ca-certificates && rm -rf /var/lib/-weight: 500;">apt/lists/* # Copiar dependencias y modelos pre-descargados COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages COPY --from=builder /usr/local/bin /usr/local/bin COPY --from=builder /root/.cache /root/.cache COPY . . RUN chmod +x -weight: 500;">start.sh EXPOSE 8000 CMD ["./-weight: 500;">start.sh"] # === Stage 1: Build === FROM node:20-alpine AS build WORKDIR /app COPY package*.json ./ RUN -weight: 500;">npm ci COPY . . RUN -weight: 500;">npm run build # === Stage 2: Serve === FROM nginx:1.27-alpine # Copiar build estático COPY --from=build /app/dist /usr/share/nginx/html # Config para SPA COPY nginx.conf /etc/nginx/conf.d/default.conf EXPOSE 5173 CMD ["nginx", "-g", "daemon off;"] # === Stage 1: Build === FROM node:20-alpine AS build WORKDIR /app COPY package*.json ./ RUN -weight: 500;">npm ci COPY . . RUN -weight: 500;">npm run build # === Stage 2: Serve === FROM nginx:1.27-alpine # Copiar build estático COPY --from=build /app/dist /usr/share/nginx/html # Config para SPA COPY nginx.conf /etc/nginx/conf.d/default.conf EXPOSE 5173 CMD ["nginx", "-g", "daemon off;"] # === Stage 1: Build === FROM node:20-alpine AS build WORKDIR /app COPY package*.json ./ RUN -weight: 500;">npm ci COPY . . RUN -weight: 500;">npm run build # === Stage 2: Serve === FROM nginx:1.27-alpine # Copiar build estático COPY --from=build /app/dist /usr/share/nginx/html # Config para SPA COPY nginx.conf /etc/nginx/conf.d/default.conf EXPOSE 5173 CMD ["nginx", "-g", "daemon off;"] server { listen 5173; root /usr/share/nginx/html; index index.html; # Assets con hash de Vite → cache agresivo location /assets/ { expires 1y; add_header Cache-Control "public, immutable"; } # index.html → nunca cachear (para que nuevos deploys se reflejen) location = /index.html { add_header Cache-Control "no-cache"; } # SPA fallback: toda ruta → index.html location / { try_files $uri $uri/ /index.html; } gzip on; gzip_types text/plain text/css application/json application/javascript; } server { listen 5173; root /usr/share/nginx/html; index index.html; # Assets con hash de Vite → cache agresivo location /assets/ { expires 1y; add_header Cache-Control "public, immutable"; } # index.html → nunca cachear (para que nuevos deploys se reflejen) location = /index.html { add_header Cache-Control "no-cache"; } # SPA fallback: toda ruta → index.html location / { try_files $uri $uri/ /index.html; } gzip on; gzip_types text/plain text/css application/json application/javascript; } server { listen 5173; root /usr/share/nginx/html; index index.html; # Assets con hash de Vite → cache agresivo location /assets/ { expires 1y; add_header Cache-Control "public, immutable"; } # index.html → nunca cachear (para que nuevos deploys se reflejen) location = /index.html { add_header Cache-Control "no-cache"; } # SPA fallback: toda ruta → index.html location / { try_files $uri $uri/ /index.html; } gzip on; gzip_types text/plain text/css application/json application/javascript; } # === HTTPS principal === server { listen 443 ssl http2; server_name tu-dominio.com; # Certificados origin (Cloudflare → VPS) ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; ssl_protocols TLSv1.2 TLSv1.3; client_max_body_size 50M; # Para upload de documentos # === Maintenance mode === set $maintenance 0; if (-f /etc/nginx/maintenance.on) { set $maintenance 1; } # Health check siempre disponible (para monitoreo) location = /api/v1/health { proxy_pass http://127.0.0.1:8000; } # Si maintenance mode → 503 if ($maintenance) { return 503; } # === API Backend === location /api/ { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE streaming: CRÍTICO proxy_buffering off; proxy_cache off; proxy_read_timeout 300s; proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding off; } # === Widget JS (cacheable) === location = /widget.js { proxy_pass http://127.0.0.1:8000; proxy_cache_valid 200 1h; } # === Frontend SPA === location / { proxy_pass http://127.0.0.1:5173; } # === Gzip === gzip on; gzip_comp_level 4; gzip_min_length 256; gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript image/svg+xml; # === Security headers === add_header X-Frame-Options "SAMEORIGIN" always; add_header X-Content-Type-Options "nosniff" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; } # === HTTP → HTTPS redirect === server { listen 80; server_name tu-dominio.com; return 301 https://$server_name$request_uri; } # === www → non-www === server { listen 443 ssl http2; server_name www.tu-dominio.com; ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; return 301 https://tu-dominio.com$request_uri; } # === Página 503 de mantenimiento === error_page 503 @maintenance; location @maintenance { default_type text/html; return 503 '<!DOCTYPE html> <html><head><meta charset="UTF-8"><title>Mantenimiento</title> <style>body{font-family:system-ui;display:flex;justify-content:center; align-items:center;min-height:100vh;background:#0f172a;color:#e2e8f0; text-align:center}h1{font-size:2rem}p{color:#94a3b8}</style></head> <body><div><h1>En mantenimiento</h1> <p>Volvemos en unos minutos.</p></div></body></html>'; } # === HTTPS principal === server { listen 443 ssl http2; server_name tu-dominio.com; # Certificados origin (Cloudflare → VPS) ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; ssl_protocols TLSv1.2 TLSv1.3; client_max_body_size 50M; # Para upload de documentos # === Maintenance mode === set $maintenance 0; if (-f /etc/nginx/maintenance.on) { set $maintenance 1; } # Health check siempre disponible (para monitoreo) location = /api/v1/health { proxy_pass http://127.0.0.1:8000; } # Si maintenance mode → 503 if ($maintenance) { return 503; } # === API Backend === location /api/ { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE streaming: CRÍTICO proxy_buffering off; proxy_cache off; proxy_read_timeout 300s; proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding off; } # === Widget JS (cacheable) === location = /widget.js { proxy_pass http://127.0.0.1:8000; proxy_cache_valid 200 1h; } # === Frontend SPA === location / { proxy_pass http://127.0.0.1:5173; } # === Gzip === gzip on; gzip_comp_level 4; gzip_min_length 256; gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript image/svg+xml; # === Security headers === add_header X-Frame-Options "SAMEORIGIN" always; add_header X-Content-Type-Options "nosniff" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; } # === HTTP → HTTPS redirect === server { listen 80; server_name tu-dominio.com; return 301 https://$server_name$request_uri; } # === www → non-www === server { listen 443 ssl http2; server_name www.tu-dominio.com; ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; return 301 https://tu-dominio.com$request_uri; } # === Página 503 de mantenimiento === error_page 503 @maintenance; location @maintenance { default_type text/html; return 503 '<!DOCTYPE html> <html><head><meta charset="UTF-8"><title>Mantenimiento</title> <style>body{font-family:system-ui;display:flex;justify-content:center; align-items:center;min-height:100vh;background:#0f172a;color:#e2e8f0; text-align:center}h1{font-size:2rem}p{color:#94a3b8}</style></head> <body><div><h1>En mantenimiento</h1> <p>Volvemos en unos minutos.</p></div></body></html>'; } # === HTTPS principal === server { listen 443 ssl http2; server_name tu-dominio.com; # Certificados origin (Cloudflare → VPS) ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; ssl_protocols TLSv1.2 TLSv1.3; client_max_body_size 50M; # Para upload de documentos # === Maintenance mode === set $maintenance 0; if (-f /etc/nginx/maintenance.on) { set $maintenance 1; } # Health check siempre disponible (para monitoreo) location = /api/v1/health { proxy_pass http://127.0.0.1:8000; } # Si maintenance mode → 503 if ($maintenance) { return 503; } # === API Backend === location /api/ { proxy_pass http://127.0.0.1:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE streaming: CRÍTICO proxy_buffering off; proxy_cache off; proxy_read_timeout 300s; proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding off; } # === Widget JS (cacheable) === location = /widget.js { proxy_pass http://127.0.0.1:8000; proxy_cache_valid 200 1h; } # === Frontend SPA === location / { proxy_pass http://127.0.0.1:5173; } # === Gzip === gzip on; gzip_comp_level 4; gzip_min_length 256; gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript image/svg+xml; # === Security headers === add_header X-Frame-Options "SAMEORIGIN" always; add_header X-Content-Type-Options "nosniff" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always; } # === HTTP → HTTPS redirect === server { listen 80; server_name tu-dominio.com; return 301 https://$server_name$request_uri; } # === www → non-www === server { listen 443 ssl http2; server_name www.tu-dominio.com; ssl_certificate /etc/ssl/certs/origin.pem; ssl_certificate_key /etc/ssl/private/origin.key; return 301 https://tu-dominio.com$request_uri; } # === Página 503 de mantenimiento === error_page 503 @maintenance; location @maintenance { default_type text/html; return 503 '<!DOCTYPE html> <html><head><meta charset="UTF-8"><title>Mantenimiento</title> <style>body{font-family:system-ui;display:flex;justify-content:center; align-items:center;min-height:100vh;background:#0f172a;color:#e2e8f0; text-align:center}h1{font-size:2rem}p{color:#94a3b8}</style></head> <body><div><h1>En mantenimiento</h1> <p>Volvemos en unos minutos.</p></div></body></html>'; } location /api/ { proxy_buffering off; # Nginx NO debe buffear la respuesta proxy_cache off; # Ni cachearla proxy_read_timeout 300s; # SSE puede durar minutos proxy_set_header Connection ''; # Deshabilitar keep-alive del upstream proxy_http_version 1.1; # HTTP/1.1 requerido para chunked chunked_transfer_encoding off; } location /api/ { proxy_buffering off; # Nginx NO debe buffear la respuesta proxy_cache off; # Ni cachearla proxy_read_timeout 300s; # SSE puede durar minutos proxy_set_header Connection ''; # Deshabilitar keep-alive del upstream proxy_http_version 1.1; # HTTP/1.1 requerido para chunked chunked_transfer_encoding off; } location /api/ { proxy_buffering off; # Nginx NO debe buffear la respuesta proxy_cache off; # Ni cachearla proxy_read_timeout 300s; # SSE puede durar minutos proxy_set_header Connection ''; # Deshabilitar keep-alive del upstream proxy_http_version 1.1; # HTTP/1.1 requerido para chunked chunked_transfer_encoding off; } #!/bin/bash set -e SERVER="usuario@ip-del-servidor" SSH_KEY="~/.ssh/deploy_key" PROJECT_DIR="/opt/mi-app" BACKUP_DIR="/opt/mi-app/backups/db" DEPLOY_MODE="${1:-full}" # frontend | backend | full ssh_cmd() { ssh -i "$SSH_KEY" "$SERVER" "$1" } echo "=== Deploy: $DEPLOY_MODE ===" # 1. Push código -weight: 500;">git push origin main # 2. Pull en servidor ssh_cmd "cd $PROJECT_DIR && -weight: 500;">git fetch origin && -weight: 500;">git reset --hard origin/main" # 3. Backup de base de datos (solo backend/full) if [[ "$DEPLOY_MODE" != "frontend" ]]; then echo "Creando backup de DB..." TIMESTAMP=$(date +%Y%m%d_%H%M%S) ssh_cmd "-weight: 500;">docker exec app-db pg_dump -U \$POSTGRES_USER \$POSTGRES_DB \ | gzip > $BACKUP_DIR/backup_${TIMESTAMP}.gz" # Activar maintenance mode ssh_cmd "touch /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: ON" fi # 4. Rebuild y -weight: 500;">restart case $DEPLOY_MODE in frontend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build frontend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d frontend" ;; backend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build backend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d backend" ;; full) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d --build" ;; esac # 5. Desactivar maintenance mode if [[ "$DEPLOY_MODE" != "frontend" ]]; then sleep 15 # Esperar a que el backend cargue modelos ssh_cmd "rm -f /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: OFF" fi # 6. Health check echo "Verificando salud..." MAX_RETRIES=30 for i in $(seq 1 $MAX_RETRIES); do STATUS=$(ssh_cmd "-weight: 500;">curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:8000/api/v1/health") if [[ "$STATUS" == "200" ]]; then echo "Deploy exitoso. Health: OK" exit 0 fi echo "Intento $i/$MAX_RETRIES... (-weight: 500;">status: $STATUS)" sleep 5 done echo "ERROR: Health check falló después de $MAX_RETRIES intentos" exit 1 #!/bin/bash set -e SERVER="usuario@ip-del-servidor" SSH_KEY="~/.ssh/deploy_key" PROJECT_DIR="/opt/mi-app" BACKUP_DIR="/opt/mi-app/backups/db" DEPLOY_MODE="${1:-full}" # frontend | backend | full ssh_cmd() { ssh -i "$SSH_KEY" "$SERVER" "$1" } echo "=== Deploy: $DEPLOY_MODE ===" # 1. Push código -weight: 500;">git push origin main # 2. Pull en servidor ssh_cmd "cd $PROJECT_DIR && -weight: 500;">git fetch origin && -weight: 500;">git reset --hard origin/main" # 3. Backup de base de datos (solo backend/full) if [[ "$DEPLOY_MODE" != "frontend" ]]; then echo "Creando backup de DB..." TIMESTAMP=$(date +%Y%m%d_%H%M%S) ssh_cmd "-weight: 500;">docker exec app-db pg_dump -U \$POSTGRES_USER \$POSTGRES_DB \ | gzip > $BACKUP_DIR/backup_${TIMESTAMP}.gz" # Activar maintenance mode ssh_cmd "touch /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: ON" fi # 4. Rebuild y -weight: 500;">restart case $DEPLOY_MODE in frontend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build frontend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d frontend" ;; backend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build backend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d backend" ;; full) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d --build" ;; esac # 5. Desactivar maintenance mode if [[ "$DEPLOY_MODE" != "frontend" ]]; then sleep 15 # Esperar a que el backend cargue modelos ssh_cmd "rm -f /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: OFF" fi # 6. Health check echo "Verificando salud..." MAX_RETRIES=30 for i in $(seq 1 $MAX_RETRIES); do STATUS=$(ssh_cmd "-weight: 500;">curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:8000/api/v1/health") if [[ "$STATUS" == "200" ]]; then echo "Deploy exitoso. Health: OK" exit 0 fi echo "Intento $i/$MAX_RETRIES... (-weight: 500;">status: $STATUS)" sleep 5 done echo "ERROR: Health check falló después de $MAX_RETRIES intentos" exit 1 #!/bin/bash set -e SERVER="usuario@ip-del-servidor" SSH_KEY="~/.ssh/deploy_key" PROJECT_DIR="/opt/mi-app" BACKUP_DIR="/opt/mi-app/backups/db" DEPLOY_MODE="${1:-full}" # frontend | backend | full ssh_cmd() { ssh -i "$SSH_KEY" "$SERVER" "$1" } echo "=== Deploy: $DEPLOY_MODE ===" # 1. Push código -weight: 500;">git push origin main # 2. Pull en servidor ssh_cmd "cd $PROJECT_DIR && -weight: 500;">git fetch origin && -weight: 500;">git reset --hard origin/main" # 3. Backup de base de datos (solo backend/full) if [[ "$DEPLOY_MODE" != "frontend" ]]; then echo "Creando backup de DB..." TIMESTAMP=$(date +%Y%m%d_%H%M%S) ssh_cmd "-weight: 500;">docker exec app-db pg_dump -U \$POSTGRES_USER \$POSTGRES_DB \ | gzip > $BACKUP_DIR/backup_${TIMESTAMP}.gz" # Activar maintenance mode ssh_cmd "touch /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: ON" fi # 4. Rebuild y -weight: 500;">restart case $DEPLOY_MODE in frontend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build frontend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d frontend" ;; backend) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ build backend && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d backend" ;; full) ssh_cmd "cd $PROJECT_DIR && -weight: 500;">docker compose -f -weight: 500;">docker-compose.prod.yml \ up -d --build" ;; esac # 5. Desactivar maintenance mode if [[ "$DEPLOY_MODE" != "frontend" ]]; then sleep 15 # Esperar a que el backend cargue modelos ssh_cmd "rm -f /etc/nginx/maintenance.on && nginx -s reload" echo "Maintenance mode: OFF" fi # 6. Health check echo "Verificando salud..." MAX_RETRIES=30 for i in $(seq 1 $MAX_RETRIES); do STATUS=$(ssh_cmd "-weight: 500;">curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:8000/api/v1/health") if [[ "$STATUS" == "200" ]]; then echo "Deploy exitoso. Health: OK" exit 0 fi echo "Intento $i/$MAX_RETRIES... (-weight: 500;">status: $STATUS)" sleep 5 done echo "ERROR: Health check falló después de $MAX_RETRIES intentos" exit 1 #!/bin/bash set -e echo "Running database migrations..." alembic -weight: 500;">upgrade head echo "Starting FastAPI server..." # 1 worker = ~800MB con modelos cargados # 2 workers = ~1.3GB (solo si tenés RAM disponible) WORKERS="${UVICORN_WORKERS:-1}" exec uvicorn app.main:app \ --host 0.0.0.0 \ --port 8000 \ --workers "$WORKERS" \ --log-level "${UVICORN_LOG_LEVEL:-info}" \ "$@" #!/bin/bash set -e echo "Running database migrations..." alembic -weight: 500;">upgrade head echo "Starting FastAPI server..." # 1 worker = ~800MB con modelos cargados # 2 workers = ~1.3GB (solo si tenés RAM disponible) WORKERS="${UVICORN_WORKERS:-1}" exec uvicorn app.main:app \ --host 0.0.0.0 \ --port 8000 \ --workers "$WORKERS" \ --log-level "${UVICORN_LOG_LEVEL:-info}" \ "$@" #!/bin/bash set -e echo "Running database migrations..." alembic -weight: 500;">upgrade head echo "Starting FastAPI server..." # 1 worker = ~800MB con modelos cargados # 2 workers = ~1.3GB (solo si tenés RAM disponible) WORKERS="${UVICORN_WORKERS:-1}" exec uvicorn app.main:app \ --host 0.0.0.0 \ --port 8000 \ --workers "$WORKERS" \ --log-level "${UVICORN_LOG_LEVEL:-info}" \ "$@" #!/bin/bash # Ejecutar: crontab -e → 0 4 * * 0 /opt/mi-app/scripts/maintenance.sh LOG="/var/log/app-maintenance.log" echo "=== Mantenimiento $(date) ===" >> "$LOG" # 1. Limpiar imágenes Docker huérfanas (>7 días) -weight: 500;">docker image prune -f --filter "until=168h" >> "$LOG" 2>&1 # 2. Limpiar contenedores parados -weight: 500;">docker container prune -f --filter "until=168h" >> "$LOG" 2>&1 # 3. Verificar uso de disco DISK_USAGE=$(df / --output=pcent | tail -1 | tr -dc '0-9') if [ "$DISK_USAGE" -gt 80 ]; then echo "ALERTA: Disco al ${DISK_USAGE}%" >> "$LOG" fi # 4. Verificar memoria MEM_USAGE=$(free | awk '/Mem:/ {printf("%.0f"), $3/$2 * 100}') if [ "$MEM_USAGE" -gt 90 ]; then echo "ALERTA: RAM al ${MEM_USAGE}%" >> "$LOG" fi # 5. Estado de contenedores -weight: 500;">docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Size}}" >> "$LOG" # 6. Inventario de backups BACKUP_COUNT=$(ls /opt/mi-app/backups/db/*.gz 2>/dev/null | wc -l) LATEST=$(ls -t /opt/mi-app/backups/db/*.gz 2>/dev/null | head -1) echo "Backups: $BACKUP_COUNT archivos. Último: $LATEST" >> "$LOG" echo "=== Fin ===" >> "$LOG" #!/bin/bash # Ejecutar: crontab -e → 0 4 * * 0 /opt/mi-app/scripts/maintenance.sh LOG="/var/log/app-maintenance.log" echo "=== Mantenimiento $(date) ===" >> "$LOG" # 1. Limpiar imágenes Docker huérfanas (>7 días) -weight: 500;">docker image prune -f --filter "until=168h" >> "$LOG" 2>&1 # 2. Limpiar contenedores parados -weight: 500;">docker container prune -f --filter "until=168h" >> "$LOG" 2>&1 # 3. Verificar uso de disco DISK_USAGE=$(df / --output=pcent | tail -1 | tr -dc '0-9') if [ "$DISK_USAGE" -gt 80 ]; then echo "ALERTA: Disco al ${DISK_USAGE}%" >> "$LOG" fi # 4. Verificar memoria MEM_USAGE=$(free | awk '/Mem:/ {printf("%.0f"), $3/$2 * 100}') if [ "$MEM_USAGE" -gt 90 ]; then echo "ALERTA: RAM al ${MEM_USAGE}%" >> "$LOG" fi # 5. Estado de contenedores -weight: 500;">docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Size}}" >> "$LOG" # 6. Inventario de backups BACKUP_COUNT=$(ls /opt/mi-app/backups/db/*.gz 2>/dev/null | wc -l) LATEST=$(ls -t /opt/mi-app/backups/db/*.gz 2>/dev/null | head -1) echo "Backups: $BACKUP_COUNT archivos. Último: $LATEST" >> "$LOG" echo "=== Fin ===" >> "$LOG" #!/bin/bash # Ejecutar: crontab -e → 0 4 * * 0 /opt/mi-app/scripts/maintenance.sh LOG="/var/log/app-maintenance.log" echo "=== Mantenimiento $(date) ===" >> "$LOG" # 1. Limpiar imágenes Docker huérfanas (>7 días) -weight: 500;">docker image prune -f --filter "until=168h" >> "$LOG" 2>&1 # 2. Limpiar contenedores parados -weight: 500;">docker container prune -f --filter "until=168h" >> "$LOG" 2>&1 # 3. Verificar uso de disco DISK_USAGE=$(df / --output=pcent | tail -1 | tr -dc '0-9') if [ "$DISK_USAGE" -gt 80 ]; then echo "ALERTA: Disco al ${DISK_USAGE}%" >> "$LOG" fi # 4. Verificar memoria MEM_USAGE=$(free | awk '/Mem:/ {printf("%.0f"), $3/$2 * 100}') if [ "$MEM_USAGE" -gt 90 ]; then echo "ALERTA: RAM al ${MEM_USAGE}%" >> "$LOG" fi # 5. Estado de contenedores -weight: 500;">docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Size}}" >> "$LOG" # 6. Inventario de backups BACKUP_COUNT=$(ls /opt/mi-app/backups/db/*.gz 2>/dev/null | wc -l) LATEST=$(ls -t /opt/mi-app/backups/db/*.gz 2>/dev/null | head -1) echo "Backups: $BACKUP_COUNT archivos. Último: $LATEST" >> "$LOG" echo "=== Fin ===" >> "$LOG" Internet → Cloudflare (proxy) → VPS (Nginx 443) → Docker containers Internet → Cloudflare (proxy) → VPS (Nginx 443) → Docker containers Internet → Cloudflare (proxy) → VPS (Nginx 443) → Docker containers ┌──────────────────────────────────────────┐ │ 4GB RAM Total │ ├──────────────────────────────────────────┤ │ OS + Docker Engine ~400MB │ │ PostgreSQL ~200-400MB │ │ Backend (1 worker) ~800MB-1.2GB │ │ Redis ≤64MB │ │ Nginx + Frontend ~30MB │ │ Libre / Buffer ~1.5-2GB │ ├──────────────────────────────────────────┤ │ Swap (2GB) emergencia │ └──────────────────────────────────────────┘ ┌──────────────────────────────────────────┐ │ 4GB RAM Total │ ├──────────────────────────────────────────┤ │ OS + Docker Engine ~400MB │ │ PostgreSQL ~200-400MB │ │ Backend (1 worker) ~800MB-1.2GB │ │ Redis ≤64MB │ │ Nginx + Frontend ~30MB │ │ Libre / Buffer ~1.5-2GB │ ├──────────────────────────────────────────┤ │ Swap (2GB) emergencia │ └──────────────────────────────────────────┘ ┌──────────────────────────────────────────┐ │ 4GB RAM Total │ ├──────────────────────────────────────────┤ │ OS + Docker Engine ~400MB │ │ PostgreSQL ~200-400MB │ │ Backend (1 worker) ~800MB-1.2GB │ │ Redis ≤64MB │ │ Nginx + Frontend ~30MB │ │ Libre / Buffer ~1.5-2GB │ ├──────────────────────────────────────────┤ │ Swap (2GB) emergencia │ └──────────────────────────────────────────┘ fallocate -l 2G /swapfile chmod 600 /swapfile mkswap /swapfile swapon /swapfile echo '/swapfile none swap sw 0 0' >> /etc/fstab fallocate -l 2G /swapfile chmod 600 /swapfile mkswap /swapfile swapon /swapfile echo '/swapfile none swap sw 0 0' >> /etc/fstab fallocate -l 2G /swapfile chmod 600 /swapfile mkswap /swapfile swapon /swapfile echo '/swapfile none swap sw 0 0' >> /etc/fstab - Modelos de embeddings que consumen ~500MB de RAM por worker - PostgreSQL con extensiones pesadas (pgvector + HNSW indexes) - Streaming SSE que necesita conexiones long-lived - Redis para rate limiting y cache - Todo eso compitiendo por 4GB de RAM - DNS en Cloudflare: A record apuntando al VPS, proxy habilitado (nube naranja) - SSL en Cloudflare: "Full (strict)" — encripta tanto el tramo browser→Cloudflare como Cloudflare→VPS - Certificado origin: Generado en Cloudflare Dashboard → SSL/TLS → Origin Server → Create Certificate - En el VPS: Copiar el certificado y key a /etc/ssl/certs/origin.pem y /etc/ssl/private/origin.key - Cache de filesystem del OS (ayuda a PostgreSQL) - Picos de tráfico - Operaciones de mantenimiento (backups, builds) - [x] Puertos Docker solo en localhost (127.0.0.1:puerto:puerto) - [x] Firewall activo (ufw allow 22,80,443/tcp && ufw -weight: 500;">enable) - [x] SSH solo por key (deshabilitar password auth en /etc/ssh/sshd_config) - [x] .env.production fuera del repo (.gitignore) - [x] Secretos nunca en compose files (usar env_file reference) - [x] DB sin puerto externo (solo accesible via Docker network) - [x] Backups automatizados con verificación periódica - [x] Health check en el script de deploy - [x] Swap configurado para evitar OOM kills - Monitoring con Prometheus + Grafana: Métricas de latencia, errores, y uso de recursos (actualmente solo logs + cron) - Backup offsite: Copiar backups a un bucket S3/R2 en vez de guardarlos solo en el mismo VPS - Blue-green deploys: Cuando el tráfico justifique un segundo VPS - CI/CD con GitHub Actions: Automatizar el deploy script (actualmente es un ./deploy.sh backend manual)