Tools: How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026) - Complete Guide

Tools: How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026) - Complete Guide

Want more AI agent content? How to Deploy an AI Agent to Production: VPS, Docker & Serverless (2026) Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Command
Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably. This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably. This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG sudo agent # Install essentials apt update && apt install -y python3 python3-pip python3-venv git curl # Switch to agent user su - agent # Clone your agent code git clone https://github.com/your-org/your-agent.git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1 # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG sudo agent # Install essentials apt update && apt install -y python3 python3-pip python3-venv git curl # Switch to agent user su - agent # Clone your agent code git clone https://github.com/your-org/your-agent.git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1 # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG sudo agent # Install essentials apt update && apt install -y python3 python3-pip python3-venv git curl # Switch to agent user su - agent # Clone your agent code git clone https://github.com/your-org/your-agent.git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate pip install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1" style="background: linear-gradient(135deg, #6a5acd 0%, #5a4abd 100%); color: #fff; border: none; padding: 6px 12px; border-radius: 8px; cursor: pointer; font-size: 12px; font-weight: 600; transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1); display: flex; align-items: center; gap: 8px; box-shadow: 0 4px 12px rgba(106, 90, 205, 0.4), inset 0 1px 0 rgba(255, 255, 255, 0.1); position: relative; overflow: hidden;">

Copy

$ Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably. This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, -weight: 500;">install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably. This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, -weight: 500;">install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup Your agent works on your laptop. Great. Now how do you make it run 24/7 without you babysitting it? Deployment is where most AI agent projects die — not because the agent doesn't work, but because nobody figured out how to keep it running reliably. This guide covers three deployment approaches (VPS, Docker, serverless), with real configs, cost breakdowns, and the monitoring you need to sleep at night while your agent works.

Choosing Your Deployment Model Approach Best For Monthly Cost Complexity Always-On? **VPS (bare metal)** 24/7 autonomous agents $5-20 Medium Yes **Docker + VPS** Reproducible, multi-agent $10-30 Medium-High Yes **Serverless (Lambda/Cloud Run)** Event-triggered agents $1-50 (pay-per-use) Low-Medium No (triggered) **Managed platforms** No-ops teams $20-200 Low Varies

Option 1: VPS Deployment (What We Use) The simplest path to a 24/7 agent. Rent a virtual server, -weight: 500;">install your agent, set up a process manager, and let it run.

Step 1: Choose a VPS Provider Provider Cheapest Plan Specs Best For **Hetzner** $4.50/mo 2 vCPU, 4GB RAM, 40GB SSD Best value in EU **DigitalOcean** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Simple UI, good docs **Vultr** $6/mo 1 vCPU, 1GB RAM, 25GB SSD Global locations **Contabo** $6.50/mo 4 vCPU, 8GB RAM, 50GB SSD Most specs per dollar **What Paxrel uses:** A Hetzner CX22 ($5.50/mo) with 2 vCPU, 4GB RAM. Runs our full agent stack: newsletter pipeline, social media automation, web scraping, and Reddit karma builder — all on one server.

Step 2: Initial Server Setup # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG -weight: 600;">sudo agent # Install essentials -weight: 500;">apt -weight: 500;">update && -weight: 500;">apt -weight: 500;">install -y python3 python3--weight: 500;">pip python3-venv -weight: 500;">git -weight: 500;">curl # Switch to agent user su - agent # Clone your agent code -weight: 500;">git clone https://github.com/your-org/your-agent.-weight: 500;">git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate -weight: 500;">pip -weight: 500;">install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1 # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG -weight: 600;">sudo agent # Install essentials -weight: 500;">apt -weight: 500;">update && -weight: 500;">apt -weight: 500;">install -y python3 python3--weight: 500;">pip python3-venv -weight: 500;">git -weight: 500;">curl # Switch to agent user su - agent # Clone your agent code -weight: 500;">git clone https://github.com/your-org/your-agent.-weight: 500;">git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate -weight: 500;">pip -weight: 500;">install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1 # SSH into your new server ssh root@your-server-ip # Create a non-root user adduser agent usermod -aG -weight: 600;">sudo agent # Install essentials -weight: 500;">apt -weight: 500;">update && -weight: 500;">apt -weight: 500;">install -y python3 python3--weight: 500;">pip python3-venv -weight: 500;">git -weight: 500;">curl # Switch to agent user su - agent # Clone your agent code -weight: 500;">git clone https://github.com/your-org/your-agent.-weight: 500;">git cd your-agent # Set up Python environment python3 -m venv .venv source .venv/bin/activate -weight: 500;">pip -weight: 500;">install -r requirements.txt # Create environment file for credentials cat > .env > logs/pipeline.log 2>&1 # Social media posting: Every 6 hours 0 */6 * * * cd /home/agent/your-agent && .venv/bin/python3 post_tweet.py >> logs/twitter.log 2>&1 # Daily monitoring report 30 9 * * * cd /home/agent/your-agent && .venv/bin/python3 monitoring.py >> logs/monitoring.log 2>&1

Option 2: Docker Deployment Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.

Command

Copy

$

Option 2: Docker Deployment Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.

Command

Copy

$

Option 2: Docker Deployment Docker adds reproducibility and isolation. Especially useful when running multiple agents or when your agent has complex dependencies.

Command

Copy

# Dockerfile FROM python:3.12-slim WORKDIR /app # Install system dependencies RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ -weight: 500;">curl -weight: 500;">git && rm -rf /var/lib/-weight: 500;">apt/lists/* # Install Python dependencies COPY requirements.txt . RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Copy agent code COPY . . # Non-root user for security RUN useradd -m agent USER agent CMD ["python3", "agent.py"] # Dockerfile FROM python:3.12-slim WORKDIR /app # Install system dependencies RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ -weight: 500;">curl -weight: 500;">git && rm -rf /var/lib/-weight: 500;">apt/lists/* # Install Python dependencies COPY requirements.txt . RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Copy agent code COPY . . # Non-root user for security RUN useradd -m agent USER agent CMD ["python3", "agent.py"] # Dockerfile FROM python:3.12-slim WORKDIR /app # Install system dependencies RUN -weight: 500;">apt-get -weight: 500;">update && -weight: 500;">apt-get -weight: 500;">install -y --no--weight: 500;">install-recommends \ -weight: 500;">curl -weight: 500;">git && rm -rf /var/lib/-weight: 500;">apt/lists/* # Install Python dependencies COPY requirements.txt . RUN -weight: 500;">pip -weight: 500;">install --no-cache-dir -r requirements.txt # Copy agent code COPY . . # Non-root user for security RUN useradd -m agent USER agent CMD ["python3", "agent.py"] # -weight: 500;">docker-compose.yml version: '3.8' services: agent: build: . -weight: 500;">restart: always env_file: .env volumes: - ./data:/app/data # Persist agent memory/state - ./logs:/app/logs # Persist logs deploy: resources: limits: memory: 2G cpus: '1.0' healthcheck: test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"] interval: 60s timeout: 10s retries: 3 # Optional: vector database for RAG chromadb: image: chromadb/chroma:latest -weight: 500;">restart: always volumes: - chroma_data:/chroma/chroma ports: - "8000:8000" volumes: chroma_data: # -weight: 500;">docker-compose.yml version: '3.8' services: agent: build: . -weight: 500;">restart: always env_file: .env volumes: - ./data:/app/data # Persist agent memory/state - ./logs:/app/logs # Persist logs deploy: resources: limits: memory: 2G cpus: '1.0' healthcheck: test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"] interval: 60s timeout: 10s retries: 3 # Optional: vector database for RAG chromadb: image: chromadb/chroma:latest -weight: 500;">restart: always volumes: - chroma_data:/chroma/chroma ports: - "8000:8000" volumes: chroma_data: # -weight: 500;">docker-compose.yml version: '3.8' services: agent: build: . -weight: 500;">restart: always env_file: .env volumes: - ./data:/app/data # Persist agent memory/state - ./logs:/app/logs # Persist logs deploy: resources: limits: memory: 2G cpus: '1.0' healthcheck: test: ["CMD", "python3", "-c", "import requests; requests.get('http://localhost:8080/health')"] interval: 60s timeout: 10s retries: 3 # Optional: vector database for RAG chromadb: image: chromadb/chroma:latest -weight: 500;">restart: always volumes: - chroma_data:/chroma/chroma ports: - "8000:8000" volumes: chroma_data: # Deploy -weight: 500;">docker compose up -d # View logs -weight: 500;">docker compose logs -f agent # Update agent -weight: 500;">git pull && -weight: 500;">docker compose build && -weight: 500;">docker compose up -d # Deploy -weight: 500;">docker compose up -d # View logs -weight: 500;">docker compose logs -f agent # Update agent -weight: 500;">git pull && -weight: 500;">docker compose build && -weight: 500;">docker compose up -d # Deploy -weight: 500;">docker compose up -d # View logs -weight: 500;">docker compose logs -f agent # Update agent -weight: 500;">git pull && -weight: 500;">docker compose build && -weight: 500;">docker compose up -d

Option 3: Serverless Deployment For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.

AWS Lambda + EventBridge

Command

Copy

$

Option 3: Serverless Deployment For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.

AWS Lambda + EventBridge

Command

Copy

$

Option 3: Serverless Deployment For agents triggered by events (webhook, email, schedule) rather than running continuously. Pay only when the agent runs.

AWS Lambda + EventBridge

Command

Copy

# handler.py import json import boto3 def lambda_handler(event, context): """Triggered by EventBridge cron or API Gateway webhook""" # Your agent logic here from agent import run_agent result = run_agent(event) return { 'statusCode': 200, 'body': json.dumps(result) } # handler.py import json import boto3 def lambda_handler(event, context): """Triggered by EventBridge cron or API Gateway webhook""" # Your agent logic here from agent import run_agent result = run_agent(event) return { 'statusCode': 200, 'body': json.dumps(result) } # handler.py import json import boto3 def lambda_handler(event, context): """Triggered by EventBridge cron or API Gateway webhook""" # Your agent logic here from agent import run_agent result = run_agent(event) return { 'statusCode': 200, 'body': json.dumps(result) } # serverless.yml (Serverless Framework) -weight: 500;">service: ai-agent provider: name: aws runtime: python3.12 timeout: 300 # 5 minutes max memorySize: 512 environment: OPENAI_API_KEY: ${ssm:/ai-agent/openai-key} functions: newsletter: handler: handler.lambda_handler events: - schedule: cron(0 8 ? * MON,WED,FRI *) # Mon/Wed/Fri 8am webhook: handler: handler.lambda_handler events: - httpApi: path: /webhook method: post # serverless.yml (Serverless Framework) -weight: 500;">service: ai-agent provider: name: aws runtime: python3.12 timeout: 300 # 5 minutes max memorySize: 512 environment: OPENAI_API_KEY: ${ssm:/ai-agent/openai-key} functions: newsletter: handler: handler.lambda_handler events: - schedule: cron(0 8 ? * MON,WED,FRI *) # Mon/Wed/Fri 8am webhook: handler: handler.lambda_handler events: - httpApi: path: /webhook method: post # serverless.yml (Serverless Framework) -weight: 500;">service: ai-agent provider: name: aws runtime: python3.12 timeout: 300 # 5 minutes max memorySize: 512 environment: OPENAI_API_KEY: ${ssm:/ai-agent/openai-key} functions: newsletter: handler: handler.lambda_handler events: - schedule: cron(0 8 ? * MON,WED,FRI *) # Mon/Wed/Fri 8am webhook: handler: handler.lambda_handler events: - httpApi: path: /webhook method: post

Google Cloud Run

Command

Copy

$

Google Cloud Run

Command

Copy

$

Google Cloud Run

Command
Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint Platform Max Runtime Cold Start Cost per Run AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01 Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05 Vercel Functions 5 minutes (pro: 15) $0.0001-0.005 Cloudflare Workers 30 seconds (free) $0.00005

Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint Platform Max Runtime Cold Start Cost per Run AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01 Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05 Vercel Functions 5 minutes (pro: 15) $0.0001-0.005 Cloudflare Workers 30 seconds (free) $0.00005

Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() }) from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() }) from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() })" style="background: linear-gradient(135deg, #6a5acd 0%, #5a4abd 100%); color: #fff; border: none; padding: 6px 12px; border-radius: 8px; cursor: pointer; font-size: 12px; font-weight: 600; transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1); display: flex; align-items: center; gap: 8px; box-shadow: 0 4px 12px rgba(106, 90, 205, 0.4), inset 0 1px 0 rgba(255, 255, 255, 0.1); position: relative; overflow: hidden;">

Copy

# For longer-running agents (up to 60 min) gcloud run deploy ai-agent \ --source . \ --region us-central1 \ --memory 1Gi \ --timeout 3600 \ --set-env-vars "OPENAI_API_KEY=sk-..." \ --no-allow-unauthenticated # For longer-running agents (up to 60 min) gcloud run deploy ai-agent \ --source . \ --region us-central1 \ --memory 1Gi \ --timeout 3600 \ --set-env-vars "OPENAI_API_KEY=sk-..." \ --no-allow-unauthenticated # For longer-running agents (up to 60 min) gcloud run deploy ai-agent \ --source . \ --region us-central1 \ --memory 1Gi \ --timeout 3600 \ --set-env-vars "OPENAI_API_KEY=sk-..." \ --no-allow-unauthenticated Platform Max Runtime Cold Start Cost per Run AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01 Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05 Vercel Functions 5 minutes (pro: 15) $0.0001-0.005 Cloudflare Workers 30 seconds (free) $0.00005

Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint Platform Max Runtime Cold Start Cost per Run AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01 Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05 Vercel Functions 5 minutes (pro: 15) $0.0001-0.005 Cloudflare Workers 30 seconds (free) $0.00005

Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint Platform Max Runtime Cold Start Cost per Run AWS Lambda 15 minutes 1-5 seconds $0.0001-0.01 Google Cloud Run 60 minutes 2-10 seconds $0.001-0.05 Vercel Functions 5 minutes (pro: 15) $0.0001-0.005 Cloudflare Workers 30 seconds (free) $0.00005

Monitoring Your Deployed Agent A deployed agent without monitoring is a liability. Here's the minimum monitoring stack:

Health Check Endpoint from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "-weight: 500;">status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() }) from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "-weight: 500;">status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() }) from flask import Flask, jsonify import psutil app = Flask(__name__) @app.route('/health') def health(): return jsonify({ "-weight: 500;">status": "healthy", "uptime_hours": get_uptime(), "memory_mb": psutil.Process().memory_info().rss / 1024 / 1024, "last_run": get_last_run_timestamp(), "errors_24h": get_error_count(hours=24), "api_balance": check_api_balance() })

Alert System

Command

Copy

$

Alert System

Command

Copy

$

Alert System

Command

Copy

$ import requests def send_alert(message, level="warning"): """Send alert via Telegram/Slack/email""" if level == "critical": # Telegram for immediate attention requests.post( f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage", data={"chat_id": OWNER_ID, "text": f"🚨 {message}"} ) else: # Slack webhook for non-critical requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"}) # Alerts to configure: # - Agent crash / -weight: 500;">restart # - API balance below threshold # - Error rate spike (3+ errors in 10 min) # - Agent stuck (no activity for 2+ hours) # - Cost spike (daily spend > 2x average) import requests def send_alert(message, level="warning"): """Send alert via Telegram/Slack/email""" if level == "critical": # Telegram for immediate attention requests.post( f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage", data={"chat_id": OWNER_ID, "text": f"🚨 {message}"} ) else: # Slack webhook for non-critical requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"}) # Alerts to configure: # - Agent crash / -weight: 500;">restart # - API balance below threshold # - Error rate spike (3+ errors in 10 min) # - Agent stuck (no activity for 2+ hours) # - Cost spike (daily spend > 2x average) import requests def send_alert(message, level="warning"): """Send alert via Telegram/Slack/email""" if level == "critical": # Telegram for immediate attention requests.post( f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage", data={"chat_id": OWNER_ID, "text": f"🚨 {message}"} ) else: # Slack webhook for non-critical requests.post(SLACK_WEBHOOK, json={"text": f"⚠️ {message}"}) # Alerts to configure: # - Agent crash / -weight: 500;">restart # - API balance below threshold # - Error rate spike (3+ errors in 10 min) # - Agent stuck (no activity for 2+ hours) # - Cost spike (daily spend > 2x average)

Log Management

Command

Copy

$

Log Management

Command

Copy

$

Log Management

Command

Copy

$ import logging from logging.handlers import RotatingFileHandler # Structured logging handler = RotatingFileHandler( 'logs/agent.log', maxBytes=10_000_000, # 10MB per file backupCount=5 # Keep 5 rotated files ) handler.setFormatter(logging.Formatter( '%(asctime)s [%(levelname)s] %(name)s: %(message)s' )) logger = logging.getLogger('agent') logger.addHandler(handler) # Log every significant action logger.info("Scraping 12 RSS feeds") logger.info("Scored 97 articles, top score: 28") logger.warning("API rate limited, retrying in 30s") logger.error("Beehiiv publish failed: 401 Unauthorized") import logging from logging.handlers import RotatingFileHandler # Structured logging handler = RotatingFileHandler( 'logs/agent.log', maxBytes=10_000_000, # 10MB per file backupCount=5 # Keep 5 rotated files ) handler.setFormatter(logging.Formatter( '%(asctime)s [%(levelname)s] %(name)s: %(message)s' )) logger = logging.getLogger('agent') logger.addHandler(handler) # Log every significant action logger.info("Scraping 12 RSS feeds") logger.info("Scored 97 articles, top score: 28") logger.warning("API rate limited, retrying in 30s") logger.error("Beehiiv publish failed: 401 Unauthorized") import logging from logging.handlers import RotatingFileHandler # Structured logging handler = RotatingFileHandler( 'logs/agent.log', maxBytes=10_000_000, # 10MB per file backupCount=5 # Keep 5 rotated files ) handler.setFormatter(logging.Formatter( '%(asctime)s [%(levelname)s] %(name)s: %(message)s' )) logger = logging.getLogger('agent') logger.addHandler(handler) # Log every significant action logger.info("Scraping 12 RSS feeds") logger.info("Scored 97 articles, top score: 28") logger.warning("API rate limited, retrying in 30s") logger.error("Beehiiv publish failed: 401 Unauthorized")

Production Hardening Checklist

Security - API keys in environment variables or secrets manager, never in code - Non-root user for the agent process - Firewall: only allow SSH (22) and necessary ports - SSH key auth only, disable password login - Auto-update OS security patches (`unattended-upgrades`)

Reliability - Process manager with auto-restart (systemd, Docker restart policy) - Graceful shutdown handling (catch SIGTERM, finish current task) - Exponential backoff on API errors (not infinite retry loops) - Circuit breaker for external services (stop calling after N failures) - Daily backup of agent state/memory to external storage

Cost Control - Daily API spend limit with hard cutoff - Max steps per agent run (prevent infinite loops) - Token counting before API calls (reject oversized prompts) - Alert when daily spend exceeds 2x average - Weekly cost report to the team

Deployment Patterns by Use Case Agent Type Best Deployment Why 24/7 autonomous agent VPS + systemd Always-on, persistent state Scheduled pipeline VPS + cron or serverless Runs on schedule, sleeps between Webhook-triggered Serverless (Lambda/Cloud Run) Pay-per-use, auto-scales Multi-agent system Docker Compose on VPS Isolated containers, shared network Customer-facing chatbot Cloud Run or managed platform Auto-scale with traffic Development/testing Local Docker Reproducible environment

Key Takeaways - **VPS + systemd is the simplest path** for always-on agents. $5-15/month, full control, works for 90% of use cases. - **Docker adds value** when you have complex dependencies, multiple agents, or need reproducibility across environments. - **Serverless is cheaper for sporadic workloads** but has runtime limits (15 min for Lambda) that don't suit long-running agents. - **Monitoring is not optional.** Health checks, alerts, and log rotation are the minimum. An unmonitored agent will fail silently. - **Security basics matter.** Non-root user, env vars for secrets, firewall, SSH keys. Takes 30 minutes, prevents disasters. - **Start simple, scale later.** A $5 VPS with cron jobs is a perfectly valid production deployment. Don't over-engineer until you need to.

Deploy With Confidence Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents. [Get the Playbook — $29](https://paxrel.gumroad.com/l/ai-agent-playbook)

Stay Updated on AI Agents Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam. [Subscribe to AI Agents Weekly](/newsletter.html)

Command

Copy

$

Production Hardening Checklist

Security - API keys in environment variables or secrets manager, never in code - Non-root user for the agent process - Firewall: only allow SSH (22) and necessary ports - SSH key auth only, disable password login - Auto-update OS security patches (`unattended-upgrades`)

Reliability - Process manager with auto-restart (systemd, Docker restart policy) - Graceful shutdown handling (catch SIGTERM, finish current task) - Exponential backoff on API errors (not infinite retry loops) - Circuit breaker for external services (stop calling after N failures) - Daily backup of agent state/memory to external storage

Cost Control - Daily API spend limit with hard cutoff - Max steps per agent run (prevent infinite loops) - Token counting before API calls (reject oversized prompts) - Alert when daily spend exceeds 2x average - Weekly cost report to the team

Deployment Patterns by Use Case Agent Type Best Deployment Why 24/7 autonomous agent VPS + systemd Always-on, persistent state Scheduled pipeline VPS + cron or serverless Runs on schedule, sleeps between Webhook-triggered Serverless (Lambda/Cloud Run) Pay-per-use, auto-scales Multi-agent system Docker Compose on VPS Isolated containers, shared network Customer-facing chatbot Cloud Run or managed platform Auto-scale with traffic Development/testing Local Docker Reproducible environment

Key Takeaways - **VPS + systemd is the simplest path** for always-on agents. $5-15/month, full control, works for 90% of use cases. - **Docker adds value** when you have complex dependencies, multiple agents, or need reproducibility across environments. - **Serverless is cheaper for sporadic workloads** but has runtime limits (15 min for Lambda) that don't suit long-running agents. - **Monitoring is not optional.** Health checks, alerts, and log rotation are the minimum. An unmonitored agent will fail silently. - **Security basics matter.** Non-root user, env vars for secrets, firewall, SSH keys. Takes 30 minutes, prevents disasters. - **Start simple, scale later.** A $5 VPS with cron jobs is a perfectly valid production deployment. Don't over-engineer until you need to.

Deploy With Confidence Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents. [Get the Playbook — $29](https://paxrel.gumroad.com/l/ai-agent-playbook)

Stay Updated on AI Agents Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam. [Subscribe to AI Agents Weekly](/newsletter.html)

Command

Copy

$

Production Hardening Checklist

Security - API keys in environment variables or secrets manager, never in code - Non-root user for the agent process - Firewall: only allow SSH (22) and necessary ports - SSH key auth only, disable password login - Auto-update OS security patches (`unattended-upgrades`)

Reliability - Process manager with auto-restart (systemd, Docker restart policy) - Graceful shutdown handling (catch SIGTERM, finish current task) - Exponential backoff on API errors (not infinite retry loops) - Circuit breaker for external services (stop calling after N failures) - Daily backup of agent state/memory to external storage

Cost Control - Daily API spend limit with hard cutoff - Max steps per agent run (prevent infinite loops) - Token counting before API calls (reject oversized prompts) - Alert when daily spend exceeds 2x average - Weekly cost report to the team

Deployment Patterns by Use Case Agent Type Best Deployment Why 24/7 autonomous agent VPS + systemd Always-on, persistent state Scheduled pipeline VPS + cron or serverless Runs on schedule, sleeps between Webhook-triggered Serverless (Lambda/Cloud Run) Pay-per-use, auto-scales Multi-agent system Docker Compose on VPS Isolated containers, shared network Customer-facing chatbot Cloud Run or managed platform Auto-scale with traffic Development/testing Local Docker Reproducible environment

Key Takeaways - **VPS + systemd is the simplest path** for always-on agents. $5-15/month, full control, works for 90% of use cases. - **Docker adds value** when you have complex dependencies, multiple agents, or need reproducibility across environments. - **Serverless is cheaper for sporadic workloads** but has runtime limits (15 min for Lambda) that don't suit long-running agents. - **Monitoring is not optional.** Health checks, alerts, and log rotation are the minimum. An unmonitored agent will fail silently. - **Security basics matter.** Non-root user, env vars for secrets, firewall, SSH keys. Takes 30 minutes, prevents disasters. - **Start simple, scale later.** A $5 VPS with cron jobs is a perfectly valid production deployment. Don't over-engineer until you need to.

Deploy With Confidence Our AI Agent Playbook includes Dockerfiles, systemd configs, monitoring templates, and deployment checklists for production agents. [Get the Playbook — $29](https://paxrel.gumroad.com/l/ai-agent-playbook)

Stay Updated on AI Agents Deployment patterns, infrastructure tips, and production war stories. 3x/week, no spam. [Subscribe to AI Agents Weekly](/newsletter.html) - Free: Download our AI Agent Starter Kit (5 templates + security checklist)

- Free: Subscribe to AI Agents Weekly for curated news 3x/week- $29: Get The AI Agent Playbook — 80+ pages of templates and guides