Build Private AI Agents in Your Homelab with Ollama and n8n

Run local LLMs with Ollama and wire them into automated workflows with n8n. A practical guide to building private AI agents that monitor logs, triage emails, summarize feeds, and more—all on your own hardware, at zero cost.

AI neural network visualization representing local language model inference
Photo by Cash Macanaya / Unsplash

You’re running a dozen self-hosted services. You’ve got dashboards, alerts, RSS feeds, logs, maybe a photo library and a media server. Everything works, but nothing talks to each other intelligently. You’re still the glue—reading logs, triaging notifications, manually kicking off tasks.

What if your homelab could think for itself?

In 2026, that’s no longer a hypothetical. With Ollama running large language models locally and n8n orchestrating workflows across 400+ integrations, you can build private AI agents that automate real work—without sending a single byte to the cloud and at zero ongoing cost.


Why Self-Hosted AI Matters

You can already use ChatGPT or Claude to answer questions. But hosted APIs have three problems for homelab automation:

  • Privacy — Every prompt you send includes your data. Server logs, email content, personal notes—all leaving your network.
  • Cost — API calls add up fast when you’re running automated workflows 24/7. A log monitoring agent that fires every 5 minutes will drain your credits.
  • Latency and availability — Your automation breaks when the API is down or rate-limited. Local inference has no such dependency.

Running your own LLM on your own hardware solves all three. And with the current generation of open models (Llama 4, Qwen 3, DeepSeek V3, Gemma 4), local doesn’t mean compromising on quality anymore.


The Stack

We’re building with four components:

  • Ollama — Runs LLMs locally with a single command. No Python environment, no dependency hell. Just pull a model and go.
  • n8n — Open-source workflow automation with a visual editor and 400+ integrations. Think Zapier, but self-hosted and with first-class AI agent support.
  • Qdrant — Vector database for semantic search. Needed if you want your agent to query your own documents (RAG).
  • PostgreSQL — Persistent storage for n8n workflows and execution history.

n8n maintains an official Self-Hosted AI Starter Kit that bundles all of this into a single Docker Compose file. We’ll use that as our starting point.


Hardware Requirements

You don’t need a server rack. Here’s what actually works:

RAM Model Size Example Models
8 GB 7B parameters Llama 3.2 7B, Gemma 4 7B
16 GB 14B parameters Qwen 3 14B, DeepSeek V3 14B
24 GB 32B parameters Llama 4 32B
32 GB+ 70B parameters Llama 4 70B, Qwen 3 72B

For most homelab workflows—summarization, classification, log analysis—a 14B model on 16 GB of RAM is the sweet spot. If you have an NVIDIA GPU, inference will be significantly faster, but CPU-only works fine for non-interactive tasks like scheduled automations.

An M4 Mac mini with 24 GB of unified memory (~$800) is arguably the best value right now: silent, 30W idle, and fast enough for 32B models.


Deployment

Option 1: The Official Starter Kit

git clone https://github.com/n8n-io/self-hosted-ai-starter-kit.git
cd self-hosted-ai-starter-kit
cp .env.example .env

# GPU (NVIDIA)
docker compose --profile gpu-nvidia up -d

# CPU only
docker compose --profile cpu up -d

# AMD GPU (Linux)
docker compose --profile gpu-amd up -d

This gets you n8n, Ollama, Qdrant, and PostgreSQL in one shot. Access n8n at http://localhost:5678 and complete the initial setup.

Option 2: Minimal Custom Compose

If you already run PostgreSQL or prefer a leaner setup:

services:
  ollama:
    image: ollama/ollama:latest
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    # Uncomment for NVIDIA GPU support:
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - capabilities: [gpu]

  n8n:
    image: n8nio/n8n:latest
    restart: unless-stopped
    ports:
      - "5678:5678"
    environment:
      - N8N_HOST=localhost
      - N8N_PORT=5678
      - N8N_PROTOCOL=http
    volumes:
      - n8n-data:/home/node/.local/share/n8n
    extra_hosts:
      - "host.docker.internal:host-gateway"

volumes:
  ollama-data:
  n8n-data:

The extra_hosts directive lets n8n reach Ollama via host.docker.internal:11434. If both containers share a Docker network, you can use the service name ollama:11434 directly instead.

Pulling Your First Model

# Pull a model
docker exec -it ollama ollama pull llama3.2

# Verify it works
docker exec -it ollama ollama run llama3.2 "Summarize what a reverse proxy does in one sentence."

Connecting n8n to Ollama

Once both services are running:

  • Open n8n at http://localhost:5678
  • Go to Credentials → Add Credential → Ollama
  • Set the base URL to http://ollama:11434 (or http://host.docker.internal:11434 if running Ollama on the host)
  • Save, and the connection is live

You now have a local LLM available as a node in any n8n workflow. Drag it in, pick your model, write a prompt, and wire it to triggers and actions.


Practical Workflows

Here’s where it gets useful. These are real automations you can build in minutes.

1. Automated Log Analysis

A workflow that runs every 15 minutes, tails your server logs, sends them to Ollama for analysis, and alerts you on Discord or email only when something looks wrong.

Nodes:

  1. Cron Trigger — Every 15 minutes
  2. Execute Commandtail -n 200 /var/log/syslog
  3. Ollama — Prompt: "Analyze these server logs. Identify any errors, warnings, or unusual patterns. If everything looks normal, respond with CLEAR. Otherwise, summarize the issues."
  4. IF — Check if response contains "CLEAR"
  5. Discord/Email — Send alert with the summary (only if not CLEAR)

You’ve just built an AI-powered log monitor. No SaaS subscription, no data leaving your network.

2. Email Triage Agent

Connect your email (IMAP or Gmail node) to an Ollama-powered classifier:

Nodes:

  1. Email Trigger — New incoming email
  2. Ollama — Prompt: "Classify this email into one of: URGENT, ACTION_REQUIRED, FYI, SPAM. Then write a one-line summary."
  3. Switch — Route by classification
  4. Slack/Notification — URGENT → immediate ping; ACTION_REQUIRED → daily digest; SPAM → archive

Your inbox is now triaged by AI, running entirely on your own machine.

3. RSS Feed Summarizer

Stay on top of news without doomscrolling:

Nodes:

  1. RSS Feed Trigger — Monitor your favorite feeds
  2. Ollama — Prompt: "Summarize this article in 2–3 bullet points. Flag if it’s relevant to: [your topics]."
  3. Filter — Only keep relevant articles
  4. Notion/Markdown File — Append to a daily digest document

4. Document Q&A with RAG

This is where Qdrant comes in. Ingest your documents (PDFs, notes, manuals) into the vector database, then ask questions in natural language:

Nodes:

  1. Chat Trigger — You ask a question
  2. Vector Store Retriever — Find relevant document chunks from Qdrant
  3. Ollama — Prompt: "Based on the following context, answer the user’s question. If the context doesn’t contain the answer, say so."
  4. Response — Return the answer

Your private, searchable knowledge base—powered by your own hardware.


The AI Agent Node: Autonomous Workflows

Everything above uses Ollama as a processing step in a linear workflow. But n8n’s AI Agent node goes further: it lets the LLM decide what to do next.

You give the agent:

  • A goal (e.g., “Investigate why disk usage spiked”)
  • A set of tools (shell commands, API calls, database queries)
  • Access to your local LLM via Ollama

The agent then reasons about the problem, selects the right tools, executes them, analyzes the results, and loops until the task is complete. This is the same agentic pattern that powers tools like Claude Code and GitHub Copilot Workspace—except it’s running on your hardware, with your data, under your control.


Tips for Production Use

  • Use quantized models — 4-bit or 6-bit quantizations give you 80–90% of full-precision quality at a fraction of the memory. For automated tasks (not creative writing), this is a no-brainer.
  • Pin model versions — Use llama3.2:7b-q4_K_M instead of llama3.2:latest. You don’t want a model update to silently change your workflow’s behavior.
  • Set timeouts — LLM inference on CPU can be slow for large inputs. Configure n8n node timeouts to avoid hanging workflows.
  • Monitor with Uptime Kuma — Point it at Ollama’s /api/tags endpoint to ensure the service is healthy.
  • Put it behind Traefik — If you want to access n8n remotely, route it through your existing reverse proxy with HTTPS and authentication.

Wrapping Up

The combination of Ollama and n8n is the most practical AI setup for a homelab in 2026. It’s not a toy—it’s a genuinely useful system that automates real work, runs on modest hardware, costs nothing after the initial setup, and keeps your data entirely under your control.

Start with one workflow. The log analyzer or RSS summarizer are good first projects—simple enough to build in 15 minutes, useful enough that you’ll actually keep them running. Once you see what’s possible, you’ll find yourself wiring AI into everything.

This website respects your privacy and does not use cookies for tracking purposes. More information