Build Private AI Agents in Your Homelab with Ollama and n8n
Run local LLMs with Ollama and wire them into automated workflows with n8n. A practical guide to building private AI agents that monitor logs, triage emails, summarize feeds, and more—all on your own hardware, at zero cost.
You’re running a dozen self-hosted services. You’ve got dashboards, alerts, RSS feeds, logs, maybe a photo library and a media server. Everything works, but nothing talks to each other intelligently. You’re still the glue—reading logs, triaging notifications, manually kicking off tasks.
What if your homelab could think for itself?
In 2026, that’s no longer a hypothetical. With Ollama running large language models locally and n8n orchestrating workflows across 400+ integrations, you can build private AI agents that automate real work—without sending a single byte to the cloud and at zero ongoing cost.
Why Self-Hosted AI Matters
You can already use ChatGPT or Claude to answer questions. But hosted APIs have three problems for homelab automation:
- Privacy — Every prompt you send includes your data. Server logs, email content, personal notes—all leaving your network.
- Cost — API calls add up fast when you’re running automated workflows 24/7. A log monitoring agent that fires every 5 minutes will drain your credits.
- Latency and availability — Your automation breaks when the API is down or rate-limited. Local inference has no such dependency.
Running your own LLM on your own hardware solves all three. And with the current generation of open models (Llama 4, Qwen 3, DeepSeek V3, Gemma 4), local doesn’t mean compromising on quality anymore.
The Stack
We’re building with four components:
- Ollama — Runs LLMs locally with a single command. No Python environment, no dependency hell. Just pull a model and go.
- n8n — Open-source workflow automation with a visual editor and 400+ integrations. Think Zapier, but self-hosted and with first-class AI agent support.
- Qdrant — Vector database for semantic search. Needed if you want your agent to query your own documents (RAG).
- PostgreSQL — Persistent storage for n8n workflows and execution history.
n8n maintains an official Self-Hosted AI Starter Kit that bundles all of this into a single Docker Compose file. We’ll use that as our starting point.
Hardware Requirements
You don’t need a server rack. Here’s what actually works:
| RAM | Model Size | Example Models |
|---|---|---|
| 8 GB | 7B parameters | Llama 3.2 7B, Gemma 4 7B |
| 16 GB | 14B parameters | Qwen 3 14B, DeepSeek V3 14B |
| 24 GB | 32B parameters | Llama 4 32B |
| 32 GB+ | 70B parameters | Llama 4 70B, Qwen 3 72B |
For most homelab workflows—summarization, classification, log analysis—a 14B model on 16 GB of RAM is the sweet spot. If you have an NVIDIA GPU, inference will be significantly faster, but CPU-only works fine for non-interactive tasks like scheduled automations.
An M4 Mac mini with 24 GB of unified memory (~$800) is arguably the best value right now: silent, 30W idle, and fast enough for 32B models.
Deployment
Option 1: The Official Starter Kit
git clone https://github.com/n8n-io/self-hosted-ai-starter-kit.git
cd self-hosted-ai-starter-kit
cp .env.example .env
# GPU (NVIDIA)
docker compose --profile gpu-nvidia up -d
# CPU only
docker compose --profile cpu up -d
# AMD GPU (Linux)
docker compose --profile gpu-amd up -d
This gets you n8n, Ollama, Qdrant, and PostgreSQL in one shot. Access n8n at http://localhost:5678 and complete the initial setup.
Option 2: Minimal Custom Compose
If you already run PostgreSQL or prefer a leaner setup:
services:
ollama:
image: ollama/ollama:latest
restart: unless-stopped
ports:
- "11434:11434"
volumes:
- ollama-data:/root/.ollama
# Uncomment for NVIDIA GPU support:
# deploy:
# resources:
# reservations:
# devices:
# - capabilities: [gpu]
n8n:
image: n8nio/n8n:latest
restart: unless-stopped
ports:
- "5678:5678"
environment:
- N8N_HOST=localhost
- N8N_PORT=5678
- N8N_PROTOCOL=http
volumes:
- n8n-data:/home/node/.local/share/n8n
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
ollama-data:
n8n-data:
The extra_hosts directive lets n8n reach Ollama via host.docker.internal:11434. If both containers share a Docker network, you can use the service name ollama:11434 directly instead.
Pulling Your First Model
# Pull a model
docker exec -it ollama ollama pull llama3.2
# Verify it works
docker exec -it ollama ollama run llama3.2 "Summarize what a reverse proxy does in one sentence."
Connecting n8n to Ollama
Once both services are running:
- Open n8n at
http://localhost:5678 - Go to Credentials → Add Credential → Ollama
- Set the base URL to
http://ollama:11434(orhttp://host.docker.internal:11434if running Ollama on the host) - Save, and the connection is live
You now have a local LLM available as a node in any n8n workflow. Drag it in, pick your model, write a prompt, and wire it to triggers and actions.
Practical Workflows
Here’s where it gets useful. These are real automations you can build in minutes.
1. Automated Log Analysis
A workflow that runs every 15 minutes, tails your server logs, sends them to Ollama for analysis, and alerts you on Discord or email only when something looks wrong.
Nodes:
- Cron Trigger — Every 15 minutes
- Execute Command —
tail -n 200 /var/log/syslog - Ollama — Prompt: "Analyze these server logs. Identify any errors, warnings, or unusual patterns. If everything looks normal, respond with CLEAR. Otherwise, summarize the issues."
- IF — Check if response contains "CLEAR"
- Discord/Email — Send alert with the summary (only if not CLEAR)
You’ve just built an AI-powered log monitor. No SaaS subscription, no data leaving your network.
2. Email Triage Agent
Connect your email (IMAP or Gmail node) to an Ollama-powered classifier:
Nodes:
- Email Trigger — New incoming email
- Ollama — Prompt: "Classify this email into one of: URGENT, ACTION_REQUIRED, FYI, SPAM. Then write a one-line summary."
- Switch — Route by classification
- Slack/Notification — URGENT → immediate ping; ACTION_REQUIRED → daily digest; SPAM → archive
Your inbox is now triaged by AI, running entirely on your own machine.
3. RSS Feed Summarizer
Stay on top of news without doomscrolling:
Nodes:
- RSS Feed Trigger — Monitor your favorite feeds
- Ollama — Prompt: "Summarize this article in 2–3 bullet points. Flag if it’s relevant to: [your topics]."
- Filter — Only keep relevant articles
- Notion/Markdown File — Append to a daily digest document
4. Document Q&A with RAG
This is where Qdrant comes in. Ingest your documents (PDFs, notes, manuals) into the vector database, then ask questions in natural language:
Nodes:
- Chat Trigger — You ask a question
- Vector Store Retriever — Find relevant document chunks from Qdrant
- Ollama — Prompt: "Based on the following context, answer the user’s question. If the context doesn’t contain the answer, say so."
- Response — Return the answer
Your private, searchable knowledge base—powered by your own hardware.
The AI Agent Node: Autonomous Workflows
Everything above uses Ollama as a processing step in a linear workflow. But n8n’s AI Agent node goes further: it lets the LLM decide what to do next.
You give the agent:
- A goal (e.g., “Investigate why disk usage spiked”)
- A set of tools (shell commands, API calls, database queries)
- Access to your local LLM via Ollama
The agent then reasons about the problem, selects the right tools, executes them, analyzes the results, and loops until the task is complete. This is the same agentic pattern that powers tools like Claude Code and GitHub Copilot Workspace—except it’s running on your hardware, with your data, under your control.
Tips for Production Use
- Use quantized models — 4-bit or 6-bit quantizations give you 80–90% of full-precision quality at a fraction of the memory. For automated tasks (not creative writing), this is a no-brainer.
- Pin model versions — Use
llama3.2:7b-q4_K_Minstead ofllama3.2:latest. You don’t want a model update to silently change your workflow’s behavior. - Set timeouts — LLM inference on CPU can be slow for large inputs. Configure n8n node timeouts to avoid hanging workflows.
- Monitor with Uptime Kuma — Point it at Ollama’s
/api/tagsendpoint to ensure the service is healthy. - Put it behind Traefik — If you want to access n8n remotely, route it through your existing reverse proxy with HTTPS and authentication.
Wrapping Up
The combination of Ollama and n8n is the most practical AI setup for a homelab in 2026. It’s not a toy—it’s a genuinely useful system that automates real work, runs on modest hardware, costs nothing after the initial setup, and keeps your data entirely under your control.
Start with one workflow. The log analyzer or RSS summarizer are good first projects—simple enough to build in 15 minutes, useful enough that you’ll actually keep them running. Once you see what’s possible, you’ll find yourself wiring AI into everything.