The difference between an AI assistant and an AI agent is action. An assistant answers questions. An agent completes tasks — reading documents, searching the web, running code, calling APIs, storing results, and repeating until the objective is achieved.
Cloud AI providers have been building agentic capabilities for the past year: OpenAI’s Codex runs autonomous coding tasks; Anthropic’s Computer Use navigates applications; Google’s agents interact with Workspace tools. All of these send your tasks to external servers and require trust in the provider’s infrastructure.
Ollama makes local agents possible. Using Ollama as the reasoning backbone with appropriate tools attached, you can build autonomous agents that work entirely on your hardware — reading your files, executing your code, calling your internal APIs, and completing multi-step tasks without any data leaving your machine.
This guide builds three levels of local agents: no-code automation with n8n, Python agents with LangChain, and stateful multi-agent systems with LangGraph.
🔗 This is Post #18 in the Ollama Unlocked series. For the Ollama API that powers these agents, see Post #9. For building applications on top of agent outputs, see Building AI Apps With Ollama (Post #11).
What Makes a Good Local Agent Model
Not every model is well-suited to agentic tasks. Agents require:
Tool calling / function calling: The model must reliably output structured JSON to invoke tools. Models not trained for this produce inconsistent structured output.
Instruction adherence: Agents need models that follow complex multi-step instructions without drifting.
Multi-step reasoning: Planning a 5-step task and executing each step correctly requires stronger reasoning than answering a single question.
Best models for local agents (May 2026):
# Best overall agentic capability
ollama pull kimi-k2.6 # MIT licensed, strong tool use
# Best for agentic coding specifically
ollama pull devstral:24b # Purpose-built for software engineering agents
# Best balance of speed and capability
ollama pull qwen3.6:27b # Strong tool calling, reliable instruction following
# Lightweight option for simpler agents
ollama pull llama4:scout # Fast, good instruction following
Level 1: No-Code Agents With n8n
n8n is a workflow automation tool similar to Zapier but self-hosted, open-source, and with native Ollama integration. It lets you build AI agents visually — no programming required.
Installing n8n
# Docker (recommended)
docker run -it --rm \
--name n8n \
-p 5678:5678 \
-v ~/.n8n:/home/node/.n8n \
n8nio/n8n
# Open http://localhost:5678
Connecting Ollama to n8n
- In n8n, go to Settings → Credentials → Add Credential
- Search for “Ollama”
- Set URL:
http://host.docker.internal:11434(if n8n is in Docker) orhttp://localhost:11434 - Save
Building a Local Research Agent in n8n
This agent: receives a research topic → searches the web → summarizes findings → saves to a file.
n8n workflow nodes:
[Webhook/Manual Trigger]
↓
[Set Node: Define topic variable]
↓
[HTTP Request: DuckDuckGo Search API]
↓
[Ollama Chat Model: Summarize search results]
↓
[Write Binary File: Save summary to disk]
↓
[Respond to Webhook: Return summary]
The Ollama node configuration:
- Model:
llama4:scout - System prompt: ``` You are a research assistant. You receive raw search results and produce a concise, accurate research summary. Structure your output as:
SUMMARY: [2-3 sentence overview] KEY FINDINGS:
- [Finding 1]
- [Finding 2]
- [Finding 3] SOURCES: [URLs from the search results] LIMITATIONS: [What this research does not cover] ```
n8n Agentic Loop Pattern
For tasks requiring multiple steps and decisions:
[Trigger]
↓
[Ollama Agent Node] ←── loops back until "done"
↓ (tool calls)
[Tool Router]
├── Web Search
├── Read File
├── Write File
└── Run Code
↓ (tool results fed back to agent)
[Check: Is task complete?]
├── YES → [Output node]
└── NO → [Back to Ollama Agent]
n8n’s AI Agent node handles this loop automatically when connected to tool nodes.
Level 2: Python Agents With LangChain
LangChain provides the tools, memory, and agent frameworks for Python-based agents on top of Ollama.
Setup
pip install langchain langchain-community langchain-ollama \
duckduckgo-search wikipedia tavily-python
Basic ReAct Agent
The ReAct (Reasoning + Acting) pattern is the most common agent architecture:
# react_agent.py
from langchain_ollama import ChatOllama
from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_community.utilities import WikipediaAPIWrapper
from langchain import hub
import subprocess
# Initialize the local LLM
llm = ChatOllama(
model="qwen3.6:27b",
temperature=0,
num_ctx=32768
)
# Define tools the agent can use
search = DuckDuckGoSearchRun()
wikipedia = WikipediaAPIWrapper()
def run_python(code: str) -> str:
"""Execute Python code safely and return output."""
try:
result = subprocess.run(
["python3", "-c", code],
capture_output=True,
text=True,
timeout=30
)
return result.stdout or result.stderr
except subprocess.TimeoutExpired:
return "Error: Code execution timed out (30s limit)"
except Exception as e:
return f"Error: {e}"
def read_file(path: str) -> str:
"""Read a local file and return its contents."""
try:
with open(path.strip(), 'r', encoding='utf-8') as f:
content = f.read()
return content[:5000] # Limit to 5000 chars
except FileNotFoundError:
return f"Error: File not found: {path}"
except Exception as e:
return f"Error reading file: {e}"
tools = [
Tool(
name="web_search",
func=search.run,
description="Search the web for current information. "
"Input: search query string"
),
Tool(
name="wikipedia",
func=wikipedia.run,
description="Search Wikipedia for factual information. "
"Input: topic or question"
),
Tool(
name="run_python",
func=run_python,
description="Execute Python code. Use for calculations, data processing. "
"Input: complete Python code as a string"
),
Tool(
name="read_file",
func=read_file,
description="Read a local file. "
"Input: absolute file path"
)
]
# Use the ReAct prompt template
prompt = hub.pull("hwchase17/react")
# Create the agent
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Show agent's reasoning
max_iterations=10,
handle_parsing_errors=True
)
def run_agent(task: str) -> str:
"""Run the agent on a task."""
result = agent_executor.invoke({"input": task})
return result["output"]
# Example tasks
if __name__ == "__main__":
# Research task
result = run_agent(
"Research the current state of fusion energy in 2026. "
"Find 3 recent developments and summarize what each means."
)
print(result)
# Calculation task
result = run_agent(
"Calculate the compound interest on $10,000 invested for 10 years "
"at 7% annual return, compounded monthly. Show the formula and result."
)
print(result)
Tool-Calling Agent (Function Calling)
Models with native tool-calling support (Qwen3, Kimi K2.6) produce more reliable structured output than the ReAct text-based approach:
# tool_calling_agent.py
from langchain_ollama import ChatOllama
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, SystemMessage
from typing import Annotated
import json
import requests
# Define tools using the @tool decorator
@tool
def get_weather(city: Annotated[str, "The city name to get weather for"]) -> str:
"""Get current weather for a city."""
# Using a public weather API (no key required for basic data)
try:
resp = requests.get(
f"https://wttr.in/{city}?format=j1",
timeout=10
)
data = resp.json()
temp_c = data["current_condition"][0]["temp_C"]
desc = data["current_condition"][0]["weatherDesc"][0]["value"]
return f"{city}: {temp_c}°C, {desc}"
except Exception as e:
return f"Could not get weather for {city}: {e}"
@tool
def calculate(
expression: Annotated[str, "Mathematical expression to evaluate"]
) -> str:
"""Safely evaluate a mathematical expression."""
import ast
import operator
# Only allow safe operations
allowed_ops = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.Pow: operator.pow,
ast.USub: operator.neg,
}
def eval_expr(node):
if isinstance(node, ast.Constant):
return node.value
elif isinstance(node, ast.BinOp):
return allowed_ops[type(node.op)](
eval_expr(node.left), eval_expr(node.right)
)
elif isinstance(node, ast.UnaryOp):
return allowed_ops[type(node.op)](eval_expr(node.operand))
else:
raise ValueError(f"Unsupported operation: {type(node)}")
try:
tree = ast.parse(expression, mode='eval')
result = eval_expr(tree.body)
return str(result)
except Exception as e:
return f"Calculation error: {e}"
tools = [get_weather, calculate]
# Create model with tools bound
llm = ChatOllama(model="qwen3.6:27b", temperature=0)
llm_with_tools = llm.bind_tools(tools)
# Tool execution mapping
tool_map = {t.name: t for t in tools}
def run_agent_with_tools(user_message: str) -> str:
"""Run a single-turn agent with tool calling."""
messages = [
SystemMessage(content="You are a helpful assistant with access to tools."),
HumanMessage(content=user_message)
]
# Agent loop
max_iterations = 5
for _ in range(max_iterations):
response = llm_with_tools.invoke(messages)
messages.append(response)
# Check if there are tool calls
if not response.tool_calls:
return response.content
# Execute tool calls
for tool_call in response.tool_calls:
tool_name = tool_call["name"]
tool_args = tool_call["args"]
if tool_name in tool_map:
tool_result = tool_map[tool_name].invoke(tool_args)
else:
tool_result = f"Error: Tool '{tool_name}' not found"
# Add tool result to messages
from langchain_core.messages import ToolMessage
messages.append(ToolMessage(
content=str(tool_result),
tool_call_id=tool_call["id"]
))
return "Max iterations reached without completing task"
# Test
result = run_agent_with_tools(
"What's the weather in Tokyo and Paris? "
"Also, if I have 42 items at $3.75 each, what's the total?"
)
print(result)
Level 3: Stateful Multi-Agent Systems With LangGraph
LangGraph extends LangChain with graph-based workflow control — enabling complex agent patterns with state, loops, and multiple specialized agents working together.
Setup
pip install langgraph
The Supervisor + Worker Pattern
A supervisor agent delegates to specialized worker agents:
# multi_agent.py
from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated, Sequence
import operator
# Shared state for the multi-agent system
class AgentState(TypedDict):
messages: Annotated[Sequence, operator.add]
task: str
subtasks: list[str]
results: dict
final_answer: str
next_agent: str
# Initialize specialized models
fast_model = ChatOllama(model="llama4:scout", temperature=0)
code_model = ChatOllama(model="qwen3.6:27b", temperature=0)
research_model = ChatOllama(model="llama4:scout", temperature=0.2)
def supervisor_agent(state: AgentState) -> AgentState:
"""Breaks down the task and routes to specialized agents."""
response = fast_model.invoke([
SystemMessage(content="""You are a supervisor that breaks complex tasks
into subtasks and routes them to the right specialists.
Available specialists:
- researcher: web research, fact-finding, current events
- coder: writing code, debugging, technical analysis
- writer: summarizing, drafting text, editing
For the given task, decide:
1. Which specialist should handle it?
2. What specific subtask should they do?
Respond in JSON format:
{
"specialist": "researcher|coder|writer",
"subtask": "specific instruction for the specialist"
}"""),
HumanMessage(content=f"Task: {state['task']}")
])
import json
import re
# Extract JSON from response
json_match = re.search(r'\{.*\}', response.content, re.DOTALL)
if json_match:
routing = json.loads(json_match.group())
return {
**state,
"next_agent": routing.get("specialist", "writer"),
"subtasks": [routing.get("subtask", state["task"])]
}
return {**state, "next_agent": "writer"}
def researcher_agent(state: AgentState) -> AgentState:
"""Handles research tasks."""
subtask = state["subtasks"][0] if state["subtasks"] else state["task"]
response = research_model.invoke([
SystemMessage(content="""You are a research specialist.
Provide accurate, well-organized research findings.
Note any uncertainty in your information."""),
HumanMessage(content=subtask)
])
return {
**state,
"results": {**state.get("results", {}), "research": response.content},
"next_agent": "writer"
}
def coder_agent(state: AgentState) -> AgentState:
"""Handles coding tasks."""
subtask = state["subtasks"][0] if state["subtasks"] else state["task"]
response = code_model.invoke([
SystemMessage(content="""You are a coding specialist.
Write clean, well-commented code. Include error handling.
Explain what the code does after the code block."""),
HumanMessage(content=subtask)
])
return {
**state,
"results": {**state.get("results", {}), "code": response.content},
"next_agent": "writer"
}
def writer_agent(state: AgentState) -> AgentState:
"""Synthesizes results into a final response."""
context = "\n\n".join([
f"{k.upper()}:\n{v}"
for k, v in state.get("results", {}).items()
])
response = fast_model.invoke([
SystemMessage(content="""You are a writer who synthesizes
specialist findings into clear, complete responses."""),
HumanMessage(content=f"""Original task: {state['task']}
Specialist findings:
{context}
Write a complete, well-organized response to the original task.""")
])
return {**state, "final_answer": response.content}
def route_agent(state: AgentState) -> str:
"""Determines which agent to call next."""
return state.get("next_agent", "writer")
# Build the graph
workflow = StateGraph(AgentState)
# Add nodes
workflow.add_node("supervisor", supervisor_agent)
workflow.add_node("researcher", researcher_agent)
workflow.add_node("coder", coder_agent)
workflow.add_node("writer", writer_agent)
# Add edges
workflow.set_entry_point("supervisor")
workflow.add_conditional_edges(
"supervisor",
route_agent,
{
"researcher": "researcher",
"coder": "coder",
"writer": "writer"
}
)
workflow.add_edge("researcher", "writer")
workflow.add_edge("coder", "writer")
workflow.add_edge("writer", END)
# Compile
app = workflow.compile()
def run_multi_agent(task: str) -> str:
"""Run the multi-agent system on a task."""
initial_state = AgentState(
messages=[],
task=task,
subtasks=[],
results={},
final_answer="",
next_agent=""
)
result = app.invoke(initial_state)
return result["final_answer"]
# Test
if __name__ == "__main__":
answer = run_multi_agent(
"Write a Python function to detect if a string is a palindrome, "
"including tests and a brief explanation of the algorithm."
)
print(answer)
Practical Agent Use Cases
Document Processing Agent
# Processes a folder of documents automatically
task = """
Process all PDF files in /documents/contracts/ and for each one:
1. Extract the contract parties (who signed it)
2. Find the expiration date
3. Identify the three most important obligations
4. Flag any unusual clauses
Save results to /documents/contract_summary.json
"""
Automated Code Review Agent
# Reviews a git diff and creates a PR comment
task = """
Read the git diff in /tmp/pr_diff.txt
Review it for: security issues, logic errors, missing error handling
Write a structured code review comment suitable for posting on GitHub
Save to /tmp/review_comment.md
"""
Research and Report Agent
# Researches a topic and writes a report
task = """
Research "quantum computing enterprise adoption 2026"
Find at least 3 specific examples of enterprise use cases
Write a 500-word executive briefing suitable for a non-technical audience
Include: current state, key use cases, timeline for broader adoption
"""
Common Agent Mistakes
Mistake 1: Using models not trained for tool calling Not all models produce reliable structured JSON for tool calls. Test tool calling specifically — some models produce inconsistent output that breaks the agent loop. Qwen3 and Kimi K2.6 are the most reliable for tool use.
Mistake 2: No iteration limits
Without max_iterations, an agent that gets stuck in a loop runs indefinitely. Always set a maximum iteration count.
Mistake 3: Unsafe code execution
Never allow agents to run arbitrary code in production without sandboxing. The run_python tool above uses subprocess — in production, use a containerized sandbox.
Mistake 4: No error handling in tools If a tool raises an exception and there is no error handling, the agent crashes. Wrap all tool functions in try-except and return error messages the agent can reason about.
Conclusion
Local AI agents running on Ollama give you automation capability without data leaving your machine. n8n handles no-code automation workflows. LangChain enables Python-based agents with tools. LangGraph enables complex multi-agent coordination.
The key constraint versus cloud agents is model reasoning quality — local agents are slightly less reliable on the hardest multi-step planning tasks. For straightforward automation, document processing, and research synthesis, local agents work well today.
Your next step: Install n8n with Docker. Build the research agent workflow from this guide. Give it a research task relevant to your actual work. The experience of watching an agent search, synthesize, and save results without any manual steps makes the value of agents immediately tangible.
📚 Continue the Series:
- ← Previous The Modelfile: Customize Any Local Model
- Next → Fine-Tuning With Ollama: Customizing Models on Your Own Hardware
- For the API powering agents The Ollama API
Last updated: June 2026. LangChain and LangGraph release updates frequently. Verify current API syntax at python.langchain.com and langchain-ai.github.io/langgraph.