A Quick Setup for Running hermes agent Locally — macOS, Ollama, and qwen3.6:35b for a Fully Local Stack

Tadashi Shigeoka ·  Sat, April 18, 2026

hermes agent, published by NousResearch, is a self-improving AI agent built for people who live in the terminal. Its tagline is “The Agent That Grows With You”: it acquires skills as you use it and builds a deepening model of who you are across sessions.

For model providers, it goes beyond cloud options like Nous Portal and OpenRouter: you can point it at any OpenAI-compatible endpoint. That means with Ollama you can run a fully closed stack on local LLMs alone. This post is the shortest path to installing hermes agent on macOS and wiring it to qwen3.6:35b running locally under Ollama. No cloud API is involved at any point.

It comes down to three steps:

  1. Pull qwen3.6:35b with Ollama (prep you can run ahead of time)
  2. Install hermes agent and point it at local Ollama during hermes setup
  3. Launch the TUI

Pull qwen3.6:35b with Ollama

First, prepare the model that will serve as the backend on the Ollama side. This is prep you can finish before installing hermes, and since the download takes a while, kicking it off first cuts down on waiting later. If you don’t have Ollama yet, install the macOS build from the download page.

Pulling the model is a single command.

ollama pull qwen3.6:35b

qwen3.6:35b is a model from the Qwen3.6 series: a mixture-of-experts (A3B) build that activates roughly 3B of its ~36B total parameters per token, with a Q4_K_M quantization and a 24 GB download (see the model page). It’s tuned for the repository-level reasoning and coding fluency that matter for agentic use.

At 24 GB, running it comfortably on macOS needs headroom in unified memory. On Apple Silicon, 32 GB or more is a safe target. The MoE design keeps inference fast for its size, but the full weights still have to fit in memory, so if things are tight, step down to the smaller qwen3.6:27b.

After the pull completes, do a quick sanity check that it responds locally.

ollama run qwen3.6:35b "hi!"

Install hermes agent and point it at local Ollama during setup

While the model download runs, install hermes agent itself. For macOS, Linux, and WSL2, the fastest route is to pipe the official install script through curl (see the installation guide).

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

The only prerequisite is Git. The installer handles the rest of the dependencies for you, including Python 3.11, Node.js v22, ripgrep, and ffmpeg. It also clones the repo, creates the virtual environment, and sets up the global hermes command automatically.

Once that finishes, run the first-time setup wizard.

hermes setup

hermes setup is an interactive wizard that configures everything at once: model, terminal, gateway, tools, and agent. You can also configure the provider separately with the dedicated hermes model command, but there’s no need to split it out: point things at local Ollama right inside setup’s model step.

When the wizard asks you to choose a model provider, pick the “custom / self-hosted endpoint” option and point it at the OpenAI-compatible endpoint that Ollama exposes.

http://localhost:11434/v1

Ollama serves an OpenAI-compatible API on port 11434 by default (see the Ollama API docs), so hermes can treat it as a custom OpenAI-compatible endpoint. When prompted for an API key, note that Ollama requires no authentication locally, so any dummy string will do. For the model name, use the same qwen3.6:35b you pulled.

At this point, every request from hermes flows to local Ollama, and nothing routes through an external cloud API.

Launch the TUI

Finally, launch the terminal UI.

hermes --tui

hermes --tui brings up the graphical terminal UI instead of the classic CLI (equivalent to setting HERMES_TUI=1). It includes the full set of features for terminal dwellers: multiline editing, slash-command autocomplete, conversation history, and streaming tool output.

Send a prompt, and if the Ollama process spins up and a response comes back from your local qwen3.6:35b, the stack is complete. If something won’t connect, hermes doctor is the quick way to diagnose configuration issues, and hermes logs lets you inspect the logs.

Wrap-up

Pulling the steps together:

  • Prepare the backend model first with ollama pull qwen3.6:35b
  • Install hermes agent with curl ... install.sh | bash, then in hermes setup’s model step point the custom endpoint http://localhost:11434/v1 at local Ollama
  • Launch the TUI with hermes --tui

The appeal of this stack is running a self-improving agent entirely on your own machine, without a cloud API. Your training data and conversation history never leave the box, which makes it a convenient setup for kicking the tires in a personal workspace first.

Once you’re comfortable, you can grow the agent further: connect it to messaging platforms like Telegram or Slack with hermes gateway, or add capabilities with hermes skills. The official hermes agent docs are the place to explore those one at a time.

That’s all from the field, running hermes agent against a local Ollama.