Octofriend: Privacy-First Command-line AI Coding Assistant

Octofriend is an open-source, command-line coding assistant that works with popular Large Language Models (LLMs) like GPT-5, Claude 4, GLM-4.5, and Kimi K2.

It allows you to write and edit code by chatting with an AI, but with a key difference: you aren’t locked into a single model. If one AI gets stuck or gives a poor response, you can switch to another, like from GPT-5 to Claude 4, within the same session to get the job done.

GitHub Repo

Features

Safety Mode: Default confirmation prompts for edits, with an “unchained” mode for experienced users.
Multi-Model Support: Switch between different AI models during the same conversation without losing context.
Custom-Trained Autofix Models: Uses specialized ML models to automatically handle tool call and code edit failures from main coding models.
Zero Telemetry: Complete privacy protection with no data tracking or collection.
MCP Server Integration: Connect to Model Context Protocol servers for rich data access.
Local LLM Support: Works with self-hosted models running on your machine via ollama or llama.cpp.
Instruction File System: Uses OCTO.md, CLAUDE.md, or AGENTS.md files for project-specific or global rules.
Thinking Token Management: Optimized handling of encrypted content from advanced reasoning models.

How to Use It

1. Octofriend is a Node.js package, so you’ll need Node installed. Then, install Octofriend with NPM as follows:

npm install --global octofriend

2. Type octofriend in your terminal to start the assistant. The first time you run it, it will guide you through setting up your first AI model by asking for an API key.

3. To use Octofriend with a local model (like one running via Ollama or llama.cpp), select Add a custom model... from the menu.

It will prompt for your API base URL, which is typically something like http://localhost:3000.
For the credential, you can enter any non-empty value, as most local servers don’t require an API key.

4. For project-specific instructions, create a file named OCTO.md in your project’s root directory. You can describe the project’s architecture, coding style, or any other context the AI should know. Octofriend will read this file and apply the rules to its responses.

5. If an API call fails and you want to see the detailed error message, you can run the tool with an environment variable:

OCTO_VERBOSE=1 octofriend

Connecting Octo to MCP Servers

After you run octofriend for the first time, it creates a configuration file located at ~/.config/octofriend/octofriend.json5. To connect to an MCP server, you just need to edit this file and add an mcpServers object.

For example, to connect Octo to your Linear workspace, you would add the following to the config file:

mcpServers: {
  linear: {
    command: "npx",
    arguments: [ "-y", "mcp-remote", "https://mcp.linear.app/sse" ],
  },
},

Using Octo with Local LLMs

If you’re an advanced user already running a local LLM API server like Ollama or llama.cpp, you can easily connect Octo to it.

There are two ways to do this:

1. Through the Octofriend Interface

This is the simplest method. When you run octofriend and the menu appears:

Select Add a custom model....
It will ask for your API base URL. This will likely be http://localhost:3000 or whatever port your local server uses.
Next, it will ask for an environment variable for a credential. You can use any variable that isn’t empty. Most local LLM servers ignore credentials, so this just needs to be present.

2. By Editing the Config File Directly

You can also add a local model by manually editing the ~/.config/octofriend/octofriend.json5 file. Add an object like this to your list of models:

{
  nickname: "My Local Llama",
  baseUrl: "http://localhost:3000",
  apiEnvVar: "ANY_NON_EMPTY_VAR",
  model: "The model string your API server uses, e.g., codellama/CodeLlama-7b-hf",
}

This gives you a permanent, configured option for your local model that you can switch to at any time.

Pros

Open-Source: The entire codebase is available on GitHub, so you can inspect it and understand how it works.
Privacy-Focused: With zero telemetry and the ability to use local LLMs, it’s a strong choice for developers who handle sensitive code.
LLM Flexibility: You are not tied to a single AI provider or model. This freedom allows you to use the best tool for each specific task.
Error Correction: The custom-trained models that fix minor AI mistakes are a practical feature that reduces friction during development.
High Customization: Project-specific and global rule files give you fine-grained control over the AI’s behavior and context awareness.

Cons

Command-Line Only: It lacks a graphical user interface, which might be a barrier for developers who are not comfortable working exclusively in the terminal.
Initial Setup Required: You need to have your own API keys for the models you want to use and go through a configuration process for each one.

Related Resources

Synthetic AI Provider: Privacy-focused LLM provider recommended by Octofriend developers for secure code assistance
Model Context Protocol Documentation: Official guide for integrating MCP servers with AI tools
Ollama Local LLM Setup: Tool for running large language models locally on your machine

FAQs

Q: Can I use Octofriend with free AI models?
A: Yes, you can use Octofriend with local models running on your machine through ollama or llama.cpp. You can also use it with free tiers of various API providers, though usage limits may apply.

Q: How does model switching work during conversations?
A: Octofriend maintains conversation context when switching between models. If one model gets stuck or produces poor results, you can switch to another model that might handle the specific problem better, without losing the conversation history or current project state.

Q: Is my code sent to third-party servers?
A: Only if you choose to use cloud-based API providers like OpenAI or Anthropic. With local LLMs or privacy-focused providers like Synthetic, your code never leaves your control. Octofriend itself has zero telemetry.

Q: Is Octofriend completely free?
A: Yes, Octofriend is an open-source tool and is free to use. However, you are responsible for the costs associated with the LLM APIs you connect to it, as those are billed by the respective providers (like OpenAI or Anthropic).

Q: What makes Octofriend different from GitHub Copilot?
A: The main differences are flexibility and privacy. Octofriend lets you switch between various LLMs from different providers, including local ones, while Copilot is tied to Microsoft’s ecosystem. Additionally, Octofriend has zero telemetry, offering a more private experience.

Q: Can I use Octofriend without an internet connection?
A: Yes, if you configure it to use an LLM that you are running locally on your own machine with a tool like Ollama or llama.cpp.