Skip to content

Fast-Editor/Lynkr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

64 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Lynkr - Claude Code Proxy with Multi-Provider Support

npm version Homebrew Tap License: Apache 2.0 Ask DeepWiki Databricks Supported AWS Bedrock OpenAI Compatible Ollama Compatible llama.cpp Compatible

Production-ready Claude Code proxy supporting 9+ LLM providers with 60-80% cost reduction through token optimization.


Overview

Lynkr is a self-hosted proxy server that unlocks Claude Code CLI and Cursor IDE by enabling:

  • πŸš€ Any LLM Provider - Databricks, AWS Bedrock (100+ models), OpenRouter (100+ models), Ollama (local), llama.cpp, Azure OpenAI, Azure Anthropic, OpenAI, LM Studio
  • πŸ’° 60-80% Cost Reduction - Built-in token optimization with smart tool selection, prompt caching, and memory deduplication
  • πŸ”’ 100% Local/Private - Run completely offline with Ollama or llama.cpp
  • 🎯 Zero Code Changes - Drop-in replacement for Anthropic's backend
  • 🏒 Enterprise-Ready - Circuit breakers, load shedding, Prometheus metrics, health checks

Perfect for:

  • Developers who want provider flexibility and cost control
  • Enterprises needing self-hosted AI with observability
  • Privacy-focused teams requiring local model execution
  • Teams seeking 60-80% cost reduction through optimization

πŸ’° Cost Savings

Lynkr reduces AI costs by 60-80% through intelligent token optimization:

Real-World Savings Example

Scenario: 100,000 API requests/month, 50k input tokens, 2k output tokens per request

Provider Without Lynkr With Lynkr Monthly Savings Annual Savings
Claude Sonnet 4.5 (Databricks) $16,000 $6,400 $9,600 $115,200
GPT-4o (OpenRouter) $12,000 $4,800 $7,200 $86,400
Ollama (Local) API costs $0 $12,000+ $144,000+

How We Achieve 60-80% Cost Reduction

6 Token Optimization Phases:

  1. Smart Tool Selection (50-70% reduction)

    • Filters tools based on request type
    • Chat queries don't get file/git tools
    • Only sends relevant tools to model
  2. Prompt Caching (30-45% reduction)

    • Caches repeated prompts and system messages
    • Reuses context across conversations
    • Reduces redundant token usage
  3. Memory Deduplication (20-30% reduction)

    • Removes duplicate conversation context
    • Compresses historical messages
    • Eliminates redundant information
  4. Tool Response Truncation (15-25% reduction)

    • Truncates long tool outputs intelligently
    • Keeps only relevant portions
    • Reduces tool result tokens
  5. Dynamic System Prompts (10-20% reduction)

    • Adapts prompts to request complexity
    • Shorter prompts for simple queries
    • Full prompts only when needed
  6. Conversation Compression (15-25% reduction)

    • Summarizes old conversation turns
    • Keeps recent context detailed
    • Archives historical context

πŸ“– Detailed Token Optimization Guide


πŸš€ Key Features

Multi-Provider Support (9+ Providers)

  • βœ… Cloud Providers: Databricks, AWS Bedrock (100+ models), OpenRouter (100+ models), Azure OpenAI, Azure Anthropic, OpenAI
  • βœ… Local Providers: Ollama (free), llama.cpp (free), LM Studio (free)
  • βœ… Hybrid Routing: Automatically route between local (fast/free) and cloud (powerful) based on complexity
  • βœ… Automatic Fallback: Transparent failover if primary provider is unavailable

Cost Optimization

  • πŸ’° 60-80% Token Reduction - 6-phase optimization pipeline
  • πŸ’° $77k-$115k Annual Savings - For typical enterprise usage (100k requests/month)
  • πŸ’° 100% FREE Option - Run completely locally with Ollama or llama.cpp
  • πŸ’° Hybrid Routing - 65-100% cost savings by using local models for simple requests

Privacy & Security

  • πŸ”’ 100% Local Operation - Run completely offline with Ollama/llama.cpp
  • πŸ”’ Air-Gapped Deployments - No internet required for local providers
  • πŸ”’ Self-Hosted - Full control over your data and infrastructure
  • πŸ”’ Local Embeddings - Private @Codebase search with Ollama/llama.cpp
  • πŸ” Policy Enforcement - Git restrictions, test requirements, web fetch controls
  • πŸ” Sandboxing - Optional Docker isolation for MCP tools

Enterprise Features

  • 🏒 Production-Ready - Circuit breakers, load shedding, graceful shutdown
  • 🏒 Observability - Prometheus metrics, structured logging, health checks
  • 🏒 Kubernetes-Ready - Liveness, readiness, startup probes
  • 🏒 High Performance - ~7ΞΌs overhead, 140K req/sec throughput
  • 🏒 Reliability - Exponential backoff, automatic retries, error resilience
  • 🏒 Scalability - Horizontal scaling, connection pooling, load balancing

IDE Integration

  • βœ… Claude Code CLI - Drop-in replacement for Anthropic backend
  • βœ… Cursor IDE - Full OpenAI API compatibility (Requires Cursor Pro)
  • βœ… Continue.dev - Works with any OpenAI-compatible client
  • βœ… Cline +VSCode - Confgiure it similar to cursor in openai compatible section

Advanced Capabilities

  • 🧠 Long-Term Memory - Titans-inspired memory system with surprise-based filtering
  • 🧠 Semantic Memory - FTS5 search with multi-signal retrieval (recency, importance, relevance)
  • 🧠 Automatic Extraction - Zero-latency memory updates (<50ms retrieval, <100ms async extraction)
  • πŸ”§ MCP Integration - Automatic Model Context Protocol server discovery
  • πŸ”§ Tool Calling - Full tool support with server and client execution modes
  • πŸ”§ Custom Tools - Easy integration of custom tool implementations
  • πŸ” Embeddings Support - 4 options: Ollama (local), llama.cpp (local), OpenRouter, OpenAI
  • πŸ“Š Token Tracking - Real-time usage monitoring and cost attribution

Developer Experience

  • 🎯 Zero Code Changes - Works with existing Claude Code CLI/Cursor setups
  • 🎯 Hot Reload - Development mode with auto-restart
  • 🎯 Comprehensive Logging - Structured logs with request ID correlation
  • 🎯 Easy Configuration - Environment variables or .env file
  • 🎯 Docker Support - docker-compose with GPU support
  • 🎯 400+ Tests - Comprehensive test coverage for reliability

Streaming & Performance

  • ⚑ Real-Time Streaming - Token-by-token streaming for all providers
  • ⚑ Low Latency - Minimal overhead (~7ΞΌs per request)
  • ⚑ High Throughput - 140K requests/second capacity
  • ⚑ Connection Pooling - Efficient connection reuse
  • ⚑ Prompt Caching - LRU cache with SHA-256 keying

πŸ“– Complete Feature Documentation


Quick Start

Installation

Option 1: NPM Package (Recommended)

# Install globally
npm install -g lynkr

# Or run directly with npx
npx lynkr

Option 2: Git Clone

# Clone repository
git clone https://github.com/vishalveerareddy123/Lynkr.git
cd Lynkr

# Install dependencies
npm install

# Create .env from example
cp .env.example .env

# Edit .env with your provider credentials
nano .env

# Start server
npm start

Option 3: Homebrew (macOS/Linux)

brew tap vishalveerareddy123/lynkr
brew install lynkr
lynkr start

Option 4: Docker

docker-compose up -d

Supported Providers

Lynkr supports 9+ LLM providers:

Provider Type Models Cost Privacy
AWS Bedrock Cloud 100+ (Claude, Titan, Llama, Mistral, etc.) $$-$$$ Cloud
Databricks Cloud Claude Sonnet 4.5, Opus 4.5 $$$ Cloud
OpenRouter Cloud 100+ (GPT, Claude, Llama, Gemini, etc.) $-$$ Cloud
Ollama Local Unlimited (free, offline) FREE πŸ”’ 100% Local
llama.cpp Local GGUF models FREE πŸ”’ 100% Local
Azure OpenAI Cloud GPT-4o, GPT-5, o1, o3 $$$ Cloud
Azure Anthropic Cloud Claude models $$$ Cloud
OpenAI Cloud GPT-4o, o1, o3 $$$ Cloud
LM Studio Local Local models with GUI FREE πŸ”’ 100% Local

πŸ“– Full Provider Configuration Guide


Claude Code Integration

Configure Claude Code CLI to use Lynkr:

# Set Lynkr as backend
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=dummy

# Run Claude Code
claude "Your prompt here"

That's it! Claude Code now uses your configured provider.

πŸ“– Detailed Claude Code Setup


Cursor Integration

Configure Cursor IDE to use Lynkr:

  1. Open Cursor Settings

    • Mac: Cmd+, | Windows/Linux: Ctrl+,
    • Navigate to: Features β†’ Models
  2. Configure OpenAI API Settings

    • API Key: sk-lynkr (any non-empty value)
    • Base URL: http://localhost:8081/v1
    • Model: claude-3.5-sonnet (or your provider's model)
  3. Test It

    • Chat: Cmd+L / Ctrl+L
    • Inline edits: Cmd+K / Ctrl+K
    • @Codebase search: Requires embeddings setup

πŸ“– Full Cursor Setup Guide | Embeddings Configuration


Documentation

Getting Started

IDE Integration

Features & Capabilities

Deployment & Operations

Support


External Resources


Key Features Highlights

  • βœ… Multi-Provider Support - 9+ providers including local (Ollama, llama.cpp) and cloud (Bedrock, Databricks, OpenRouter)
  • βœ… 60-80% Cost Reduction - Token optimization with smart tool selection, prompt caching, memory deduplication
  • βœ… 100% Local Option - Run completely offline with Ollama/llama.cpp (zero cloud dependencies)
  • βœ… OpenAI Compatible - Works with Cursor IDE, Continue.dev, and any OpenAI-compatible client
  • βœ… Embeddings Support - 4 options for @Codebase search: Ollama (local), llama.cpp (local), OpenRouter, OpenAI
  • βœ… MCP Integration - Automatic Model Context Protocol server discovery and orchestration
  • βœ… Enterprise Features - Circuit breakers, load shedding, Prometheus metrics, K8s health checks
  • βœ… Streaming Support - Real-time token streaming for all providers
  • βœ… Memory System - Titans-inspired long-term memory with surprise-based filtering
  • βœ… Tool Calling - Full tool support with server and passthrough execution modes
  • βœ… Production Ready - Battle-tested with 400+ tests, observability, and error resilience

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Claude Code CLI β”‚  or  Cursor IDE
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Anthropic/OpenAI Format
         ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Lynkr Proxy    β”‚
β”‚  Port: 8081     β”‚
β”‚                 β”‚
β”‚ β€’ Format Conv.  β”‚
β”‚ β€’ Token Optim.  β”‚
β”‚ β€’ Provider Routeβ”‚
β”‚ β€’ Tool Calling  β”‚
β”‚ β€’ Caching       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€β”€β†’ Databricks (Claude 4.5)
         β”œβ”€β”€β†’ AWS Bedrock (100+ models)
         β”œβ”€β”€β†’ OpenRouter (100+ models)
         β”œβ”€β”€β†’ Ollama (local, free)
         β”œβ”€β”€β†’ llama.cpp (local, free)
         β”œβ”€β”€β†’ Azure OpenAI (GPT-4o, o1)
         β”œβ”€β”€β†’ OpenAI (GPT-4o, o3)
         └──→ Azure Anthropic (Claude)

πŸ“– Detailed Architecture


Quick Configuration Examples

100% Local (FREE)

export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=qwen2.5-coder:latest
export OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
npm start

AWS Bedrock (100+ models)

export MODEL_PROVIDER=bedrock
export AWS_BEDROCK_API_KEY=your-key
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
npm start

OpenRouter (simplest cloud)

export MODEL_PROVIDER=openrouter
export OPENROUTER_API_KEY=sk-or-v1-your-key
npm start

πŸ“– More Examples


Contributing

We welcome contributions! Please see:


License

Apache 2.0 - See LICENSE file for details.


Community & Support


Made with ❀️ by developers, for developers.