Skip to content

tot-ra/ka_ba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ™ŒπŸ» ka_ba 🦚

AI Agent Runtime (ka) and agent orchestration application (ba).

Screenshot 2025-05-09 at 22 54 39

Features, Goals and Vision

  • Agent creation
    • β˜‘οΈ Modular system prompt dependent on tools
    • β˜‘οΈ System context
      • Add project path with initial 2-layer-deep file list (100 files max)
  • Agent runtime
    • Basic CLI interaction with LLM
    • β˜‘οΈ Tools
      • CLI
      • Generated code execution (eg. generate custom pythong/go/js code and run it in docker container as a tool output)
      • Browser
    • MCP
    • Docker container
  • LLM Providers
    • Add LMStudio, Clarifai, Google gemini, Openrouter
    • Custom rate-limit per provider
  • Multiagent orchestration of tasks with multiple agents
    • Agent-to-Agent calls
    • Automatic discovery (DNS/k8s)
    • Sequential / parallel workflows
  • Task processing
    • β˜‘οΈ task duplication
    • β˜‘οΈ automatic task creation from other tasks
      • prompt: to know when to split tasks
    • Parallel tool processing if possible
    • multitasking (N tasks processing at the same time)
    • priorities (focus on high-priorities first)
    • prompt: use ephemeral TODO buffer for local tasks
    • result preview: show inline artifacts (links to files, browser window)
  • Permanent thinking agent
    • Sentry mode (security, code style, refactoring, marketing)
    • Watching filesystem
  • Security:
    • Credential management for MCPs
    • limit agent access to specfic folders only
    • Ratelimiting (max RPM to LLM provider)
    • Max task processing
    • detect unsafe commands
    • prevent loops within task messages
    • prevent loops in tasks creating tasks
    • prevent loops in agent-to-agent calls
  • I/O
    • push notifications
    • multimodal inputs
    • audio input
    • video input
  • Portability
    • Embeddable UI. Script to place on website to talk to an agent(s)
    • Multiple agents talking within same chat window
    • Desktop app
    • Mobile app

Usage

npm install
npm run dev
open "http://localhost:5173"

cd backend
npm install
npm run dev

Architecture

Kaba is a project combining two main components:

  1. ka: (Located in ka/) A Go-based AI agent runtime compatible with the Agent-to-Agent (A2A) communication protocol. It provides a CLI and an HTTP server for task management.
  2. ba: (Located in src/) A web-based UI (Vite/React) acting as a control layer for A2A agents, including ka.
  3. backend: (Located in backend/) A potential backend service for ba, intended for spawning local ka instances.
flowchart LR
ba["ba UI"] --"manage agents and tasks <br> graphql, subscriptions"--> backend --"spawn new process, create tasks"--> ka["ka agent"] --"evaluate task"--> LLM
ka -."use tools".-> filesystem
ka -."call MCPs (todo)".-> MCP-server
ka -."store task state as JSON file".-> disk
Loading

ba - A2A Agent UI, Control, and Orchestration Layer (src/)

ba is a web application built with Vite and React (located in the src/ directory), designed to serve as a user interface, control panel, and orchestration layer for Agent-to-Agent (A2A) compliant AI agents. It aims to provide a unified interface for interacting with various agents, including the local ka agent runtime.

Purpose

The primary goal of ba is to enable users to:

  • Manage and interact with A2A-compliant AI agents.
  • Spawn and control local instances of the ka agent.
  • Submit tasks to selected agents with various input types.
  • Monitor task status and view results, including streaming output and artifacts.
  • Eventually, orchestrate complex workflows involving multiple agents.

ba Features

ba is being developed to include the following features (see TODO.md for detailed tasks and progress):

  • Agent Management: Add, remove, and list A2A agent endpoints.
  • Local ka Control: Spawn and stop local ka agent processes with configurable settings.
  • Task Interaction: Send tasks to selected agents, handle different input types (text, files, data), and receive/display streaming responses and final results.
  • Task Monitoring: View the status and history of tasks.
  • Artifact Handling: Retrieve and display artifacts generated by agents.
  • Input Handling: Provide necessary input to agents when a task requires it.
  • Orchestration: (Planned) Define and execute workflows involving multiple agents.
  • Agent Discovery: (Planned) Fetch and display agent capabilities from their Agent Cards (/.well-known/agent.json).

A2A Agent Requirements

For effective interaction and future orchestration capabilities within ba, A2A-compliant agents are expected to implement the following standard endpoint:

  • /agents/update (POST): This endpoint, as defined in the A2A specification, allows ba (or other control layers) to inform an agent about the presence and details of other agents in the network. Implementing this is crucial for enabling features like agent-to-agent task delegation and collaborative workflows managed by ba.

Getting Started

Prerequisites

  • Node.js and npm/yarn/pnpm
  • Go (if you plan to spawn local ka agents)
  • An A2A-compliant agent running and accessible via a URL (e.g., a running ka instance).

ba Project Structure

  • src/: Frontend source code (React/TypeScript).
  • backend/: Placeholder for the pending backend service.
  • vite.config.ts, package.json, etc. in the root directory configure the frontend build.

ka - AI Agent Runtime (ka/)

Located in the ka/ directory, this project implements a Go-based agent runtime compatible with the Agent-to-Agent (A2A) communication protocol. It provides both a command-line interface (ka) for direct interaction with a configured LLM and an HTTP server exposing A2A-compliant endpoints for task management.

The primary goal is to create a flexible and extensible runtime that can manage tasks, interact with LLMs (initially via LM Studio's OpenAI-compatible API), and potentially integrate with other tools and capabilities.

ka Features

Basic CLI interaction with LLM

Uses GEMINI_API_KEY from env variables to talk to google gemini model

./ka --provider google "hi"
  • A2A HTTP Server:
    • Serves agent self-description at /.well-known/agent.json.
    • Implements core A2A task endpoints:
      • /tasks/send: Accepts tasks for asynchronous processing.
      • /tasks/sendSubscribe: Accepts tasks and streams responses via Server-Sent Events (SSE).
      • /tasks/status: Retrieves the status and details of a task.
      • /tasks/input: Allows providing input to tasks waiting in the input-required state.
      • /tasks/artifact: Retrieves artifacts generated by tasks.
      • /tasks/pushNotification/set: Placeholder for push notification registration.
    • Supports different input Part types (TextPart, FilePart, DataPart) for task submission.
      • Basic handling for file:// URIs in FilePart is included.
    • Handles input-required state transitions based on LLM response markers.
  • Task Management:
    • Defines a Task model with states (submitted, working, input-required, completed, failed, canceled).
    • Includes both an InMemoryTaskStore (default, non-persistent) and a FileTaskStore (persistent, saves tasks as JSON files in _tasks/).
  • ka Command-Line Tool:
    • Provides direct interaction with the configured LLM (requires LM Studio running).
    • Supports piping input (cat file | ka).
    • Supports streaming responses (ka --stream "prompt").
    • Can output the agent's self-description (ka --describe).
  • LLM Integration:
    • Connects to OpenAI-compatible APIs (tested with LM Studio).
    • Configurable via environment variables (see llm/llm.go).
    • Supports streaming responses from the LLM.

ka Getting Started

ka Prerequisites

  • Go (version 1.22 or later recommended)
  • LM Studio installed and running (lms server start) or another OpenAI-compatible API endpoint accessible for LLM interaction.

Building & testing ka

Note: These commands should be run from the ka/ directory.

cd ka
make build
make test

Running the ka A2A Server

Note: Run these commands from the project root (kaba/).

Start the server (uses in-memory task storage by default):

./ka server
# Or simply:
# ./ka

The server defaults to port 8080.

To use persistent file-based task storage:

./ka server --task-store file --task-store-path ./_ka_tasks

(This will create and use the _ka_tasks directory in the project root).

The server exposes standard A2A endpoints:

  • http://localhost:8080/.well-known/agent.json
  • http://localhost:8080/tasks/send (POST)
  • http://localhost:8080/tasks/sendSubscribe (POST)
  • http://localhost:8080/tasks/status?id={task_id} (GET)
  • http://localhost:8080/tasks/input (POST)
  • http://localhost:8080/tasks/artifact?id={task_id}&artifact_id={artifact_id} (GET)
  • ... and others as defined in the A2A specification.

Using the ka CLI Tool

Note: Run these commands from the project root (kaba/). Ensure the ka executable exists there.

Basic Interaction:

./ka "Hello, world!"

Streaming:

./ka --stream "Tell me a story."

Piping Input:

cat README.md | ./ka ai "Summarize this file."

Agent Description (from server):

./ka describe

Maximum Context Length:

./ka --max_context_length 4096 "Prompt requiring specific context length"

ka Configuration (Environment Variables)

The ka agent (both server and CLI) uses environment variables for configuration, primarily for connecting to the LLM:

  • LLM_API_BASE: (Required) The base URL of the OpenAI-compatible API (e.g., http://localhost:1234/v1 for LM Studio).
  • LLM_API_KEY: The API key for the LLM service (often optional for local models like LM Studio, e.g., lm-studio).
  • LLM_MODEL: The model identifier to use (e.g., local-model). If not set, ka might use a default or the first available model.
  • KA_SERVER_PORT: Port for the A2A HTTP server (defaults to 8080).
  • KA_TASK_STORE: Type of task store (memory or file, defaults to memory).
  • KA_TASK_STORE_PATH: Path for the file task store (defaults to _tasks/ relative to where ka is run).

Example .env file (place in kaba/ and source it or use a tool like direnv):

LLM_API_BASE="http://localhost:1234/v1"
LLM_API_KEY="lm-studio"
LLM_MODEL="local-model"
KA_SERVER_PORT="8081"
KA_TASK_STORE="file"
KA_TASK_STORE_PATH="./_ka_tasks"

ka Development (ka/ directory)

  • Code Structure:
    • main.go: Entrypoint for CLI commands (ai, server, describe) and server startup.
    • http.go: A2A HTTP server setup, routing, and agent card definition.
    • ai.go: Implementation of the ka ai CLI command logic.
    • a2a/: Package containing A2A protocol types (Task, Message, Part, Artifact), TaskStore interface/implementations, and HTTP handlers.
    • llm/: Package for LLM client abstraction and interaction.

ka Agent Workflow

The ka agent implements a sophisticated workflow for handling tasks and LLM interactions. Below is a detailed explanation of how tasks are processed, with a focus on the LLM execution flow.

Task Lifecycle

flowchart TD
    Create[Task Created] --> Submit[Task Submitted]
    Submit --> Working[Agent Working]
    Working -->|Tool Calls Detected| Tools[Execute Tools]
    Tools --> Working
    Working -->|Input Needed| InputReq[Waiting for User Input]
    InputReq -->|User Responds| Working
    Working -->|Task Finished| Complete[Task Completed]
    Working -->|Error Occurs| Failed[Task Failed]
    Working -->|User Cancels| Canceled[Task Canceled]
Loading

Tasks in ka follow a state machine pattern:

  1. submitted: Initial state when a task is created
  2. working: The agent is actively processing the task
  3. input-required: The agent needs additional input from the user
  4. completed: The task has been successfully completed
  5. failed: An error occurred during task processing
  6. canceled: The task was canceled by the user or system

LLM Execution Flow

sequenceDiagram
    participant User as User/Client
    participant Executor as Task Executor
    participant LLMHandler as HandleLLMExecution
    participant LLM as LLM Client
    participant Store as Task Store
    participant ToolMgr as Tool Dispatcher

    User->>Executor: Submit Task
    Executor->>Store: Create Task (submitted)
    Executor->>Store: Update State (working)
    Executor->>LLMHandler: Process Messages
    
    LLMHandler->>LLM: Stream Request
    LLM-->>LLMHandler: Stream Response
    
    LLMHandler->>Store: Update with Assistant Message
    
    alt Tool Calls Detected
        LLMHandler->>Store: Set State (working)
        Executor->>ToolMgr: Execute Tool Calls
        ToolMgr-->>Executor: Tool Results
        Executor->>LLMHandler: Process with Tool Results
    else Input Required
        LLMHandler->>Store: Set State (input-required)
        User->>Executor: Provide Input
        Executor->>LLMHandler: Process with Input
    else No More Work
        LLMHandler->>Store: Set State (completed)
    end
    
    Store-->>User: Return Final Result
Loading

HandleLLMExecution Function

The HandleLLMExecution function in ka/a2a/executor_llm.go is a core component that:

  1. Streams LLM Responses: Always uses streaming for real-time updates
  2. Manages Task State: Updates task state based on LLM response
  3. Parses Tool Calls: Extracts XML-formatted tool calls from LLM responses
  4. Handles Errors: Manages error conditions and state transitions

Key processing steps:

  1. Sets up a buffer and signaller to detect the first write from the LLM
  2. Calls the LLM with streaming enabled
  3. Processes the full response, extracting any tool calls in XML format
  4. Updates the task with the assistant's message
  5. Determines the next state based on:
    • Presence of tool calls (β†’ working)
    • Need for user input (β†’ input-required)
    • Completion with no further actions (β†’ completed)

Tool Execution Cycle

flowchart LR
    Response[LLM Response] --> Parse[Extract XML Tool Calls]
    Parse --> Dispatch[Route to Tools]
    Dispatch --> Execute[Run Tool Functions]
    Execute --> Results[Collect Tool Results]
    Results --> NextCall[Next LLM Interaction]
Loading

When the LLM generates tool calls in XML format:

  1. The response is parsed to extract structured tool calls
  2. The ToolDispatcher handles routing to the appropriate tools
  3. Tool results are added to the conversation context
  4. The task remains in the "working" state for another LLM iteration
  5. This cycle continues until the LLM produces a final response with no tool calls

About

πŸ™ŒπŸ» ka_ba 🦚 - AI agent orchestration app

Resources

License

Stars

Watchers

Forks