My AI Stack

This document outlines my current AI stack and the tools I use regularly. Given the rapid evolution of the AI landscape, this stack is constantly evolving. I'm always tinkering and experimenting, but these are the components I've found particularly valuable for enhancing my productivity. You can find out more about my AI projects and thoughts on my homepage.

Core AI Components

These are the foundational elements of my AI stack. They represent the key technologies and infrastructure that support most of my AI-related activities.

LLM APIs

I heavily rely on LLMs via API, preferring them over self-hosted options for ease of use and resource management. Using APIs allows me to avoid hardware stress and simplifies deployment.

I use OpenRouter to consolidate billing and access a wide variety of APIs, enabling me to select models best suited for specific tasks.
Google Flash 2.0 is my primary go-to model due to its fast inference, large context window, and reasonable pricing. While not the best for complex reasoning, its versatility makes it suitable as the backing model for all my Assistant configurations.
For code generation that isn't agentic or for debugging purposes, I often turn to Qwen's models. I find the Qwen coder particularly underrated.
Cohere is useful for instructional text-based tasks.
While I sometimes use OpenAI, especially Sonnet 3.7 for agented code generation, I've found its recent performance to be inconsistent.

LLM Frontend

I've tested numerous AI tool frontends, and Open Web UI stands out as the most impressive. I share some of my configurations on Open Web UI. I recommend starting with a PostgreSQL database for long-term use, rather than SQLite.

While the container defaults to Chroma DB, you can configure it to use Milvus, Qdrant, or other options. Initially, self-hosting was for experimentation, but when it became robust enough to replace commercial tools, I re-architected for long-term stability, emphasizing careful component selection from the outset.

Speech To Text / ASR

Transitioning to speech-to-text has been transformative! After unsatisfactory experiences a decade ago, Whisper has revolutionized reliability and made it good enough for everyday use.

I use Whisper AI as a Chrome extension for speech-to-text and use it for many hours per day.

For Android, the open source Futo Keyboard project offers promise, but it depends on local hardware.

While I recognize the use case, I prefer not to run speech-to-text or most AI models locally. On my Linux desktop, I use generative AI tools to create my own notepads for Whisper via the API.

Vector Storage

I am developing a personal managed context data store for creating personalized AI experiences. This is a long-term project and my approach is likely to change over time. I'm using a multi-agent workflow to proactively generate contextual data. You can see some of my related projects here:

The project involves creating markdown files based on interviews detailing aspects of my life. I've also used the inverse approach of putting non-contextual data through an LLM pipeline to isolate context data. These workflows can be implemented with complex agent systems like Crew AI or by creating assistants using system prompts.

For vector storage, I avoid OpenAI assistants to prevent vendor lock-in and instead use Qdrant to decouple my personal context data from other parts of the project.

Regular Storage

Storing AI outputs more robustly doesn't require specialized solutions; regular databases suffice.

MongoDB and PostgreSQL are my preferred databases. PostgreSQL is especially beneficial, as it can easily be extended with PGVector.

Agentic, Generative & Orchestration Tools

These tools enhance and extend the capabilities of the core components, enabling more complex workflows and creative applications.

Agents, Assistants

I've explored the field of AI agents and assistants, noting that many interesting projects lack well-developed frontends. You can explore some of my AI assistants in my AI Assistants Library.

I'm a strong advocate for simple system prompt-based agents and have open-sourced over 600 system prompts since discovering AI in 2024. I currently use these in Open Web UI, sharing my library with the Open Web UI community. I've also tested Dify AI, but found it less effective with such a large agent network.

While having over 600 assistants may seem excessive, it's manageable when each assistant is highly focused on a small, distinct task. For example, I have assistants for changing the persona of text, formalizing it, informalizing it, and other common writing tasks. My current focus is on orchestration and tool usage.

Other Generative AI

Here's a look at some other generative AI tools I've been playing with:

Text To Image

I use Leonardo AI for text-to-image generation. I appreciate the diversity of models and their configurable parameters.

Text To Video

While I haven't explored text-to-video as extensively, I use Runway ML for creating animations from frames.

Workflows / Orchestration

My main interest in AI systems lies in addressing the challenge of making this rapidly growing technology more effective and versatile through tool use, workflow management, and orchestration.

I use N8N to provision and orchestrate agents. I am trying out different stack combinations, prioritizing fewer components. I also like the idea of pipelines and tools within Open Web UI to enable actions on external services. I believe that we will see stack consolidation this year.

Langflow provides a user-friendly interface for visually building complex workflows with language models, making it easier to prototype and experiment with different LLM configurations.

Agentic Workflows & Tools

This section covers the tools and workflows I use that leverage agentic AI principles, including my choice of IDEs and how I integrate AI with my computer usage.

AI IDEs

I currently subscribe to Windsurf, valuing its integrated experience for agent-driven code generation, despite some recent performance issues.

I also use Aider, especially for single-script projects where precise context specification is advantageous.

Computer Use

I use OpenSUSE Linux as my daily desktop, which influences my choice of tools.

I've found Open Interpreter impressive for running LLMs directly within the terminal and see significant potential in this project. It requires careful provisioning for debugging and working on the computer, but it's worth exploring.

Docker Implementation

This repository includes a docker-compose.yaml file that encapsulates my AI stack. This setup allows for easy deployment and management of the various components.

Services included:

Key Components:

OpenWebUI: My primary frontend for interacting with LLMs.
PostgreSQL: The main database for storing application data.
Qdrant: A vector database essential for semantic search and RAG applications.
Redis: Used for caching and performance optimization.
Langflow: Facilitates workflow management for language models.
Linkwarden: A bookmark and web content manager for research and reference.
N8N: My chosen workflow automation platform.
Unstructured: For extracting content from a variety of file formats.

In addition to these core services, the Docker Compose configuration includes:

Monitoring and Backup: Glances for system monitoring and Duplicati for backups, ensuring a robust and maintainable system.

This implementation demonstrates a practical deployment of the tools and services, including necessary environment variables, volume mounts, networking configurations, and health checks.

APIs

I leverage specialized APIs in conjunction with LLMs to enhance specific tasks.

Tavily: This search API provides relevant, up-to-date information, making it ideal for RAG applications and ensuring LLMs have access to current knowledge.
Sonar by Perplexity: Perplexity's API delivers powerful search capabilities with built-in summarization and information synthesis, particularly effective for research and gathering comprehensive information on specific topics.

These APIs complement the LLM capabilities, enabling more robust AI applications with access to real-time data.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
banner.webp		banner.webp
docker-compose.yaml		docker-compose.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

My AI Stack

Table of Contents

Core AI Components

LLM APIs

LLM Frontend

Speech To Text / ASR

Vector Storage

Regular Storage

Agentic, Generative & Orchestration Tools

Agents, Assistants

Other Generative AI

Text To Image

Text To Video

Workflows / Orchestration

Agentic Workflows & Tools

AI IDEs

Computer Use

Docker Implementation

APIs

About

Uh oh!

Releases

Packages

danielrosehill/My-AI-Stack

Folders and files

Latest commit

History

Repository files navigation

My AI Stack

Table of Contents

Core AI Components

LLM APIs

LLM Frontend

Speech To Text / ASR

Vector Storage

Regular Storage

Agentic, Generative & Orchestration Tools

Agents, Assistants

Other Generative AI

Text To Image

Text To Video

Workflows / Orchestration

Agentic Workflows & Tools

AI IDEs

Computer Use

Docker Implementation

APIs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages