browser_cascading.md

Browser Cascading Guide

This guide shows how to use cascadeflow in browser environments for client-side AI applications.

Overview

cascadeflow's TypeScript library enables browser-based AI applications with the same 40-85% cost savings as the Python version.

Why Browser Cascading?

✅ Lower Latency - Edge functions run globally, close to users ✅ Better UX - Real-time AI responses in web apps ✅ Cost Savings - Same cascade logic, 40-85% cheaper than direct API calls ✅ Scalability - Serverless auto-scaling for traffic spikes

Supported Environments

Environment	Status	Best For
Node.js 18+	✅ Production	Backend APIs, CLI tools
Vercel Edge	✅ Production	Global web apps
Cloudflare Workers	✅ Production	Ultra-low latency
Browser (direct)	⚠️ With proxy	When you control the proxy

Security First

⚠️ CRITICAL: Never expose API keys in browser code!

// ❌ NEVER DO THIS
const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', apiKey: 'sk-...' }  // ❌ Exposed to users!
  ]
});

// ✅ ALWAYS USE A BACKEND PROXY
const agent = new CascadeAgent({
  models: [
    {
      name: 'gpt-4o-mini',
      provider: 'openai',
      proxyUrl: '/api/cascade'  // ✅ API key stays on server
    }
  ]
});

Why This Matters

API keys in browser code can be stolen from DevTools
Attackers can drain your OpenAI credits
Keys can be scraped from bundled JavaScript

Solution: Always use a backend proxy or edge function.

Architecture Patterns

Pattern 1: Edge Function (Recommended)

Best for: Public web apps, global users, low latency

User Browser
    ↓
Edge Function (Vercel/Cloudflare)
    ├── cascadeflow Logic
    └── API Key (secure)
    ↓
OpenAI API

Pros:

Global distribution (runs close to users)
No infrastructure management
Auto-scaling
Secure (API keys never exposed)

Cons:

Vendor-specific (Vercel, Cloudflare)
Cold starts (minimal with edge)

Code:

// Edge function (api/chat.ts)
import { CascadeAgent } from '@cascadeflow/core';

export default async function handler(req: Request) {
  const agent = new CascadeAgent({
    models: [
      { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015, apiKey: process.env.OPENAI_API_KEY },
      { name: 'gpt-4o', provider: 'openai', cost: 0.00625, apiKey: process.env.OPENAI_API_KEY }
    ]
  });

  const { query } = await req.json();
  const result = await agent.run(query);

  return Response.json(result);
}

Pattern 2: Backend API + Browser Client

Best for: Enterprise apps, existing backends, fine-grained control

User Browser
    ↓ fetch('/api/cascade')
Backend API (Express/Fastify)
    ├── cascadeflow Logic
    └── API Key (secure)
    ↓
OpenAI API

Pros:

Full control over infrastructure
Can add custom auth, rate limiting
Works with any backend framework

Cons:

Need to manage servers
Single region (higher latency for global users)

Code:

// Backend (Express)
import { CascadeAgent } from '@cascadeflow/core';
import express from 'express';

const app = express();
app.use(express.json());

const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015, apiKey: process.env.OPENAI_API_KEY },
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625, apiKey: process.env.OPENAI_API_KEY }
  ]
});

app.post('/api/cascade', async (req, res) => {
  const result = await agent.run(req.body.query);
  res.json(result);
});

app.listen(3000);

// Frontend (Browser)
const response = await fetch('/api/cascade', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ query: 'What is TypeScript?' })
});

const result = await response.json();
console.log(`Saved ${result.savingsPercentage}%`);

Pattern 3: Multi-Provider Browser Support

Best for: Using multiple AI providers in browser environments

All cascadeflow providers automatically work in both Node.js and browser environments through runtime detection:

import { CascadeAgent } from '@cascadeflow/core';

// All providers work in browser automatically!
const agent = new CascadeAgent({
  models: [
    {
      name: 'gpt-4o-mini',
      provider: 'openai',
      cost: 0.00015,
      proxyUrl: 'https://your-proxy.com/api/openai'  // Your proxy
    },
    {
      name: 'claude-3-haiku',
      provider: 'anthropic',
      cost: 0.00075,
      proxyUrl: 'https://your-proxy.com/api/anthropic'  // Your proxy
    }
  ]
});

const result = await agent.run('Hello!');

Supported providers in browser:

✅ OpenAI (automatic runtime detection)
✅ Anthropic (automatic runtime detection)
✅ Groq (automatic runtime detection)
✅ Together AI
✅ Ollama
✅ HuggingFace
✅ vLLM

Quick Start

1. Install

npm install @cascadeflow/core openai

2. Choose Your Deployment

Option A: Vercel Edge Function (60 seconds)

# Clone example
git clone https://github.com/cascadeflow/examples
cd examples/browser/vercel-edge

# Set API key
vercel env add OPENAI_API_KEY

# Deploy
vercel deploy --prod

Option B: Express Backend (5 minutes)

# Create project
mkdir my-cascade-app && cd my-cascade-app
npm init -y
npm install @cascadeflow/core openai express dotenv

# Create .env
echo "OPENAI_API_KEY=sk-..." > .env

# Create server.js (see Pattern 2 above)

# Run
node server.js

Examples

Example 1: Simple Web App

HTML:

<!DOCTYPE html>
<html>
<head>
  <title>cascadeflow Demo</title>
</head>
<body>
  <textarea id="query" placeholder="Ask anything..."></textarea>
  <button onclick="ask()">Ask AI</button>
  <div id="result"></div>
  <div id="savings"></div>

  <script>
    async function ask() {
      const query = document.getElementById('query').value;

      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ query })
      });

      const result = await response.json();

      document.getElementById('result').textContent = result.content;
      document.getElementById('savings').textContent =
        `Saved ${result.savingsPercentage}% vs best model`;
    }
  </script>
</body>
</html>

Edge Function (api/chat.ts):

import { CascadeAgent } from '@cascadeflow/core';

export const config = { runtime: 'edge' };

export default async function handler(req: Request) {
  const agent = new CascadeAgent({
    models: [
      { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015, apiKey: process.env.OPENAI_API_KEY },
      { name: 'gpt-4o', provider: 'openai', cost: 0.00625, apiKey: process.env.OPENAI_API_KEY }
    ]
  });

  const { query } = await req.json();
  const result = await agent.run(query);

  return Response.json(result);
}

Example 2: React App

import { useState } from 'react';

function CascadeChat() {
  const [query, setQuery] = useState('');
  const [result, setResult] = useState(null);

  async function handleSubmit(e) {
    e.preventDefault();

    const response = await fetch('/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ query })
    });

    const data = await response.json();
    setResult(data);
  }

  return (
    <div>
      <form onSubmit={handleSubmit}>
        <input
          value={query}
          onChange={(e) => setQuery(e.target.value)}
          placeholder="Ask anything..."
        />
        <button type="submit">Ask AI</button>
      </form>

      {result && (
        <div>
          <p>{result.content}</p>
          <p>💰 Saved {result.savingsPercentage}% vs best model</p>
          <p>⚡ {result.latencyMs}ms</p>
        </div>
      )}
    </div>
  );
}

Example 3: Next.js API Route

// app/api/cascade/route.ts
import { CascadeAgent } from '@cascadeflow/core';
import { NextRequest } from 'next/server';

export async function POST(req: NextRequest) {
  const { query } = await req.json();

  const agent = new CascadeAgent({
    models: [
      { name: 'gpt-4o-mini', provider: 'openai', cost: 0.00015, apiKey: process.env.OPENAI_API_KEY },
      { name: 'gpt-4o', provider: 'openai', cost: 0.00625, apiKey: process.env.OPENAI_API_KEY }
    ]
  });

  const result = await agent.run(query);

  return Response.json(result);
}

Production Deployment

Vercel

# Install Vercel CLI
npm install -g vercel

# Set environment variables
vercel env add OPENAI_API_KEY

# Deploy
vercel deploy --prod

vercel.json:

{
  "functions": {
    "api/**/*.ts": {
      "maxDuration": 30
    }
  }
}

Cloudflare Workers

# Install Wrangler
npm install -g wrangler

# Set secrets
wrangler secret put OPENAI_API_KEY

# Deploy
wrangler deploy

wrangler.toml:

name = "cascadeflow-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"

[env.production]
vars = { ENVIRONMENT = "production" }

Railway / Render / Fly.io

# Set environment variable
export OPENAI_API_KEY=sk-...

# Deploy (platform-specific)
railway up  # Railway
render deploy  # Render
fly deploy  # Fly.io

Cost Tracking

Display Savings to Users

const result = await agent.run(query);

// Show savings
console.log(`💰 Saved ${result.savingsPercentage}%`);
console.log(`📊 Cost: $${result.totalCost.toFixed(6)}`);
console.log(`🎯 Model: ${result.modelUsed}`);

// Alert if cascade failed
if (!result.draftAccepted) {
  console.log('⚠️ Draft rejected, escalated to verifier');
}

Aggregate Analytics

// Track cumulative savings
let totalSaved = 0;
let totalCost = 0;

async function runWithTracking(query: string) {
  const result = await agent.run(query);

  totalCost += result.totalCost;
  totalSaved += result.costSaved || 0;

  console.log(`Total saved: $${totalSaved.toFixed(4)}`);
  console.log(`Total cost: $${totalCost.toFixed(4)}`);

  return result;
}

Best Practices

1. Rate Limiting

Protect your API from abuse:

// Simple in-memory rate limiter
const rateLimiter = new Map();

export default async function handler(req: Request) {
  const ip = req.headers.get('x-forwarded-for');
  const now = Date.now();

  if (rateLimiter.has(ip)) {
    const lastRequest = rateLimiter.get(ip);
    if (now - lastRequest < 1000) {  // 1 request per second
      return new Response('Rate limit exceeded', { status: 429 });
    }
  }

  rateLimiter.set(ip, now);

  // ... rest of handler
}

2. Error Handling

export default async function handler(req: Request) {
  try {
    const agent = new CascadeAgent({ /* config */ });
    const result = await agent.run(query);
    return Response.json(result);
  } catch (error) {
    console.error('Cascade error:', error);
    return new Response(
      JSON.stringify({ error: 'Failed to process request' }),
      { status: 500, headers: { 'Content-Type': 'application/json' } }
    );
  }
}

3. CORS Configuration

export default async function handler(req: Request) {
  // Handle preflight
  if (req.method === 'OPTIONS') {
    return new Response(null, {
      headers: {
        'Access-Control-Allow-Origin': 'https://yourdomain.com',
        'Access-Control-Allow-Methods': 'POST',
        'Access-Control-Allow-Headers': 'Content-Type',
      },
    });
  }

  const result = await agent.run(query);

  return new Response(JSON.stringify(result), {
    headers: {
      'Content-Type': 'application/json',
      'Access-Control-Allow-Origin': 'https://yourdomain.com',
    },
  });
}

4. Monitoring

import * as Sentry from '@sentry/node';

export default async function handler(req: Request) {
  const startTime = Date.now();

  try {
    const result = await agent.run(query);

    // Log metrics
    Sentry.metrics.distribution('cascade.latency', Date.now() - startTime);
    Sentry.metrics.distribution('cascade.cost', result.totalCost);
    Sentry.metrics.distribution('cascade.savings', result.savingsPercentage);

    return Response.json(result);
  } catch (error) {
    Sentry.captureException(error);
    throw error;
  }
}

Troubleshooting

"API key not found"

Solution: Set environment variable in your deployment platform:

# Vercel
vercel env add OPENAI_API_KEY

# Cloudflare
wrangler secret put OPENAI_API_KEY

# Railway
railway variables set OPENAI_API_KEY=sk-...

CORS Errors

Solution: Add proper CORS headers:

headers: {
  'Access-Control-Allow-Origin': '*',  // Or specific domain
  'Access-Control-Allow-Methods': 'POST, OPTIONS',
  'Access-Control-Allow-Headers': 'Content-Type'
}

Timeout Errors

Solution: Increase function timeout:

// vercel.json
{
  "functions": {
    "api/**/*.ts": {
      "maxDuration": 60  // 60 seconds
    }
  }
}

High Costs

Solution: Check cascade is working:

const result = await agent.run(query);

if (result.draftAccepted) {
  console.log('✅ Draft accepted (cheap)');
} else {
  console.log('⚠️ Escalated to verifier (expensive)');
  console.log('Reason:', result.rejectionReason);
}

Learn More

Next Steps:

FilesExpand file tree

browser_cascading.md

Latest commit

History

browser_cascading.md

File metadata and controls

Browser Cascading Guide

Table of Contents

Overview

Why Browser Cascading?

Supported Environments

Security First

Why This Matters

Architecture Patterns

Pattern 1: Edge Function (Recommended)

Pattern 2: Backend API + Browser Client

Pattern 3: Multi-Provider Browser Support

Quick Start

1. Install

2. Choose Your Deployment

Option A: Vercel Edge Function (60 seconds)

Option B: Express Backend (5 minutes)

Examples

Example 1: Simple Web App

Example 2: React App

Example 3: Next.js API Route

Production Deployment

Vercel

Cloudflare Workers

Railway / Render / Fly.io

Cost Tracking

Display Savings to Users

Aggregate Analytics

Best Practices

1. Rate Limiting

2. Error Handling

3. CORS Configuration

4. Monitoring

Troubleshooting

"API key not found"

CORS Errors

Timeout Errors

High Costs

Learn More