ZennVid is a full-stack platform for generating, processing, and serving multimedia content — video, audio, and text — using advanced AI/ML capabilities. The platform supports multiple languages, voice styles, and two primary video modes (SadTalker and Magic Video). The README below documents the user and developer flows, the architecture, APIs, and examples.
ZennVid enables users to create short videos from text or prompts using AI models. Users can authenticate, buy credits, pick a video mode, and produce a final video which is stored and surfaced in a dashboard. Developers can generate API keys and use platform APIs under a credit model.
- Authentication
- Users sign up or log in via the Next.js frontend. They can choose OTP login or Google OAuth. OTPs are generated, stored temporarily, and validated; OAuth flows use the Google APIs and are validated by Express/Next.js middleware.
- Access & Roles
- Verified users are granted access to the Dashboard and, depending on role, to Admin and Developer portals.
- Admins can manage users, view transactions, developer usage, and video analytics.
- Developer Portal
- Developers can request API keys (stored in MongoDB), receive them by email (Nodemailer), and consume platform APIs (translation, audio generation, caption generation) under a credit model (e.g., 10 credits per API call).
- Dashboard & Credits
- The Dashboard (Next.js) is the primary product UI where users purchase credits (Razorpay), upload assets, choose video modes, preview jobs, and view generated videos.
- Video Modes
- SadTalker (20 credits): Input script -> XTTS voice clone generates audio -> SadTalker (Python) generates talking-face video -> Whisper generates captions -> FFmpeg exports final video.
- Magic Video (20 credits): GPT-OSS-20B (script generation) -> Edge-TTS creates audio -> stable-diffusion-xl-base-1.0 (Hugging Face) generates images -> Whisper captions -> FFmpeg export.
- Export, Storage & Metadata
- Final videos are saved to Cloudinary. Video metadata (owner, title, URLs, thumbnails, duration, shares, stats) is saved to MongoDB via the Express backend. An API endpoint provides retrieval for the Dashboard and Feed pages.
- Share & Feed
- Users can share generated videos to a Feed (Reel-style). The Feed page queries shared videos and renders them for browsing.
http://localhost:3000/api
- POST
/api/auth/otp— request OTP (body: { phoneOrEmail }) - POST
/api/auth/verify-otp— verify OTP (body: { id, otp }) - POST
/api/auth/oauth/google— handle Google OAuth callback / token exchange
- POST
/api/tts— generate speech audio from text (XTTS/Edge-TTS) - POST
/api/video— submit a video generation job (SadTalker or Magic) - GET
/api/videos— list videos for a user or feed - POST
/api/videos/:id/share— mark video as shared to feed
Refer to the api-server folder for route implementations and schemas.
- XTTS — voice cloning (SadTalker pipeline)
- SadTalker — talking face video generation (Python)
- GPT-OSS-20B — in-house / self-hosted script generation (Magic Video)
- Edge-TTS — audio generation for Magic Video
- stabilityai/stable-diffusion-xl-base-1.0 — image generation (Hugging Face)
- Whisper — caption generation
- FFmpeg — final export and containerization
- Language: TypeScript (Node.js)
- Framework: Express.js
- Database: MongoDB (metadata, API keys, users, transactions)
- Storage: Cloudinary (videos and thumbnails)
- Payment: Razorpay (credits)
See api-server/src for implementation details and routes.
- Framework: Next.js (React)
- Pages: Login, Dashboard, Upload, Developer Portal, Feed, Admin
- Components: video preview, share controls, transactions, credit purchase flow
See web-client/app and web-client/components for the UI code.
// POST /api/tts
const response = await fetch('/api/tts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
text: "Hello, ZennVid!",
voice: "en-US-AriaNeural",
style: "realistic"
})
});
const data = await response.json();
console.log(data.audioUrl);// POST /api/video
const response = await fetch('/api/video', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
mode: 'sadTalker', // or 'magic'
script: 'Welcome to ZennVid!',
options: { style: 'anime' }
})
});
const data = await response.json();
console.log(data.jobId);- Realistic
- Anime
- Cartoon
- Cyberpunk
- Sketch
- Pixel Art
- SadTalker: 20 credits
- Magic Video: 20 credits
- Developer API calls: 10 credits per API
Refer to audio.json and api-server/src/constants/Voicemappping.ts for full mappings.
web-client/— Next.js frontend (pages & components)api-server/— Express backend, routes, schemas, and utilitiesai-service/— Python helpers, XTTS/SadTalker orchestration, media pipelinesSadTalker/— SadTalker model code and helpers
Note: this README presents a high-level view tied to the current code layout. For implementation details, inspect the route files and services inside
api-server/srcand the frontend components insideweb-client/componentsandweb-client/app.
