Pranab Sarkar
AI Researcher & Senior Software Engineer
Building intelligent systems at the intersection of industry-scale engineering and AI research.
About
I'm a Senior Software Developer at Walmart Global Tech with 14+ years of experience building scalable, production-grade systems. My work sits at the intersection of industry-scale engineering and AI research.
Beyond production systems, I research and publish on AI infrastructure — exploring ideas like persistent KV caching for tool-augmented LLMs, schema compression for tool use, and cognitive memory architectures for AI agents. I also build open-source MCP servers that enable multi-agent AI workflows.
Educated at Techno India (2006–2010). Recognized as a GitHub Copilot Champion. Provisional patent holder.
Research & Publications
Preprints on LLM optimization, tool use, and AI infrastructure
ContextCache
Persistent KV Cache with Content-Hash Addressing for Zero-Degradation Tool Schema Caching
A persistent KV cache system that accelerates tool-augmented LLM inference by caching prefilled key-value states of tool schema prefixes with SHA-256 content-hash addressing. On cache hits, only the user query requires prefilling, reducing time-to-first-token by 6.9x (787ms to 114ms) with zero quality degradation.
ToolFormerMicro
Composable Tool Schema Compression via Gated Cross-Attention
A ~428M parameter encoder-decoder model that compresses verbose tool schemas into compact 8-token gist vectors via gated cross-attention. Achieves 0.818 Tool Selection Accuracy with zero false positives across seen, held-out, and unseen tool splits.
YantrikDB
A Cognitive Memory Engine for Persistent AI Systems
An embedded cognitive memory engine that unifies five index types — vector (HNSW), knowledge graph, temporal, decay heap, and key-value — within a single embedded database. Implements multi-signal retrieval scoring with relevance-gated importance amplification. 16,000 lines of Rust.
SDF
Convert Once, Consume Many: Cacheable, Typed Semantic Extraction from Web Pages
An open, schema-validated JSON protocol for publishing pre-extracted, agent-oriented semantic representations of web content. A fine-tuned 1.5B + 3B pipeline achieves 4.1x latency reduction versus a 14B baseline with 90% exact extraction accuracy.
Open Source
MCP servers and developer tools for AI agent workflows
brainstorm-mcp
MCP server for multi-round AI brainstorming debates between multiple models (GPT, DeepSeek, Groq, Ollama, etc.)
- Multi-round cross-model debates
- Supports GPT, DeepSeek, Groq, Ollama
ClawBrain
AI memory and personalization system that enables truly personalized AI-human communication with evolving personality traits and real-time mood detection.
- Evolving personality traits
- Encrypted local-first storage
saga-mcp
A Jira-like project tracker MCP server for AI agents. SQLite-backed with 22 built-in tools.
- 22 built-in tools
- SQLite-backed storage
MCP Registry Portal
Enterprise-grade web portal for managing MCP applications with multi-environment support, API key management, and security provider integration.
- JWT auth with role-based access
- 5 secret management backends
Cognitive Memory Engine with Instinct-Driven Proactive Behavior, Unified In-Process Companion Runtime, and Contradiction-Aware Adaptive Retrieval
63/991,357
SARKAR-2026-001
February 26, 2026
Pranab Sarkar
Get in touch
Let's Connect
Interested in research collaboration, open-source contributions, or discussing AI infrastructure? I'd love to hear from you.
[email protected]