Senior Software Developer @ Walmart Global Tech

Pranab Sarkar

AI Researcher & Senior Software Engineer

Building intelligent systems at the intersection of industry-scale engineering and AI research.

4Research Papers

14+Years Experience

1Patent

View Research GitHub LinkedIn

Scroll

About

I'm a Senior Software Developer at Walmart Global Tech with 14+ years of experience building scalable, production-grade systems. My work sits at the intersection of industry-scale engineering and AI research.

Beyond production systems, I research and publish on AI infrastructure — exploring ideas like persistent KV caching for tool-augmented LLMs, schema compression for tool use, and cognitive memory architectures for AI agents. I also build open-source MCP servers that enable multi-agent AI workflows.

Educated at Techno India (2006–2010). Recognized as a GitHub Copilot Champion. Provisional patent holder.

0+Years Experience

0Research Papers

0Open Source Projects

0+GitHub Stars

Research & Publications

Preprints on LLM optimization, tool use, and AI infrastructure

01Feb 2026

preprint

ContextCache

Persistent KV Cache with Content-Hash Addressing for Zero-Degradation Tool Schema Caching

6.9x TTFT speedup

A persistent KV cache system that accelerates tool-augmented LLM inference by caching prefilled key-value states of tool schema prefixes with SHA-256 content-hash addressing. On cache hits, only the user query requires prefilling, reducing time-to-first-token by 6.9x (787ms to 114ms) with zero quality degradation.

LLM InferenceKV CacheTool Use

Read paper

02Feb 2026

preprint

ToolFormerMicro

Composable Tool Schema Compression via Gated Cross-Attention

0.818 TSA, zero false positives

A ~428M parameter encoder-decoder model that compresses verbose tool schemas into compact 8-token gist vectors via gated cross-attention. Achieves 0.818 Tool Selection Accuracy with zero false positives across seen, held-out, and unseen tool splits.

TransformerCompressionTool Use

Read paper

03Feb 2026

preprint

YantrikDB

A Cognitive Memory Engine for Persistent AI Systems

5 unified index types

An embedded cognitive memory engine that unifies five index types — vector (HNSW), knowledge graph, temporal, decay heap, and key-value — within a single embedded database. Implements multi-signal retrieval scoring with relevance-gated importance amplification. 16,000 lines of Rust.

RustDatabaseAI MemoryHNSW

Read paper

04Feb 2026

preprint

SDF

Convert Once, Consume Many: Cacheable, Typed Semantic Extraction from Web Pages

4.1x latency reduction

An open, schema-validated JSON protocol for publishing pre-extracted, agent-oriented semantic representations of web content. A fine-tuned 1.5B + 3B pipeline achieves 4.1x latency reduction versus a 14B baseline with 90% exact extraction accuracy.

Data FormatWebSemantic Extraction

Read paper

Open Source

MCP servers and developer tools for AI agent workflows

brainstorm-mcp

MCP server for multi-round AI brainstorming debates between multiple models (GPT, DeepSeek, Groq, Ollama, etc.)

Multi-round cross-model debates
Supports GPT, DeepSeek, Groq, Ollama

TypeScriptMCPMulti-Agent

View repository

ClawBrain

AI memory and personalization system that enables truly personalized AI-human communication with evolving personality traits and real-time mood detection.