Showing posts with label Anthropic. Show all posts
Showing posts with label Anthropic. Show all posts

Sunday, 22 March 2026

Induced Demand Loop: Anthropic Sells You the Problem, Then the Solution

Anthropic built Claude Code to write your software. They have done awesome job to make it the most preferred agentic coding tool. It makes sure that you generate best code at first time or with shorter loops.

Now it sells Claude to review what Claude wrote. The snake has found its tail — and this is not an accident.


There is a pattern in business history that feels, the first time you notice it, like a conspiracy. A company creates a category of problem, then creates the solution, then collects rent from the gap between the two. 

Security consultancies who audited the systems they also architected. 

ERP vendors who sold implementation services for the complexity they introduced. 

Management consultants who institutionalized the inefficiencies they were paid to eliminate.

The AI era has produced its own version of this. It is more elegant than the historical ones — structurally self-reinforcing in a way the older models could only approximate. And Anthropic, with the quiet launch of code review as a product category following Claude Code, has demonstrated the loop with unusual clarity.

First, They Shipped the Generator

Claude Code is, at its core, an autonomous coding agent. It reads your codebase, writes implementations, refactors modules, scaffolds tests, and submits pull requests with the confidence of a senior engineer who has never experienced the social cost of a bad review. It is fast, tireless, and cheap. It is also — and this matters — statistically wrong in ways that are difficult to detect without reading every line it produces.



The product was sold, correctly, as a productivity multiplier. The pitch was straightforward: software engineering is bottlenecked on implementation speed, and Claude Code removes that bottleneck. Ship faster. Do more with fewer engineers. The implementation is no longer the hard part.

What this framing quietly omitted was the second-order effect. If you remove the implementation bottleneck, you do not get the same system running faster — you get a different system running under entirely new constraints. The bottleneck shifts. And the new bottleneck, almost inevitably, is verification.


The speed of generation outpaces the speed of comprehension. Code review was already the slowest lane on the engineering highway. Claude Code just added ten more lanes of traffic.

Every line that Claude Code writes must be read by someone who understands it well enough to sign off on it. That person is, in most organizations, increasingly rare. 

The engineers who remain after a round of AI-enabled headcount reduction are the ones reviewing output, not producing it. They were already stretched. Now they are reviewing five times as much code per day. Quality degrades. Bugs ship. Technical debt accumulates at the speed of token generation.


Then, They Shipped the Reviewer

The code review product is the second half of the loop. It reads the code — implicitly, the code that Claude Code wrote — and identifies issues, suggests improvements, flags security concerns, enforces architectural consistency. It is, in essence, an AI that reviews the output of a different AI trained by the same company, sold to the same customer, billed on the same invoice.

The symmetry is so clean it almost obscures the mechanism. But the mechanism is precise: Claude Code created the supply of unreviewed code. Code review created the demand for reviewing it. The company captures value on both ends of the transaction. The customer pays twice for a problem they did not have before they adopted the first product.


The Pattern, Precisely

This is not identical to the older consulting-firm model, where the problem was manufactured through advice. Here, the problem is an emergent property of the product itself. Claude Code does not intend to create review debt — it simply does, structurally, as a consequence of its own efficiency. It is the rational response to a real problem. The fact that the same company profits from both sides is not malfeasance. It is alignment.



This is what i call the induced demand pattern — AI tools that structurally generate the conditions for their own expansion. The code generation category is the clearest instance yet. Generate more code, create more review surface, sell more review tooling, use that revenue to train better generation models, which generate more code. The loop is not just self-sustaining. It is self-accelerating.


Why the Snake Eats Its Own Tail

The ancient image of a serpent consuming itself — was originally a symbol of cyclical renewal. The snake does not die; it feeds itself, perpetually. This is an accurate metaphor for what Anthropic has constructed.

The model that reviews the code learns from what it reviews. The patterns it flags become training signal for the model that writes the code next time. The review product improves the generation product, which increases the volume of code requiring review, which expands the market for the review product. There is no exterior — no part of this loop that does not feed back into the loop itself.






Compare this to the classical tech platform flywheel, where more users attract more sellers who attract more users. That loop is linear in its dependencies — it requires external participants at every node. The AI coding loop is tighter. The only external participant is the engineer, and even the engineer's role is progressively compressed as each generation of the model improves. The loop internalizes its own demand generation.


Implication for Engineers

The engineer who adopts Claude Code and then adopts the code review product has not automated away two separate problems. They have enrolled in a subscription to a problem-solution pair that is jointly managed by a vendor whose revenue depends on both sides of it remaining necessary. This is not a reason to reject the tools — the productivity gains are real, and the competitive pressure to adopt them is overwhelming. But it is a reason to be precise about what is actually happening.

The skills that used to be valuable in this workflow — the ability to write clean code quickly, to hold an architectural pattern in your head while implementing it — are being hollowed out from below. The skills that survive this compression are the ones at the top of the evaluation chain: the ability to read code written by someone else (or something else) and judge it accurately. The ability to know what a correct system feels like before you have built it. The ability to detect subtle errors in logic that no statistical model will flag because no statistical model has ever understood what the code is supposed to do.


The review product is not your ally in this dynamic. It is a product that profits most when the gap between what gets generated and what is actually correct remains large enough to require continuous attention.

This is the tension that no product announcement will name directly. Code review tooling, like all automated verification, has an incentive structure that is subtly misaligned with actually closing the verification gap. 

A perfect reviewer would put itself out of business. A profitable reviewer finds just enough to flag that you keep paying — while the deeper architectural drift, the slow divergence between what the system does and what it should do, accumulates beneath the surface of any automated check.


What the Pattern Predicts

If the induced demand pattern holds — and structurally, I believe it will — the next several years of AI developer tooling will follow a predictable shape. Every tool that accelerates a phase of the engineering lifecycle will create a corresponding tool that manages the debt that acceleration produces. Test generation will be followed by test quality analysis. Documentation generation will be followed by documentation accuracy verification. Architecture suggestion will be followed by architecture review.

Each pair will be sold by the same vendors, or by vendors whose incentives are structurally identical. Each pair will be presented as the solution to a problem, while quietly sustaining the conditions that make the problem recur. The stack will grow upward, each layer extracting value from the gap created by the layer below it.

The engineers who navigate this without becoming permanently dependent on it are the ones who maintain a clear model of what the system is supposed to do — not just what it currently does. That model is not a product. It cannot be sold, automated, or subscribed to. It is built slowly, through exposure to consequences, through the experience of being wrong in ways that matter and learning why.

Judgment compounds. Skills depreciate.

human judgment as cloud Function

Anthropic is not cynically manufacturing problems. The induced demand here is emergent, not engineered. But emergent does not mean neutral. The structure rewards continued dependence, punishes the development of in-house evaluation capability, and gradually transfers the judgment function — the most valuable thing an engineering team possesses — to a vendor whose model of your system is forever incomplete.

The snake eats its tail. The tail grows back. The snake is always hungry.



Wednesday, 11 February 2026

Blind Spots in Anthropic's Agentic Coding Report

 Anthropic's 2026 Agentic Coding Trends Report documents a real shift in how software gets built. The data from Rakuten, CRED, TELUS, and Zapier shows engineers increasingly orchestrating AI agents rather than writing code directly. Trend lines are clear: 60% of development work now involves AI, and output volume is rising.

But as someone building production systems with these tools, I found myself returning to what the report didn't address. Not because Anthropic's data is wrong—it isn't—but because the gaps reveal assumptions that deserve scrutiny. These aren't minor omissions. They're the difference between a marketing document and an honest assessment of where this technology actually stands.

Here are seven critical areas where the report's silence speaks louder than its claims.


Cost Model Is Conspicuously Absent



Report asserts that "total cost of ownership decreases" as agents augment engineering capacity. There's a chart. The line goes in the right direction. What's missing is any actual cost analysis.

Running multi-agent systems at the scale Anthropic envisions requires substantial compute. A coordinated team of agents working across separate context windows, iterating over hours or days, generates significant API costs. For a well-funded enterprise, this might be absorbed easily. For smaller teams, especially those in markets with different economic realities, this is a first-order consideration.

Absence of cost modeling isn't accidental—it's strategic. Anthropic benefits when organizations focus on productivity gains rather than infrastructure costs. But builders need both sides of the equation to make informed decisions.

Without cost data, you can't calculate ROI. You can't compare agent-augmented workflows against traditional development. You can't determine which tasks justify agent delegation and which don't. report gives you trend lines but no decision framework.

This matters particularly for the "long-running workflows" trend the report highlights. If tasks stretch across days with multiple agents maintaining state and coordinating actions, the compute bill scales accordingly. Organizations need to understand this economics before committing to these architectures.


Junior Developer Paradox



Report positions role transformation optimistically: engineers evolve from implementers to orchestrators. This framing works for experienced developers who already possess deep systems knowledge. It sidesteps a harder question about how that knowledge gets built in the first place.

Consider what the report itself acknowledges through an Anthropic engineer's quote: "I'm primarily using AI in cases where I know what the answer should be or should look like. I developed that ability by doing software engineering 'the hard way.'"

This creates a structural problem. If agents handle the implementation work that traditionally builds developer intuition—debugging complex issues, understanding why certain patterns fail, developing architectural taste—where does the next generation of experienced engineers come from?

This isn't a philosophical concern about automation displacing jobs. It's a practical question about skill development pipelines. Organizations adopting the orchestrator model need engineers who can effectively direct agents. Those engineers need deep systems understanding. But if the path to developing that understanding increasingly involves reviewing agent output rather than building from scratch, the pipeline breaks.

Report assumes a steady supply of experienced engineers capable of orchestration. It doesn't address how to maintain that supply in a world where early-career development looks fundamentally different.


Failure Modes at Scale Aren't Examined



Rakuten's case study highlights "99.9% numerical accuracy" for a seven-hour autonomous coding task. This is impressive. It's also potentially misleading as a success metric.

In production systems, 99.9% accuracy can translate to hundreds or thousands of subtle bugs at scale. More importantly, agent-generated bugs differ qualitatively from human-generated ones. Traditional debugging assumes you can reconstruct the reasoning that produced the code. Agent-generated code breaks this assumption.

When code fails, the standard approach is to examine the implementation and understand what the author intended. With agent-generated code, there's no author to query and no reasoning to reconstruct. Agent followed patterns and produced output that satisfied its objectives. Understanding why the code works a certain way requires reverse-engineering rather than recall.

Report doesn't discuss what happens when agents produce code that passes tests but contains architectural flaws that only manifest under load. Or when multi-agent systems create emergent complexity that no single reviewer can fully evaluate. Or when errors compound over multi-day tasks because early decisions affect later implementation in ways the orchestrating engineer didn't anticipate.

As agent-generated code becomes a larger percentage of codebases, these failure modes need systematic study. Report treats increased output as an unqualified success. It should be examining what happens when that output fails in production.


Global Access Barriers Remain Invisible

Every case study features well-resourced organizations in developed markets: Rakuten (Japan), TELUS (Canada), CRED (venture-backed India), Zapier (US). "democratization" trends discuss non-technical users gaining coding abilities but remain silent on geographic and economic access disparities.

Agentic coding at scale requires reliable infrastructure, API access with scalable billing, and often English language proficiency for optimal results. These requirements create structural barriers for developers in many markets.

Cost consideration from section one compounds this. If running agent workflows at meaningful scale requires substantial API spend, access becomes stratified by organizational resources. A developer at a startup in Lagos faces different constraints than one at Rakuten.

This matters because software development has been more democratized than many industries—you need a computer and internet access, not expensive capital equipment. If agentic coding raises the resource bar significantly, it doesn't democratize development. It concentrates it.

Report's vision of transformation only reflects the experience of well-funded organizations in specific markets. If this genuinely represents the future of software development, unequal access to these tools doesn't create a temporary gap. It creates stratification in who participates in that future.


Verification Doesn't Scale With Generation



Report celebrates increased output volume: more features shipped, more bugs fixed, more experiments run. It notes that 27% of AI-assisted work consists of tasks "that wouldn't have been done otherwise."

This creates a bottleneck the report doesn't examine. If output increases significantly while humans can only fully delegate 0-20% of tasks (per the report's own data), verification load increases proportionally. Someone must review the additional code. Someone must validate the architectural decisions. Someone must ensure the implementation is correct.

Report proposes "agentic quality control" as a solution—using AI to review AI-generated code. This doesn't resolve the problem; it relocates it. If you can't trust the agent to write code without review, the logical basis for trusting it to review code is unclear. You've created a verification loop that still requires human judgment at some point.

The fundamental constraint isn't code generation—agents demonstrably excel at that. Constraint is verification. Human reviewers can only evaluate so much code, especially code they didn't write and can't query about intent.

Organizations that scale output without proportionally scaling verification capacity aren't increasing velocity sustainably. They're accumulating technical debt and increasing the probability of errors reaching production.


Legal and IP Questions Are Unaddressed



When agents autonomously generate code, questions arise that the report doesn't acknowledge: Who owns the intellectual property? If agent-generated code replicates patterns from training data, who bears copyright liability? When legal teams use agents to build self-service tools (as the report highlights), what's the liability framework if those tools produce incorrect guidance?

These aren't theoretical concerns. They're active legal questions that enterprises must resolve before scaling agentic workflows to the levels Anthropic envisions. The report mentions that Anthropic's legal team built tools to streamline processes but doesn't address what happens when automated legal work produces errors.

Enterprises adopt new technologies slowly not primarily due to technical limitations but due to legal and compliance uncertainty. A forward-looking report that ignores these questions optimizes for excitement over practical adoption guidance.

Organizations need frameworks for:

  • IP ownership when agents generate substantial code independently
  • Copyright compliance when agent output may reflect training data patterns
  • Professional liability when agents augment knowledge work in regulated fields
  • Responsibility allocation when multi-agent systems make decisions over extended periods

Absence of any discussion around these points suggests they're considered solved problems. They're not.


Vendor Lock-In Isn't Mentioned

Report positions Anthropic as the infrastructure for agentic coding's future. Every case study uses Claude. Every workflow assumes access to Anthropic's tools. There's no discussion of what organizations should do to maintain strategic flexibility.

What happens when your multi-agent architecture, your long-running workflows, your team's entire development process is built around one provider's models and tools? When that provider changes pricing, when model capabilities shift, when new competitors emerge with better offerings?

Building deep dependencies on any single vendor creates strategic risk. In a market where model capabilities evolve rapidly and pricing structures change frequently, organizations need abstraction strategies.

Report understandably doesn't highlight this—Anthropic benefits from deep integration. But readers evaluating long-term adoption should be thinking carefully about portability. Today's best model becomes tomorrow's commodity. The investment is in workflows and processes, not specific model endpoints.

Organizations need to consider:

  • How to abstract agent interactions so models can be swapped
  • What standards exist for agent framework portability
  • How to structure workflows to minimize provider-specific dependencies
  • What the exit costs look like if they need to migrate

Report envisions a future built on Anthropic's infrastructure. Strategic planning requires thinking about that future without assuming permanent vendor relationships.


What's Actually Happening

Trends in Anthropic documents are real. Agentic coding is changing software development in meaningful ways. Data showing 60% AI involvement with only 0-20% full delegation is honest and valuable—it describes actual practice rather than aspirational vision.

But this is a vendor report designed to drive adoption, and it accomplishes that goal effectively. What it doesn't do is provide the complete picture builders need to make strategic decisions.

Most important questions about agentic coding in 2026 aren't about capabilities—agents demonstrably work. The questions are about economics, skill development, failure modes, access equity, verification scalability, legal frameworks, and strategic flexibility.

Agentic capabilities are impressive but gaps in the analysis are equally significant.

Understanding both is necessary for making informed decisions about how deeply to integrate these tools into your development process.

Wednesday, 5 March 2025

Building a Universal Java Client for Large Language Models

Building a Universal Java Client for Large Language Models

In today's rapidly evolving AI landscape, developers often need to work with multiple Large Language Model (LLM) providers to find the best solution for their specific use case. Whether you're exploring OpenAI's GPT models, Anthropic's Claude, or running local models via Ollama, having a unified interface can significantly simplify development and make it easier to switch between providers.

The Java LLM Client project provides exactly this: a clean, consistent API for interacting with various LLM providers through a single library. Let's explore how this library works and how you can use it in your Java applications.

Core Features

The library offers several key features that make working with LLMs easier:

  1. Unified Interface: Interact with different LLM providers through a consistent API
  2. Multiple Provider Support: Currently supports OpenAI, Anthropic, Google, Groq, and Ollama
  3. Chat Completions: Send messages and receive responses from language models
  4. Embeddings: Generate vector representations of text where supported
  5. Factory Pattern: Easily create service instances for different providers

Architecture Overview

The library is built around a few key interfaces and classes:

  • GenerativeAIService: The main interface for interacting with LLMs
  • GenerativeAIFactory: Factory interface for creating service instances
  • GenerativeAIDriverManager: Registry that manages available services
  • Provider-specific implementations in separate packages

This design follows the classic factory pattern, allowing you to:

  1. Register service factories with the GenerativeAIDriverManager
  2. Create service instances through the manager
  3. Use a consistent API to interact with different providers

Getting Started

To use the library, first add it to your Maven project:

xml
<dependency> <groupId>org.llm</groupId> <artifactId>llmapi</artifactId> <version>1.0.0</version> </dependency>


Basic Usage Example

Here's how to set up and use the library:

java
// Register service providers GenerativeAIDriverManager.registerService(OpenAIFactory.NAME, new OpenAIFactory()); GenerativeAIDriverManager.registerService(AnthropicAIFactory.NAME, new AnthropicAIFactory()); // Register more providers as needed // Create an OpenAI service Map<String, Object> properties = Map.of("apiKey", System.getenv("gpt_key")); var service = GenerativeAIDriverManager.create( OpenAIFactory.NAME, "https://api.openai.com/", properties ); // Create and send a chat request var message = new ChatMessage("user", "Hello, how are you?"); var conversation = new ChatRequest("gpt-4o-mini", List.of(message)); var reply = service.chat(conversation); System.out.println(reply.message()); // Generate embeddings var vector = service.embedding( new EmbeddingRequest("text-embedding-3-small", "How are you") ); System.out.println(Arrays.toString(vector.embedding()));

Working with Different Providers

OpenAI

java
Map<String, Object> properties = Map.of("apiKey", System.getenv("gpt_key")); var service = GenerativeAIDriverManager.create( OpenAIFactory.NAME, "https://api.openai.com/", properties ); // Chat with GPT-4o mini var conversation = new ChatRequest("gpt-4o-mini", List.of(new ChatMessage("user", "Hello, how are you?"))); var reply = service.chat(conversation);

Anthropic

java
Map<String, Object> properties = Map.of("apiKey", System.getenv("ANTHROPIC_API_KEY")); var service = GenerativeAIDriverManager.create( AnthropicAIFactory.NAME, "https://api.anthropic.com", properties ); // Chat with Claude var conversation = new ChatRequest("claude-3-7-sonnet-20250219", List.of(new ChatMessage("user", "Hello, how are you?"))); var reply = service.chat(conversation);

Ollama (Local Models)

java
// No API key needed for local models Map<String, Object> properties = Map.of(); var service = GenerativeAIDriverManager.create( OllamaFactory.NAME, "http://localhost:11434", properties ); // Chat with locally hosted Llama model var conversation = new ChatRequest("llama3.2", List.of(new ChatMessage("user", "Hello, how are you?"))); var reply = service.chat(conversation);

Under the Hood

The library uses an RPC (Remote Procedure Call) client to handle the HTTP communication with various APIs. Each provider's implementation:

  1. Creates appropriate request objects with the required format
  2. Sends requests to the corresponding API endpoints
  3. Parses responses into a consistent format
  4. Handles errors gracefully

The RpcBuilder creates proxy instances of service interfaces, handling the HTTP communication details so you don't have to.

Supported Models

The library currently supports several models across different providers:

  • OpenAI: all
  • Anthropic: all
  • Google: gemini-2.0-flash
  • Groq: all
  • Ollama: any other model you have locally

Extending the Library

One of the strengths of this design is how easily it can be extended to support new providers or features:

  1. Create a new implementation of GenerativeAIFactory
  2. Implement GenerativeAIService for the new provider
  3. Create necessary request/response models
  4. Register the new factory with GenerativeAIDriverManager

Conclusion

The Java LLM Client provides a clean, consistent way to work with multiple LLM providers in Java applications. By abstracting away the differences between APIs, it allows developers to focus on their application logic rather than the details of each provider's implementation.

Whether you're building a chatbot, generating embeddings for semantic search, or experimenting with different LLM providers, this library offers a straightforward way to integrate these capabilities into your Java applications.

The project's use of standard Java patterns like factories and interfaces makes it easy to understand and extend, while its modular design allows you to use only the providers you need. As the LLM ecosystem continues to evolve, this type of abstraction layer will become increasingly valuable for developers looking to build flexible, future-proof applications.


Link to github project - llmapi