Inference (@inference

Inference

360 posts

Inference

@inference_net

Build and monitor self-improving agents Get started: docs.inference.net/get-started/ca…

San Francisco, CA

Joined March 2024

Pinned
Inference
@inference_net
Apr 14
Catalyst is live built for teams shipping agents in production train & deploy frontier LLMs in minutes, using the data your application is already generating Get Started:
Sam Hogan 🇺🇸
@samhogan
Apr 14
Introducing Catalyst: a developer platform to monitor, train & deploy self-improving AI models built for teams operating AI products at scale Catalyst can automatically: - collect traces from your agents - curate training data & evals - train & deploy models on par w/ Opus 4.6
Catalyst by Inference.net - Inference.net Documentation
From docs.inference.net
14K
Inference reposted
Mike Pollard
@mikepollard_dev
Jun 17
Put together a video of what we've been building over at @inference_net Check out how you can optimize your production agents with HALO and Catalyst
00:00
2.6K
Inference reposted
Sam Hogan 🇺🇸
@samhogan
Jun 13
Schematron volume is now 4M requests per day, 10s of billions of tokens. Growing 10% WoW it's the best HTML-to-JSON model for apps/agents sonnet 4.5 quality at 9b model prices. crazy fast Get $50 in free Schematron credit with code EXTRACT on @inference_net. only 100 available
3.5K
Inference reposted
Sam Hogan 🇺🇸
@samhogan
Jun 10
Had a great time sitting down with @compliantvc to have a very serious conversation about @inference_net, startup culture, and all things compliance
Henrick Johansson
@compliantvc
Jun 10
I sat down with @samhogan He raised $11M to build private AI infrastructure. Then he asked for another $11M. He says companies should stop sending data to OpenAI. Naturally, I asked about SOC 2. He had one. This was upsetting. I prefer when founders are easier to prosecute.
00:00
2.2K
Inference
@inference_net
Jun 10
Specialized models are becoming a practical path to better AI UX. Olive moved from a frontier model to a custom model trained with Inference Catalyst for their food verdict workflow. After a user scans a product, the model now delivers near-instant verdicts on what to watch out
How Olive Delivers Real-Time Food Verdicts on a Model It Owns | Inference.net
From inference.net
1.4K
Inference reposted
AVB
@neural_avb
May 27
Finished recording the Tiny Models DPO video. Working on manim illustrations now. What it'll cover: - Measuring diversity in LM responses - Generate preference data locally - DPO (+ RM, ORPO) - Training DPO w Unsloth/TRL - Evals Thanks to @inference_net for sponsoring!
00:00
00:25
AVB
@neural_avb
May 27
Deep learning bros and sisters, don't sleep on this. You can cluster millions of documents in embedding space, mass-annotate them, visualize them... basically for free and within seconds.
6.8K
Inference reposted
Sam Hogan 🇺🇸
@samhogan
May 28
3 weeks ago we open-sourced HALO this led to talking with dozens of teams running agents at scale we realized the current agent monitoring tools aren't built for the future that we so clearly see ahead of us today we’re releasing native OpenTelemetry-compatible agent tracing
00:00
Sam Hogan 🇺🇸
@samhogan
Apr 29
We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis
37K
Inference
@inference_net
May 19
The best production model is the one trained for the job. Gravity Ads replaced a 70B model on Cerebras with a specialized 1B model trained for their actual workload. Same quality, much faster and cheaper inference: - p50: 152ms - p99: 5.7x lower - cost: ~10x lower - model: 70x
How Gravity Ads Trains Specialized LLMs to Power Their AI-Native Ad Network | Inference.net
From inference.net
8.4K
Inference reposted
Amar Singh
@AmarSVS
May 18
Article
Agent Performance: Model-Bound versus Harness-Bound
You might think that as models get smarter, the harnesses matter less. Just give them all the tools and surface area to do what they want, and they will find their way since models are getting more...
13K
Inference reposted
Sam Hogan 🇺🇸
@samhogan
May 14
Article
HALO: Using RLMs to build self-improving agents
AI Agents are composed of two key elements: a model and a harness. The model decides what to do; the harness does it. An AI agent is what you get when you put a model into the harness and start a...
24K
Inference reposted
Amar Singh
@AmarSVS
May 11
Article
Recursive Agent Optimization on Terminal Bench
Debugging agent harnesses is still a massive challenge, especially with modern harnesses ballooning traces to hundreds of thousands of tokens. Because of that, there might be subtle patterns of...
37K
Inference reposted
Amar Singh
@AmarSVS
May 5
Article
Recursive Agent Optimization Actually Works
If you have built an agent harness before, you know how difficult it is to start debugging LLM behaviors, especially once it starts looking even remotely complex. You might rely on external...
22K
Inference reposted
Sam Hogan 🇺🇸
@samhogan
Apr 29
Excited to launch Day One support for tracing the Cursor Agent SDK with @inference_net 3 lines of code is all you need to track agent performance across executions and iterate to perfection Docs below 👇
00:35
Cursor
@cursor_ai
Apr 29
We’re introducing the Cursor SDK so you can build agents with the same runtime, harness, and models that power Cursor. Run agents from CI/CD pipelines, create automations for end-to-end workflows, or embed agents directly inside your products.
9.4K