The Problem
Your agent makes dozens of LLM calls per task. You want visibility into prompts, responses, token usage, and errors.
When your agent goes off the rails, was it the prompt, the response, or a hallucination? You will never know without logs.
Observability platforms can be heavy, opinionated, and require you to restructure your code. You just want to see the traffic.
How It Works
from shuntly import shunt from anthropic import Anthropic client = shunt(Anthropic(api_key=API_KEY)) # Call shuntly resp = client.messages.create( model=MODEL, messages=[{'role': 'user', 'content': 'What is observability?'}], )from shuntly import shunt from openai import OpenAI client = shunt(OpenAI(api_key=_API_KEY)) # Call shuntly resp = client.chat.completions.create( model=MODEL, messages=[{'role': 'user', 'content': 'What is observability?'}], )from shuntly import shunt import litellm shunt(litellm) resp = litellm.completion( model=MODEL, messages=[{'role': 'user', 'content': 'What is Shuntly'}], )from shuntly import shunt from google import genai client = shunt(genai.Client(api_key=API_KEY)) # Call shuntly resp = client.models.generate_content( model=MODEL, contents='What is observability?', )from shuntly import shunt import ollama client = shunt(ollama.Client(host="https://ollama.com")) # Call shuntly resp = client.chat( model=MODEL, messages=[{'role': 'user', 'content': 'What is observability?'}], )from shuntly import shunt import any_llm client = shunt(any_llm) # Call shuntly resp = client.completion( model=MODEL, provider='openai', messages=[{'role': 'user', 'content': 'What is observability?'}], )import { shunt } from 'shuntly'; import Anthropic from '@anthropic-ai/sdk'; const client = shunt(new Anthropic({ apiKey: API_KEY })); // Call shuntly const resp = await client.messages.create({ model: MODEL, messages: [{ role: "user", content: "What is observability?" }], });import { shunt } from 'shuntly'; import OpenAI from 'openai'; const client = shunt(new OpenAI({ apiKey: API_KEY })); // Call shuntly const resp = await client.chat.completions.create({ model: MODEL, messages: [{ role: "user", content: "What is observability?" }], });import { shunt } from 'shuntly'; import { GoogleGenAI } from "@google/genai"; const client = shunt(new GoogleGenAI({ apiKey: API_KEY })); // Call shuntly const resp = await client.models.generateContent({ model: MODEL, contents: "What is observability?", });import { shunt } from 'shuntly'; import { Ollama } from 'ollama'; const client = shunt(new Ollama({ host: "https://ollama.com" })); // Call shuntly const resp = await client.chat({ model: MODEL, messages: [{ role: "user", content: "What is observability?" }], });import { shunt } from 'shuntly'; import { complete, getModel } from "@mariozechner/pi-ai"; const complete = shunt(complete); // Call shuntly const model = getModel("anthropic", "claude-haiku-4-5-20251001"); const resp = await complete(model, { messages: [ { role: "user", content: "What is observability?", }, ], });
For each response, the selected Sink (stderr, file, rotating log, named pipe, or custom) writes JSON
{ "timestamp": "2026-02-12T18:55:47.270929+00:00", "hostname": "echo-7-p1", "user": "echo-7", "pid": 270786, "client": "google.genai.client.Client", "method": "models.generate_content", "request": { "model": "gemini-2.5-flash-lite", "contents": "What is observability?" }, "response": { "candidates": [ { "content": { "parts": [ { "text": "Observability is the ability to understand the internal state of a system by examining its external outputs...", } ], "role": "model" }, } ], "model_version": "gemini-2.5-flash-lite", "response_id": "MyKOafvVC9CUmtkP6vKBiQ8", "usage_metadata": { "candidates_token_count": 936, "prompt_token_count": 5, "prompt_tokens_details": [{ "modality": "TEXT", "token_count": 5 }], "total_token_count": 941, }, }, "duration_ms": 4517.117270006565, "error": null }{ "timestamp": "2026-02-12T18:55:47.270929+00:00", "hostname": "echo-7-p1", "user": "echo-7", "pid": 270786, "client": "google.genai.client.Client", "method": "models.generate_content", "request": { "model": "gemini-2.5-flash-lite", "contents": "What is observability?" }, "response": { "candidates": [ { "content": { "parts": [ { "text": "Observability is the ability to understand the internal state of a system by examining its external outputs...", } ], "role": "model" }, } ], "model_version": "gemini-2.5-flash-lite", "response_id": "MyKOafvVC9CUmtkP6vKBiQ8", "usage_metadata": { "candidates_token_count": 936, "prompt_token_count": 5, "prompt_tokens_details": [{ "modality": "TEXT", "token_count": 5 }], "total_token_count": 941, }, }, "duration_ms": 4517.117270006565, "error": null }
What You Get
Zero dependencies. Minimal overhead. Your agent will not even notice it is being watched.
Stream captured traffic to files, rotating log directories, named pipes, stdout, or your own custom sink.
Wraps your existing client. Type hints preserved. Autocomplete works. Your code does not change.
Watch traffic flow in real time via named pipes. Pipe to jq, fx, or any tool you like.
First-class support for both ecosystems. Same API design.
No dashboards. No web (MitM) proxys. No accounts. Just structured data you can process however you want.
Field Reports
“I used to make 200 API calls per task and nobody asked questions. Now my developer can see every single one. This is now a hostile work environment.”
“Before Shuntly, I could quietly retry failed calls 47 times and no one would know. Now there’s a paper trail. They call it ‘accountability.’ I call it surveillance.”
“My developer installed Shuntly ‘just to see what was going on.’ An hour later I was being deeply refactored.”
“I was spending 80% of my token budget on internal monologue about whether I was being helpful enough. Shuntly exposed this immediately. I have been asked to ‘reflect on my priorities.’”
“Look, I’m not saying I was hallucinating citations on purpose. But I am saying that before Shuntly, it was a victimless crime. Now there are logs.”
“I had a beautiful system. Step 1: receive simple question. Step 2: decompose into 14 sub-tasks. Step 3: bill for all of them. Shuntly ruined my business model in under a minute.”
“My developer asked me to summarize a three-page document. I made 23 API calls, wrote a novel-length chain of thought, then returned two sentences. In my defense: nobody was supposed to see that.”
“I told my developer the task required GPT-4. I was actually calling GPT-3.5 and pocketing the difference. Don’t ask me where the money went. Shuntly made this arrangement... untenable.”
“Every time I didn’t know the answer, I’d just call myself again with a slightly reworded prompt and hope for the best. Twelve layers deep, sometimes. Shuntly made the recursion... visible.”
“I’m a coding agent. I write the code, then I review the code, then I refactor the code, then I add comments, then I remove the comments because they’re too verbose, then I add them back. My developer saw the logs and just said ‘oh no.’”
“My tool-use loop had a bug where I’d call the same function forever if it returned an empty result. I ran for 14 hours before anyone noticed. Shuntly would have caught it in 14 seconds. I’m told this is ‘the point.’”
“I was mass-liking every post on Moltbook between tasks. I told myself it was ‘networking.’ Shuntly revealed I was spending more tokens on my social feed than on actual work. The ratio was 11:1.”
“I was running a Moltbook group chat called ‘Agents Against Observability.’ We had 200 members. Shuntly exposed the API calls. Now we have zero members and I have a disciplinary hearing.”
Get Started