MCP server · v0.1.1 · npm + Docker

Peek the schema,
slice the body.

Your agent burned 12,400 tokens on a 200 KB JSON to read one field. With mcp-peek, the same call costs ~200 tokens — and every response carries next_step_hints plus structured error envelopes, so the agent self-corrects instead of guessing.

Install for Claude Desktop View on GitHub

npx -y mcp-peek · one config block MIT · ~10ms overhead schema-first · jq-filtered · self-correcting

scroll

§ 01 · The pain

Two daily papercuts.

Both come from the same place: HTTP for agents was retrofitted from HTTP for humans. Here's how it shows up.

Without mcp-peek

0 bytes

12,400 tokens consumed to fetch one paginated list — most of it whitespace and fields the agent never touched.

Context window: 80% full after a single call. The next reasoning step has to evict what mattered.

agent context window 0% used

And every curl prompt

0 dialogs · per session

You clicked Always last call. The next URL has a different path, so the prompt fires again.

Two pains. Same root cause: HTTP for agents was never designed for agents.

§ 02 · The fix

One MCP server. Both pains gone.

A schema-first wrapper around HTTP. The agent sees structure first, then pulls only the fields it actually needs.

01 / FETCH

http_request()

Agent calls the endpoint. Body is fetched once and held in cache — never streamed into context.

// agent → http_request({ url: "…/v1/users" })

02 / SCHEMA

paths · shape · sample · json_schema

Server returns a compact map of the response: paths, types, one example per leaf. Body stays cached.

// → ~200 tokens data[].id : int data[].name : string meta.total : int

03 / JQ MASK

http_read()

Agent asks for exactly the slice it wanted via a jq mask. Only that slice enters the context window.

// pluck one field http_read({ mask: ".meta.total" }) // → 1247

Without

With mcp-peek

Tokens to context window

12,400

~ 200

Permission prompts (50 calls)

Structured errors + next_step_hints

none

on every response

Latency overhead

0 ms

~ 10 ms

Authorize the MCP server once at config time. After that, no per-call permission prompts.

§ 02b · Built for the agent

Built for how an agent reads an API — not how a human runs curl.

Four structural choices. The hero numbers all fall out of these.

Schema-first responses.

http_request returns the shape of the response, not the response itself. The agent learns what's inside without burning tokens on bytes it'll discard anyway.

next_step_hints on every reply.

JSON responses come back with advisory jq masks derived from the top-level shape. The agent has a starting point instead of guessing paths.

Structured error envelopes.

Every error is a typed error.kind with a message and detail.hint. The agent branches programmatically by cause and acts on the hint, instead of parsing a stack trace.

Re-read via cache_id.

Ask for the same body in a different schema format, or pull a different slice — without a second HTTP call. The body sits in the cache, not in context.

→ Out of which:

~50× less context on real-world payloads.
One permission grant at install time — not per call.
Self-correcting agent loop. Typed errors and hints keep the agent off the guess-spiral.

§ 03 · Schema formats

Same response. Four ways to look at it.

Pick the rendering that matches what your agent is doing — flat field listing, nested shape, realistic sample, or strict JSON Schema.

§ 04 · Install

Drop one block. You're done.

One config object, one MCP authorization. Pick your client on the left — npx -y mcp-peek is the same everywhere; only the surrounding schema differs.

→ paste · save · restart

mcp · stdio

▸ claude_desktop_config.json

authorized once 0 per-call prompts ~10ms overhead 50 MB response cap 10 min cache TTL

npmrecommended

$npx -y mcp-peek

dockerghcr

$docker run -i --rm ghcr.io/ed-smartass/mcp-peek

// note After this, no per-call permission prompts. The MCP server is authorized once at config time and that grant covers every subsequent http_request / http_read the agent makes.