MCP server · v0.1.1 · npm + Docker

Peek the schema,
slice the body.

Your agent burned 12,400 tokens on a 200 KB JSON to read one field. With mcp-peek, the same call costs ~200 tokens — and every response carries next_step_hints plus structured error envelopes, so the agent self-corrects instead of guessing.

npx -y mcp-peek  ·  one config block MIT  ·  ~10ms overhead schema-first  ·  jq-filtered  ·  self-correcting
scroll
§ 01 · The pain

Two daily papercuts.

Both come from the same place: HTTP for agents was retrofitted from HTTP for humans. Here's how it shows up.

Without mcp-peek

0 bytes

12,400 tokens consumed to fetch one paginated list — most of it whitespace and fields the agent never touched.

Context window: 80% full after a single call. The next reasoning step has to evict what mattered.

agent context window 0% used

And every curl prompt

0 dialogs · per session

You clicked Always last call. The next URL has a different path, so the prompt fires again.

Two pains. Same root cause: HTTP for agents was never designed for agents.

§ 02 · The fix

One MCP server. Both pains gone.

A schema-first wrapper around HTTP. The agent sees structure first, then pulls only the fields it actually needs.

01 / FETCH
http_request()
Agent calls the endpoint. Body is fetched once and held in cache — never streamed into context.
// agent → http_request({ url: "…/v1/users" })
02 / SCHEMA
paths · shape · sample · json_schema
Server returns a compact map of the response: paths, types, one example per leaf. Body stays cached.
// → ~200 tokens data[].id : int data[].name : string meta.total : int
03 / JQ MASK
http_read()
Agent asks for exactly the slice it wanted via a jq mask. Only that slice enters the context window.
// pluck one field http_read({ mask: ".meta.total" }) // → 1247
 
Without
With mcp-peek
Tokens to context window
12,400
~ 200
Permission prompts (50 calls)
50
1
Structured errors + next_step_hints
none
on every response
Latency overhead
0 ms
~ 10 ms

Authorize the MCP server once at config time. After that, no per-call permission prompts.

§ 02b · Built for the agent

Built for how an agent reads an API — not how a human runs curl.

Four structural choices. The hero numbers all fall out of these.

01

Schema-first responses.

http_request returns the shape of the response, not the response itself. The agent learns what's inside without burning tokens on bytes it'll discard anyway.

02

next_step_hints on every reply.

JSON responses come back with advisory jq masks derived from the top-level shape. The agent has a starting point instead of guessing paths.

03

Structured error envelopes.

Every error is a typed error.kind with a message and detail.hint. The agent branches programmatically by cause and acts on the hint, instead of parsing a stack trace.

04

Re-read via cache_id.

Ask for the same body in a different schema format, or pull a different slice — without a second HTTP call. The body sits in the cache, not in context.

→ Out of which:
  1. ~50× less context on real-world payloads.
  2. One permission grant at install time — not per call.
  3. Self-correcting agent loop. Typed errors and hints keep the agent off the guess-spiral.
§ 03 · Schema formats

Same response. Four ways to look at it.

Pick the rendering that matches what your agent is doing — flat field listing, nested shape, realistic sample, or strict JSON Schema.


        
§ 04 · Install

Drop one block. You're done.

One config object, one MCP authorization. Pick your client on the left — npx -y mcp-peek is the same everywhere; only the surrounding schema differs.

→ paste · save · restart
~/mcp-peek/install mcp · stdio
claude_desktop_config.json

      
authorized once 0 per-call prompts ~10ms overhead 50 MB response cap 10 min cache TTL
npmrecommended
$npx -y mcp-peek
dockerghcr
$docker run -i --rm ghcr.io/ed-smartass/mcp-peek

// note After this, no per-call permission prompts. The MCP server is authorized once at config time and that grant covers every subsequent http_request / http_read the agent makes.