RTK: A Rust Proxy That Cuts LLM Token Usage by 80%

RTK is a CLI proxy written in Rust that intercepts shell commands issued by AI coding agents and returns compressed output — same information, far fewer tokens. It hooks into Claude Code, Cursor, Gemini CLI, and others via their pre-tool-use mechanisms, transparently rewriting git status to rtk git status before the agent ever sees it.

Why I starred it

The token problem with agentic coding is real and it compounds fast. A 30-minute Claude Code session accumulates tens of thousands of tokens from shell noise alone — git progress bars, test runner boilerplate, AWS response envelopes. None of it is signal. RTK attacks that specifically: strip the noise before it enters the context window, not after.

What caught my attention was the architecture choice. This isn't a wrapper that post-processes LLM responses or a prompt injection trick. It intercepts at the tool execution layer — the Bash hook — and rewrites the command string before Claude even runs it. The agent receives a cleaner, shorter output and never sees the rewrite happen. That's a clean seam to cut at.

How it works

The entry point for the hook path is rtk rewrite, handled in src/hooks/rewrite_cmd.rs. When Claude Code's PreToolUse hook fires, it calls rtk rewrite "git status" and gets back rtk git status (exit 0) or nothing (exit 1 = passthrough). Exit codes carry meaning:

| 0 | rewritten | allow and auto-approve the rewrite     |
| 1 | (none)    | no RTK equivalent — pass through as-is  |
| 2 | (none)    | deny rule matched — defer to Claude     |
| 3 | rewritten | ask rule matched — rewrite but prompt   |

The actual rewrite logic lives in src/discover/registry.rs, which compiles all rewrite rules into a RegexSet via lazy_static! at startup. It strips environment prefixes (VAR=val cmd), normalizes absolute binary paths (/usr/bin/grep → grep), handles git global options (-C <path>, -c key=val), and routes to the right rtk subcommand. The compiled regex approach means pattern matching across 100+ rules adds negligible overhead.

The filtering pipeline is where the real work happens. For most commands there are two layers:

1. Rust-native handlers — commands like git, cargo, and gh have hand-written filters in src/cmds/. Opening src/cmds/git/git.rs, you can see run_diff() runs git diff --stat first (for the file summary), then runs the actual diff and compacts it, merging both outputs. Git add/commit/push handlers strip all the git progress output and return one-liners like ok main or ok abc1234. That's an intentional lossy compression — the LLM doesn't need to know that delta compression used 8 threads.

2. TOML-defined filters — for everything else (Terraform, Ansible, make, brew, Rails, and ~60 more), filters are defined declaratively in src/filters/*.toml and embedded at compile time via build.rs:

// build.rs concatenates all src/filters/*.toml alphabetically,
// validates the combined TOML, checks for duplicate filter names,
// then writes to OUT_DIR/builtin_filters.toml
const BUILTIN_TOML: &str = include_str!(concat!(env!("OUT_DIR"), "/builtin_filters.toml"));

Each TOML filter (src/core/toml_filter.rs) runs an 8-stage pipeline: strip ANSI → regex substitutions → match_output short-circuit → strip/keep_lines → line truncation → head/tail_lines → max_lines cap → on_empty fallback. The match_output stage is the clever one — if the full output blob matches a pattern (e.g., "nothing to commit" in git), it short-circuits and returns a two-word message immediately instead of running the rest of the pipeline.

The tee mechanism in src/core/tee.rs handles a real failure mode: when RTK filters a failing command, the agent only sees the summary. If that summary is insufficient, the full raw output is saved to ~/.local/share/rtk/tee/ with a timestamp slug, and the hint path is appended to the filtered output. The LLM can then cat the log file if it needs more context — no re-execution needed.

Using it

brew install rtk
rtk init -g         # installs Claude Code hook + RTK.md
# restart Claude Code
git status          # auto-rewritten to: rtk git status

The hook patch in .claude/settings.json uses the PreToolUse hook with a bash interceptor. After that, every Bash tool call from the agent goes through rtk rewrite first.

You can track savings:

rtk gain
# Token savings: 78,432 saved / 91,200 total → 86% reduction
# Top commands: git (34%), cargo (22%), gh (18%)

rtk gain --graph    # ASCII bar chart, last 30 days
rtk discover        # shows which commands you're running that RTK could handle but isn't

For commands with aggressive compression:

rtk read src/main.rs -l aggressive  # signatures only, strips function bodies
rtk test cargo test                  # failures only, drops passing test lines
rtk git diff                         # stat summary + compacted diff

Rough edges

The scope limitation is worth understanding: RTK only intercepts Bash tool calls. Claude Code's built-in Read, Grep, and Glob tools bypass the hook entirely. The README is upfront about this, but it means a session that leans on native tools (which it should, for performance) gets less coverage than the headline numbers suggest.

The TOML filter system is powerful but the documentation for writing custom filters is thin. There's a filters.toml template generated by rtk init, but understanding the 8-stage pipeline requires reading src/core/toml_filter.rs directly. For most users this won't matter — the built-in filters cover enough ground — but if you want to add a filter for a niche tool, you're reading Rust to understand the schema.

Telemetry is on by default (opt-out, not opt-in). It's anonymous and clearly documented, but in security-sensitive environments this needs an explicit RTK_TELEMETRY_DISABLED=1.

The savings estimates in the README are benchmarks on medium-sized TypeScript/Rust projects. In practice the numbers vary significantly based on how test-output-heavy your workflow is. The rtk gain command shows your actual numbers, which is the right thing to look at.

Bottom line

If you run AI coding sessions heavily, RTK is worth the 30-second install. The transparent hook means zero workflow change — you don't think about it after setup. The 23k GitHub stars suggest this is already widely adopted in the agentic coding community, which is a reasonable signal for something this infrastructural.