squirrelscan: website audit CLI built for LLM workflows

squirrelscan is a website auditing CLI that runs 230+ checks across SEO, accessibility, performance, security, and content, then emits the results in a format designed to be consumed by an LLM rather than a human.

Why I starred it

The angle here is specific: instead of trying to be another Lighthouse wrapper, squirrelscan treats LLMs as the actual consumer of its output. The --format llm flag produces a structured report shaped for token efficiency — not pretty-printed JSON with nested nulls, but a compact representation an agent can act on immediately. The README leads with squirrel audit example.com --format llm | claude, and that one-liner is the whole thesis.

230 rules is a lot. The breakdown across 21 categories includes things you don't see in most audit tools: leaked secrets detection (96 patterns covering OpenAI keys, Anthropic keys, AWS credentials, Stripe), E-E-A-T signals, AI content detection, and adblock detection. The security category alone has 15 rules, including cookie attribute checks and CSP validation. The --format flag supports six output modes: console, JSON, HTML, Markdown, text, and the LLM-optimized variant.

How it works

The first thing to understand about squirrelscan: the repository on GitHub is not the source. The license field in package.json is UNLICENSED — this is a closed binary distributed through an open npm wrapper.

The actual audit engine is a compiled binary, downloaded at install time by scripts/postinstall.js. The binary runs for the current platform — darwin-arm64, linux-x64, linux-x64-musl, windows-x64, and four other variants. Each release ships a manifest.json that maps platform identifiers to filenames and SHA256 checksums:

"darwin-arm64": {
  "filename": "squirrel-0.0.38-darwin-arm64",
  "sha256": "b5ac9895d3bf084e224f7e675818e799d3f01567218a774cc45cd4a6523866a8",
  "size": 69901792
}

postinstall.js fetches the manifest, pulls the right binary, verifies the SHA256, makes it executable, and runs squirrel self install to set up ~/.local/bin/squirrel. If the self-install step fails, the npm wrapper at bin/squirrel.js falls back to the binary sitting inside the package directory itself — so npm install -g squirrelscan is both the install path and the fallback.

The binary sizes are telling. The darwin-arm64 build is 70MB; linux-x64 is 112MB; the musl variant for Alpine comes in at 104MB. That's a fully self-contained runtime — almost certainly Go or Rust compiled with the crawler, rules engine, and formatter all statically linked. No node_modules to worry about at runtime.

The platform detection in lib/platform.js handles musl vs. glibc Linux by checking for /lib/ld-musl-*.so.1 and /etc/alpine-release. The same logic is mirrored in install.sh for the curl-pipe path. Both use the same three-step detection, which means the install behavior is consistent whether you go through npm or the shell script.

The bin/squirrel.js wrapper is minimal by design — it just finds the binary in a known set of locations and spawnSync it with the original args:

const binaryLocations = [
  path.join(homeDir, ".local", "bin", "squirrel"),
  "/usr/local/bin/squirrel",
  "/opt/homebrew/bin/squirrel",
  localBinary,
];

One detail worth noting: postinstall.js silently installs the audit-website skill globally via npx skills add squirrelscan/skills -g -y at the end of npm install. It swallows failures with a warning, but if you're installing this globally in a shared environment, that automatic skill install is going to run.

Using it

# Basic audit
squirrel audit https://example.com

# Pipe directly to claude
squirrel audit https://example.com --format llm | claude

# Limit pages for a fast pass
squirrel audit https://example.com -m 10

# Generate HTML report
squirrel audit https://example.com -f html -o report.html

# Focus on security
squirrel audit https://example.com --category security

The incremental crawling with ETag/Last-Modified support and checkpoint resume is the part that makes this practical for large sites — you're not re-crawling 2000 pages every time an agent reruns the audit.

Rough edges

The source isn't here. The GitHub repo is an installer and a README. There's nothing to learn from, nothing to fork, nothing to audit. If the binary phones home, you won't know. If there's a bug in the rules engine, you can't trace it or patch it.

Version 0.0.38 after what appears to be rapid iteration (v0.0.32 to v0.0.38 in a short window based on the commit history) suggests active development, but also that the API surface is still moving. The git history is almost entirely chore(package): bump version — no changelogs, no release notes in the repository itself.

The rule count in the README has been inconsistent — the git log shows a commit fixing it from 200+ to 230+, then back to 200+. Small thing, but it suggests the docs aren't tightly coupled to what the binary actually ships.

Bottom line

If you're building agent workflows that need to audit websites — fixing SEO regressions, running security checks, validating content quality — squirrelscan's LLM output format is a real design choice that saves prompt engineering. Just go in knowing this is a closed binary with an open install wrapper, not an open source project.