Code Mode — 1 tool, 57 capabilities
Code Mode
Section titled “Code Mode”One tool. Runs sandboxed JavaScript. Calls every Ganglia tool through typed wrappers. Chains N tool calls server-side so intermediate results never re-enter your agent’s context window.
A typical agent workflow — “find dead functions that have no tests” — is a 50-step sequence:
- Call
code_dead→ 50 functions come back - Call
code_test_forfor each of the 50 → 50 more results in context - Filter manually → final answer
Each step forces the LLM to re-read the entire conversation. Schema tokens + accumulated results grow ~quadratically with the number of tool calls. On real workflows this burns 100,000+ tokens that never needed to be there.
How Code Mode fixes it
Section titled “How Code Mode fixes it”gl_run({ code: ` const dead = gl.code.dead({}); const coverage = dead.map(fn => gl.code.test_for({ name: fn.name })); return dead.filter((_, i) => coverage[i].tests.length === 0);` })- One turn. Agent writes code once, gets final answer back.
- Server runs the chain internally. 50 intermediate results never touch the context.
- Only the filtered output returns. The 47 functions that had tests are silently discarded server-side.
Real benchmarks
Section titled “Real benchmarks”Running scripts/benchmark_code_mode.py on a 462-file Rust project with 5 realistic workflows:
| Metric | Individual tools | Code Mode | Reduction |
|---|---|---|---|
| Schema load (one-time) | 6,376 tokens | 575 tokens | 91% |
| Wire tokens per workflow | 4,014 | 2,316 | 1.7× |
| Effective LLM context | 170,424 | 9,191 | 18.5× |
The third row is what your agent actually pays for — tokens that enter the model’s context window across all turns. On longer workflows (10+ tool calls) the ratio exceeds 30×.
The sandbox
Section titled “The sandbox”Code runs in QuickJS with hard limits:
- 256 MB memory — can’t allocate forever
- 30 s timeout — can’t loop forever
- No filesystem, no network, no process — the only escape hatch is
gl.*wrappers back into the host - No recursion —
gl_runcan’t call itself
Dimensions stay lean: the QuickJS runtime is embedded in the gl binary (~500 KB), no external dependencies, starts in milliseconds.
The API
Section titled “The API”All wrappers return the tool’s output synchronously. No await.
const gl = { code: { grep, get, dead, hotspots, impact, callers, callees, semantic_search, similar, /* … 45 more */ }, doc: { index, list, toc, get, query }, smart: { read, grep, diff, context }, deliberation: { start, opinion, status, result }, call: (name, args) => /* escape hatch for any registered MCP tool */,};The full typed API is available as an MCP resource — ganglia://types/api.d.ts. Claude Code fetches and caches it automatically if your client supports resources.
When to use gl_run vs. individual tools
Section titled “When to use gl_run vs. individual tools”Use gl_run when:
- Your workflow would take 3+ sequential tool calls
- You want to filter / transform / aggregate results before they hit the LLM
- You’re writing a multi-step analysis (“find X and show me Y for each”)
Use individual tools when:
- It’s a one-shot lookup (
code_get,code_grep) - You need to see the intermediate result to decide what to do next
- You’re exploring interactively
Typo recovery
Section titled “Typo recovery”Guessed the wrong method name? The Proxy wrapper helps:
> gl.code.rgep({pattern: "foo"})Error: Unknown method gl.code.rgep — available: annotate, build, callees, callers, ..., grep, hotspots, impact, ..., search, ...Or when calling via gl.call(name):
> gl.call("code_rgep", {...})Error: Unknown tool: code_rgep (did you mean `code_grep`?)Can I stack it with semantic search?
Section titled “Can I stack it with semantic search?”Yes. The whole point.
gl_run({ code: ` const hits = gl.code.semantic_search({ query: "payment retry with exponential backoff", top_k: 5 }); return hits.map(h => ({ name: h.name, callers: gl.code.callers({ name: h.name }), impact: gl.code.impact({ name: h.name, depth: 2 }) }));` })One call. Three layers of retrieval (semantic → graph → blast radius). Returns a structured digest. Your agent’s context doesn’t blow up.
Further reading
Section titled “Further reading”- Semantic search — the RAG layer under
gl.code.semantic_search - Changelog — what landed when
- Benchmark script — run the numbers on your own codebase