Semantic search (RAG)

Semantic search

Regex search finds text. Structural search follows the graph. Neither finds code by meaning — a query like “payment retry with exponential backoff” misses functions named attempts_remaining or stripe_backoff even though they do exactly that.

Semantic search fixes this by embedding every symbol with a small language model and indexing the vectors. At query time it embeds the natural-language query and returns the N nearest symbols by cosine similarity.

Three tools

code_embed_index              # build or refresh vectors (idempotent)
code_semantic_search(query)   # natural language → top-K symbols
code_similar(name)            # find symbols like this one

All three are wired into Code Mode — gl.code.semantic_search({query, top_k}) works inside gl_run.

Zero-config setup

Ollama is the default provider. If it’s running, nothing else is needed:

# One-time: install Ollama + pull the embedding model
ollama pull nomic-embed-text

# Watch your project — embeddings auto-build in the background
gl watch .

You’ll see in the log:

Building semantic search index in background...
Semantic index ready: 47 embedded, 0 unchanged (model: nomic-embed-text)

Alternate providers

Provider	Model	Dim	Env vars
`ollama`	nomic-embed-text (default)	768	`OLLAMA_URL=http://localhost:11434` `GL_OLLAMA_EMBED_MODEL=nomic-embed-text`
`openai`	text-embedding-3-small	1536	`GL_EMBED_PROVIDER=openai` `OPENAI_API_KEY=sk-...`
`fastembed`	BGE-small-en-v1.5 (ONNX)	384	Feature-gated: `cargo build --features local-embeddings`

Switch at runtime by setting GL_EMBED_PROVIDER. Vectors are tagged with their model — switching triggers a background re-embed.

How the “fingerprint” works

We don’t embed raw code bodies. Bodies are noisy — full of boilerplate, error handling, imports. Instead we compose a short, dense “fingerprint” per symbol that captures intent:

function authenticate_user(email, password) -> Result<User, AuthError>
docstring: Validates credentials and returns the user if valid.
calls: hash_password, lookup_user, issue_session
callers: login_handler, reset_password
file: src/auth/service.rs

The fingerprint is blake3-hashed so code_embed_index can skip symbols whose fingerprint hasn’t changed. Running on every gl watch cycle is effectively free after the first pass.

Query examples

// Natural language
gl.code.semantic_search({ query: "validate user credentials", top_k: 5 })
// → authenticate_user (0.654 cosine), hash_password (0.514), ...

// File filter
gl.code.semantic_search({
  query: "route handler",
  file_filter: "src/api/"
})

// Similarity from an anchor
gl.code.similar({ name: "retry_with_backoff" })
// → exponential_delay, circuit_breaker_reset, ...

Stacking with the graph

Semantic search alone returns symbols by meaning. Stack it with graph tools for “meaning + structure”:

gl_run({ code: `
  // Find payment-related code
  const hits = gl.code.semantic_search({
    query: "payment processing and retries",
    top_k: 10
  });

  // Get blast radius for each one
  return hits.map(h => ({
    name: h.name,
    file: h.file,
    impact: gl.code.impact({ name: h.name, depth: 2 })
  }));
` })

Storage

Vectors are stored in the project’s graph backend:

Cozo (default): embedding relation with raw little-endian f32 bytes. Brute-force cosine in Rust (~50 ms for 10k symbols).
FalkorDB: :Embedding nodes with JSON-serialized vectors. Same brute-force approach.

HNSW indexing is on the roadmap for v2 — it’ll matter once we have 100k+ symbols per project.

Disabling auto-index

If you don’t want gl watch to build embeddings automatically:

export GL_EMBED_AUTO=false

You can then build them manually when needed:

code_embed_index({})

Troubleshooting

“Embedding provider unavailable” — Ollama isn’t running. ollama serve in another terminal, then ollama pull nomic-embed-text.
“No embeddings yet for model X” — run code_embed_index({}) once. Or enable auto-index.
Queries return nothing relevant — try code_embed_index({force: true}) to rebuild from scratch, then re-query. The fingerprint compose logic improves over releases.