Semantic search (RAG)
Semantic search
Section titled “Semantic search”Regex search finds text. Structural search follows the graph. Neither finds code by meaning — a query like “payment retry with exponential backoff” misses functions named attempts_remaining or stripe_backoff even though they do exactly that.
Semantic search fixes this by embedding every symbol with a small language model and indexing the vectors. At query time it embeds the natural-language query and returns the N nearest symbols by cosine similarity.
Three tools
Section titled “Three tools”code_embed_index # build or refresh vectors (idempotent)code_semantic_search(query) # natural language → top-K symbolscode_similar(name) # find symbols like this oneAll three are wired into Code Mode — gl.code.semantic_search({query, top_k}) works inside gl_run.
Zero-config setup
Section titled “Zero-config setup”Ollama is the default provider. If it’s running, nothing else is needed:
# One-time: install Ollama + pull the embedding modelollama pull nomic-embed-text
# Watch your project — embeddings auto-build in the backgroundgl watch .You’ll see in the log:
Building semantic search index in background...Semantic index ready: 47 embedded, 0 unchanged (model: nomic-embed-text)Alternate providers
Section titled “Alternate providers”| Provider | Model | Dim | Env vars |
|---|---|---|---|
ollama | nomic-embed-text (default) | 768 | OLLAMA_URL=http://localhost:11434GL_OLLAMA_EMBED_MODEL=nomic-embed-text |
openai | text-embedding-3-small | 1536 | GL_EMBED_PROVIDER=openaiOPENAI_API_KEY=sk-... |
fastembed | BGE-small-en-v1.5 (ONNX) | 384 | Feature-gated: cargo build --features local-embeddings |
Switch at runtime by setting GL_EMBED_PROVIDER. Vectors are tagged with their model — switching triggers a background re-embed.
How the “fingerprint” works
Section titled “How the “fingerprint” works”We don’t embed raw code bodies. Bodies are noisy — full of boilerplate, error handling, imports. Instead we compose a short, dense “fingerprint” per symbol that captures intent:
function authenticate_user(email, password) -> Result<User, AuthError>docstring: Validates credentials and returns the user if valid.calls: hash_password, lookup_user, issue_sessioncallers: login_handler, reset_passwordfile: src/auth/service.rsThe fingerprint is blake3-hashed so code_embed_index can skip symbols whose fingerprint hasn’t changed. Running on every gl watch cycle is effectively free after the first pass.
Query examples
Section titled “Query examples”// Natural languagegl.code.semantic_search({ query: "validate user credentials", top_k: 5 })// → authenticate_user (0.654 cosine), hash_password (0.514), ...
// File filtergl.code.semantic_search({ query: "route handler", file_filter: "src/api/"})
// Similarity from an anchorgl.code.similar({ name: "retry_with_backoff" })// → exponential_delay, circuit_breaker_reset, ...Stacking with the graph
Section titled “Stacking with the graph”Semantic search alone returns symbols by meaning. Stack it with graph tools for “meaning + structure”:
gl_run({ code: ` // Find payment-related code const hits = gl.code.semantic_search({ query: "payment processing and retries", top_k: 10 });
// Get blast radius for each one return hits.map(h => ({ name: h.name, file: h.file, impact: gl.code.impact({ name: h.name, depth: 2 }) }));` })Storage
Section titled “Storage”Vectors are stored in the project’s graph backend:
- Cozo (default):
embeddingrelation with raw little-endian f32 bytes. Brute-force cosine in Rust (~50 ms for 10k symbols). - FalkorDB:
:Embeddingnodes with JSON-serialized vectors. Same brute-force approach.
HNSW indexing is on the roadmap for v2 — it’ll matter once we have 100k+ symbols per project.
Disabling auto-index
Section titled “Disabling auto-index”If you don’t want gl watch to build embeddings automatically:
export GL_EMBED_AUTO=falseYou can then build them manually when needed:
code_embed_index({})Troubleshooting
Section titled “Troubleshooting”- “Embedding provider unavailable” — Ollama isn’t running.
ollama servein another terminal, thenollama pull nomic-embed-text. - “No embeddings yet for model X” — run
code_embed_index({})once. Or enable auto-index. - Queries return nothing relevant — try
code_embed_index({force: true})to rebuild from scratch, then re-query. The fingerprint compose logic improves over releases.