Similarity & Vector Search
This section is the reference for everything vector in Curiosity Workspace: what each embedding index does, which built-in models are available, how to call them from IQuery, how to compose multiple signals into a similarity scenario, and how to cluster and visualize the result.
If you only need to add semantic retrieval to the search UI, start with AI Search — that page is the operator's view. The pages below are the builder's view: what you call from C# in endpoints, code indexes, and the shell.
The three embedding indexes
Curiosity ships three embedding indexes. Each produces vectors per node, stores them in an HNSW (Hierarchical Navigable Small World) approximate nearest-neighbor index, and exposes the same ISimilarityIndex / ITextSimilarityIndex surface to the rest of the system. They differ in where the vector comes from:
| Index | Vector source | Use when |
|---|---|---|
| Sentence Embeddings | A transformer model encodes a text field on the node. | You want semantic search or recommendations over free-text fields (names, summaries, bodies). |
| Graph Embeddings (PageSpace) | A self-supervised model trains over the graph topology — nodes near each other in the graph end up near each other in vector space. | You want "structurally similar" results — e.g. users with similar behavior, products with similar buying patterns, regardless of text. |
| Raw Embeddings | You supply your own vectors (from an external provider, a domain-specific model, or any code you can run). | You already have embeddings, or you need a model the workspace doesn't ship with. |
All three implement ISimilarityIndex. From the query layer, a similar-products lookup looks identical regardless of which index is configured — the difference is purely in how the vector was produced.
Where similarity surfaces in code
| Layer | What it does |
|---|---|
Index (SentenceEmbeddingsIndex, PageSpaceEmbeddingsIndex, RawEmbeddingsIndex) |
Computes vectors and serves nearest-neighbor lookups. |
IQuery.Similar(...) |
Inside a query chain, replaces the current set with each node's neighbors. Cheap, single-index. See IQuery Similarity Search. |
IQuery.StartAtSimilarTextAsync(...) |
Starts a query from text — encodes the text and pulls neighbors from a ITextSimilarityIndex. |
IQuery.ToSimilarity(...) |
Builds a multi-signal scenario: combine vector neighbors with graph traversals or external lookups, fuse, and apply rules. See Similarity Engine. |
WeightedGraph<T>.Cluster(...) |
Takes a list of weighted similarity edges and groups nodes into clusters. See Clustering & Visualization. |
In this section
Sentence Embeddings
Embed text fields with built-in transformer models (MiniLM, ArcticXS) or an external provider. The default choice for semantic search.
Graph Embeddings (PageSpace)
Self-supervised embeddings derived from the graph's topology. Find structurally similar nodes regardless of their text content.
Raw Embeddings
Bring your own vectors. Index pre-computed embeddings from any source via Curiosity.Library.
IQuery Similarity Search
Consume vectors from inside IQuery: Similar(), StartAtSimilarTextAsync(), narrowing to a specific index, and building endpoints that return scored UIDs.
See also
- AI Search — operator-facing configuration of the same embedding indexes.
- Indexes overview — how indexes are queued and processed.
- Code Indexes — how to compute derived text or your own vectors inside the workspace.
- Graph Query Language — the full
IQuerysurface.