Sentence Embeddings
The SentenceEmbeddingsIndex reads a text field off each node, runs it through a transformer encoder, and stores the resulting vector in an HNSW index. It's the default choice for semantic search and text-based recommendations.
The class lives in Mosaik.GraphDB.Indexes.SentenceEmbeddingsIndex. It implements ITextSimilarityIndex, so anything that consumes embeddings — IQuery.StartAtSimilarTextAsync, hybrid search, the similarity engine — works against it.
When to use it
Reach for sentence embeddings when the content of the field is what makes two nodes similar: product names, support case summaries, article bodies, code snippets, chat messages. If the words don't matter and only the graph structure does, use Graph Embeddings instead. If you already have vectors from a domain-specific model, use Raw Embeddings.
Built-in models
The index ships with three model choices, selected by SentenceEncoderModel:
| Model | Source | Runs | Max chunk | Notes |
|---|---|---|---|---|
MiniLM |
all-MiniLM-L6-v2 ONNX |
In-process (CPU/GPU) | ~256 tokens | Fast, low-RAM. Good default for short to medium fields. |
ArcticXS |
snowflake-arctic-embed-xs ONNX |
In-process (CPU/GPU) | ~512 tokens | Default for new indexes. Higher recall than MiniLM at comparable cost. |
External |
Any OpenAI-compatible embeddings endpoint (OpenAI, Azure OpenAI, Cohere, Google, custom URL) | Remote HTTP | 4096 tokens (ExternalSentenceEncoder.DefaultMaxChunkLength) |
Use when you need a hosted model — set provider, URL, model name, API key on the options. |
None |
— | — | — | Sentinel; disables embedding. |
ArcticXS is SentenceEmbeddingsIndex.SentenceEncoderModelDefaultModel, the default selected by the AI Search controller and most UI flows.
The External model is configured via SentenceEmbeddingsIndexOptions:
opts.SentenceEncoderModel = SentenceEncoderModel.External;
opts.ExternalProviderName = "OpenAi"; // "OpenAi" | "AzureOpenAi" | "Cohere" | "Google" | "Anthropic"
opts.ExternalProviderUrl = ""; // override base URL if needed
opts.ExternalProviderModel = "text-embedding-3-small";
opts.ExternalProviderApiKey = "sk-…";
How indexing happens
Each indexed node turns into either one vector (no chunking) or many vectors (chunked, with the parent UID stored alongside each chunk). At query time, chunk hits dedupe back to the parent node.
Registering the index
Use the Graph.Indexes extension method. Two common cases:
1. Default settings — short field.
var index = await Graph.Indexes.AddSentenceEmbeddingsIndexAsync(
nodeType: N.Product.Type,
fieldName: N.Product.Name,
model: SentenceEncoderModel.ArcticXS);
2. Long-form content with chunking + AI Search enabled.
var settings = new SettingsHolder();
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.ChunkText), "True");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.ChunkOverlap), "50");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.MaximumChunks), "200");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.MinimumLength), "20");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.EnableAISearch), "True");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.InjectResultCutoff), "0.50");
settings.ManuallySet(nameof(SentenceEmbeddingsIndexOptions.RerankResultCutoff), "0.40");
var index = await Graph.Indexes.AddSentenceEmbeddingsIndexAsync(
nodeType: N.SupportCase.Type,
fieldName: N.SupportCase.Content,
model: SentenceEncoderModel.ArcticXS,
setting: settings);
You can also register the same configuration through the admin UI under Settings → Indexes → Code Indexes → Sentence Embeddings, or via a migration. The UI sets the same SentenceEmbeddingsIndexOptions fields under the hood.
Index options reference
The SentenceEmbeddingsIndexOptions class drives every knob:
| Option | Type | Default | Effect |
|---|---|---|---|
SentenceEncoderModel |
enum | ArcticXS |
Which encoder runs. Changing it rebuilds the index. |
MinimumLength |
int | 20 | Skip values shorter than this; small strings hurt precision. |
InferencingCores |
int | 1 | ONNX session intra-op threads. |
ParallelInferencing |
int | 1 | How many texts encode concurrently. |
ChunkText |
bool | false | Split long values into overlapping windows before encoding. |
ChunkOverlap |
int | 50 | Tokens shared between adjacent chunks (only used when ChunkText is true). |
MaximumChunks |
int | 100 | Cap chunks per value to bound work on very long docs. |
EnableAISearch |
bool | false | Allow the search controller to inject these vectors into hybrid queries. |
InjectResultCutoff |
float | 0.50 | Min cosine similarity for a vector hit to enter the search result set. |
RerankResultCutoff |
float | 0.40 | Min similarity to reorder an existing BM25 hit. |
ResultsToExpand |
int | 100 | Top-N pulled from HNSW before cutoff is applied. |
Binary |
bool | false | Store vectors as 1-bit values (smaller, faster, less accurate). |
Buckets |
int | 1 | Shards the HNSW across N searchers — increase for very large corpora. |
FileTypes |
enum[] | null | Restrict to files matching these FilesType values (used for _FileEntry). |
SourcesToIndex |
string[] | null | Restrict to nodes ingested by specific connectors. |
ExternalProviderName / …Url / …Model / …ApiKey |
string | — | External encoder configuration (only used with SentenceEncoderModel.External). |
Consuming the vectors
Once the index is built, you query it like any other text similarity index. From an endpoint:
// 1. Pure semantic retrieval (text → similar nodes).
var hits = await Q()
.StartAtSimilarTextAsync(
text: "battery drains overnight",
count: 20,
nodeTypes: new[] { N.SupportCase.Type })
.ContinueWith(t => t.Result.EmitWithScores());
// 2. From a seed node — find neighbors of an existing UID using only this index.
var index = Graph.Indexes
.OfType<SentenceEmbeddingsIndex>(N.Product.Type)
.First(i => i.FieldName == N.Product.Name);
return Q().StartAt(productUID)
.Similar(IndexTypes.SentenceEmbeddingsIndex, index.UID, count: 20)
.EmitWithScores();
See IQuery Similarity Search for the full set of IQuery methods that consume sentence embeddings, including how to filter by index UID.
Inspecting an index in code
foreach (var ix in Graph.Indexes.OfType<SentenceEmbeddingsIndex>())
{
Logger.LogInformation("{Field} on {Type}: {Vectors} vectors, model={Model}",
ix.FieldName, ix.NodeType, ix.VectorsCount, ix.SentenceEncoderModel);
}
ITextSimilarityIndex.PredictVectorAsync(text, ct) returns the raw vector for a string — useful when you want to feed the encoder's output into a non-Curiosity store, or to debug similarity scores.
See also
- AI Search — operator-facing tuning of the same options.
- IQuery Similarity Search — calling the index from queries and endpoints.
- Similarity Engine — combining sentence embeddings with graph signals.
- Raw Embeddings — when you have vectors from elsewhere.
- Indexes overview — how the indexing queue works.