Curiosity - Sentence embeddings

Flow diagram showing Ticket.Body, text segments, and Vector index with arrows and a network-pattern background.

Sentence embeddings

Embeddings are the foundation. Each text field you mark as vector-indexed gets encoded into a dense vector and stored in an approximate nearest-neighbour (ANN) index. Two nodes with similar vectors are semantically similar.

Built-in models:

Model	Max tokens	Best for
MiniLM	256	Fast, low-RAM, good for short fields
ArcticXS	512	Higher recall, default for new indexes
External (OpenAI-compatible)	4096	Highest quality, text leaves your network

Register a vector index in code:

await Graph.Indexes.AddSentenceEmbeddingsIndexAsync(
    nodeType:  N.Ticket.Type,
    fieldName: N.Ticket.Body,
    model:     SentenceEncoderModel.ArcticXS);

Or enable it in Settings → Search → Indexes without writing code.

Find similar nodes by text:

var query = await Q().StartAtSimilarTextAsync(
    "battery drain overnight", count: 10, nodeTypes: new[] { N.Ticket.Type });

return query.EmitWithScores();

Find neighbours of a known node:

return Q().StartAt(seedUID)
          .Similar(indexUID: ticketBodyIndex, count: 10)
          .EmitWithScores();

EmitWithScores() is the only emitter that keeps the similarity score — you need it to rank and explain. The next steps build on these two lookups.

→ Sentence embeddings

Go back 01-what-is-similarity-and-recommendation Next step 03-graph-and-raw-embeddings