#
Vector Search
#
Vector Search
Vector search (semantic search) retrieves results by meaning rather than by exact keywords. Curiosity Workspace achieves this by embedding text fields into vectors and indexing those vectors for fast similarity queries.
#
When to use vector search
Vector search is most useful when:
- users don’t know the right keywords
- the text is long and varied (notes, conversations, articles)
- you need “similar items” experiences (similar cases, related documents)
#
Core concepts
- Embedding model: converts text into a vector representation.
- Vector index: stores vectors to support fast approximate nearest-neighbor search.
- Similarity score: higher means “closer in meaning”.
- Chunking: splitting long fields into smaller pieces so embeddings preserve important content.
#
What to index with embeddings
Good candidates:
- descriptions, summaries, bodies
- conversations, transcripts
- knowledge base content
Avoid embedding:
- short identifiers (IDs, codes)
- fields where exact matching is required (unless used only as a supplement)
#
Chunking strategy (practical guidance)
If your content can exceed typical embedding context lengths:
- enable chunking for those fields
- choose chunk sizes that preserve semantic units (paragraphs, messages)
- too small: loses context
- too large: truncation/poor representation
#
Common pitfalls
- Embedding everything: increases cost and can reduce precision.
- No grounding: semantic results are better when constrained by facets/graph context.
- No evaluation: tune similarity cutoffs and chunking with real user queries.
#
Next steps
- Combine with keywords for robust relevance: Hybrid Search
- Tune ranking and cutoffs: Ranking Tuning