k-NN search
KNNSearch is the read side of SmallWorld<TItem, TDistance>. Given a query item and a k, it returns up to k items judged closest to the query under your distance function.
A single query
IList<SmallWorld<float[], float>.KNNSearchResult> hits =
graph.KNNSearch(query, k: 10);
Each KNNSearchResult exposes:
Id— the integer ID assigned byAddItems.Item— the originalTItemreference.Distance— the numeric distance from the query.
The result list is not pre-sorted — sort by Distance for ranked output:
foreach (var hit in hits.OrderBy(r => r.Distance))
{
Console.WriteLine($"id={hit.Id,8} dist={hit.Distance:F4}");
}
Tuning recall with EfSearch
EfSearch controls how aggressively the search walks the graph. Higher values raise recall — at the cost of latency.
You can change EfSearch between queries without touching the graph:
graph.Parameters.EfSearch = 32; // fast, lower recall
var fast = graph.KNNSearch(query, 10);
graph.Parameters.EfSearch = 256; // slower, higher recall
var precise = graph.KNNSearch(query, 10);
A typical workflow:
- Capture a query set with known ground truth (e.g. exact bruteforce results).
- Sweep
EfSearchover{50, 100, 200, 400, 800}and record latency and recall@k. - Pick the smallest
EfSearchthat hits your recall target.
The relationship between EfSearch and recall is monotone but sub-linear — doubling EfSearch typically buys a few percentage points of recall once you're already in the 90s.
Filtering at query time
Pass a Func<TItem, bool> to keep only items that pass a predicate. The filter runs on the candidate set during the walk, so HNSW still navigates the graph but discards anything that fails.
var todayOnly = graph.KNNSearch(query, 10,
filterItem: item => item.Date >= DateTime.UtcNow.Date);
When the predicate is very selective (say, only 1% of items pass), basic filtering can run out of candidates and return fewer results than k. For those cases, build the graph with OptimizeForFiltering = true to use ACORN-γ — see Filtering.
Cancellation
KNNSearch accepts a CancellationToken. The token is only meaningful when a filterItem is provided — the algorithm checks it between candidate evaluations.
using var cts = new CancellationTokenSource(TimeSpan.FromMilliseconds(50));
try
{
var hits = graph.KNNSearch(query, 10,
filterItem: HardPredicate,
cancellationToken: cts.Token);
}
catch (OperationCanceledException)
{
// search exceeded the budget
}
Single-threaded fast path
When you know only one thread will touch the graph, build it with threadSafe: false:
var graph = new SmallWorld<float[], float>(
CosineDistance.SIMDForUnits,
DefaultRandomGenerator.Instance,
parameters,
threadSafe: false); // no internal RW-lock
This shaves the lock overhead off every call. Don't enable unless you are certain — KNNSearch from a second thread while AddItems is mid-flight corrupts the graph.
See Thread safety for the full picture.
Common pitfalls
Sort the results yourself
KNNSearch returns an unsorted IList<KNNSearchResult>. If you need a ranked top-K, call .OrderBy(r => r.Distance). Don't assume the order is meaningful.
Re-use the query vector
The library doesn't mutate the query — feel free to keep a single float[] and pass it repeatedly. Allocating a fresh one per call adds GC pressure with no benefit.