HNSW-Sharp

Progress reporting

Building an HNSW graph over millions of vectors takes time. IProgressReporter is the hook the library uses to push progress out during a long AddItems call.

The contract

public interface IProgressReporter
{
    void Progress(int current, int total);
}

The library calls Progress from the inserting thread, with the count of items inserted so far and the total to insert in this AddItems call. No allocation, no synchronisation overhead — just a method call per progress tick.

A minimal reporter

class ConsoleReporter : IProgressReporter
{
    int _last;

    public void Progress(int current, int total)
    {
        if (current - _last < 1000 && current != total) return;
        _last = current;
        Console.Write($"\r  {current,8}/{total} ({100.0 * current / total:F1}%)");
    }
}

graph.AddItems(vectors, new ConsoleReporter());

The throttling matters — without it, you'll spend more time printing than indexing on small distances.

Reporting to a UI

Progress is called on the inserting thread. For a UI client, marshal across:

class WpfReporter : IProgressReporter
{
    readonly System.Windows.Threading.Dispatcher _dispatcher;
    readonly Action<int, int> _onProgress;
    int _last;

    public WpfReporter(System.Windows.Threading.Dispatcher dispatcher, Action<int, int> onProgress)
    {
        _dispatcher = dispatcher;
        _onProgress = onProgress;
    }

    public void Progress(int current, int total)
    {
        if (current - _last < 100 && current != total) return;
        _last = current;
        _dispatcher.BeginInvoke(_onProgress, current, total);
    }
}

For ASP.NET workloads, push to a Channel<T> or an IAsyncEnumerable and surface that to the client.

Total semantics

total is the number of items in this AddItems call, not the cumulative size of the graph. If you build the graph in chunks:

graph.AddItems(chunk1, reporter);   // total = chunk1.Count
graph.AddItems(chunk2, reporter);   // total = chunk2.Count

If you want cumulative progress, track it yourself in the reporter:

class CumulativeReporter : IProgressReporter
{
    readonly int _grandTotal;
    int _completedFromEarlierChunks;
    int _last;

    public CumulativeReporter(int grandTotal) => _grandTotal = grandTotal;

    public void Progress(int current, int total)
    {
        int globalCurrent = _completedFromEarlierChunks + current;
        if (globalCurrent - _last < 1000 && current != total) return;
        _last = globalCurrent;
        Console.Write($"\r  {globalCurrent,8}/{_grandTotal} ({100.0 * globalCurrent / _grandTotal:F1}%)");

        if (current == total) _completedFromEarlierChunks += total;
    }
}

Reporting from KNNSearch

KNNSearch does not take an IProgressReporter — it completes in microseconds, so progress reporting isn't meaningful. For long bulk-query workloads, track progress in your own loop:

int done = 0;
foreach (var query in queries)
{
    var hits = graph.KNNSearch(query, 10);
    if (Interlocked.Increment(ref done) % 1000 == 0)
        Console.Write($"\r  query {done}/{queries.Count}");
}

ETA estimation

For human-facing progress, an ETA is more useful than a raw percentage:

class EtaReporter : IProgressReporter
{
    readonly long _start = Environment.TickCount64;
    int _last;

    public void Progress(int current, int total)
    {
        if (current - _last < 5000 && current != total) return;
        _last = current;

        long elapsed = Environment.TickCount64 - _start;
        long eta = current > 0 ? elapsed * (total - current) / current : 0;

        Console.Write($"\r  {current,8}/{total} — {TimeSpan.FromMilliseconds(elapsed):mm\\:ss} elapsed, {TimeSpan.FromMilliseconds(eta):mm\\:ss} remaining");
    }
}

HNSW build time is super-linear in N (each insertion gets slightly more expensive as the graph grows), so the early ETA overestimates progress and the late ETA underestimates it. Be honest about the precision in your UI.

Referenced by

© 2026 HNSW-Sharp. All rights reserved.