Progress reporting
UMAP runs can take seconds (for thousands of vectors) to minutes (for hundreds of thousands). To keep the UI responsive or log meaningful CI output, pass a ProgressReporter delegate to the Umap constructor.
The delegate
public delegate void ProgressReporter(float progress);
The reporter is called repeatedly with a value between 0.0 and 1.0 representing approximate completion across the entire job — both InitializeFit and the Step loop combined.
Basic usage
var umap = new Umap(progressReporter: p =>
Console.WriteLine($"{p:P0}"));
var epochs = umap.InitializeFit(vectors);
for (var i = 0; i < epochs; i++)
{
umap.Step();
}
You will see output that smoothly progresses from 0% to 100%:
5%
10%
15%
...
80%
85%
90%
95%
100%
How the budget is split
For large inputs InitializeFit accounts for roughly 80% of total work, with the Step loop covering the remaining 20%. UMAP-Sharp does the scaling internally — you do not need to track which phase you are in.
Within InitializeFit the split is approximately:
- 30% — k-nearest-neighbour descent (forest construction + neighbor refinement)
- 50% — fuzzy simplicial set computation (sparse-matrix operations)
The exact distribution depends on dataset size; the reporter smooths over it.
Step-only progress
The reporter is only called inside Step if you call Step the recommended number of times — i.e. the value returned by InitializeFit. If you call Step fewer times, you will see the reporter top out below 100%. If you call it more times, you will see it pinned at 100% for the extra calls.
Integrating with IProgress<T>
The canonical .NET progress abstraction. IProgress<float> lets you marshal progress callbacks back to the UI thread automatically:
IProgress<float> progress = new Progress<float>(value =>
{
progressBar.Value = (int)(value * 100);
});
var umap = new Umap(progressReporter: p => progress.Report(p));
The Progress<T> constructor captures the current SynchronizationContext, so the callback runs on the UI thread in a WPF / WinForms / MAUI app.
Throttling
The reporter can be called frequently — many times per InitializeFit phase, plus once per epoch in Step. If your reporter is expensive (UI updates, logging to disk, telemetry), throttle it:
var lastReported = -1f;
var umap = new Umap(progressReporter: p =>
{
if (p - lastReported >= 0.01f)
{
lastReported = p;
progress.Report(p);
}
});
This reports at most 100 updates over the lifetime of the run.
Cancellation
There is no built-in cancellation token, but you can implement one by checking a flag inside the Step loop:
var cts = new CancellationTokenSource();
var umap = new Umap(progressReporter: progress.Report);
var epochs = umap.InitializeFit(vectors);
for (var i = 0; i < epochs; i++)
{
if (cts.IsCancellationRequested) break;
umap.Step();
}
InitializeFit itself is not interruptible — once you call it, it runs to completion. If you need to bail out of a very long fit, run it on a background thread and abandon the result.
A full example with a logger
using Microsoft.Extensions.Logging;
using UMAP;
var stopwatch = Stopwatch.StartNew();
var lastTenth = 0;
var umap = new Umap(progressReporter: p =>
{
var tenth = (int)(p * 10);
if (tenth > lastTenth)
{
lastTenth = tenth;
logger.LogInformation("UMAP {progress:P0} after {elapsed}",
p, stopwatch.Elapsed);
}
});
var epochs = umap.InitializeFit(vectors);
for (var i = 0; i < epochs; i++) umap.Step();
logger.LogInformation("Done in {elapsed}", stopwatch.Elapsed);
Next
- Parallelization — control whether
Stepfans out to all cores. - Reproducibility — pair progress reporting with deterministic runs in tests.