Parallelization
UMAP-Sharp parallelizes the SGD optimization step across all available cores by default. This section explains how the parallelism works, when to disable it, and the tradeoffs involved.
How it works
UMAP-Sharp follows the same lock-free, multi-threaded optimization approach used by Facebook's fastText:
- Randomise the order in which sample pairs are visited.
- Run the SGD updates in parallel across cores, each thread updating the shared embedding array.
- Accept the (small) probability that two threads write to the same coordinate at the same time, on the assumption that collisions are rare and impact converges away.
The relevant call inside Umap.Step() is a Parallel.For guarded by a single property check:
if (_random.IsThreadSafe)
{
Parallel.For(0, _optimizationState.EpochsPerSample.Length, Iterate);
}
else
{
for (var i = 0; i < _optimizationState.EpochsPerSample.Length; i++)
{
Iterate(i);
}
}
The _random.IsThreadSafe flag comes from your IProvideRandomValues implementation. The default DefaultRandomGenerator.Instance returns true for this; DefaultRandomGenerator.DisableThreading returns false.
The two built-in modes
| Generator | IsThreadSafe |
Behaviour |
|---|---|---|
DefaultRandomGenerator.Instance (default) |
true |
SGD runs in parallel across all available cores. |
DefaultRandomGenerator.DisableThreading |
false |
SGD runs single-threaded. |
// Parallel — default and what you almost always want
var umap = new Umap();
// Single-threaded
var umap = new Umap(random: DefaultRandomGenerator.DisableThreading);
When to disable threading
The default is right for most workloads, but there are real cases for the single-threaded mode:
- Concurrent requests in a server. If your service serves many small UMAP jobs in parallel, letting each one fan out to all cores hurts overall throughput. Use
DisableThreadingso each job uses one core and the thread pool schedules them. - Shared / constrained environments. Containers with CPU limits, CI jobs, or laptops on battery where you do not want one operation to peg every core.
- Determinism with a thread-safe RNG. Even with a seeded generator, the SGD update order is non-deterministic across threads. If you need fully reproducible output, run single-threaded — see Reproducibility.
- Debugging / profiling. Stack traces are simpler when the loop is single-threaded.
Custom thread-safe generators
You can supply your own IProvideRandomValues. Set IsThreadSafe = true if your implementation is safe to call concurrently — UMAP will then parallelize. Set it to false to force single-threaded execution.
public sealed class MyRandom : IProvideRandomValues
{
public bool IsThreadSafe => true;
public int Next(int min, int max) { /* ... */ }
public float NextFloat() { /* ... */ }
public void NextFloats(Span<float> buffer) { /* ... */ }
}
var umap = new Umap(random: new MyRandom());
See Reproducibility for a complete deterministic-generator example.
Collisions and convergence
The lock-free strategy means two threads can simultaneously read and write the same embedding row. In practice the probability is low — the optimisation visits N × nNeighbors × nEpochs sample pairs over O(threads) cores, with each pair touching only two rows out of N — and the SGD's averaging behaviour absorbs the occasional lost update.
The result is that multi-threaded and single-threaded runs produce different embeddings even with the same RNG seed. The qualitative structure (clusters, neighbourhoods, separation) is preserved; the exact coordinates are not.
Threading scope
Only the SGD Step loop is parallelized. The fit phase (InitializeFit — nearest-neighbour descent, fuzzy simplicial set, KNN graph construction) uses sequential code internally. For large inputs the fit phase dominates total time, so disabling SGD threading slows down the optimisation portion but not the bulk of the work.
Next
- Reproducibility — get bit-for-bit identical runs.
- Progress Reporting — observe the fit + step phases live.