Clustering Similarity Results & Visualizing as a Graph
A single similarity lookup gives you the neighbors of one seed. The interesting product features start when you compute that for many seeds and treat the resulting scored pairs as a weighted graph: nodes that pull each other closer end up in the same cluster. From there a force-directed view makes the structure visible to users.
This page walks through:
- Collecting
(source, target, score)triples from many similarity searches. - Feeding them into
WeightedGraph<T>.Cluster(...)to extract groups. - Rendering the result with
ForceGraphView.
The algorithm lives in Mosaik.Core.Algorithms (WeightedGraph<T>, Node<T>, WeightedEdge<T>, Cluster<T>). The UI component is Mosaik.Components.ForceGraphView.
1. Compute similarity edges for many seeds
Building on the /similar-products endpoint from IQuery Similarity Search, expand the input from one UID to a set of seed UIDs and collect the neighbors of each:
public record SimilarityGraphRequest(
string[] ProductIds,
int PerSeedK = 10,
float MinScore = 0.5f);
public record GraphResponse(
IReadOnlyList<GraphNodeDto> Nodes,
IReadOnlyList<GraphEdgeDto> Edges);
public record GraphNodeDto(string Id, string Label, int Cluster);
public record GraphEdgeDto(string Source, string Target, double Weight);
var input = Body.FromJson<SimilarityGraphRequest>();
if (input?.ProductIds is null || input.ProductIds.Length == 0)
return BadRequest("`productIds` is required.");
// Resolve seed UIDs.
var seedUIDs = input.ProductIds
.Select(id => Graph.TryGetReadOnlyContent<Product>(N.Product.Type, id, out var n) ? n.UID : default)
.Where(uid => uid != default)
.Distinct()
.ToArray();
if (seedUIDs.Length == 0)
return NotFound("No matching products found.");
// Pick the index we want to drive similarity from.
var nameIndex = Graph.Indexes
.OfType<SentenceEmbeddingsIndex>(N.Product.Type)
.First(i => i.FieldName == N.Product.Name);
// Run a similarity query per seed and collect scored pairs.
var pairs = new List<(UID128 src, UID128 tgt, float score)>();
foreach (var seedUID in seedUIDs)
{
var neighbors = Q().StartAt(seedUID)
.Similar(indexUID: nameIndex.UID, count: input.PerSeedK + 1)
.AsScoredUIDEnumerable();
foreach (var s in neighbors)
{
if (s.UID.UID == seedUID) continue; // skip self
if (s.Score < input.MinScore) continue; // weak edges hurt clustering
pairs.Add((seedUID, s.UID.UID, s.Score));
}
}
AsScoredUIDEnumerable() is the form of IQuery materialization that preserves similarity scores — see Querying the Graph.
2. Build a WeightedGraph and extract clusters
WeightedGraph<T>.Cluster takes a node list and an edge list and returns a list of Cluster<T> instances. Each cluster enumerates the nodes that belong to it; every node also carries its assigned cluster id (Node<T>.Cluster):
// Distinct UID set — includes seeds plus everything they pulled in.
var allUIDs = pairs
.SelectMany(p => new[] { p.src, p.tgt })
.Distinct()
.ToList();
// Wrap each UID in a graph node.
var nodes = allUIDs.Select(u => new Mosaik.Core.Algorithms.Node<UID128>(u)).ToList();
// Convert pairs to weighted edges.
var lookup = nodes.ToDictionary(n => n.Value);
var edges = pairs.Select(p =>
new Mosaik.Core.Algorithms.WeightedEdge<UID128>(
source: lookup[p.src],
target: lookup[p.tgt],
weight: p.score))
.ToList();
// Run hierarchical clustering — returns one Cluster<UID128> per top-level group.
// Side-effect: each node's `.Cluster` field is set to its cluster index.
var clusters = Mosaik.Core.Algorithms.WeightedGraph<UID128>.Cluster(nodes, edges);
WeightedGraph<T>.Cluster does a hierarchical edge-cut: it iteratively splits the graph by removing the weakest edges and falls back to a per-component flat clustering. The result is robust to disconnected sub-graphs (each one becomes its own top-level cluster).
After the call:
clusters[i]enumerates the nodes in clusteri.node.Clusteris the cluster index assigned to each node — convenient when you've stored theNode<UID128>list alongside other per-node data.
3. Shape the response for a force-directed view
ForceGraphView consumes GraphExplorerNode[] (drawn nodes) and GraphExplorerEdge[] (drawn links). Map clustered UIDs into both shapes, using the cluster id as the color key:
// Pick a colour per cluster (cycle through a palette).
string[] palette = { "#4F8BF9", "#F97171", "#7FD17F", "#F9C74F", "#9B59B6", "#26C6DA" };
string ColorFor(int cluster) => palette[((cluster % palette.Length) + palette.Length) % palette.Length];
var nodeDtos = nodes.Select(n =>
{
Graph.TryGetReadOnlyContent<Product>(n.Value, out var p);
return new GraphNodeDto(
Id: n.Value.ToString(),
Label: p?.Name ?? n.Value.ToString(),
Cluster: n.Cluster);
}).ToList();
var edgeDtos = pairs.Select(p =>
new GraphEdgeDto(p.src.ToString(), p.tgt.ToString(), p.score)
).ToList();
return Ok(new GraphResponse(nodeDtos, edgeDtos).ToJson(), "application/json");
The endpoint's JSON output is everything a front-end view needs: each node has an id, a label, and a cluster (used to colour the bubble); each edge has source/target/weight.
4. Render with ForceGraphView
ForceGraphView lives in the workspace front-end project (Mosaik.Components) and renders a D3 force-directed canvas. The component already powers the workspace's built-in graph explorer; you can drop it into a custom front-end view without re-implementing the layout.
A minimal page that fetches the JSON above and renders it:
public class ProductSimilarityView : IComponent
{
private readonly ForceGraphView _view = new ForceGraphView(
enableInteraction: true,
isEmbeddedView: false);
public ProductSimilarityView(string[] seedIds)
{
_ = LoadAsync(seedIds);
}
private async Task LoadAsync(string[] seedIds)
{
var response = await Endpoints.SimilarityGraph(new
{
productIds = seedIds,
perSeedK = 10,
minScore = 0.5f
});
// Map server JSON → ForceGraphView inputs.
var nodes = response.Nodes.Select(n => new GraphExplorerNode
{
id = n.Id,
Label = n.Label,
ShortLabel= n.Label,
NodeType = N.Product.Type,
Color = ColorFor(n.Cluster),
radius = 12
}).ToArray();
var edges = response.Edges.Select(e => new GraphExplorerEdge
{
UID = $"{e.Source}->{e.Target}",
SourceUID = e.Source,
TargetUID = e.Target,
EdgeTypeName = "Similar"
}).ToArray();
_view.SetData(nodes, edges);
}
public HTMLElement Render() => _view.Render();
private static string ColorFor(int c)
{
string[] palette = { "#4F8BF9", "#F97171", "#7FD17F", "#F9C74F", "#9B59B6", "#26C6DA" };
return palette[((c % palette.Length) + palette.Length) % palette.Length];
}
}
Key points when wiring data into ForceGraphView:
GraphExplorerNode.idmust be unique per node — using the UID string from the endpoint response is the natural choice.GraphExplorerEdge.SourceUID/TargetUIDmust match theidof a node already passed toSetData— orphan edges are dropped.Coloris what produces the per-cluster visual grouping. Cluster index → palette colour is the simplest mapping; for many clusters use an HSL ramp.radiusdrives node size; combine withEdgeCountif you want hubs to read bigger.SetDatareplaces the current data. CallAddDatato layer new nodes onto an existing render (e.g. when the user expands a cluster).
The component also exposes OnNodeClick, OnNodeSelect, OnNodeContextClick, etc. — wire those to push details into a side panel when the user explores the graph:
_view.OnNodeClick((sender, evt) =>
{
var clicked = evt.Data;
OpenDetailsFor(clicked.id);
});
Built-in physics applies "same-colour attracts" via the forceCluster JS hook the component installs at construction, so cluster nodes naturally settle into separated bubbles without extra work.
End-to-end shape
Each stage is independent: you can drive the clustering from any embedding index (just swap nameIndex.UID for a PageSpaceEmbeddingsIndex or RawEmbeddingsIndex UID), and you can render the result anywhere ForceGraphView is available without changing the endpoint.
See also
- IQuery Similarity Search — the per-seed lookup used to populate edges.
- Similarity Engine — when you want each edge to be a fused score from multiple signals before clustering.
- Custom Front-End — wiring a Tesserae view that hosts
ForceGraphView. - Graph Query Language — full
IQueryreference.