Memory.Introspect

Sampling profiles

CollectSamplingProfileAsync captures CPU samples over a fixed time window and returns the methods that appeared most often. It's the in-process equivalent of dotnet-trace collect --profile cpu-sampling followed by a top-methods analysis.

Use it to answer:

  • "Which methods are using the most CPU right now?"
  • "Where does this batch job spend its time?"
  • "Is the bottleneck in my code, in a library, or in the runtime?"

The basic capture

using Memory.Introspect;

int pid = Process.GetCurrentProcess().Id;

var introspector = MemoryIntrospector.Create();

SamplingProfileResult profile = await introspector.CollectSamplingProfileAsync(
    pid,
    duration: TimeSpan.FromSeconds(30));

The call blocks for at least duration — that's the sampling window. After it returns, profile.TopMethods lists the methods ranked by sample count.

Reading the top methods

foreach (var method in profile.TopMethods.Take(20))
{
    Console.WriteLine($"{method.SampleCount,8}  {method.Name}");
}

Sample counts are absolute, not percentages — divide by the total to get a fraction:

long total = profile.TopMethods.Sum(m => m.SampleCount);

foreach (var method in profile.TopMethods.Take(10))
{
    double pct = 100.0 * method.SampleCount / total;
    Console.WriteLine($"{pct,6:F1}%  {method.Name}");
}

Excluding library frames

Sampling from inside the same process you're sampling means Memory.Introspect's own frames show up in the output. The library filters out a small default list to keep results readable. You can customise the list:

var introspector = MemoryIntrospector.Create(new()
{
    SamplingExcludedModules = new[]
    {
        "Memory.Introspect",
        "Microsoft.Diagnostics.NETCore.Client",
        "System.Private.CoreLib",
    },
});

Pass an empty list to disable filtering entirely:

SamplingExcludedModules = Array.Empty<string>();

SamplingExcludedModules is matched against the assembly simple name of each frame.

Sampling duration

The right duration depends on how busy the process is:

  • Idle process — samples will be sparse. Run for 30–60 seconds.
  • Busy request loop — 5–10 seconds is usually enough to see the top contenders.
  • One-shot batch job — wrap the entire job and use a duration ≥ the expected run time. The capture stops at duration, not at the end of the job.

Sampling rate is fixed by the runtime (default ~1 kHz). Longer captures don't sample more densely, they just collect more samples.

A "what's slow right now?" pattern

app/Diagnostics/CpuWatcher.cs
public class CpuWatcher : BackgroundService
{
    readonly MemoryIntrospector _introspector;
    readonly ILogger<CpuWatcher> _logger;

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            await Task.Delay(TimeSpan.FromMinutes(10), stoppingToken);

            try
            {
                var profile = await _introspector.CollectSamplingProfileAsync(
                    Process.GetCurrentProcess().Id,
                    TimeSpan.FromSeconds(20),
                    stoppingToken);

                foreach (var method in profile.TopMethods.Take(10))
                {
                    _logger.LogInformation("CPU sample: {Samples,8}  {Name}",
                        method.SampleCount, method.Name);
                }
            }
            catch (OperationCanceledException) { }
        }
    }
}

Schedule on a long interval — capturing every 10 minutes is normally enough to spot trends without producing log noise.

Configuration knobs

Option Effect
CircularBufferSizeInMB Size of the EventPipe ring buffer. Increase for long-duration captures on busy processes.
DiagnosticPort Use a named diagnostic port instead of the default per-PID pipe. Useful for sidecars and remote capture.
SamplingExcludedModules Assemblies whose frames are filtered out of TopMethods.

See Configuration for the full list.

Common pitfalls

Sampling profiles aren't allocation profiles

CollectSamplingProfileAsync measures CPU time. For "where is memory being allocated?", capture a .gcdump before and after the workload and compare — see Memory graphs.

Use real wall-clock duration

The duration parameter is a real TimeSpan — the call doesn't return early just because the process is idle. Pick a value you can afford to wait.

Referenced by

© 2026 Memory.Introspect. All rights reserved.