Bulk loading with SST files
Writing a billion keys with Put is slow — every key goes through the WAL, the MemTable, and (eventually) compaction. RocksDB has a much faster path for offline loads: build an SST file out-of-band, then ingest it into the database directly.
Use it for initial seeds, ETL imports, and migration jobs. Upstream reference: Creating and Ingesting SST files.
Building an SST file
SstFileWriter writes keys in sorted order to a single SST file. Keys must be strictly increasing — RocksDB will reject the file at ingest time if they're not.
var envOptions = new EnvOptions();
var cfOptions = new ColumnFamilyOptions();
using var writer = new SstFileWriter(envOptions, cfOptions);
writer.Open("/tmp/seed.sst");
foreach (var kv in keysInSortedOrder) // strictly increasing keys!
{
writer.Add(kv.Key, kv.Value);
}
writer.Finish();
Ingesting the file
IngestExternalFiles atomically adds one or more SST files to a column family. The database picks the right level (usually L6) so the files are immediately queryable.
var ingest = new IngestExternalFileOptions()
.SetMoveFiles(true) // move instead of copy
.SetSnapshotConsistency(true);
db.IngestExternalFiles(
files: new[] { "/tmp/seed.sst" },
ingestOptions: ingest,
cf: db.GetDefaultColumnFamily());
Once ingested, the data is visible to all subsequent reads. No compaction is required to make it visible — though future compactions may rewrite the file.
End-to-end example
using RocksDbSharp;
var dbPath = "/var/lib/myapp/state";
var sstPath = "/tmp/import.sst";
// 1) Build the SST out of process / offline.
{
using var w = new SstFileWriter(new EnvOptions(), new ColumnFamilyOptions());
w.Open(sstPath);
foreach (var (k, v) in LoadFromCsv("import.csv").OrderBy(p => p.k, StringComparer.Ordinal))
{
w.Add(Encoding.UTF8.GetBytes(k), Encoding.UTF8.GetBytes(v));
}
w.Finish();
}
// 2) Ingest into the live DB.
{
using var db = RocksDb.Open(new DbOptions().SetCreateIfMissing(true), dbPath);
var ingest = new IngestExternalFileOptions()
.SetMoveFiles(true)
.SetSnapshotConsistency(true);
db.IngestExternalFiles(new[] { sstPath }, ingest);
Console.WriteLine($"Ingested. Sequence now {db.GetLatestSequenceNumber()}");
}
When to use bulk loading
| Scenario | Bulk-load or write loop? |
|---|---|
| Initial migration from another store | Bulk-load — 10–100× faster than Put. |
| Daily ETL with millions of rows | Bulk-load — also frees memtables / WAL pressure. |
| Mixed workload during ingest | Write loop or WriteBatch — bulk-loaded files arrive at a level, not the memtable. |
| Sub-second latency on writes | Write loop. |
Caveats
Keys must be strictly sorted
If you Add keys out of order, SstFileWriter.Finish() (or ingestion) will throw. Sort beforehand with an ordinal byte comparer — that matches RocksDB's default.
Column family options must match
The ColumnFamilyOptions passed to SstFileWriter (especially compression, comparator, prefix extractor, merge operator) must match the destination CF, or ingestion fails.
Atomic across multiple files
IngestExternalFiles accepts an array. All files in the array are ingested atomically — either all succeed or none do — which lets you split a huge import into chunks but still expose it in one step.
Use `SetMoveFiles(true)` when you can
With SetMoveFiles(true), the file is renamed into the DB directory instead of copied. Same filesystem only.