Benchmarks

Every number,
with its caveat.

The 2026-05-19 publishable sweep. Four slices: the engine against Kuzu and Neo4j local, the cloud against Neo4j Aura, bulk-write throughput against R2 EU, and a raw R2 PUT/GET microbench. Methodology, hardware, and the runs that don't flatter us, all in the open. The engine has since reached its first stable release; the next sweep widens LDBC query coverage and we publish it here when it lands.

Engine: LDBC SNB

Bench A, warm p50

NamiDB engine vs Kuzu (embedded) vs Neo4j 5-community (local Docker, NVMe). Same dataset, same Cypher, same hardware. The plan-cache, decoded-batch tier, and sidecar index land at the right place. Once warm, NamiDB beats Kuzu on every query.

IC023-4x vs Kuzu

Recent messages from your friends

p50, 30 warm runs, 3 parameter sets

NamiDB

1.09ms

Kuzu

3.72ms

Neo4j local

0.61ms

IC071.3x vs Kuzu

Recent likers of your messages

p50, 30 warm runs, 3 parameter sets

NamiDB

1.19ms

Kuzu

1.50ms

Neo4j local

0.43ms

IC08~4x vs Kuzu

Recent replies to your messages

p50, 30 warm runs, 3 parameter sets

NamiDB

0.95ms

Kuzu

3.68ms

Neo4j local

0.38ms

IC091.4-2x vs Kuzu

Friends of friends, with their messages

p50, 30 warm runs, 3 parameter sets

NamiDB

2.88ms

Kuzu

6.55ms

Neo4j local

0.84ms

Reading the numbers honestly: Neo4j local wins everything because it runs on attached NVMe with a tuned B-tree. NamiDB's value isn't engine-only. It's the rest: object storage, multi-tenant, zero-egress. NamiDB cold-start is still 5-50x slower than Kuzu and Neo4j warm-disk paths (cold IC09 especially), where Parquet decode and first-time SST scan dominate. RFC-021 and a foyer-tier decoded-batch cache close most of that gap.

Cloud: server-side

Bench B, wall minus WAN

NamiDB cloud (Hetzner fsn1 to R2 EU) against Neo4j Aura (GCP us-east1). Stripped of the WAN floor, the warm server-side cost on IC02 / IC07 / IC08 is essentially nothing. The engine, worker, and R2 path lands inside 0-2 ms of the network round-trip.

IC026x faster

Recent messages from your friends

Server-side warm p50, WAN subtracted

NamiDB cloud~2 ms

Neo4j Aura~12 ms

IC07Within WAN floor

Recent likers of your messages

Server-side warm p50, WAN subtracted

NamiDB cloud< 1 ms

Neo4j Aura~12 ms

IC08Within WAN floor

Recent replies to your messages

Server-side warm p50, WAN subtracted

NamiDB cloud< 1 ms

Neo4j Aura~11 ms

IC09Comparable

Friends of friends, with their messages

Server-side warm p50, WAN subtracted

NamiDB cloud8-18 ms

Neo4j Aura~12-13 ms

Reading the numbers honestly: wall-clock from this LATAM laptop Aura is closer (RTT ~72 ms) than fsn1 (RTT ~205 ms), an accident of test geography, not architecture. The cold IC09 path still bites (~370 ms server-side) where Aura's tuned-NVMe is cheaper first-touch. For EU-resident workloads where data-plane locality matters, NamiDB cloud is the right call today; for a US-east client today, it isn't a drop-in Aura replacement.

Bulk write throughput

Bench D, 540 K elements

Engine-level and cloud-level write throughput. The multipart PUT path (5 MiB parts, 8-way concurrency) eats most of the WAN-bound serial-PUT overhead; an in-region anchor adds another 1.6x. Aura's LOAD CSVisn't comparable (it's bottlenecked by Bolt round-trips), so the cloud lane is NamiDB-only.

Kuzu COPY (CSV to local disk)

engine

540 K elements, 0.70 s

Tuned columnar batch encoder for offline import. Wins engine-level writes ~2x.

769 K /s

elements / second

(769,821)

NamiDB engine: bulk_load_inmemory

engine

540 K elements, 1.37 s

In-memory upper bound for the engine path. Kuzu wins this slice.

394 K /s

elements / second

(394,138)

NamiDB cloud: Hetzner fsn1 to R2 EU

cloud

In-region, ~3 ms RTT, 10.5 s

Multipart PUT, 5 MiB parts, 8-way concurrency, 9x the pre-multipart baseline.

51.5 K /s

elements / second

(51,522)

NamiDB cloud: laptop to R2 EU

cloud

LATAM client, ~205 ms RTT, 16.9 s

Transatlantic, residential link. 5.5x the pre-multipart baseline.

31.9 K /s

elements / second

(31,870)

R2 PUT / GET microbench

Bench F, laptop to R2 EU

Raw object-storage latency from a LATAM laptop. TTFB stays at ~140-170 ms regardless of object size, the RTT to the nearest Cloudflare PoP. The edge-cache effect: small reads are 3x faster warm. The bandwidth-bound regime kicks in at 16 MiB; a 100 MiB cold GET sustains about 96 Mbps.

Object size	PUT p50	GET cold p50	GET warm p50	TTFB p50
1 KiB	845 ms	432 ms	155 ms	n/a
64 KiB	754 ms	527 ms	152 ms	n/a
1 MiB	1.38 s	752 ms	235 ms	141 ms
16 MiB	1.56 s	1.70 s	1.32 s	170 ms
100 MiB	6.82 s	8.76 s	8.52 s	156 ms

Reproduce it yourself

Full harness

The harness is deterministic: same seed, same dataset, same numbers. Output is structured JSON with p50, p95, p99 per query, per parameter set. Run it on your hardware and compare against the numbers published here.

# Engine 3-way (NamiDB / Kuzu / Neo4j)
python3 scripts/bench_publish/bench_a_ldbc_engine.py \
    --scale 1.0 --dataset-dir /tmp/snb-1.0 \
    --warm-runs 30 --param-count 3 \
    --engines namidb-engine,kuzu,neo4j

# Cloud (NamiDB fsn1 vs Aura)
WORKER_URL=http://<fsn1-ip>:8081 NAMESPACE=<your-ns> \
    bash scripts/bench_publish/run_b_now.sh

# Bulk write to R2
namidb-bench load-r2 --scale 1.0 \
    --bucket $NAMIDB_WORKER_TEST_BUCKET \
    --namespace bench-publish-$(date +%s)

Methodology

Dataset: LDBC SNB Interactive, synthetic, deterministic (scale=1.0, 540 K elements)
Bench client: MacBook M4 Pro (12 cores), macOS 24.1.0, located in LATAM
Cloud worker: Hetzner CCX13 (2 vCPU AMD EPYC, 8 GiB), fsn1 (Falkenstein, DE)
Object store: Cloudflare R2 EU, bucket namidb
Compared: Kuzu 0.11.3, Neo4j 5-community (Docker), Neo4j Aura free, GCP us-east1
Protocol: 30 warm runs x 3 parameter sets per query, median reported
Storage (engine): object_store::InMemory for query benches
Build: Engine commit 1f7166c, cloud commit 34eec6d, 2026-05-19
Reproducibility: Harness scripts public, same seed, same numbers

Headline numbers

Engine warm-path: 3-5x over Kuzu on IC02 / IC08.
Cloud server-side: 2-6x over Neo4j Aura, WAN subtracted.
Bulk-write to R2 EU: 51.5 K elem/s from in-region fsn1, 31.9 K /s from a transatlantic laptop.
Cold IC09 still slower than Aura first-touch. RFC-021 and a foyer-tier cache close ~80% of the residual.

Not covered yet

The remaining 10 LDBC Complex Read queries.
LDBC Short Reads (IS1-IS7).
Classical LDBC SF1 from the official Hadoop datagen (11 M elements).
Aura in gcp-europe-west1 for a symmetric WAN test.
A second in-region anchor (e.g. AWS Frankfurt) for the multipart write lane.
Per-tenant cloud cost, once the cloud is in public beta.

Run it on your own bucket.

Early access to the Cloud is open. One launch email when the engine is ready, never spam.

Request early access

Every number,with its caveat.

Reproduce it yourself

Every number,
with its caveat.