ER vendor comparison

This page summarizes the entity-resolution landscape as recorded in the repository’s vendor comparison. The intent is honest positioning: where GoldenMatch leads, and where other engines lead.

Pricing is indicative, vendor surfaces change quickly, and academic F1 numbers are not enterprise outcomes on your real data. GoldenMatch numbers are reproducible from docs/reproducing-benchmarks.md in the repository.

GoldenMatch baseline

Axis	GoldenMatch
License	MIT (every package).
Languages	Python (headline) + TypeScript (parity) + Rust (Postgres/DuckDB).
Runtimes	Polars (≤500K), DuckDB (500K–50M), Ray (≥50M); Postgres via pgrx; DuckDB UDFs; edge JS.
Throughput	1M dedupe in ~12.3 min on 4-core / 16 GB; 100K fuzzy ~39s.
Accuracy (DBLP-ACM)	F1 0.964 zero-config (hand-tuned ceiling 0.918).
Accuracy (NCVR)	F1 0.972 zero-config.
Zero-config	Introspective auto-config controller with cross-run memory and LLM fallback.
PPRL	Bloom-filter PPRL, F1 0.924 on FEBRL4.
AI-native surface	69 MCP tools, agent skills, REST API.

OSS engines

Engine	License	Where it beats GoldenMatch	Where GoldenMatch beats it
Splink (UK MoJ)	MIT	Distributed Fellegi-Sunter at 1B+ rows (Spark); mature interactive m/u + comparison-viewer charting UI.	Non-PII accuracy (DBLP-ACM 0.964 vs 0.728); zero-config; polyglot; AI-native; PPRL. Probabilistic head-to-head closed and reversed: under the shared `bench_er_headtohead` evaluator (pairwise F1), GoldenMatch’s probabilistic (Fellegi-Sunter) auto-config beats Splink on every dataset Splink scores — historical_50k (Splink’s flagship) 0.778 vs 0.757, febrl3 0.991 vs 0.965, synthetic_person 0.998 vs 0.996 (and at the cluster level on historical_50k, B-cubed F1 0.844 vs 0.789). These are deterministic as of #829. (Pairwise F1 under one shared evaluator; the often-cited ~0.97 on historical_50k is a cluster/entity-level metric, not exhaustive within-cluster pairwise F1 — Splink itself is ~0.75 pairwise here under the same harness. Splink skips bibliographic dblp_acm; the probabilistic path is weak there (0.377) — use the weighted controller, which scores 0.964. Full bake-off: `docs/benchmarks/2026-06-09-splink-bakeoff.md`.) The `type: probabilistic` matchkey also has EM-trained m/u, supervised m from labels, model persistence, match-weight waterfall, posterior calibration, and label-driven threshold analysis, plus FS scale-out on the bucket/Ray path (measured 6M dedupe in 162.6 s / 11.3 GB single-node).
dedupe.io	BSD	Cleanest active-learning UX in OSS.	Accuracy without labels; throughput at scale; polyglot; AI-native; PPRL.
RecordLinkage Toolkit	BSD	DBLP-ACM F1 0.923 (narrowly).	Throughput, scale, PPRL, AI-native, zero-config, polyglot.
Zingg	AGPL + commercial	Native Spark scale.	MIT license; polyglot; AI-native; zero-config; PPRL.
Senzing G2 CE	Apache wrapper + closed core	Native identity graph; bundled reference data; real-time latency.	OSS license; zero-config; AI-native; bibliographic/product matching; PPRL.

Identity-graph and cloud-managed

Engine	Where it beats GoldenMatch	Where GoldenMatch beats it
Quantexa	Graph-native output; decisioning layer; enterprise governance.	OSS and composable; time-to-value; AI-native; cost.
Tilores	Identity-graph engine; GraphQL API; hosted by default.	OSS; edge runtimes; zero-config; AI-native; PPRL.
AWS Entity Resolution	Scale is AWS’s problem; Glue/Lake Formation integration.	Portability (laptop, Postgres, edge worker); zero-config; AI-native; multi-cloud.
Snowflake Cortex Match	Runs in-warehouse, no data movement.	Full engine (blocking, clustering, golden records); portable; published F1.

Research SOTA

Engine	Where it beats GoldenMatch	Where GoldenMatch beats it
Ditto (Megagon)	Abt-Buy F1 0.893 (product-matching SOTA) vs GoldenMatch 0.722 (+LLM) / 0.817 (+Vertex).	No training labels; non-product domains; throughput; AI-native; cheaper than fine-tuning.

Cross-cutting observations

Zero-config is uncontested. No other vendor publishes zero-config benchmark numbers.
Identity graph is the next frontier. Mainstream OSS engines emit clusters; Senzing, Quantexa, and Tilores emit a graph.
AI-native is uncontested but unproven. Nobody else ships MCP, agent skills, or LLM-budget-controlled borderline scoring.
License matters. MIT and BSD engines have outgrown AGPL and closed-core peers partly on embedding permissions.
Portability is the wedge against warehouse-native managed services: the same engine runs in app code, an edge worker, or a Postgres trigger.

Get started

Concepts

GoldenMatch

GoldenCheck

GoldenFlow

GoldenPipe

GoldenAnalysis

InferMap

SQL extensions

Reference

Research

GoldenMatch baseline

OSS engines

Identity-graph and cloud-managed

Research SOTA

Cross-cutting observations

​GoldenMatch baseline

​OSS engines

​Identity-graph and cloud-managed

​Research SOTA

​Cross-cutting observations

GoldenMatch baseline

OSS engines

Identity-graph and cloud-managed

Research SOTA

Cross-cutting observations