Skip to main content
This page summarizes the entity-resolution landscape as recorded in the repository’s vendor comparison. The intent is honest positioning: where GoldenMatch leads, and where other engines lead.
Pricing is indicative, vendor surfaces change quickly, and academic F1 numbers are not enterprise outcomes on your real data. GoldenMatch numbers are reproducible from docs/reproducing-benchmarks.md in the repository.

GoldenMatch baseline

AxisGoldenMatch
LicenseMIT (every package).
LanguagesPython (headline) + TypeScript (parity) + Rust (Postgres/DuckDB).
RuntimesPolars (≤500K), DuckDB (500K–50M), Ray (≥50M); Postgres via pgrx; DuckDB UDFs; edge JS.
Throughput1M dedupe in ~12.3 min on 4-core / 16 GB; 100K fuzzy ~39s.
Accuracy (DBLP-ACM)F1 0.964 zero-config (hand-tuned ceiling 0.918).
Accuracy (NCVR)F1 0.972 zero-config.
Zero-configIntrospective auto-config controller with cross-run memory and LLM fallback.
PPRLBloom-filter PPRL, F1 0.924 on FEBRL4.
AI-native surface35+ MCP tools, agent skills, REST API.

OSS engines

EngineLicenseWhere it beats GoldenMatchWhere GoldenMatch beats it
Splink (UK MoJ)MITThroughput (1B+ rows); PII F1 on Febrl (0.998).Non-PII accuracy (DBLP-ACM 0.964 vs 0.728); zero-config; polyglot; AI-native; PPRL.
dedupe.ioBSDCleanest active-learning UX in OSS.Accuracy without labels; throughput at scale; polyglot; AI-native; PPRL.
RecordLinkage ToolkitBSDDBLP-ACM F1 0.923 (narrowly).Throughput, scale, PPRL, AI-native, zero-config, polyglot.
ZinggAGPL + commercialNative Spark scale.MIT license; polyglot; AI-native; zero-config; PPRL.
Senzing G2 CEApache wrapper + closed coreNative identity graph; bundled reference data; real-time latency.OSS license; zero-config; AI-native; bibliographic/product matching; PPRL.

Identity-graph and cloud-managed

EngineWhere it beats GoldenMatchWhere GoldenMatch beats it
QuantexaGraph-native output; decisioning layer; enterprise governance.OSS and composable; time-to-value; AI-native; cost.
TiloresIdentity-graph engine; GraphQL API; hosted by default.OSS; edge runtimes; zero-config; AI-native; PPRL.
AWS Entity ResolutionScale is AWS’s problem; Glue/Lake Formation integration.Portability (laptop, Postgres, edge worker); zero-config; AI-native; multi-cloud.
Snowflake Cortex MatchRuns in-warehouse, no data movement.Full engine (blocking, clustering, golden records); portable; published F1.

Research SOTA

EngineWhere it beats GoldenMatchWhere GoldenMatch beats it
Ditto (Megagon)Abt-Buy F1 0.893 (product-matching SOTA) vs GoldenMatch 0.722 (+LLM) / 0.817 (+Vertex).No training labels; non-product domains; throughput; AI-native; cheaper than fine-tuning.

Cross-cutting observations

  • Zero-config is uncontested. No other vendor publishes zero-config benchmark numbers.
  • Identity graph is the next frontier. Mainstream OSS engines emit clusters; Senzing, Quantexa, and Tilores emit a graph.
  • AI-native is uncontested but unproven. Nobody else ships MCP, agent skills, or LLM-budget-controlled borderline scoring.
  • License matters. MIT and BSD engines have outgrown AGPL and closed-core peers partly on embedding permissions.
  • Portability is the wedge against warehouse-native managed services: the same engine runs in app code, an edge worker, or a Postgres trigger.