SynthArena

A Unified Evaluation Framework for AI-Driven Retrosynthesis

The Evaluation Crisis

The Babel of Formats: AiZynthFinder outputs bipartite graphs; Retro* outputs precursor maps; DirectMultiStep outputs recursive dictionaries. Comparing them requires bespoke parsers for every model.

Inconsistent Stocks: Starting material definitions vary by over 1000×—from curated catalogs of 300k molecules to speculative screening libraries of 230M+ compounds—making reported solvability scores incomparable.

Solvability ≠ Validity: Routes marked as "solved" are validated only by endpoint availability, with no guarantee that intermediate transformations are chemically feasible.

The Solution

RetroCast: A universal translation layer providing adapters for 10+ models (AiZynthFinder, Retro*, ASKCOS, DirectMultiStep, and more), casting all outputs into a canonical schema with cryptographic manifests for reproducibility.

Curated Benchmarks: Stratified evaluation sets fixing PaRoutes' distribution skew. The mkt- series uses commercial stocks (Buyables) for practical utility; the ref- series uses standardized stocks for fair algorithmic comparison.

SynthArena: This platform provides side-by-side route comparison with diff overlays, bootstrapped confidence intervals, and a living leaderboard—turning evaluation from a static exercise into an ongoing community process.

Platform Statistics

Models & Predictions

MetricCount

Stock Molecules

StockMolecules

Benchmark Series

BenchmarkTargetsRuns
BenchmarkTargetsRuns

Explore

Evaluating MultiStep Retrosynthetic Solutions | SynthArena by isChemist Group