Benchmarks

Stratified evaluation subsets from PaRoutes designed to measure performance across route lengths, topologies, and material availability.

Available Benchmarks

BenchmarkSeriesTargetsDescriptionStock
mkt-lin-500mkt500Linear routes of lengths 2–6 (100 each)buyables-stock
mkt-cnv-160mkt160Convergent routes of depths 2–5 (40 each)buyables-stock
ref-lin-600ref600Linear routes of lengths 2–7 (100 each)n5-stock
ref-cnv-400ref400Convergent routes of depths 2–5 (100 each)n5-stock
ref-lng-84ref84All available routes with length 8–10 from n1 and n5n1-n5-stock

Which Benchmark Should I Use?

Are you a chemist looking for the best model to use right now with commercially available materials? Use Market Series (mkt-*).

Are you an algorithm developer who wants fair comparison against other approaches? Use Reference Series (ref-*).

Why Stratified Benchmarks?

Metric Insensitivity

74% of routes in PaRoutes n5 are length 3-4. General metrics can mask significant performance differences on longer routes (5+ steps) or specific topologies (linear vs. convergent).

Stock Definition

Only ~46% of PaRoutes leaf molecules are in Buyables stock, suggesting many routes are arbitrary fragments cut off where patent descriptions ended.

Validation: We used seed stability analysis across 15 candidate subsets to ensure each benchmark is internally representative and minimizes variance.

Key Terminology

Convergent Route
Contains at least one reaction combining ≥2 non-leaf molecules. Represents complex synthetic strategies where multiple intermediates are brought together.
Linear Route
All reactions use at most one non-leaf molecule, representing sequential transformations.
Route Length
The number of reaction steps in the synthesis route from target to starting materials.
Stock
The set of commercially available or defined starting materials that can be used as leaves in a synthesis route.

Next Steps