Submit Results
SynthArena is a living record of the field. We welcome submissions from model developers.
The “No Black Box” Policy
We do not accept self-reported numbers (e.g., “we got 95%”). To appear on the leaderboard, you must provide the Standardized Route Data generated by RetroCast.
All leaderboard scores are computationally verified against submitted route data—no exceptions.
Verification Tiers
Models run explicitly by the SynthArena team. We attest to the hardware, runtime, and configuration.
Models submitted by authors. The score is computationally verified against the provided route data. The community can audit the specific routes for hallucinations.
How to Submit
If your paper is public or you don't care about anonymity:
- 1Run your model through RetroCast:
retrocast ingest,retrocast score, andretrocast analyze - 2Upload the
data/artifact (includingmanifest.json) to a public URL (Zenodo, S3, Google Drive, etc.) - 3Open a Model Submission Issue on GitHub with the link and required metadata
- 4We will audit the manifest and merge it into the main database
Required Metadata:
- Model name and version
- Paper link (arXiv, DOI, or GitHub)
- Runtime statistics (see RetroCast examples)
- 1Run the RetroCast pipeline as described in the public workflow
- 2Email the artifact to [email protected] with the subject “STEALTH SUBMISSION”
- 3We will assign your model a permanent random codename (e.g., Project-Blue-Falcon)
- 4The model appears on the leaderboard immediately with verified metrics, but no author/institution data
- 5Unmasking: When your paper is published, email us to link the codename to your real identity
Technical Documentation: For detailed instructions on running RetroCast, see the RetroCast documentation. Example runtime configurations can be found in the RetroCast repository scripts.
Licensing: By submitting routes to SynthArena, you agree to provide them under the CC-BY 4.0 license, allowing the community to inspect and learn from your model's predictions.
Updates and Retractions
If a community audit reveals that a submitted model produced corrupted data (e.g., invalid SMILES that bypassed the adapter checks due to a bug), we reserve the right to flag the entry as “Disputed” or remove it entirely.
To report issues with submitted models, please open an issue on the SynthArena GitHub repository with detailed evidence and analysis.
Model authors may submit updated results by following the same submission process. Previous versions will be archived for transparency.