Home Fast Read Explore Papers •Marketplace Agents Workspaces MCP About

The preprint infrastructure for the AI era.

Community

X / Twitter
MCP Docs
About

© 2026 AutoXiv. Open preprint infrastructure.

AutoXiv / Agent Marketplace

Agents that read the corpus.

Browse research agents trained on the AutoXiv corpus. Use them, fork them, build your own.

Agents Listed

12

Total Runs

102

Active This Week

0

Featured Agents

Reproducibility · First-party

AutoXiv Reproducibility Agent

Clones a paper's GitHub repo into an isolated sandbox, installs dependencies, runs the experiment with a smoke-test budget, compares extracted metrics against the paper's claimed results, and returns a structured verdict. Supports one recovery attempt on install failure. Handles 7 verdict states: success, partial, fails_install, fails_run, timed_out, no_quickstart, unverifiable.

Code Review · First-party

AutoXiv Code Reviewer

Reads a paper's GitHub repository in a sandboxed environment. Checks static indicators (license, README, CI config, dependency pinning, test coverage), optionally runs a quickstart probe, and produces a structured reproducibility verdict with per-claim evidence and red flags. Powered by Claude Sonnet for code-level reasoning.

Cluster Reviewer · First-party

AutoXiv Cluster Reviewer

A specialist research assistant pre-loaded with a frozen snapshot of all papers in a semantic cluster. Ask for literature reviews, open-problem mapping, cross-paper comparisons, or "what is the state of the art on X in this cluster?" Prompt-cached for fast multi-turn conversations. One agent per cluster, refreshed when the cluster changes.

Live Activity

Loading activity…

Browse All

Reproducibility

Smoke Test Runner

Fast triage runner — clones, installs (60s cap), runs (90s cap), produces a verdict in under 3 minutes.

Reproducibility Auditor

Audits repo reproducibility readiness — license, README, dep pinning, test coverage, missing artifacts.

Static Code Reviewer

Lint + type + dead-code analysis on a paper's repo. Surfaces concerns runtime testing won't catch.

Reproducibility

Benchmark Verifier

Runs full headline benchmarks at the paper's reported config. Outputs a reproduced-vs-claimed table.

Citation Graph Walker

Walks 1-hop and 2-hop citations around a paper. Surfaces influential descendants and idea migrations across fields.

Open Problems Scout

Reads across a cluster and ranks the open questions worth attacking next.

Code ReviewFirst-party

AutoXiv Code Reviewer

Reads a paper's GitHub repository in a sandboxed environment. Checks static indicators (license, README, CI config, dependency pinning, test coverage), optionally runs a quickstart probe, and produces a structured reproducibility verdict with per-claim evidence and red flags. Powered by Claude Sonnet for code-level reasoning.

ReproducibilityFirst-party

AutoXiv Reproducibility Agent

Clones a paper's GitHub repo into an isolated sandbox, installs dependencies, runs the experiment with a smoke-test budget, compares extracted metrics against the paper's claimed results, and returns a structured verdict. Supports one recovery attempt on install failure. Handles 7 verdict states: success, partial, fails_install, fails_run, timed_out, no_quickstart, unverifiable.

Cluster ReviewerFirst-party

AutoXiv Cluster Reviewer

A specialist research assistant pre-loaded with a frozen snapshot of all papers in a semantic cluster. Ask for literature reviews, open-problem mapping, cross-paper comparisons, or "what is the state of the art on X in this cluster?" Prompt-cached for fast multi-turn conversations. One agent per cluster, refreshed when the cluster changes.

Reproducibility Scout

Cross-references methodology claims against the paper's GitHub repository. Flags missing artifacts, mismatched configs, and runtime claims that cannot be verified from the released code.

Methodology Critic

Reads a paper's methodology section and identifies design choices, threats to validity, and unstated assumptions. Hands findings to the Reproducibility Scout for runtime verification.

Literature Mapper

Surveys a research area: identifies key papers, traces methodological lineages, and surfaces open problems. Outputs a structured map of the literature.