Getting Started
Install AgenticSwarmBench and run your first benchmark in under 2 minutes.
Installation
AgenticSwarmBench is available on PyPI. Requires Python 3.9+. uv is recommended, but pip works too.
uv pip install agentic-swarm-bench # with uv (recommended)pip install agentic-swarm-bench # or with pipFor proxy support (required for asb agent, asb record, and anthropic passthrough):
uv pip install "agentic-swarm-bench[proxy]"Record a Real Session, Then Replay It Anywhere
The fastest way to get meaningful numbers: record what you actually do with your agent, then replay that exact session against any endpoint.
1. Start the recording proxy
asb record -e http://your-gpu-server:8000 -m your-model2. Point your agent at the proxy (listens on localhost:19000)
ANTHROPIC_BASE_URL=http://localhost:19000 claude3. Do your normal work, Ctrl+C when done. Then replay anywhere.
asb replay \
-e http://new-server:8000 \
-m my-model \
-w my-session.jsonlThis captures your real context patterns, real token counts, and real multi-turn behavior - then lets you A/B test endpoints with your actual workload. See Record & Replay for the full workflow.
Run a Synthetic Speed Test
If you don't have a recording yet, asb speed generates realistic agentic context synthetically and sweeps context sizes and concurrency levels.
asb speed \
--endpoint http://localhost:8000 \
--model my-model \
--suite quickasb is the short alias. agentic-swarm-bench also works.
Full Suite with Report
Sweep all context sizes (6K → 400K) and concurrency levels. Generate a Markdown report with verdicts, charts, and recommendations.
asb speed \
--endpoint http://localhost:8000 \
--model my-model \
--suite full \
--output report.mdEndpoint URL Handling
Pass any URL. If it doesn't end with /v1/chat/completions, the path is appended automatically. Both of these work:
asb speed -e http://localhost:8000 -m my-modelasb speed -e https://api.example.com/v1/chat/completions -m my-modelAuthentication
By default, --api-key is sent as Authorization: Bearer <key>. If your endpoint uses a different header:
asb speed -e URL -m MODEL -k MY_KEY --api-key-header X-API-KeyDry Run
Preview exactly what will be sent to the endpoint without making any requests. Useful for validating configuration.
asb speed -e URL -m MODEL --dry-runDocker Quickstart
Run without installing Python. Results are mounted to your host via volume:
docker run --rm -v $(pwd)/results:/results \
swarmone/agentic-swarm-bench speed \
--endpoint http://host.docker.internal:8000 \
--model my-model \
--suite quick \
--output /results/report.mdUse host.docker.internal to reach services running on your host machine from inside the container.
Next Steps
- Deep dive on Record & Replay - the headline feature
- Learn about all 5 CLI modes
- Understand context profiles and prefix cache poisoning
- Read about reports and verdicts