A standalone benchmark dashboard for AgentX - visualizing and comparing performance metrics across multiple LLM providers (OpenAI, Anthropic, Google Gemini, Ollama).
- 📊 Real-time Performance Metrics: TTFT, throughput, success rates
- 📈 Historical Trend Analysis: Track performance over time
- 💰 Cost Comparison: Compare pricing across providers
- 🎯 Category Breakdown: Performance by task type (math, coding, reasoning)
- 🚀 Static Deployment: Runs entirely in browser with SQL.js
- 📱 Responsive Design: Works on desktop and mobile
Visit the live dashboard at: https://codenotary.github.io/agentx-benchmark-ui/
# Clone the repository
git clone https://github.com/codenotary/agentx-benchmark-ui.git
cd agentx-benchmark-ui
# Install dependencies
npm install
# Run development server
npm run dev:static
# Open http://localhost:5173
If you have AgentX running benchmarks:
# Copy latest benchmark database from AgentX
cp ../agentx/benchmark_history.db .
# Update and optimize for web
npm run update-db
# Deploy to GitHub Pages
npm run deploy
- Place your
benchmark_history.db
file in the root directory - Run
npm run update-db
- Deploy with
npm run deploy
-
Fork this repository
-
Enable GitHub Pages:
- Go to Settings → Pages
- Source: Deploy from branch
- Branch: gh-pages
- Path: / (root)
-
Update configuration:
# Edit package.json and vite.config.ts # Replace codenotary with your GitHub username
-
Deploy:
npm run deploy
Your dashboard will be available at: https://codenotary.github.io/agentx-benchmark-ui/
Create .github/workflows/update-dashboard.yml
:
name: Update Dashboard
on:
workflow_dispatch:
inputs:
database_url:
description: 'URL to benchmark_history.db file'
required: false
schedule:
- cron: '0 0 * * *' # Daily at midnight
jobs:
update:
runs-on: ubuntu-latest
permissions:
contents: write
pages: write
steps:
- uses: actions/checkout@v3
- name: Setup Node
uses: actions/setup-node@v3
with:
node-version: '18'
- name: Install dependencies
run: npm install
- name: Download database (if URL provided)
if: github.event.inputs.database_url
run: |
wget -O benchmark_history.db "${{ github.event.inputs.database_url }}"
- name: Update and deploy
run: |
npm run update-db
npm run build:static
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./dist
- Frontend: React + TypeScript + Vite
- UI Components: Tailwind CSS + Recharts
- Database: SQLite via SQL.js-httpvfs
- Deployment: GitHub Pages (static hosting)
┌─────────────────┐
│ GitHub Pages │
│ (Static CDN) │
└────────┬────────┘
│
┌────▼────┐
│ React │
│ SPA │
└────┬────┘
│
┌────▼────────┐
│ SQL.js │
│ Web Worker │
└────┬────────┘
│
┌────▼────────┐
│ benchmark.db │
│ (96KB) │
└─────────────┘
The dashboard reads from these main tables:
benchmark_runs
: Overall benchmark execution metadatatest_results
: Individual test results with timingsmodel_performance
: Aggregated performance metricscategory_performance
: Performance breakdown by category
agentx-benchmark-ui/
├── src/
│ ├── components/ # React components
│ ├── services/ # SQLite and API services
│ └── types/ # TypeScript definitions
├── public/
│ ├── benchmark.db # SQLite database
│ └── sql-wasm.wasm # SQL.js WebAssembly
├── scripts/
│ └── update-db.sh # Database update script
└── vite.config.ts # Vite configuration
npm run dev
- Start development servernpm run dev:static
- Development with static SQLitenpm run build
- Build for productionnpm run build:static
- Build with embedded databasenpm run deploy
- Deploy to GitHub Pagesnpm run update-db
- Update and optimize database
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
This project is licensed under the Apache License 2.0.
- AgentX - The main multi-agent orchestration system
- AgentX Benchmark Tool - CLI tool for running benchmarks
Copyright © 2025 Codenotary Inc