AgentX Benchmark UI

A standalone benchmark dashboard for AgentX - visualizing and comparing performance metrics across multiple LLM providers (OpenAI, Anthropic, Google Gemini, Ollama).

🌐 View Live Dashboard

Features

📊 Real-time Performance Metrics: TTFT, throughput, success rates
📈 Historical Trend Analysis: Track performance over time
💰 Cost Comparison: Compare pricing across providers
🎯 Category Breakdown: Performance by task type (math, coding, reasoning)
🚀 Static Deployment: Runs entirely in browser with SQL.js
📱 Responsive Design: Works on desktop and mobile

Quick Start

Option 1: Use Hosted Version

Visit the live dashboard at: https://codenotary.github.io/agentx-benchmark-ui/

Option 2: Run Locally

# Clone the repository
git clone https://github.com/codenotary/agentx-benchmark-ui.git
cd agentx-benchmark-ui

# Install dependencies
npm install

# Run development server
npm run dev:static

# Open http://localhost:5173

Updating Benchmark Data

From AgentX Repository

If you have AgentX running benchmarks:

# Copy latest benchmark database from AgentX
cp ../agentx/benchmark_history.db .

# Update and optimize for web
npm run update-db

# Deploy to GitHub Pages
npm run deploy

Manual Database Upload

Place your benchmark_history.db file in the root directory
Run npm run update-db
Deploy with npm run deploy

GitHub Pages Deployment

Initial Setup

Fork this repository
Enable GitHub Pages:
- Go to Settings → Pages
- Source: Deploy from branch
- Branch: gh-pages
- Path: / (root)

Update configuration:

# Edit package.json and vite.config.ts
# Replace codenotary with your GitHub username

Deploy:
```
npm run deploy
```

Your dashboard will be available at: https://codenotary.github.io/agentx-benchmark-ui/

Automated Updates with GitHub Actions

Create .github/workflows/update-dashboard.yml:

name: Update Dashboard

on:
  workflow_dispatch:
    inputs:
      database_url:
        description: 'URL to benchmark_history.db file'
        required: false
  schedule:
    - cron: '0 0 * * *'  # Daily at midnight

jobs:
  update:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pages: write
      
    steps:
      - uses: actions/checkout@v3
      
      - name: Setup Node
        uses: actions/setup-node@v3
        with:
          node-version: '18'
          
      - name: Install dependencies
        run: npm install
        
      - name: Download database (if URL provided)
        if: github.event.inputs.database_url
        run: |
          wget -O benchmark_history.db "${{ github.event.inputs.database_url }}"
          
      - name: Update and deploy
        run: |
          npm run update-db
          npm run build:static
          
      - name: Deploy to GitHub Pages
        uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_dir: ./dist

Technology Stack

Frontend: React + TypeScript + Vite
UI Components: Tailwind CSS + Recharts
Database: SQLite via SQL.js-httpvfs
Deployment: GitHub Pages (static hosting)

Architecture

┌─────────────────┐
│  GitHub Pages   │
│   (Static CDN)  │
└────────┬────────┘
         │
    ┌────▼────┐
    │  React  │
    │   SPA   │
    └────┬────┘
         │
    ┌────▼────────┐
    │  SQL.js     │
    │  Web Worker │
    └────┬────────┘
         │
    ┌────▼────────┐
    │ benchmark.db │
    │   (96KB)     │
    └─────────────┘

Data Schema

The dashboard reads from these main tables:

benchmark_runs: Overall benchmark execution metadata
test_results: Individual test results with timings
model_performance: Aggregated performance metrics
category_performance: Performance breakdown by category

Development

Project Structure

agentx-benchmark-ui/
├── src/
│   ├── components/     # React components
│   ├── services/       # SQLite and API services
│   └── types/          # TypeScript definitions
├── public/
│   ├── benchmark.db    # SQLite database
│   └── sql-wasm.wasm   # SQL.js WebAssembly
├── scripts/
│   └── update-db.sh    # Database update script
└── vite.config.ts      # Vite configuration

Available Scripts

npm run dev - Start development server
npm run dev:static - Development with static SQLite
npm run build - Build for production
npm run build:static - Build with embedded database
npm run deploy - Deploy to GitHub Pages
npm run update-db - Update and optimize database

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Open a Pull Request

License

This project is licensed under the Apache License 2.0.

Related Projects

AgentX - The main multi-agent orchestration system
AgentX Benchmark Tool - CLI tool for running benchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
docs		docs
public		public
scripts		scripts
src		src
.gitignore		.gitignore
JSONIC_INTEGRATION_ISSUES.md		JSONIC_INTEGRATION_ISSUES.md
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
server.cjs		server.cjs
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AgentX Benchmark UI

Features

Quick Start

Option 1: Use Hosted Version

Option 2: Run Locally

Updating Benchmark Data

From AgentX Repository

Manual Database Upload

GitHub Pages Deployment

Initial Setup

Automated Updates with GitHub Actions

Technology Stack

Architecture

Data Schema

Development

Project Structure

Available Scripts

Contributing

License

Related Projects

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

codenotary/agentx-benchmark-ui

Folders and files

Latest commit

History

Repository files navigation

AgentX Benchmark UI

Features

Quick Start

Option 1: Use Hosted Version

Option 2: Run Locally

Updating Benchmark Data

From AgentX Repository

Manual Database Upload

GitHub Pages Deployment

Initial Setup

Automated Updates with GitHub Actions

Technology Stack

Architecture

Data Schema

Development

Project Structure

Available Scripts

Contributing

License

Related Projects

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages