Skip to content

context-labs/cliptagger-playground

Repository files navigation

ClipTagger Demo

ClipTagger Demo is a Next.js app that annotates images and video keyframes using the inference-net/cliptagger-12b model via the Inference.net Chat Completions API. It lets you:

  • Upload a single image for annotation
  • Drop a short video to extract 5 representative frames client‑side and annotate each
  • View strict JSON output with timing/attempt metadata
  • Copy ready‑to‑use API code samples (cURL, TypeScript, Python)

The server route keeps your API key on the server and returns only the model output to the client.


Quickstart

Prerequisites

  • Node.js 18+ (recommended 20+)
  • A package manager (pnpm, npm, or yarn)
  • Inference.net API key

1) Install dependencies

pnpm install
# or
npm install

Avoid having multiple lockfiles in the project to prevent install/build warnings. Use a single tool consistently.

2) Configure environment

Create a .env file in the project root:

INFERENCE_API_KEY=YOUR_API_KEY_HERE
# Optional: Upstash Redis for rate limiting
# If not provided, the API runs without rate limiting
UPSTASH_REDIS_REST_URL="https://YOUR-UPSTASH-URL"
UPSTASH_REDIS_REST_TOKEN="YOUR_UPSTASH_TOKEN"

3) Run the app

pnpm dev
# or
npm run dev

Visit http://localhost:3000 and try uploading an image or a video.

4) Production build

pnpm build && pnpm start
# or
npm run build && npm start

How it works

  • UI: app/page.tsx provides the image flow; components/VideoAnnotator.tsx extracts 5 frames from a dropped video and invokes the same API per frame.
  • Server: app/api/annotate/route.ts calls Inference.net’s Chat Completions endpoint with a strict prompt and returns parsed JSON.
  • Prompts: lib/prompts.ts contains the system and user prompts that shape the response.
  • Code samples: components/CodeModal.tsx + lib/code-snippets/* show how to call the upstream API directly (cURL, TS, Python).

Key model: inference-net/cliptagger-12b


API

POST /api/annotate

Accepts a base64 data URL for an image and returns structured JSON.

Request body:

{
  "imageDataUrl": "..."
}

Successful response (shape):

{
  "success": true,
  "result": {
    "description": "...",
    "objects": ["..."],
    "actions": ["..."],
    "environment": "...",
    "content_type": "...",
    "specific_style": "...",
    "production_quality": "...",
    "summary": "...",
    "logos": ["..."]
  },
  "usage": {},
  "upstreamStatus": 200,
  "attempts": 1,
  "timings": { "upstreamMs": 0, "totalMs": 0 }
}

Error response (example):

{
  "error": "Missing INFERENCE_API_KEY server environment variable"
}

Test locally with cURL:

curl -X POST http://localhost:3000/api/annotate \
  -H 'Content-Type: application/json' \
  -d '{
    "imageDataUrl": "..."
  }'

Security note: the server route reads INFERENCE_API_KEY from the server environment and never exposes it to the browser.

If Upstash Redis variables are set (UPSTASH_REDIS_REST_URL, UPSTASH_REDIS_REST_TOKEN), the route enforces:

  • 100 requests per IP per hour
  • 500 requests per minute globally

If those variables are not set, no rate limiting is applied.


Direct API usage (optional)

If you prefer to call Inference.net directly from your own backend, see:

  • lib/code-snippets/curl-code.ts
  • lib/code-snippets/ts-code.ts
  • lib/code-snippets/python-code.ts

These examples target https://api.inference.net/v1/chat/completions with the inference-net/cliptagger-12b model and a strict JSON response format.


UI notes

  • Image upload accepts JPEG/PNG/WebP/GIF (client limit ~4.5MB).
  • Video workflow extracts 5 uniformly spaced frames client‑side and annotates each independently.
  • Result JSON is syntax‑highlighted and accompanied by timing/attempt metadata.

You can change the number of video frames by editing NUM_FRAMES in components/VideoAnnotator.tsx.


Tech stack

  • Next.js 15, React 19
  • Tailwind CSS 4
  • Radix UI (Dialog, Tabs)
  • lucide-react icons
  • highlight.js for code/JSON display

Troubleshooting

  • 500 from /api/annotate: ensure INFERENCE_API_KEY is set server‑side.
  • Install/build warning about multiple lockfiles: use a single package manager and delete extra lockfiles.
  • Slow or inconsistent upstream responses: the server implements basic retries with backoff; try again or check your network/API quota.

Deploy

On platforms like Vercel:

  1. Set INFERENCE_API_KEY in your project’s environment variables.
  2. Deploy as usual; the app uses Node.js runtime for the API route and does not expose your key to the client.

License

MIT

About

A playground for the ClipTagger model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published