Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured, validated data using TypeScript, Zod, and AI providers like Scaleway and Ollama.
-
Updated
Sep 14, 2025 - TypeScript
Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured, validated data using TypeScript, Zod, and AI providers like Scaleway and Ollama.
AI-powered OCR for Diablo II: Resurrected - batch-extract item tooltips from screenshots using Vision LLMs (OpenAI, Groq, OpenRouter, LM Studio/Ollama). No Tesseract or EasyOCR needed.
Multimodal AI-powered medical assistant with LLMs, speech, and image understanding.
Car Damage Assessment using Vision LLM
This repository focuses on customizing the Qwen2.5-Vision model for specific tasks. It provides step-by-step guidance, scripts, and best practices for fine-tuning the model on custom datasets. Ideal for developers and researchers, it ensures optimal performance and accuracy tailored to unique use cases.
🖼️ Extract Diablo II: Resurrected item tooltips from screenshots in batches, using AI for accurate categorization and searchable databases.
Add a description, image, and links to the vision-llm topic page so that developers can more easily learn about it.
To associate your repository with the vision-llm topic, visit your repo's landing page and select "manage topics."