Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
-
Updated
Jul 23, 2025 - Python
Vision-Augmented Retrieval and Generation (VARAG) - Vision first RAG Engine
Multimodal RAG to search and interact locally with technical documents of any kind
Official code release for ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity (published at ICLR 2022)
Evaluation code and datasets for the ACL 2024 paper, VISTA: Visualized Text Embedding for Universal Multi-Modal Retrieval. The original code and model can be accessed at FlagEmbedding.
Official Implementation of GENIUS: A Generative Framework for Universal Multimodal Search, CVPR 2025
This repository contains the dataset and source files to reproduce the results in the publication Müller-Budack et al. 2021: "Multimodal news analytics using measures of cross-modal entity and context consistency", In: International Journal on Multimedia Information Retrieval (IJMIR), Vol. 10, Art. no. 2, 2021.
[CVPR 2025] Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
Explores early fusion and late fusion approaches for Multimodal medical Image Retrieval
A Survey of Multimodal Retrieval-Augmented Generation
The official code of "Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search"
Formalizing Multimedia Recommendation through Multimodal Deep Learning, accepted in ACM Transactions on Recommender Systems.
Multimodal retrieval in art with context embeddings.
The code used to train and run inference with MMDocIR
A list of research papers on knowledge-enhanced multimodal learning
Official Implementation of "Composed Object Retrieval: Object-level Retrieval via Composed Expressions"
A generalized self-supervised training paradigm for unimodal and multimodal alignment and fusion.
Mini-batch selective sampling for knowledge adaption of VLMs for mammography.
iPatent - Interactive Patent Search and Analysis
Evaluating dense model-based approaches for Multimodal Medical Case retrieval.
Add a description, image, and links to the multimodal-retrieval topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-retrieval topic, visit your repo's landing page and select "manage topics."