This project builds a portrait similarity retrieval system that combines:
- Visual features from DINO Vision Transformer (ViT)
- Fast approximate nearest neighbor search with FAISS (using IVF+PQ index structure)
- Weak labels (e.g.
"painting_001.jpg") to restrict search space by class
The system supports query-by-example similarity search on artworks from the National Gallery of Art Open Access dataset.
Below is an example output generated by
demo.ipynb:
- Left: Query Portrait
- Right: Top-5 Similar Results retrieved from the same category (e.g., "painting")
| Step | Description |
|---|---|
| 1. Preprocessing | Download portrait artworks from NGA and assign weak labels from filenames (e.g. painting_16.jpg) |
| 2. Feature Extraction | Use DINO ViT to extract visual embeddings (768-dim). |
| 3. PCA Reduction | Apply PCA to reduce to 256-dim embeddings, improving retrieval speed & efficiency |
| 4. FAISS Indexing | Build a FAISS IVF+PQ index or Flat index over reduced embeddings |
| 5. Class-Aware Retrieval | Restrict query to same class using filename prefix and sub-index filtering |
| 6. Interactive Demo | Visualize query + top-K results and save the output for reports or evaluation |


