- Baseline CNN for sign language gesture recognition
- Transfer learning using ResNet-50 and EfficientNet-B0
- Hand landmark extraction via MediaPipe (GRU-based)
- Comprehensive evaluation: accuracy, macro/micro F1, Cohen's kappa, confusion matrix
- Robustness testing (lighting, noise)
The repository includes:
- Scripts for training and evaluation.
- Sample outputs (results.json, plots).
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Train baseline CNN on mini-sample
python -m src.models.train_cnn --config configs/config.yaml
python -m src.eval.evaluate --config configs/config.yaml --weights outputs/baseline_cnn_best.h5
python -m src.eval.robustness --config configs/config.yaml --weights outputs/baseline_cnn_best.h5
# Transfer Learning
python -m src.models.train_transfer --config configs/config.yaml --backbone resnet50
python -m src.models.train_transfer --config configs/config.yaml --backbone efficientnetb0
# Landmark + GRU (stub example)
python -m src.models.train_landmark_gru --config configs/config.yaml
## Demo
### Streamlit (browser)
To run a simple browser-based demo using Streamlit:
```bash
pip install -r requirements.txt
streamlit run demo/streamlit_app.py
Upload a hand image (PNG or JPEG) and optionally adjust the weights path. The app displays the predicted letter and confidence as well as per-class probabilities.
To run a Gradio demo:
pip install -r requirements.txt
python demo/gradio_app.py
This serves a web interface where you can upload an image and see the probability distribution across classes and the top prediction.
To run a live webcam demo (requires a webcam):
python scripts/webcam_demo.py
If you are using this work, please cite using below details :
@misc{shukla_gupta_2025_advancing_sign_language,
author = {Shukla, Manish and Gupta, Harsh},
title = {Advancing Sign Language Interpretation with Transfer Learning and Multimodal Features},
year = {2025},
month = sep,
note = {Preprint (Version 1), Research Square},
doi = {10.21203/rs.3.rs-7586144/v1},
url = {https://doi.org/10.21203/rs.3.rs-7586144/v1}
}
Shukla, M., & Gupta, H. (2025, September 12). Advancing Sign Language Interpretation with Transfer Learning and Multimodal Features (Version 1) [Preprint]. Research Square. https://doi.org/10.21203/rs.3.rs-7586144/v1