Skip to content

Real-time multi-target regression system for pallet counting using CNN ensembles and Grad-CAM explainability.

Notifications You must be signed in to change notification settings

Simo-dg/cnn-pallet-counting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“¦ Deep Learning Ensemble for Automated Pallet Counting

This repository provides a real-time solution for automated pallet counting in operational warehouses using deep learning ensemble models. Our system predicts the total, CHEP, and EPAL pallet counts from RGB images through a multi-target regression approach with explainable AI.

πŸ” Overview

Manual pallet counting is still widely used despite being error-prone and labor-intensive. We propose an ensemble of CNN backbones to replace this with a vision-based system that is:

  • Accurate: MAE as low as 1.15 pallets
  • Explainable: Grad-CAM for model transparency

πŸ“Š Key Results

Model MAE (Total) MAE (CHEP) MAE (EPAL) RΒ²
EfficientNet-B3 1.401 0.991 1.009 0.840
ResNet-50 1.918 1.682 1.475 0.729
ConvNeXt-Tiny 1.696 0.721 1.426 0.702
Ensemble 1.151 0.918 0.857 0.875

🧠 Model Architecture

  • Backbones: EfficientNet-B3, ResNet-50, ConvNeXt-Tiny
  • Task: Multi-target regression for [CHEP, EPAL] with total = CHEP + EPAL
  • Loss: Smooth L1
  • Explainability: Grad-CAM visualizations on final conv layers

πŸ›  Training & Augmentation

  • Framework: PyTorch
  • Image size: 224Γ—224
  • Augmentations: Crop, flip, affine, color jitter, blur, compression, dropout
  • Optimizer: AdamW with One-Cycle LR

πŸ“ Dataset

A proprietary dataset of 130 real-world warehouse images with manual annotations. Due to privacy concerns, the dataset is not publicly available. Contact for potential access or collaboration.

πŸ“‚ Repository Structure

β”œβ”€β”€ CNN-Pallet.ipynb
β”œβ”€β”€ dataset/
β”‚ β”œβ”€β”€ images/
β”‚ └── labels.csv
└── README.md

πŸ“ˆ Explainable AI

Grad-CAM visualizations show attention regions for each model:

  • EfficientNet focuses on stack edges and shadows
  • ResNet highlights CHEP markings and textures
  • ConvNeXt captures broader stack areas with robustness to lighting

πŸ’‘ Future Work

  • Expand dataset across multiple facilities
  • Integrate RGB-D or stereo vision for depth-aware counting
  • Explore temporal modeling with video streams

About

Real-time multi-target regression system for pallet counting using CNN ensembles and Grad-CAM explainability.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published