Skip to content

Real time AI camera combining YOLOv8 (COCO + OpenImages) with Weighted Box Fusion for broad spectrum, duplicate-free object detection.

Notifications You must be signed in to change notification settings

Filip-2002/Multi-Model-Real-Time-Object-Detection-AI-Camera

Repository files navigation

Multi Model Real Time Object Detection AI Camera by Filip Ilovsky

This project is an advanced AI camera system that integrates two YOLOv8 models trained on COCO (80 classes) and OpenImages V7 (600+ classes) to achieve broad spectrum object recognition. Predictions are combined using Weighted Box Fusion (WBF) to intelligently merge overlapping detections, resulting in duplicate free, higher confidence results beyond the capability of a single model. The pipeline is optimized with OpenCV for real time processing of both webcam feeds and video files, enabling accurate detection across more than 600 object categories.

This project demonstrates expertise in artificial intelligence, machine learning, deep learning model integration, ensemble methods, and real time computer vision systems.

✨ Features

  • Combines COCO + OpenImages V7 models for broad object coverage
  • Uses Weighted Box Fusion (WBF) for duplicate free, high confidence detections
  • Supports both real-time webcam feeds and video file input/output
  • Ability to save processed videos with detections into videos/outputs/
  • Configurable model sizes (s/m/l) and image resolutions to balance speed vs accuracy

🎥 Demo

Real time webcam detection:

Webcam Demo

Real time mp4 detection:

Video Demo

🚀 Setup

If you have any problems scroll down to ⚠️ Notes

  1. Clone the repository:

    git clone https://github.com/Filip-2002/AI-live-camera.git
    cd AI-live-camera
    
    
  2. Create and activate a virtual environment:

    Windows

    python -m venv .venv
    .\.venv\Scripts\activate

    MacOS/Linux

    python3 -m venv .venv
    source .venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
    
    
  4. Download YOLOv8 pretrained weights and place them in the project folder:

    Default (what this project uses):

    Optional (other sizes available for different speed/accuracy trade-offs):

    Performance vs Accuracy:

    • s (small models): Fastest, runs well on lower end machines, but less accurate.
    • m (medium models): Good balance between speed and accuracy.
    • l (large models): Most accurate, but slower, best on more powerful hardware.
    • More information about how to change models in ⚠️ Notes

    More model sizes can be found on the Ultralytics YOLOv8 releases page.

  5. Run:

    Webcam

    Default (CPU, works on all machines):

    python run_webcam_wbf.py

    Use GPU for faster performance (requires an NVIDIA GPU with CUDA):

    python run_webcam_wbf.py --device cuda

    Force CPU explicitly (useful if CUDA is installed but you prefer CPU):

    python run_webcam_wbf.py --device cpu

    Save your webcam output by adding --save before running, for example:

    python run_webcam_wbf.py --device cpu --save

    Video Files

    Run a video from /videos folder:

    python run_webcam_wbf.py --source example.mp4

    Run and save the output video (saved to videos/outputs/):

    python run_webcam_wbf.py --source example.mp4 --save

⚠️ Notes

  • macOS users: Allow your Terminal app access to the Camera in System Settings → Privacy & Security → Camera.

  • If you see ModuleNotFoundError (e.g., cv2), make sure your virtual environment is activated (look for (.venv) in your terminal prompt).

  • Windows users: If python doesn’t work, try using python3 instead.

  • If you get pip version errors, upgrade pip inside the virtual environment:

    Windows

    python -m pip install --upgrade pip

    MacOS/Linux

    python3 -m pip install --upgrade pip
  • When switching between projects, deactivate your virtual environment with:

    deactivate
    
  • If you see OpenCV camera errors on macOS, make sure no other application (e.g. Zoom, Teams, or browser) is already using the webcam.

  • You can adjust the input image size in run_webcam_wbf.py on line 55 ("--imgsz") depending on your machine’s performance:

    • If your machine is struggling, set the default to 320.
    • If your machine is powerful, set the default to 1280.
    • The default value (640) is a balanced option.
  • You can change the model size in run_webcam_wbf.py on lines 52 ("--coco") and 53 ("--oiv7") depending on your machine’s performance:

    • If your machine is struggling, use "yolov8s.pt" and "yolov8s-oiv7.pt", or "yolov8m.pt" and "yolov8m-oiv7.pt".
    • Make sure to download the corresponding model weights in Step 4.
  • By default, the script automatically selects GPU if available, otherwise falls back to CPU. You can force this behavior with --device cuda (GPU) or --device cpu.

  • Use the --save flag to save processed output (works for both webcam and video files). Saved videos go into videos/outputs/.

About

Real time AI camera combining YOLOv8 (COCO + OpenImages) with Weighted Box Fusion for broad spectrum, duplicate-free object detection.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published