Skip to content

🤖 End-to-end MLOps platform for fraud detection, integrating a Data Engineering pipeline (GCP, Airflow) and the ML model lifecycle (CatBoost).

Notifications You must be signed in to change notification settings

arthurcornelio88/automatic_fraud_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Automatic Fraud Detection Project

This project provides a complete pipeline for fraud detection, from data engineering to model training and serving. It is composed of two main services, each in its own repository.


📂 Project Repositories

This project is split into two core repositories:

  • ➡️ dataops_pipeline: Contains all code for the data engineering pipeline, including data ingestion, cleaning, and preparation (ETL), orchestrated by Airflow.

  • ➡️ model_training: Contains the notebooks and scripts for training, evaluating, and serving the classification model for fraud detection.


🚀 Getting Started (Development Environment)

To run this project locally, you need to containerize and run each service independently. This requires two separate terminal windows.

  1. Clone both repositories to your local machine:

    git clone git@github.com:arthurcornelio88/stripe_model_training.git
    git clone git@github.com:arthurcornelio88/stripe_dataops_pipeline.git 
  2. Open two terminal windows.

  3. In the first terminal, navigate to the dataops_pipeline directory and start its services using Docker Compose:

    cd dataops_pipeline
    docker-compose up --build
  4. In the second terminal, navigate to the model_training directory and start its services:

    cd model_training
    docker-compose up --build

Each service will now be running in its own isolated, containerized environment.


☁️ Production Deployment

For detailed instructions on how to deploy these services to a production environment (like GCP VMs), please refer to the specific documentation within each repository's README.md or /docs folder.

About

🤖 End-to-end MLOps platform for fraud detection, integrating a Data Engineering pipeline (GCP, Airflow) and the ML model lifecycle (CatBoost).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published