🛡️ Persian Swear Detector

A robust and easy-to-use Python tool for detecting Persian (Farsi) offensive text using both rule-based and machine learning (ML) approaches.

🚀 Features

Hybrid Detection: Combines rule-based and ML-based detection for high accuracy
Confidence Scores: Provides confidence levels for predictions
Persian Language Support: Handles Persian text preprocessing and normalization
CLI Interface: Simple command-line interface for quick testing
Model Persistence: Save and load trained models for fast deployment

📦 Project Structure

├── swear_detector.py         # Main detector script
├── requirements.txt          # Python dependencies
├── dataset/
│   └── dataset.json         # Labeled dataset for training (Offensive/Normal)
├── models/
│   └── model.pkl            # Trained ML model
├── Dockerfile               # Docker support
├── docker-compose.yml       # Docker Compose config
└── README.md               # Documentation

🛠️ Installation

Clone the repository
Install dependencies:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

🚀 Usage

Run the detector:

python3 swear_detector.py

Each prediction includes:

Original text
Final prediction (Offensive/Normal)
Confidence score
ML confidence score (for offensive predictions)

📊 Dataset

The project uses a labeled dataset (dataset.json) containing:

Offensive texts: Inappropriate or offensive content
Normal texts: Regular, non-offensive content

🤖 Model

The system uses a hybrid approach:

Machine Learning: TF-IDF + Logistic Regression
Rule-based detection
Combined scoring for final prediction

🐳 Docker Support

Build and run with Docker:

docker-compose up --build

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ Persian Swear Detector

🚀 Features

📦 Project Structure

🛠️ Installation

🚀 Usage

📊 Dataset

🤖 Model

🐳 Docker Support

About

Uh oh!

Releases 2

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
dataset		dataset
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
swear_detector.py		swear_detector.py

ghaninia/toxicity_detection

Folders and files

Latest commit

History

Repository files navigation

🛡️ Persian Swear Detector

🚀 Features

📦 Project Structure

🛠️ Installation

🚀 Usage

📊 Dataset

🤖 Model

🐳 Docker Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors 3

Languages