This project predicts whether a loan application will be approved or rejected based on applicant details. It uses Machine Learning classification models like Logistic Regression and Decision Tree.
This project predicts whether a loan application will be Approved (1) or Rejected (0) using machine learning models. We explore and preprocess the dataset, apply classification models, and compare their performance with evaluation metrics.
Source: Kaggle - Loan Approval Prediction Dataset
Number of Rows: 4269
Number of Columns: 13
Features:
loan_id
→ Unique ID of the loan applicationno_of_dependents
→ Number of dependents of applicanteducation
→ Graduate / Not Graduateself_employed
→ Yes / Noincome_annum
→ Annual income of the applicantloan_amount
→ Loan amount requestedloan_term
→ Loan repayment term in monthscibil_score
→ Credit score of the applicantresidential_assets_value
→ Value of residential assetscommercial_assets_value
→ Value of commercial assetsluxury_assets_value
→ Value of luxury assetsbank_asset_value
→ Value of bank assetsloan_status
→ Target variable (Approved=1, Rejected=0)
Preprocessing Steps:
- Removed duplicates
- Handled missing values
- Dropped
loan_id
(irrelevant for prediction) - Encoded categorical variables into numeric form
-
Logistic Regression
- Best for binary classification.
- Outputs probabilities between 0 and 1.
- Helps in understanding the relationship between features and target.
-
Decision Tree Classifier
- Splits data into branches using conditions.
- Easy to visualize and interpret.
- Can overfit if not tuned properly.
It used the following metrics to evaluate model performance:
- Accuracy → Correct predictions / Total predictions
- Precision → Out of predicted "Approved", how many were actually "Approved"
- Recall (Sensitivity) → Out of actual "Approved", how many were correctly predicted
- F1 Score → Harmonic mean of Precision & Recall
- Confusion Matrix → Table showing True/False Positives and Negatives
- Logistic Regression
Training Accuracy: 79.5% Testing Accuracy: 79.8% Balanced performance, slightly lower accuracy.
- Decision Tree (max_depth=5)
Training Accuracy: 97.5% Testing Accuracy: 96.8% Higher accuracy, well-controlled overfitting with depth limit.
✅ Decision Tree performed better on this dataset.
- Python 🐍
- Pandas, NumPy → Data handling
- Matplotlib, Seaborn → Visualization
- Scikit-learn → Machine Learning models & metrics
- Clone the repository (or download the files):
git clone https://github.com/Adeeba-Shahzadi/LoanApprovalPrediction-BinaryClassificationModel.git
cd LoanApprovalPrediction-BinaryClassification
- Install required dependencies:
pip install -r requirements.txt
- LoanApprovalPrediction.ipynb → Jupyter Notebook with step-by-step implementation.
- loan_approval_dataset.csv → Dataset for training
- loanapprovalprediction.py → Python script version
- requirements.txt → Required libraries.
- README.md → Project documentation.
- Run with Jupyter Notebook:
jupyter notebook LoanApprovalPrediction.ipynb
- Run with Python:
python loanapprovalprediction.py