This project focuses on analyzing traffic data collected over a span of two months. Using data visualization and machine learning models, the goal is to uncover traffic patterns and build predictive models to better understand situation and traffic density.
- Perform exploratory data analysis (EDA) on traffic data.
- Visualize patterns with Matplotlib, Seaborn, and Plotly.
- Apply feature preprocessing (scaling, encoding).
- Train regression models (RandomForest Classifier) to predict traffic values.
- Evaluate model performance with accuracy_score metrics.
- File Used:
TrafficTwoMonth.csv
- The dataset contains traffic-related information including:
- Hourly data
- Vehicle counts (car, bus, truck, bike and total)
- Traffic situation categories
- Other feature like Time of that particular date is present.
- Python
- Pandas & NumPy β Data manipulation & analysis
- Matplotlib & Seaborn β Normal visualizations
- Plotly β Interactive visualizations
- Scikit-learn β Data preprocessing, RandomForest Classifier, pipelines , ColumnTransformer
- Initial data inspection (
head()
,info()
,describe()
) - Traffic trend analysis across different hours and dates
- Distribution plots for vehicle counts
- Interactive plots for comparative analysis
- Traffic patterns per weekend
- Count comparisons between different vehicle types per Hour
- Interactive hist plot for vehicles per hour for week days
- Line plot for average vehicle per hour
- Visualizations combining total trafffic by category
- Line plot for average vehicle accordance with the date of the month
-
Preprocessing:
- Standard scaling
- One-hot encoding for categorical features
-
Models Used:
- Random Forest Classifier
-
Evaluation:
- Accuracy_score
- Classification_report
- RandomForest Classifier models provide an estimation of traffic levels.
- Visualization highlights peak hours and traffic conditions.