This project was completed during a workshop on "Data Analytics in Python" at IIT (BHU) in June 2024. It involves analyzing a hypothetical tourism dataset to extract actionable insights through data analytics and visualization techniques.
-
Data Collection:
- Loaded and explored the dataset to understand its structure and key attributes.
-
Data Cleaning:
- Checked for missing values and handled them appropriately.
- Removed duplicate rows to ensure data integrity.
-
Data Analysis:
- Performed descriptive statistics to summarize the data.
- Analyzed trends like the number of tourists over time and the relationship between tourists and revenue.
- Calculated average spending per tourist and identified high-revenue countries.
- Measured yearly changes in tourist numbers for each country.
-
Data Visualization:
- Visualized trends in tourists over time.
- Created revenue distribution plots for better insights.
- Used interactive visualizations for deeper exploration of the dataset.
- NumPy: Handling numerical operations.
- Pandas: Data manipulation and analysis.
- Matplotlib: Basic plotting and visualizations.
- Seaborn: Visualizing data patterns with advanced aesthetics.
- Plotly Express: Creating interactive and dynamic visualizations.

- The number of tourists generally increased from 2015 to 2020, with a sharp decline in 2020 likely due to external factors.
- A perfect positive correlation between tourists and revenue highlights the direct impact of tourism growth.
- Country C had the highest average spending per tourist, while Country F generated the highest total revenue.
Tourism_Dataset_Analysis.ipynb
: Jupyter Notebook containing Python code, analysis, and visualizations.Dataset.csv
: Hypothetical tourism dataset used for the analysis.
This project showcases the power of Python in analyzing and visualizing data to uncover meaningful trends. It’s an excellent demonstration of foundational data analytics techniques.
- Download the files.
- Open the Jupyter Notebook file (.ipynb) in Jupyter Notebook or Jupyter Lab.
- Ensure the required Python libraries are installed. Use the following command to install missing dependencies:
pip install numpy pandas matplotlib seaborn plotly
- Run the cells in the notebook sequentially to view the analysis and visualizations.
This project was part of the workshop "Data Analytics in Python" conducted by IIT (BHU).
Feel free to provide feedback or suggestions! 😊