Skip to content

This repository contains my project Tourism Dataset Analysis, completed during a 5-day workshop on "Data Analytics in Python" organized by IIT (BHU) in June 2024. The project focuses on analyzing hypothetical tourism data to identify trends, relationships, and patterns while leveraging Python libraries for data analytics and visualization.

Notifications You must be signed in to change notification settings

GPA95/Tourism-Dataset-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Tourism Dataset Analysis

This project was completed during a workshop on "Data Analytics in Python" at IIT (BHU) in June 2024. It involves analyzing a hypothetical tourism dataset to extract actionable insights through data analytics and visualization techniques.

Features

  1. Data Collection:

    • Loaded and explored the dataset to understand its structure and key attributes.
  2. Data Cleaning:

    • Checked for missing values and handled them appropriately.
    • Removed duplicate rows to ensure data integrity.
  3. Data Analysis:

    • Performed descriptive statistics to summarize the data.
    • Analyzed trends like the number of tourists over time and the relationship between tourists and revenue.
    • Calculated average spending per tourist and identified high-revenue countries.
    • Measured yearly changes in tourist numbers for each country.
  4. Data Visualization:

    • Visualized trends in tourists over time.
    • Created revenue distribution plots for better insights.
    • Used interactive visualizations for deeper exploration of the dataset.

Libraries Used

  • NumPy: Handling numerical operations.
  • Pandas: Data manipulation and analysis.
  • Matplotlib: Basic plotting and visualizations.
  • Seaborn: Visualizing data patterns with advanced aesthetics.
  • Plotly Express: Creating interactive and dynamic visualizations.

Data Analysis Workflow

Tourism-Dataset-Analytics

Project Highlights

Key Insights:

  • The number of tourists generally increased from 2015 to 2020, with a sharp decline in 2020 likely due to external factors.
  • A perfect positive correlation between tourists and revenue highlights the direct impact of tourism growth.
  • Country C had the highest average spending per tourist, while Country F generated the highest total revenue.

Files in the Repository

  • Tourism_Dataset_Analysis.ipynb: Jupyter Notebook containing Python code, analysis, and visualizations.
  • Dataset.csv: Hypothetical tourism dataset used for the analysis.

Conclusion

This project showcases the power of Python in analyzing and visualizing data to uncover meaningful trends. It’s an excellent demonstration of foundational data analytics techniques.

How to Run the Project

  1. Download the files.
  2. Open the Jupyter Notebook file (.ipynb) in Jupyter Notebook or Jupyter Lab.
  3. Ensure the required Python libraries are installed. Use the following command to install missing dependencies:
  • pip install numpy pandas matplotlib seaborn plotly
  1. Run the cells in the notebook sequentially to view the analysis and visualizations.

Acknowledgment

This project was part of the workshop "Data Analytics in Python" conducted by IIT (BHU).

Feedback

Feel free to provide feedback or suggestions! 😊

About

This repository contains my project Tourism Dataset Analysis, completed during a 5-day workshop on "Data Analytics in Python" organized by IIT (BHU) in June 2024. The project focuses on analyzing hypothetical tourism data to identify trends, relationships, and patterns while leveraging Python libraries for data analytics and visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published