Mrityunjay Pathak TheMrityunjayPathak

About     Skills     Projects     Certificates     Blogs

About

Hello! My name is Mrityunjay Pathak.

I'm a data scientist passionate about building real-world, end-to-end data solutions - from data analysis and dashboards to machine learning and deployment. I love creating projects that don't just stay in notebooks, but live on the internet - making them interactive, accessible and valuable for everyone.

Some projects I've worked on :

AutoIQ : Car Price Prediction

Built a car price prediction system using FastAPI and Docker, trained on 2,800+ scraped car records from Cars24.

Deployed an interactive HTML/CSS/JS application on GitHub Pages that connects to the API, allowing users to get real-time price predictions.

Pickify : Movie Recommender System

Built a content-based movie recommender system using metadata from 5,000+ movies.

Integrated the TMDB API to fetch and display movie posters dynamically, delivering a personalized user experience.

Dashly : Live Sales Dashboard

Built a live Power BI dashboard connected to a Neon PostgreSQL database, containing 50,000+ sales records.

Developed an automated ETL pipeline using GitHub Actions to collect and ingest data daily, keeping the dashboard continuously updated with the latest insights.

Tools and Technologies I've worked with :

Programming Language : Python

Libraries : NumPy, Pandas, Matplotlib, Seaborn, Plotly

Machine Learning : Scikit-learn

Database : MySQL, PostgreSQL

BI Tool : Power BI

Web Framework : FastAPI

Containerization : Docker

Version Control : Git

Automation : GitHub Actions

I'm currently looking for opportunities in Data Science/Data Analytics, where I can contribute to building data-driven solutions that create measurable business impact.

If you're looking for someone who's eager to learn, collaborate and deliver results, I'd love to connect and explore how I can add value to your team.

📫 Connect with Me

Kaggle  |  LinkedIn  |  GitHub  |  Medium  |  Portfolio

Skills

Projects

AutoIQ : Car Price Prediction



➔ Problem

In the used car market, buyers and sellers often struggle to determine a fair price for their vehicle.

This project aims to provide accurate and transparent pricing for used cars by analyzing real-world data.

➔ Solution

Built and deployed an end-to-end machine learning pipeline to predict used car prices from real-world data.

Collected and cleaned 2,800+ used car records from Cars24 using Selenium and BeautifulSoup.

Optimized memory consumption of the dataset by downcasting data types and converting to Parquet format.

Trained models with Scikit-learn Pipelines & ColumnTransformer to avoid leakage.

Deployed the machine learning model as an API using FastAPI on Render.

Built a HTML/CSS/JS frontend hosted on GitHub Pages to interact with the API and display predictions in real-time.

Containerized the entire application using Docker and pushed to Docker Hub for reproducibility.

➔ Results

Reduced dataset memory usage by 90% using optimized storage techniques.

Achieved a 30% lower MAE and a 12% higher R2-score compared to the baseline model.

Improved model stability by 70%, ensuring more stable and reliable predictions.

➔ Impact

Helps car owners quickly find the right selling price for their vehicles based on real-world data.

Makes it easier for buyers to know if a car is fairly priced before making a purchase.

Pickify : Movie Recommender System



➔ Problem

With the rise of streaming services, viewers now have access to thousands of movies across platforms.

As a result, many viewers spend more time browsing than actually watching.

This problem can lead to frustration, lower satisfaction and less time spent on the platform.

Ultimately, this impacts both user experience and business performance.

➔ Solution

Built a content-based movie recommender system trained on 5,000+ movie metadata records.

Recommends the top 5 similar titles for any selected movie in ~2.5 seconds per recommendation.

Integrated the TMDB API to dynamically fetch and display movie posters, enhancing user experience.

Deployed as a Streamlit web app, used by 100+ users to discover personalized movie suggestions.

➔ Impact

If this system gets scaled and integrated with a streaming service, this could :

Reduce the time users spend choosing what to watch.

Increase user engagement, watch time and customer satisfaction.

Help streaming platforms retain users by offering better personalized content.

Netflix Data Analysis



➔ Problem Statement

To analyze Netflix content data, uncovering valuable insights into how the platform evolves over time.

➔ Some Key Findings

Cleaned and analyzed a dataset of 8,000+ Netflix Movies and TV Shows.

More than 60% of the content on Netflix is rated for mature audiences.

Suggests that Netflix targets adult viewers to boost engagement and retention.

More than 25% of the Movies and TV Shows were released on 1st day of the month.

Shows a consistent release schedule, likely aligned with subscription renewal cycles.

More than 40% of the content on Netflix is exclusive to United States.

Shows a strong focus on U.S. market and content availability by location.

More than 20% of the content on Netflix falls under the "Drama" genre.

Confirms that "Drama" is a key part of Netflix's content library.

More than 23% of the content on Netflix was released in 2019 alone.

Indicates a major content push that year, possibly tied to growth or user acquisition efforts.

Supermarket Sales Analysis



➔ Problem Statement

To analyze Supermarket Sales data, identifying key factors for improving profitability and operational efficiency.

➔ Some Key Findings

Analyzed purchasing patterns of 9,000+ customers of a Supermarket.

More than 15% of the products sold were Snacks.

Shows that Snacks are a convenient choice and a major source of revenue.

More than 32% of total sales came from the West region of the Supermarket.

Suggests that West region is a strong performing area as compared to others.

Health and Soft drinks were the most profitable sub-categories in Beverages.

Shows that both type of drink options perform well among customers.

November was the most profitable month contributing about 15% of the total annual profits.

Makes it an ideal time for running promotions and special offers.

Certificates



Blogs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mrityunjay Pathak TheMrityunjayPathak

Block or report TheMrityunjayPathak

About

Skills

Projects

AutoIQ : Car Price Prediction

Pickify : Movie Recommender System

Netflix Data Analysis

Supermarket Sales Analysis

Certificates

Blogs

Pinned Loading

Uh oh!