A comprehensive text summarization system that condenses lengthy news articles and blogs into concise summaries using both extractive and abstractive methods. Built with spaCy, BERT, and GPT via HuggingFace Transformers, the project explores the strengths of traditional and modern NLP techniques.
📄 Includes a detailed Report.pdf and explanatory ProjectVideo.mp4 for academic presentation or documentation.
📦 Repository: https://github.com/ahsankhizar5/text-summarization-cnn-dailymail.git
- 📚 Preprocessing of large-scale text data
- ✂️ Extractive summarization using
spaCy
- 🤖 Abstractive summarization using
BERT
andGPT
via HuggingFace - 🎯 Fine-tuning transformer models for improved output
- 🧪 Evaluation of summaries on real-world content
- 📄 Includes Report & Presentation Video
git clone https://github.com/ahsankhizar5/text-summarization-cnn-dailymail.git
cd text-summarization-cnn-dailymail
pip install spacy transformers datasets torch nltk
python -m nltk.downloader punkt
✅ Ensure you are using Python 3.7+ for compatibility with HuggingFace.
Open the Code.ipynb
notebook and follow the cells step-by-step to run extractive and abstractive summarization pipelines.
- Python
- spaCy – Extractive summarization
- HuggingFace Transformers – Abstractive summarization
- BERT, GPT-2 – Pre-trained language models
- NLTK, PyTorch – NLP and deep learning backends
├── Code.ipynb
├── Report.pdf
└── ProjectVideo.mp4
-
Fork the repo
-
Create a branch
git checkout -b feature/your-feature
-
Commit your changes
git add . git commit -m "Add your feature"
-
Push and submit a PR
git push origin feature/your-feature
MIT License — free to use, modify, and distribute.
If this project helped you, inspired you, or saved you time — consider giving it a ⭐ on GitHub!
🧠 "In a world full of information, clarity is power."