A simple, interactive web application that lets you upload one or more PDF documents and chat with their contents in natural language. Powered by Streamlit, LangChain, and OpenAI (or any other supported LLM), your PDFs become conversational knowledge bases.
-
PDF Upload
Upload single or multiple PDF files via an easy-to-use sidebar. -
Document Parsing & Chunking
Extracts text from PDFs, splits into manageable “chunks” for better context handling. -
Vector Embeddings & Search
Converts chunks into embeddings and stores them in a vector store (e.g. FAISS or Chroma) for fast semantic retrieval. -
Chat Interface
Ask questions in free form; the app finds the most relevant passages and generates informed answers. -
Session State
Maintains chat history within your browser session so you can carry on a back-and-forth conversation. -
Customizable LLM
Swap in any supported LLM: OpenAI’s GPT models, Anthropic’s Claude, local LLMs, etc.
- Frontend & UI: Streamlit
- PDF Processing: PyPDF2 or
pypdf
- LangChain: prompt templates, text splitting, LLM interface
- Embeddings & Vector Store: FAISS, Chroma, or any LangChain-compatible vector store
- LLM Provider: OpenAI (via
openai
SDK) by default
-
Clone the repo
git clone https://github.com/Rishi-Kora/PDF-Chatbot-using-streamlit.git cd PDF-Chatbot-using-streamlit
-
Create & activate a virtual environment (optional, but recommended)
python3 -m venv venv source venv/bin/activate # Linux / macOS venv\Scripts\activate # Windows
-
Install dependencies
pip install streamlit langchain openai pypdf faiss-cpu
-
Set your OpenAI API key
export OPENAI_API_KEY="your_api_key_here" # Linux / macOS set OPENAI_API_KEY="your_api_key_here" # Windows
-
Run the app
streamlit run pdf_chatbot.py
-
Upload PDFs Click the “Browse files” button in the sidebar and select one or more PDFs.
-
Start chatting Type your question in the chat input at the bottom. The app displays your query and the LLM’s response in a chat-style format.
-
Continue the conversation Ask follow-up questions; session state preserves history.
-
pdf_chatbot.py
-
Imports and configures Streamlit, LangChain, and vector store
-
Defines helper functions:
load_pdfs(files)
: Reads and concatenates text from uploaded PDFssplit_into_chunks(text)
: Uses LangChain’s text splittercreate_vector_store(chunks)
: Builds or loads FAISS/Chroma indexask_question(query)
: Retrieves relevant chunks and calls the LLM
-
Sets up Streamlit UI: sidebar for uploads, main chat window, custom CSS
-
-
LICENSE
MIT License — free to use and modify.
PDF-Chatbot-using-streamlit/
├── LICENSE
├── pdf_chatbot.py # Main Streamlit application
└── README.md # This file
- Fork the repository
- Create a feature branch (
git checkout -b feature/YourIdea
) - Commit your changes (
git commit -m "Add some feature"
) - Push to your branch (
git push origin feature/YourIdea
) - Open a Pull Request — we’d love to see what you build!
Feel free to open an issue or reach out to me at korarishi@gmail.com.
This project is licensed under the MIT License. See LICENSE for details.