Effortlessly package your entire codebase for analysis by Large Language Models. This command-line tool recursively reads a directory, respects your .gitignore
rules, and formats the contents into a single, clean text block that you can paste directly into prompts for models like GPT-5, Claude, or Gemini.
Imagine this project structure:
.
├── assets
│ ├── css
│ │ └── style.css
│ ├── js
│ │ └── app.js
│ └── logo.png
├── vendor
│ └── some-library
│ └── file.php
├── .gitignore
└── index.php
Running the tool is as simple as:
$ php llm-format.php --use-gitignore --copy
Skipping binary file: assets/logo.png
Formatted content has been copied to the clipboard.
Now, your clipboard contains the formatted contents of index.php
, assets/css/style.css
, and assets/js/app.js
, ready for your LLM. The vendor
directory and the binary logo.png
were automatically ignored.
- 📁 Recursive Scanning: Traverses your entire project directory structure automatically.
- 🚫 Smart Ignoring:
- Natively supports
.gitignore
rules to exclude irrelevant files and directories. - Allows for additional custom ignore patterns via the command line (e.g.,
dist/*
,*.log
).
- Natively supports
- 🧩 MIME Type Detection: Includes the MIME type for each file, giving the LLM better context.
- 💨 Skips Binaries: Intelligently detects and skips binary files (like images, executables, etc.) to keep the output clean.
- 📋 Clipboard Integration: Uses the powerful OSC 52 escape sequence to copy the output directly to your system clipboard, even over an SSH connection!
- 💻 Cross-Platform: Written in PHP, it runs anywhere you have the PHP CLI and standard command-line tools.
When you need an LLM to understand or work with a multi-file project, you face several challenges:
- Manual Labor: Opening each file, copying its content, and pasting it into the prompt is tedious and error-prone.
- Loss of Context: You have to manually add file paths to tell the LLM which code belongs to which file.
- Clutter: You might accidentally include build artifacts, dependencies (
node_modules
,vendor
), or local configuration files (.env
), which wastes tokens and confuses the model.
This tool solves all three problems by creating a single, perfectly formatted text block representing your project's context in seconds.
- PHP (version 7.4 or newer is recommended).
file
command-line utility: This tool is used for reliable binary file detection. It's pre-installed on virtually all Linux and macOS systems. For Windows users, it's available through Git for Windows (included in Git Bash) or WSL.
-
Clone the repository:
git clone https://github.com/arthurdick/llm-formatter.git cd llm-formatter
-
Make the script executable:
chmod +x llm-format.php
-
(Recommended) Add to your PATH: For easy access from anywhere, move the script to a directory in your system's
PATH
or add its location to your shell's configuration file (e.g.,~/.bashrc
,~/.zshrc
).# Example: move to /usr/local/bin sudo mv llm-format.php /usr/local/bin/llm-format # Now you can run it from any directory llm-format --help
The script is controlled via command-line options.
php llm-format.php [options]
Short | Long | Description | Default |
---|---|---|---|
-d |
--dir <path> |
The directory to process. | Current directory |
-g |
--use-gitignore |
Exclude files and directories specified by found .gitignore files. |
Disabled |
-i |
--ignore <csv> |
A comma-separated list of glob patterns to ignore (e.g., "*.log,build/*" ). |
None |
-c |
--copy |
Copy the output to the system clipboard instead of printing to the terminal. | Disabled |
-h |
--help |
Display the help message. | - |
1. Process the current directory and print to terminal:
php llm-format.php
2. Process a specific directory, respecting .gitignore
, and copy to clipboard:
php llm-format.php -d ./my-project -g -c
3. Process the current directory, ignoring log files and the dist
folder:
php llm-format.php --ignore "*.log,dist/*"
The generated output is formatted for clarity, with clear separators for each file.
--- BEGIN FILE: src/index.php (MIME: text/x-php) ---
<?php
require_once 'helpers.php';
echo render_page("Welcome!");
--- END FILE: src/index.php ---
--- BEGIN FILE: src/helpers.php (MIME: text/x-php) ---
<?php
function render_page(string $title): string {
return "<html><head><title>{$title}</title></head></html>";
}
--- END FILE: src/helpers.php ---
The --copy
feature uses the OSC 52 terminal escape sequence. This is a modern and secure way to access the system clipboard that works seamlessly, even over remote SSH sessions.
For it to work, you must be using a compatible terminal emulator, such as:
- iTerm2
- Kitty
- WezTerm
- Windows Terminal
- Alacritty
Older terminals may not support this feature. If --copy
doesn't work, you can always pipe the output to your system's clipboard command:
# macOS
php llm-format.php | pbcopy
# Linux (requires xclip)
php llm-format.php | xclip -selection clipboard
This project is licensed under the MIT License. See the LICENSE file for details.