A reusable theme component to manage and generate the site's robots.txt.
- Features
- Demo
- Installation
- Configuration
- Compatibility
- Contributing
- Licensing, copyright
- Author information
- Automatically excludes all bots and crawlers in non-production environments by default.
- Sane and useful Disallow:defaults.
- Supports crawler-specific blocking and per-path exclusions.
- Automatically manages sitemap references:
- If a sitemap is enabled (default: sitemap.xml), its URL is added torobots.txt.
- If a sitemap is disabled or renamed, the reference is updated or omitted accordingly.
 
- If a sitemap is enabled (default: 
Clone the repository and run the included example content (requires Hugo, Go, and Git):
git clone https://github.com/foundata/hugo-component-robotstxt.git
cd ./hugo-component-robotstxt/exampleSite
HUGO_MODULE_WORKSPACE=hugo.work hugo server --ignoreVendorPaths "**"Or look at the following pages using this theme component:
Add the following module path(s) to your theme: configuration:
theme:
  - "golang.foundata.com/hugo-component-robotstxt"Hugo automatically fetches and import theme module paths as Go/Hugo modules, so you do not need to list them under module.imports manually. Using modules requires Hugo, Go, and Git to be installed on your system.
From the root directory of your Hugo site, initialize a new Git repository (if you haven't already), then add the theme as a Git submodule:
git submodule add https://github.com/foundata/hugo-component-robotstxt.git themes/robotstxtNow reference the theme directory name in your theme: configuration:
theme:
  - "robotstxt"ℹ️ Heads-up: You have to set
enableRobotsTXT: true(which isfalseby default) and make surerobotstxtis not listed atdisableKinds(which should be OK by default). Otherwise, norobots.txtwill be created.
Example:
# Enable generation of robots.txt file.
enableRobotsTXT: true
params:
  robotsTxt:
    # Block all user agents ("Disallow: /") in non-production environments.
    excludeNonProduction: true
    exclude:
      # Version control
      - "/.git/"
      # System and metadata dirs
      - "/.well-known/"
      # Log and temp files
      - "/*.log$"
      - "/*.tmp$"
      - "/*.bak$"
    excludeCrawlers:
      - "GPTBot" # OpenAI / ChatGPT indexing
      - "ChatGPT-User" # OpenAI / ChatGPT plugins, used for direct actions in the name of a ChatGPT user
This section documents the theme options you can place under params.robotsTxt in your Hugo configuration. The example configurations and are safe to copy-paste. All keys are optional and the theme falls back to sensible behavior unless otherwise noted.
- Type: Boolean.
- Default: true
- Purpose: When true, the template adds the following directive in non-production builds:Production detection is based on either:User-agent: * Disallow: /- hugo.IsProduction
- .Site.Params.env == "production"
 
- Example (config):
params: robotsTxt: excludeNonProduction: true 
- Type: List of strings.
- Default: ["/.git/", "/*.log$", "/*.tmp$", "/*.bak$", "/.well-known/"]
- Purpose:
- List of path patterns.
- Each entry becomes a Disallow:rule for all .crawlers (User-agent: *).
 
- Example (config):
becomes the following inparams: robotsTxt: exclude: - "/download/" - "*.asc$" robots.txt:User-agent: * Disallow: /download/ Disallow: *.asc$
- Type: List of strings.
- Default: [](empty list)
- Purpose:
- List of crawler user-agent names to exclude. Most companies provide some kind of list, e.g.:
- Reminder: robots.txtis an advisory mechanism. It prevents compliant crawlers from fetching URLs, but does not protect sensitive files from direct access.
 
- Example (config): Each entry creates a crawler-specific block:
becomes the following inparams: robotsTxt: excludeCrawlers: - "ia_archiver" - "GPTBot" robots.txt:User-agent: ia_archiver Disallow: / User-agent: GPTBot Disallow: /
There is nothing to configure. But the component is aware of Hugo's sitemap configuration:
- By default Hugo generates the Sitemap as /sitemap.xml.
- If disabled (disableKinds = ["sitemap"]) or ifsitemap.filenameis set to an empty string, noSitemap:line is emitted.
- If a custom filename is set (e.g. sitemap.filename = "mysite-map.xml"), the generatedrobots.txtwill correctly reference it.
This project is compatible with Hugo (extended) ≥ v0.148.0 and should always work with the latest Hugo release (we usually run the latest Hugo ourselves and fix issues promptly). It has been tested at least with:
If your version isn't listed, it might still work. Just give it a try.
See CONTRIBUTING.md if you want to get involved.
This projects's functionality is mature, so there might be little activity on the repository in the future. Don't get fooled by this, the project is under active maintenance and used daily by the maintainers.
Copyright (c) 2025 foundata GmbH (https://foundata.com)
This project is licensed under the GNU General Public License v3.0 or later (SPDX-License-Identifier: GPL-3.0-or-later), see LICENSES/GPL-3.0-or-later.txt for the full text.
The REUSE.toml file provides detailed licensing and copyright information in a human- and machine-readable format. This includes parts that may be subject to different licensing or usage terms, such as third-party components. The repository conforms to the REUSE specification. You can use reuse spdx to create a SPDX software bill of materials (SBOM).
This project was created and is maintained by foundata.