T
ToolPrime

robots.txt for GitHub Pages

robots.txt template for GitHub Pages sites and documentation. Simple configuration for static hosting.

robots.txt

User-agent: *
Disallow: /404.html
Allow: /

Sitemap: https://username.github.io/repo/sitemap.xml

Line-by-Line Explanation

User-agent: * — applies to all crawlers

Disallow: /404.html — prevents the custom 404 page from being indexed

Allow: / — allows all content pages to be crawled

Sitemap — points to the sitemap (update with your actual URL)

Best Practices for GitHub Pages

Build a custom robots.txt for your GitHub Pages site

Open robots.txt Generator

Frequently Asked Questions

Where do I put robots.txt for GitHub Pages?
Place it in the root of your GitHub Pages source branch (main or gh-pages) or in the /docs folder if you configured that as the source. It will be served at yourdomain/robots.txt.
Does GitHub Pages need a complex robots.txt?
No. GitHub Pages sites are static HTML and inherently crawl-friendly. A minimal robots.txt with a sitemap reference is all you need.
How do I create a sitemap for GitHub Pages?
If using Jekyll, the jekyll-sitemap plugin generates one automatically. For other static generators, use their respective sitemap plugins. You can also create sitemap.xml manually.

Related Templates