robots.txt for GitHub Pages
robots.txt template for GitHub Pages sites and documentation. Simple configuration for static hosting.
robots.txt
User-agent: * Disallow: /404.html Allow: / Sitemap: https://username.github.io/repo/sitemap.xml
Line-by-Line Explanation
User-agent: * — applies to all crawlers
Disallow: /404.html — prevents the custom 404 page from being indexed
Allow: / — allows all content pages to be crawled
Sitemap — points to the sitemap (update with your actual URL)
Best Practices for GitHub Pages
- ✓ Place robots.txt in the repository root (or /docs folder if using that as the source).
- ✓ GitHub Pages serves static files directly — robots.txt works out of the box.
- ✓ Use Jekyll, Hugo, or a CI script to auto-generate sitemap.xml.
- ✓ Custom domains require updating the sitemap URL to match your domain.
Build a custom robots.txt for your GitHub Pages site
Open robots.txt GeneratorFrequently Asked Questions
Where do I put robots.txt for GitHub Pages?▾
Place it in the root of your GitHub Pages source branch (main or gh-pages) or in the /docs folder if you configured that as the source. It will be served at yourdomain/robots.txt.
Does GitHub Pages need a complex robots.txt?▾
No. GitHub Pages sites are static HTML and inherently crawl-friendly. A minimal robots.txt with a sitemap reference is all you need.
How do I create a sitemap for GitHub Pages?▾
If using Jekyll, the jekyll-sitemap plugin generates one automatically. For other static generators, use their respective sitemap plugins. You can also create sitemap.xml manually.
Related Templates
Gatsby robots.txt template for Gatsby static sites. Optimized for the static site generation build process. Astro robots.txt template for Astro framework sites. Minimal configuration needed thanks to static-first architecture. React (Single Page Application) robots.txt template for client-side React applications. Handles build artifacts and ensures proper crawling of SPA routes.