SeoHow-ToBeginner · 3 min read

How to Use robots.txt for SEO: Guide and Examples

Use a robots.txt file placed in your website's root directory to tell search engines which pages or folders to crawl or avoid. It uses simple rules like User-agent to specify the crawler and Disallow to block paths. This helps manage SEO by controlling what content is indexed.

📐

Syntax

The robots.txt file uses simple lines to control web crawlers. Each section starts with User-agent to specify which crawler the rules apply to. Disallow tells the crawler which pages or folders it should NOT visit. Allow can be used to override disallow rules for specific paths. Comments start with #.

plaintext

User-agent: *
Disallow: /private/
Allow: /private/public-info.html

💻

Example

This example blocks all crawlers from accessing the /admin/ folder but allows them to crawl everything else on the site.

plaintext

User-agent: *
Disallow: /admin/

Output

Search engines will not crawl any URL starting with /admin/ but will crawl all other pages.

⚠️

Common Pitfalls

Placing robots.txt in the wrong folder so it is not found by crawlers.
Using Disallow: / unintentionally blocks the entire site.
Expecting robots.txt to hide pages from search results; it only blocks crawling, not indexing if pages are linked elsewhere.
Incorrect syntax like missing colons or spaces can cause rules to be ignored.

plaintext

Wrong:
User-agent *
Disallow /private/

Right:
User-agent: *
Disallow: /private/

📊

Quick Reference

Directive	Purpose	Example
User-agent	Specifies which crawler the rules apply to	User-agent: Googlebot
Disallow	Blocks crawler from specified path	Disallow: /secret/
Allow	Allows crawling of a path even if parent is disallowed	Allow: /secret/public.html
Sitemap	Specifies location of sitemap file	Sitemap: https://example.com/sitemap.xml
#	Adds a comment line	# This is a comment

✅

Key Takeaways

Place robots.txt in your website's root folder to control crawler access.

Use User-agent and Disallow directives to block or allow specific crawlers and paths.

robots.txt controls crawling but does not guarantee pages won't appear in search results.

Check syntax carefully to avoid blocking your entire site accidentally.

Use the Sitemap directive to help crawlers find your sitemap.