0
0
SEO Fundamentalsknowledge~5 mins

Robots.txt configuration in SEO Fundamentals - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is the purpose of a robots.txt file?
A robots.txt file tells web robots (like search engine crawlers) which parts of a website they can or cannot visit. It helps control what content is indexed.
Click to reveal answer
beginner
What does the User-agent directive specify in a robots.txt file?
The User-agent directive specifies which web robot the following rules apply to. For example, User-agent: Googlebot targets Google's crawler.
Click to reveal answer
beginner
How do you block all web crawlers from accessing your entire website using robots.txt?
You write:<br>User-agent: *<br>Disallow: /<br>This tells all robots not to visit any pages on the site.
Click to reveal answer
beginner
What does the Disallow directive do in a robots.txt file?
The Disallow directive tells the specified user-agent which paths or pages it should NOT crawl.
Click to reveal answer
intermediate
Can robots.txt prevent a page from being indexed if other sites link to it?
No. robots.txt only controls crawling. If other sites link to a page, search engines might still index its URL without content.
Click to reveal answer
What does User-agent: * mean in a robots.txt file?
AIt applies rules only to Googlebot
BIt allows all users to access the website
CIt blocks all users from the website
DIt applies rules to all web crawlers
How do you allow all web crawlers to access your entire website?
AUser-agent: *<br>Disallow: /
BUser-agent: *<br>Disallow:
CUser-agent: Googlebot<br>Disallow: /
DUser-agent: *<br>Allow: /private
Which directive blocks a specific folder from being crawled?
ADisallow: /folder/
BAllow: /folder/
CUser-agent: /folder/
DBlock: /folder/
If a page is blocked by robots.txt, can it still appear in search results?
AOnly if the page is on the homepage
BNo, it will never appear
CYes, if other sites link to it
DOnly if the page has a sitemap
Where should the robots.txt file be placed on a website?
AIn the root directory of the website
BIn the images folder
CIn the CSS folder
DAnywhere on the website
Explain how a robots.txt file controls web crawler access to a website.
Think about how you tell robots where they can and cannot go.
You got /5 concepts.
    Describe a scenario where blocking a page with robots.txt might not prevent it from appearing in search results.
    Consider what happens if other sites link to a blocked page.
    You got /3 concepts.