FEATURE: let Google index pages so it can remove them

Google insists on indexing pages so it can figure out if they
can be removed from the index.

see: https://support.google.com/webmasters/answer/6332384?hl=en

This change ensures the we have special behavior for Googlebot
where we allow indexing, but block the actual indexing via
X-Robots-Tag
This commit is contained in:
Sam Saffron
2020-05-11 12:14:21 +10:00
parent 4a74f18e95
commit bb4e8899c4
4 changed files with 18 additions and 2 deletions

View File

@@ -1,4 +1,13 @@
# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
#
# Googlebot must be allowed to index so it can remove items from the index
# we return the X-Robots-Tag with noindex, nofollow which will ensure
# indexing is minimized and nothing shows up in Google search results
User-agent: googlebot
Allow: <%= Discourse.base_uri + "/" %>
Disallow: <%= Discourse.base_uri + "/uploads/*" %>
User-agent: *
Disallow: <%= Discourse.base_uri + "/" %>