WebFaction
Community site: login faq

We have www.ourdomain.com set up with Round Robin DNS to serve our application (django) from a few different servers. We also have www1.ourdomain.com, www2.ourdomain.com, www3.ourdomain.com, etc, so that the servers can be accessed directly (used for debugging in case one server in the DNS pool stops responding).

This has worked well for us, however recently the numbered subdomains have started showing up in Google search results.

I want to serve a different robots.txt depending on what domain is being used so that the GoogleBots will not crawl the numbered subdomains. I wanted to do this in apache's httpd.conf but was not sure what they best way to do this was.

asked 17 May '11, 01:26

Jesse
13113
accept rate: 0%


You should be able to do this using mod_rewrite, the %{HTTP_HOST} variable, and [P] (proxy) redirection. For example, something like:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^www1.ourdomain.com$ [NC, OR]
RewriteCond %{HTTP_HOST} ^www2.ourdomain.com$ [NC, OR]
RewriteCond %{HTTP_HOST} ^www3.ourdomain.com$ [NC, OR]
RewriteRule ^/robots.txt$ /robots_nocrawl.txt [P,L]

And then serve both /robots.txt and /robots_nocrawl.txt simultaneously on your application.

What this effectively says is, "check the incoming domain case-insensitively (NC = No Case). If the specified host is www1.ourdomain.com, OR www2.ourdomain.com, OR www3.ourdomain.com, AND if the request is asking for the file /robots.txt (in URL space), then serve /robots_nocrawl.txt (in URL space) instead, but use proxy redirection so that it still appears as "/robots.txt" to the client and it doesn't look like any magic is happening."

Hope that helps!

Note: you could use a regular expression instead of explicitly using www1, www2, etc, but that would have made the example less readable.

permanent link

answered 17 May '11, 01:43

ryans ♦♦
5.0k93360
accept rate: 43%

edited 17 May '11, 01:44

Looks good, however is mod_proxy compiled in to the standard httpd that Webfaction provides, or can it be loaded in in the httpd.conf via a LoadModule import?

(17 May '11, 02:29) Jesse

No, they aren't automatically available in the Django or mod_wsgi installers, but they are already on the server and can be directly copied from /usr/lib/httpd/modules/ to your apache2/modules/ directory in the Django application. You will need mod_proxy.so and mod_proxy_http.so

After copying them, activate them in your httpd.conf using:

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so

Edit: thanks to Jessie for providing this information

(17 May '11, 02:59) ryans ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×28
×9
×5

question asked: 17 May '11, 01:26

question was seen: 19,568 times

last updated: 17 May '11, 20:12

WEBFACTION
REACH US
SUPPORT
AFFILIATE PROGRAM
LEGAL
© COPYRIGHT 2003-2019 SWARMA LIMITED - WEBFACTION IS A SERVICE OF SWARMA LIMITED
REGISTERED IN ENGLAND AND WALES 5729350 - VAT REGISTRATION NUMBER 877397162
5TH FLOOR, THE OLD VINYL FACTORY, HAYES, UB3 1HA, UNITED KINGDOM