Monday, June 15, 2009

FDA cleans its robots.txt file

In Late April I posted about how the FDA's robots.txt file had various peculiar sections, including a note indicating that one section was added at the request of Bristol Myers Squibb. Well, FDA recently (probably late May when they revamped the entire site) tidied up the robots.txt file removing most of the peculiarities (like "area 51").

The comment about Bristol still stands. The rest is now vanilla.

See for yourselves.

#Added for Bristol-Myers on Sept 2005
User-agent: vspider
Disallow: /

#For all other crawlers
User-agent: *
Disallow: /Management/ # don't crawl healthcheck
Hit-rate: 30 # wait 30 seconds before starting a new URL request default=30
Visiting-hours: 23:00EDT-05:00EDT #index this site between 11PM - 5AM EDT
Concurrent-hits: 2 # limit concurrent active URLS to 2 for each index server

No comments: