spacer
Janet Systems DotNetNuke Websites, Hosting, Custom Servers
Tag Words

Robots or Spiders are used to visit web sites to discover and analyse the content. Dependant upon the owning website the content discovered may be either indexed for use by search engines or for gathering e-mail addresses.


spacer
Web Robot Articles

Details for the CCBot Internet robot. Details for this robot include owner, description, HTTP user agent and whether this robot adheres to the robot exclusion standard.

Owner CommonCrawl Foundation
Country USA
Description

The aim of CommonCrawl is to develop a comprehensive crawl of the Internet.

IP Addresses 38.103.63.59 
HTTP User Agent CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
Exclusion

The website reference, given below, confirms support for the robots.txt exclusion standard, which is described at http://www.robotstxt.org/wc/exclusion.html#robotstxt.

Further Info. http://www.commoncrawl.org/bot.html

NAT August 2008

  This article viewed: 300 times Back

Copyright © 2004-2008 Janet Systems Ltd.

spacer
Print