Yahoo releases a new web page spider

  • coolslko
  • Proficient
  • Proficient
  • coolslko
  • Posts: 288
  • Loc: India

Post 3+ Months Ago

Hi,

Recently yahoo released a new web page spider- Slurp 3.0

This new Yahoo! Slurp 3.0 recognizes the same user-agent and all robots.txt directives for 'Yahoo! Slurp,' though it'll identify itself as Slurp 3.0 in your web logs.

People will see some changes because of this new web page spider as mentioned below by Sharad Verma from Yahoo Search.

"a) The crawlers will start crawling from a different and much smaller set of IP addresses, but it'll still be from the crawl.yahoo.net domain. Any reverse DNS checks to identify our crawler will continue to work. Please note that if you're using IP-based recognition of our crawlers, you might see a drop in crawl/coverage from Yahoo! We strongly recommend that you move to reverse DNS-based identification of Yahoo! Slurp if you're using any other method to avoid this problem. The current set of IPs will disappear from your web logs in the next several weeks.

b) The crawlers will also publish a new user-agent, 'Yahoo! Slurp/3.0.' Existing robots.txt directives for 'Slurp' or 'Yahoo! Slurp' will continue to work, but if you have directives specific to 'Slurp/2.0,' they won't be recognized by the new crawler (though usage of the 'Slurp/2.0' user-agent is very rare on the web, so you won't likely be affected). We recommend specifying the shorter version of: User-agent: Slurp. Check out "How do I prevent my site or certain subdirectories from being crawled?" on our Help page for more details.

These changes will affect the main Yahoo! Web Search crawlers. Crawlers that similarly respect the Yahoo! Slurp directive but identify themselves more specifically, such as Yahoo! Slurp China and others, will not be impacted."

From- http://www.ysearchblog.com/archives/000531.html
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Posts: 9088
  • Loc: Seattle, WA & Phoenix, AZ

Post 3+ Months Ago

I looked at that link briefly and noticed a mention of yahoo possibly incorrectly identifying itself as Wget. I find that interesting because I just recently banned Wget from retreiving pages of ozzu after I found numerous requests per second from someone using it. Now it makes me wonder if this was in fact Yahoo as I never checked the IP Address.

Anybody else noticing Yahoo! Bots being identified as Wget instead of Slurp 3.0?
  • lajocar
  • Proficient
  • Proficient
  • User avatar
  • Posts: 272
  • Loc: South Africa

Post 3+ Months Ago

cool, I must check my stats to see if I can see the new Yahoo bot in my logs.

Post Information

  • Total Posts in this topic: 3 posts
  • Users browsing this forum: No registered users and 1 guest
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.