Yahoo releases a new web page spider

  • coolslko
  • Proficient
  • Proficient
  • User avatar
  • Joined: 18 Aug 2007
  • Posts: 302
  • Loc: India
  • Status: Offline

Post April 16th, 2008, 4:56 am

Hi,

Recently yahoo released a new web page spider- Slurp 3.0

This new Yahoo! Slurp 3.0 recognizes the same user-agent and all robots.txt directives for 'Yahoo! Slurp,' though it'll identify itself as Slurp 3.0 in your web logs.

People will see some changes because of this new web page spider as mentioned below by Sharad Verma from Yahoo Search.

"a) The crawlers will start crawling from a different and much smaller set of IP addresses, but it'll still be from the crawl.yahoo.net domain. Any reverse DNS checks to identify our crawler will continue to work. Please note that if you're using IP-based recognition of our crawlers, you might see a drop in crawl/coverage from Yahoo! We strongly recommend that you move to reverse DNS-based identification of Yahoo! Slurp if you're using any other method to avoid this problem. The current set of IPs will disappear from your web logs in the next several weeks.

b) The crawlers will also publish a new user-agent, 'Yahoo! Slurp/3.0.' Existing robots.txt directives for 'Slurp' or 'Yahoo! Slurp' will continue to work, but if you have directives specific to 'Slurp/2.0,' they won't be recognized by the new crawler (though usage of the 'Slurp/2.0' user-agent is very rare on the web, so you won't likely be affected). We recommend specifying the shorter version of: User-agent: Slurp. Check out "How do I prevent my site or certain subdirectories from being crawled?" on our Help page for more details.

These changes will affect the main Yahoo! Web Search crawlers. Crawlers that similarly respect the Yahoo! Slurp directive but identify themselves more specifically, such as Yahoo! Slurp China and others, will not be impacted."

From- http://www.ysearchblog.com/archives/000531.html
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post April 16th, 2008, 4:56 am

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: 20 Dec 2002
  • Posts: 6443
  • Loc: Seattle, WA
  • Status: Offline

Post April 16th, 2008, 9:18 am

I looked at that link briefly and noticed a mention of yahoo possibly incorrectly identifying itself as Wget. I find that interesting because I just recently banned Wget from retreiving pages of ozzu after I found numerous requests per second from someone using it. Now it makes me wonder if this was in fact Yahoo as I never checked the IP Address.

Anybody else noticing Yahoo! Bots being identified as Wget instead of Slurp 3.0?
Webmaster Resources
UNFLUX.net - Quality Web Hosting
  • lajocar
  • Proficient
  • Proficient
  • User avatar
  • Joined: 26 Mar 2007
  • Posts: 279
  • Loc: South Africa
  • Status: Offline

Post April 19th, 2008, 2:19 am

cool, I must check my stats to see if I can see the new Yahoo bot in my logs.

Post Information

  • Total Posts in this topic: 3 posts
  • Moderator: Moderator Team
  • Users browsing this forum: No registered users and 6 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© Unmelted Enterprises 1998-2008. Driven by phpBB © 2001-2008 phpBB Group.

 
 
 

Need a pre-made web design for your website?

Check out our templates here: Ozzu Templates