Google Announces it Processes Over 1 Trillion Unique URLs

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Posts: 9074
  • Loc: Seattle, WA & Phoenix, AZ

Post 3+ Months Ago

Quote:
We've known it for a long time: the web is big. The first Google index in 1998 already had 26 million pages, and by 2000 the Google index reached the one billion mark. Over the last eight years, we've seen a lot of big numbers about how much content is really out there. Recently, even our search engineers stopped in awe about just how big the web is these days -- when our systems that process links on the web to find new content hit a milestone: 1 trillion (as in 1,000,000,000,000) unique URLs on the web at once!


http://googleblog.blogspot.com/2008/07/ ... s-big.html

I find the timing of this information interesting as today was the day Cuil launced announcing they have the biggest index on the internet:

other-search-engines/search-engine-cuil-launched-compete-with-google-t90243.html

Quote:
For starters, Cuil's search index spans 120 billion Web pages.

Patterson believes that's at least three times the size of Google's index, although there is no way to know for certain. Google stopped publicly quantifying its index's breadth nearly three years ago when the catalog spanned 8.2 billion Web pages.


A few days ago Google announced this information with regards to the 1 trillion mark. News is also spreading all around that Cuil is having major problems and their attempt to compete with Google is downright embarassing. Their results don't show up for common words like "frog" and "toad" and not even for their own search engine "Cuil". Ozzu doesn't show up, as well as many other missing websites.
  • Anonymous
  • Bot
  • No Avatar
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post 3+ Months Ago

  • joebert
  • Sledgehammer
  • Genius
  • User avatar
  • Posts: 13496
  • Loc: Florida

Post 3+ Months Ago

I still haven't figured out if it's actually a trillion pages, or a trillion links. I've been seeing conflicting news on this subject. :scratchhead:
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Posts: 9074
  • Loc: Seattle, WA & Phoenix, AZ

Post 3+ Months Ago

It is kind of confusing how they say it. In their first paragraph they talk about how many pages they had indexed over their history, then they throw out the 1 trillion number except use the term "links" instead (but all in the same paragraph). Another part of the blog says this:

Quote:
We don't index every one of those trillion pages -- many of them are similar to each other, or represent auto-generated content similar to the calendar example that isn't very useful to searchers. But we're proud to have the most comprehensive index of any search engine, and our goal always has been to index all the world's data.


So it sounds like they are saying they examine 1 trillion links everyday, but since many are duplicates not every page is listed. Either way from what I can tell Google has way more indexed than Cuil, I don't even get results for many things that should come up. Or Cuil is just buggy at the moment.
  • joebert
  • Sledgehammer
  • Genius
  • User avatar
  • Posts: 13496
  • Loc: Florida

Post 3+ Months Ago

Now a trillion page pool makes sense.
The most results I've found for any one word at Google doesn't even number 50 billion & that's for words like "a" or the number "1".

Maybe Cuil had issues adding more hardware & misplaced some data, I remember seeing something about it going down to add capacity awhile ago.
  • brandrocker
  • Novice
  • Novice
  • brandrocker
  • Posts: 22

Post 3+ Months Ago

They may have developed a decent database of links but need to work on Algo, although I'm not quite sure whether it can put a tough challenge before Google or it will grab a part of Live's already shrinked share in search engine market.
  • George L.
  • Bronze Member
  • Bronze Member
  • George L.
  • Posts: 2209
  • Loc: Malaysia

Post 3+ Months Ago

Cuil, I visited it about half an hour ago, and was looking for information about RAR file. I needed to download the extractor. I typed download RAR. What I found was quite embarrassing if someone was standing next to me. ( out of the 3 columns links ).

I think it was much skewed as most analysts would have said. Another blog says it's only a day old and Google has been in Search for about a decade. So we can't tell from now.

In my own conclusion for tonight, I don't think it can beat Google search relevancy, not today nor the next 10 years.

I don't see the niche for someone who wants size than quality.
  • AnarchY SI
  • Web Master
  • Web Master
  • User avatar
  • Posts: 2521
  • Loc: /usr/src/MI

Post 3+ Months Ago

Bigwebmaster wrote:
Ozzu doesn't show up, as well as many other missing websites.


interestingly, i searched ozzu.com and a post on the v7n boards came up (O_o)? lol

http://www.v7n.com/forums/seo-forum/34047-ozzu-com.html

i couldn't get to the site they were talking about as i got a 403 error, but it was weird to read someone mimicked ozzu (o_O)
  • eautocad
  • Graduate
  • Graduate
  • User avatar
  • Posts: 195
  • Loc: Spring Hill, Florida

Post 3+ Months Ago

that is just mind boggling... 1 T links.... crazay

Post Information

  • Total Posts in this topic: 8 posts
  • Users browsing this forum: No registered users and 6 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.