Google Indexing Nonexistent Files

  • Tandem
  • Born
  • Born
  • Tandem
  • Posts: 3

Post 3+ Months Ago

My Error Logs show quite a few 404 entries.
The referer is usually google.gr, google.ee google.com.br etc.

Please keep in mind, this is not a case of broken links, renamed or removed files. The files in question never existed on the sites (as far as I know).

Also, the sites and the directories that the indexes point to have all robots.txt files with the following:
User-agent: *
Disallow: /

The sites are for private use and are not indexed by SEs. I am aware that bots can ignore the robots.txt files.

What concerns me is that the file names usually are something along the lines:
....serial-free.html
....CD-key-changer.html
...something-sex.html and so on.

Does anyone have any ideas about what's is going on? How do these end up in google index?

Thank you for your time.
  • Don2007
  • Web Master
  • Web Master
  • Don2007
  • Posts: 4923
  • Loc: NY

Post 3+ Months Ago

It sounds like a bot looking for cracks. Don't worry about it.
  • Tandem
  • Born
  • Born
  • Tandem
  • Posts: 3

Post 3+ Months Ago

That's what it looks like, but the referer is google.
  • Don2007
  • Web Master
  • Web Master
  • Don2007
  • Posts: 4923
  • Loc: NY

Post 3+ Months Ago

Let's say I type into the google search box

site:tandemsite.com inurl:/CD-key-changer.html & press enter, wouldn't that make the google the referrer? That's what I think is happening but not by hand.

See if you can find googlehacking.pdf

johnny I hack stuff seems to be down for the moment but the pdf should still be available somewhere.
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13504
  • Loc: Florida

Post 3+ Months Ago

Tandem wrote:
That's what it looks like, but the referer is google.


The referer by itself isn't of much importance. Check the IP address against a list of known Google networks to confirm.

I can send a request to the site looking for "/i-was-here.joebert" that appears to come from Google right now if you send me the address to one of the sites in question, assuming you only consider the referer.
Referers can be forged, IP addresses can kinda be spoofed via proxy, but there's an extremely tiny, right next to non-existant, chance that an IP address pointing back to Googles network can be forged.

Are you sure they're "in Googles index", did you look for them by searching Google ?
Or did you just assume they're in the index because of the referer in your own logs ?
  • Tandem
  • Born
  • Born
  • Tandem
  • Posts: 3

Post 3+ Months Ago

Thank you all for your input.

Yes, the referer was google, I verified the index.

This is the latest entry:
Code: [ Select ]
[Sun Apr 12 06:26:29 2009] [error] [client 78.160.xxx.xxx] File does not exist: /home/SITENAME/public_html/DIR/Smileys, referer: http://www.google.com.tr/search?hl=tr&r ... edir&meta=


My site is no longer among the results because I removed it from the index using Google's Remove URLs tool.
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13504
  • Loc: Florida

Post 3+ Months Ago

That really is strange.

Post Information

  • Total Posts in this topic: 7 posts
  • Users browsing this forum: No registered users and 5 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.