Meta Tag to block pages from being scanned by spiders/bots

  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 19th, 2003, 10:56 pm

Is there a way to block email harvesters/spiders/bot thingys from accessing a page? I am thinking this because maybe we could use it only on the pages where email addresses are held (if you have dedicated pages for contact info)

Anyone know?
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post October 19th, 2003, 10:56 pm

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Joined: Dec 20, 2002
  • Posts: 8926
  • Loc: Seattle, WA & Phoenix, AZ
  • Status: Offline

Post October 20th, 2003, 1:46 am

Well you could make it so your page only accepts certain browsers that are passed through the User-Agent field. However most harvestors I think simply fake this field to pass it off as if they are just a regular IE browser which means you might limit some of the harvestors, but many would still get through like normal.
Ozzu Hosting - Want your website on a fast server like Ozzu?
  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 20th, 2003, 2:12 am

mmm bummer! stupid spammers!!
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • dr nick
  • Proficient
  • Proficient
  • No Avatar
  • Joined: Sep 10, 2003
  • Posts: 263
  • Loc: Frankfurt
  • Status: Offline

Post October 20th, 2003, 2:30 pm

I'm assuming you're probably not going to want to do this, but you could have a contact page that is only accessible by registered members using passwords.

I'm also thinking, you could make life difficult for spiders/bots by using unconventional code (like using contact info inside iframes), but I'm assuming you want to do this for convenience, and anyway that wouldn't catch all spiders out.

Looks like you'll probably have to stick with contact info as images --- you could write a PHP script or something that takes text as an input and automatically places an image containing the text!?
  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 20th, 2003, 4:02 pm

mmm the image thing sounds good but then I would have to have the email in the Source code :cry:
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • Alan2
  • Graduate
  • Graduate
  • User avatar
  • Joined: Aug 20, 2003
  • Posts: 112
  • Status: Offline

Post October 21st, 2003, 4:28 pm

Oooh, I read somewhere about the Document.write .... and doing Email addresses so Spiders miss them as random text and not addy's

I will find it out. :)
  • dr nick
  • Proficient
  • Proficient
  • No Avatar
  • Joined: Sep 10, 2003
  • Posts: 263
  • Loc: Frankfurt
  • Status: Offline

Post October 21st, 2003, 5:14 pm

musik wrote:
the image thing sounds good but then I would have to have the email in the Source code


I don't quite understand what you mean -- I was refering to simply doing an <img src="emailpic.png">. Rather than reading a file containing a list of email addresses as text to display, you could read a list of images that display email addresses as png's or something. The source code wouldn't need to contain any 'email@address.com' at all...
  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 21st, 2003, 5:30 pm

oh! i missunderstood :D

yes I suppose I could do that :)
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • Alan2
  • Graduate
  • Graduate
  • User avatar
  • Joined: Aug 20, 2003
  • Posts: 112
  • Status: Offline

Post October 22nd, 2003, 11:12 am

ROSE... :)

I found it, register to bravenet. they give out good Webmasters tips :)

Code: [ Select ]
<cursive LANGUAGE="_javascript">
var first = 'ma';
var second = 'il';
var third = 'to:';

// example: user554554
var address = 'tips';

// example: hotmail
var domain = 'bravenet';

var ext = 'com';
document.write('<a target="_blank" href="http://mail.yahoo.com/config/login?/');
document.write(first+second+third);
document.write(address);
document.write('@');
document.write(domain);
document.write('.');
document.write(ext);
document.write('">');
document.write('Email Me!</a>');
</script>
  1. <cursive LANGUAGE="_javascript">
  2. var first = 'ma';
  3. var second = 'il';
  4. var third = 'to:';
  5. // example: user554554
  6. var address = 'tips';
  7. // example: hotmail
  8. var domain = 'bravenet';
  9. var ext = 'com';
  10. document.write('<a target="_blank" href="http://mail.yahoo.com/config/login?/');
  11. document.write(first+second+third);
  12. document.write(address);
  13. document.write('@');
  14. document.write(domain);
  15. document.write('.');
  16. document.write(ext);
  17. document.write('">');
  18. document.write('Email Me!</a>');
  19. </script>


that bit of code when tweaked fools any Bot or spider :D
  • lolaholic
  • Beginner
  • Beginner
  • User avatar
  • Joined: Oct 18, 2003
  • Posts: 43
  • Loc: Twinnieville!
  • Status: Offline

Post October 22nd, 2003, 11:27 am

rose hunni, u thought about a robots.txt file?

you can include/exclude the files and directories you want bots/spiders to have access to

there's a short tutorial here http://www.searchengineworld.com/robots ... torial.htm

not sure if it'll work, but it's worth a try :)
  • Alan2
  • Graduate
  • Graduate
  • User avatar
  • Joined: Aug 20, 2003
  • Posts: 112
  • Status: Offline

Post October 22nd, 2003, 12:37 pm

lolaholic wrote:
rose hunni, u thought about a robots.txt file?

you can include/exclude the files and directories you want bots/spiders to have access to

there's a short tutorial here http://www.searchengineworld.com/robots ... torial.htm

not sure if it'll work, but it's worth a try
:)


Ok, i just found some useful info about Atomic. ;) :)

They use the robots.txt to block search engines. as they were searching the database and slowing the System to a halt.

anyways, my Idea for Emails.

They could be held in the Document.write thingy, in a email.js file. then just set the Variables in your Document.

webpage stuff wrote:
<script language="javascript">

var first = 'ma';
var second = 'il';
var third = 'to:';

// example: user554554
var address = 'abheath2003';

// example: hotmail
var domain = 'Yahoo';

var ext = 'co.uk';
</SCRIPT>

<script language="javascript" src="email.js">
</SCRIPT>


that is the HTML code

this is the JS script.
Code: [ Select ]
document.write('<a target="_blank" href="http://mail.yahoo.com/config/login?/');
document.write(first+second+third);
document.write(address);
document.write('@');
document.write(domain);
document.write('.');
document.write(ext);
document.write('">');
document.write('Email Me!</a>');
  1. document.write('<a target="_blank" href="http://mail.yahoo.com/config/login?/');
  2. document.write(first+second+third);
  3. document.write(address);
  4. document.write('@');
  5. document.write(domain);
  6. document.write('.');
  7. document.write(ext);
  8. document.write('">');
  9. document.write('Email Me!</a>');


If you place the above .Js script into a email.js file you can call it after setting the variables as shown above. then this will insert an email address without letting Spiders see em cos they are built from two seperate files. :D
  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 22nd, 2003, 3:40 pm

Its 8:30 in the morning and I havent finished my Corn Flakes yet - I will tackle this later today - dont the email harvesters read text off the actual website not just the code?

xox for your input!

I know DD had to do that - it went through every user and slowed everything down - in fact one week i got so many through mine it counted them as visits and put me in the top ten !! :lol: :lol:
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • Alan2
  • Graduate
  • Graduate
  • User avatar
  • Joined: Aug 20, 2003
  • Posts: 112
  • Status: Offline

Post October 22nd, 2003, 3:55 pm

musik wrote:
Its 8:30 in the morning and I havent finished my Corn Flakes yet - I will tackle this later today - dont the email harvesters read text off the actual website not just the code?

xox for your input!

I know DD had to do that - it went through every user and slowed everything down - in fact one week i got so many through mine it counted them as visits and put me in the top ten !! :lol: :lol:


Ok rose hun,

Lol@ DD getting top spot. :P

the bit of code and the variables make the Emails from totally seperate documents, eg the Variables are set in the page and the email address assembled In the Email.JS so the spider or Robot never realises it found an email just some more Code and a file that writes something, which is called for in a page.

It would have to be a smart Spider to assemble the code and write the document before It searched. ;)

I should be here in the "Morning" Since I'm off work with a bad toothache... eg I had tooth pulled out today and am in ickle pain. No sympathy please it my fault. << Katie will complain if you give me sympathy :P she says I been milking it LOL....
  • musik
  • Legend
  • Super Moderator
  • User avatar
  • Joined: Aug 06, 2003
  • Posts: 6892
  • Loc: up a tree
  • Status: Offline

Post October 22nd, 2003, 4:05 pm

I am going to try to do it later for sure !!! :D
Anything to stop those dumb spam thingies!!! :twisted:
Opportunity To Do - Changing the lives of children around the world.
Rose.id.au - Doing Life.
  • Alan2
  • Graduate
  • Graduate
  • User avatar
  • Joined: Aug 20, 2003
  • Posts: 112
  • Status: Offline

Post October 22nd, 2003, 5:38 pm

musik wrote:
I am going to try to do it later for sure !!! :D
Anything to stop those dumb spam thingies!!! :twisted:


well,

I should be back in the morning. So if you need any help and i'm in MSN. Just shout :)
  • Anonymous
  • Bot
  • No Avatar
  • Joined: 25 Feb 2008
  • Posts: ?
  • Loc: Ozzuland
  • Status: Online

Post October 22nd, 2003, 5:38 pm

Post Information

  • Total Posts in this topic: 16 posts
  • Users browsing this forum: No registered users and 21 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
cron
 

© 2011 Unmelted, LLC. Ozzu® is a registered trademark of Unmelted, LLC.