Block website archivers and email spam collectors

  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Posts: 9099
  • Loc: Seattle, WA & Phoenix, AZ

Post 3+ Months Ago

Here is something simple you can do to protect against website archiving programs and email collector programs. One reason you might want to block against these malicious programs is the fact that they hog the resources on your server. Most will request as many pages per second that their computer can handle, and it could seriously bog down or crash your server. To use this you have to be able to use the Limit directive. To block some of the common programs place this in your .htaccess file:

Code: [ Select ]
SetEnvIfNoCase User-Agent "HTTrack" bad_bot
SetEnvIfNoCase User-Agent "Download Ninja 2.0" bad_bot
SetEnvIfNoCase User-Agent "JBH Agent 2.0" bad_bot

SetEnvIfNoCase User-Agent "EmailCollector/1.0" spam_bot
SetEnvIfNoCase User-Agent "EmailSiphon" spam_bot
SetEnvIfNoCase User-Agent "EmailWolf 1.00" spam_bot
SetEnvIfNoCase User-Agent "ExtractorPro" spam_bot
SetEnvIfNoCase User-Agent "Crescent Internet ToolPak HTTP OLE Control v.1.0" spam_bot
SetEnvIfNoCase User-Agent "Mozilla/2.0 (compatible; NEWT ActiveX; Win32)" spam_bot
SetEnvIfNoCase User-Agent "CherryPicker/1.0" spam_bot
SetEnvIfNoCase User-Agent "CherryPickerSE/1.0" spam_bot
SetEnvIfNoCase User-Agent "CherryPickerElite/1.0" spam_bot
SetEnvIfNoCase User-Agent "NICErsPRO" spam_bot
SetEnvIfNoCase User-Agent "WebBandit/2.1" spam_bot
SetEnvIfNoCase User-Agent "WebBandit/3.50" spam_bot
SetEnvIfNoCase User-Agent "webbandit/4.00.0" spam_bot
SetEnvIfNoCase User-Agent "WebEMailExtractor/1.0B" spam_bot
SetEnvIfNoCase User-Agent "autoemailspider" spam_bot

<Limit GET POST HEAD>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
Deny from env=spam_bot
</Limit>
  1. SetEnvIfNoCase User-Agent "HTTrack" bad_bot
  2. SetEnvIfNoCase User-Agent "Download Ninja 2.0" bad_bot
  3. SetEnvIfNoCase User-Agent "JBH Agent 2.0" bad_bot
  4. SetEnvIfNoCase User-Agent "EmailCollector/1.0" spam_bot
  5. SetEnvIfNoCase User-Agent "EmailSiphon" spam_bot
  6. SetEnvIfNoCase User-Agent "EmailWolf 1.00" spam_bot
  7. SetEnvIfNoCase User-Agent "ExtractorPro" spam_bot
  8. SetEnvIfNoCase User-Agent "Crescent Internet ToolPak HTTP OLE Control v.1.0" spam_bot
  9. SetEnvIfNoCase User-Agent "Mozilla/2.0 (compatible; NEWT ActiveX; Win32)" spam_bot
  10. SetEnvIfNoCase User-Agent "CherryPicker/1.0" spam_bot
  11. SetEnvIfNoCase User-Agent "CherryPickerSE/1.0" spam_bot
  12. SetEnvIfNoCase User-Agent "CherryPickerElite/1.0" spam_bot
  13. SetEnvIfNoCase User-Agent "NICErsPRO" spam_bot
  14. SetEnvIfNoCase User-Agent "WebBandit/2.1" spam_bot
  15. SetEnvIfNoCase User-Agent "WebBandit/3.50" spam_bot
  16. SetEnvIfNoCase User-Agent "webbandit/4.00.0" spam_bot
  17. SetEnvIfNoCase User-Agent "WebEMailExtractor/1.0B" spam_bot
  18. SetEnvIfNoCase User-Agent "autoemailspider" spam_bot
  19. <Limit GET POST HEAD>
  20. Order Allow,Deny
  21. Allow from all
  22. Deny from env=bad_bot
  23. Deny from env=spam_bot
  24. </Limit>


If anybody knows of anymore site archivers or email collectors to block please post it here.
  • joebert
  • Fart Bubbles
  • Genius
  • User avatar
  • Posts: 13504
  • Loc: Florida

Post 3+ Months Ago

Might be interesting to see if any of these bots are still around, & what new ones have surfaced in the last few years. :D


Here's a list of additions I'd come across the other day.
Code: [ Select ]
SetEnvIfNoCase user-agent "^BlackWidow" bad_bot
SetEnvIfNoCase user-agent "^Bot\ mailto:craftbot@yahoo.com" bad_bot
SetEnvIfNoCase user-agent "^ChinaClaw" bad_bot
SetEnvIfNoCase user-agent "^Custo" bad_bot
SetEnvIfNoCase user-agent "^DISCo" bad_bot
SetEnvIfNoCase user-agent "^Download\ Demon" bad_bot
SetEnvIfNoCase user-agent "^eCatch" bad_bot
SetEnvIfNoCase user-agent "^EirGrabber" bad_bot
SetEnvIfNoCase user-agent "^EmailSiphon" bad_bot
SetEnvIfNoCase user-agent "^EmailWolf" bad_bot
SetEnvIfNoCase user-agent "^Express\ WebPictures" bad_bot
SetEnvIfNoCase user-agent "^ExtractorPro" bad_bot
SetEnvIfNoCase user-agent "^EyeNetIE" bad_bot
SetEnvIfNoCase user-agent "^FlashGet" bad_bot
SetEnvIfNoCase user-agent "^GetRight" bad_bot
SetEnvIfNoCase user-agent "^GetWeb!" bad_bot
SetEnvIfNoCase user-agent "^Go!Zilla" bad_bot
SetEnvIfNoCase user-agent "^Go-Ahead-Got-It" bad_bot
SetEnvIfNoCase user-agent "^GrabNet" bad_bot
SetEnvIfNoCase user-agent "^Grafula" bad_bot
SetEnvIfNoCase user-agent "^HMView" bad_bot
SetEnvIfNoCase user-agent “HTTrack” bad_bot
SetEnvIfNoCase user-agent "^Image\ Stripper" bad_bot
SetEnvIfNoCase user-agent "^Image\ Sucker" bad_bot
SetEnvIfNoCase user-agent "Indy\ Library" [NC,OR]
SetEnvIfNoCase user-agent "^InterGET" bad_bot
SetEnvIfNoCase user-agent "^Internet\ Ninja" bad_bot
SetEnvIfNoCase user-agent "^JetCar" bad_bot
SetEnvIfNoCase user-agent "^JOC\ Web\ Spider" bad_bot
SetEnvIfNoCase user-agent "^larbin" bad_bot
SetEnvIfNoCase user-agent "^LeechFTP" bad_bot
SetEnvIfNoCase user-agent "^Mass\ Downloader" bad_bot
SetEnvIfNoCase user-agent "^MIDown\ tool" bad_bot
SetEnvIfNoCase user-agent "^Mister\ PiX" bad_bot
SetEnvIfNoCase user-agent "^Navroad" bad_bot
SetEnvIfNoCase user-agent "^NearSite" bad_bot
SetEnvIfNoCase user-agent "^NetAnts" bad_bot
SetEnvIfNoCase user-agent "^NetSpider" bad_bot
SetEnvIfNoCase user-agent "^Net\ Vampire" bad_bot
SetEnvIfNoCase user-agent "^NetZIP" bad_bot
SetEnvIfNoCase user-agent "^Octopus" bad_bot
SetEnvIfNoCase user-agent "^Offline\ Explorer" bad_bot
SetEnvIfNoCase user-agent "^Offline\ Navigator" bad_bot
SetEnvIfNoCase user-agent "^PageGrabber" bad_bot
SetEnvIfNoCase user-agent "^Papa\ Foto" bad_bot
SetEnvIfNoCase user-agent "^pavuk" bad_bot
SetEnvIfNoCase user-agent "^pcBrowser" bad_bot
SetEnvIfNoCase user-agent "^RealDownload" bad_bot
SetEnvIfNoCase user-agent "^ReGet" bad_bot
SetEnvIfNoCase user-agent "^SiteSnagger" bad_bot
SetEnvIfNoCase user-agent "^SmartDownload" bad_bot
SetEnvIfNoCase user-agent "^SuperBot" bad_bot
SetEnvIfNoCase user-agent "^SuperHTTP" bad_bot
SetEnvIfNoCase user-agent "^Surfbot" bad_bot
SetEnvIfNoCase user-agent "^tAkeOut" bad_bot
SetEnvIfNoCase user-agent "^Teleport\ Pro" bad_bot
SetEnvIfNoCase user-agent "^VoidEYE" bad_bot
SetEnvIfNoCase user-agent "^Web\ Image\ Collector" bad_bot
SetEnvIfNoCase user-agent "^Web\ Sucker" bad_bot
SetEnvIfNoCase user-agent "^WebAuto" bad_bot
SetEnvIfNoCase user-agent "^WebCopier" bad_bot
SetEnvIfNoCase user-agent "^WebFetch" bad_bot
SetEnvIfNoCase user-agent "^WebGo\ IS" bad_bot
SetEnvIfNoCase user-agent "^WebLeacher" bad_bot
SetEnvIfNoCase user-agent "^WebReaper" bad_bot
SetEnvIfNoCase user-agent "^WebSauger" bad_bot
SetEnvIfNoCase user-agent "^Website\ eXtractor" bad_bot
SetEnvIfNoCase user-agent "^Website\ Quester" bad_bot
SetEnvIfNoCase user-agent "^WebStripper" bad_bot
SetEnvIfNoCase user-agent "^WebWhacker" bad_bot
SetEnvIfNoCase user-agent "^WebZIP" bad_bot
SetEnvIfNoCase user-agent "^Widow" bad_bot
SetEnvIfNoCase user-agent "^WWWOFFLE" bad_bot
SetEnvIfNoCase user-agent "^Xaldon\ WebSpider" bad_bot
SetEnvIfNoCase user-agent "^Zeus" bad_bot
  1. SetEnvIfNoCase user-agent "^BlackWidow" bad_bot
  2. SetEnvIfNoCase user-agent "^Bot\ mailto:craftbot@yahoo.com" bad_bot
  3. SetEnvIfNoCase user-agent "^ChinaClaw" bad_bot
  4. SetEnvIfNoCase user-agent "^Custo" bad_bot
  5. SetEnvIfNoCase user-agent "^DISCo" bad_bot
  6. SetEnvIfNoCase user-agent "^Download\ Demon" bad_bot
  7. SetEnvIfNoCase user-agent "^eCatch" bad_bot
  8. SetEnvIfNoCase user-agent "^EirGrabber" bad_bot
  9. SetEnvIfNoCase user-agent "^EmailSiphon" bad_bot
  10. SetEnvIfNoCase user-agent "^EmailWolf" bad_bot
  11. SetEnvIfNoCase user-agent "^Express\ WebPictures" bad_bot
  12. SetEnvIfNoCase user-agent "^ExtractorPro" bad_bot
  13. SetEnvIfNoCase user-agent "^EyeNetIE" bad_bot
  14. SetEnvIfNoCase user-agent "^FlashGet" bad_bot
  15. SetEnvIfNoCase user-agent "^GetRight" bad_bot
  16. SetEnvIfNoCase user-agent "^GetWeb!" bad_bot
  17. SetEnvIfNoCase user-agent "^Go!Zilla" bad_bot
  18. SetEnvIfNoCase user-agent "^Go-Ahead-Got-It" bad_bot
  19. SetEnvIfNoCase user-agent "^GrabNet" bad_bot
  20. SetEnvIfNoCase user-agent "^Grafula" bad_bot
  21. SetEnvIfNoCase user-agent "^HMView" bad_bot
  22. SetEnvIfNoCase user-agent “HTTrack” bad_bot
  23. SetEnvIfNoCase user-agent "^Image\ Stripper" bad_bot
  24. SetEnvIfNoCase user-agent "^Image\ Sucker" bad_bot
  25. SetEnvIfNoCase user-agent "Indy\ Library" [NC,OR]
  26. SetEnvIfNoCase user-agent "^InterGET" bad_bot
  27. SetEnvIfNoCase user-agent "^Internet\ Ninja" bad_bot
  28. SetEnvIfNoCase user-agent "^JetCar" bad_bot
  29. SetEnvIfNoCase user-agent "^JOC\ Web\ Spider" bad_bot
  30. SetEnvIfNoCase user-agent "^larbin" bad_bot
  31. SetEnvIfNoCase user-agent "^LeechFTP" bad_bot
  32. SetEnvIfNoCase user-agent "^Mass\ Downloader" bad_bot
  33. SetEnvIfNoCase user-agent "^MIDown\ tool" bad_bot
  34. SetEnvIfNoCase user-agent "^Mister\ PiX" bad_bot
  35. SetEnvIfNoCase user-agent "^Navroad" bad_bot
  36. SetEnvIfNoCase user-agent "^NearSite" bad_bot
  37. SetEnvIfNoCase user-agent "^NetAnts" bad_bot
  38. SetEnvIfNoCase user-agent "^NetSpider" bad_bot
  39. SetEnvIfNoCase user-agent "^Net\ Vampire" bad_bot
  40. SetEnvIfNoCase user-agent "^NetZIP" bad_bot
  41. SetEnvIfNoCase user-agent "^Octopus" bad_bot
  42. SetEnvIfNoCase user-agent "^Offline\ Explorer" bad_bot
  43. SetEnvIfNoCase user-agent "^Offline\ Navigator" bad_bot
  44. SetEnvIfNoCase user-agent "^PageGrabber" bad_bot
  45. SetEnvIfNoCase user-agent "^Papa\ Foto" bad_bot
  46. SetEnvIfNoCase user-agent "^pavuk" bad_bot
  47. SetEnvIfNoCase user-agent "^pcBrowser" bad_bot
  48. SetEnvIfNoCase user-agent "^RealDownload" bad_bot
  49. SetEnvIfNoCase user-agent "^ReGet" bad_bot
  50. SetEnvIfNoCase user-agent "^SiteSnagger" bad_bot
  51. SetEnvIfNoCase user-agent "^SmartDownload" bad_bot
  52. SetEnvIfNoCase user-agent "^SuperBot" bad_bot
  53. SetEnvIfNoCase user-agent "^SuperHTTP" bad_bot
  54. SetEnvIfNoCase user-agent "^Surfbot" bad_bot
  55. SetEnvIfNoCase user-agent "^tAkeOut" bad_bot
  56. SetEnvIfNoCase user-agent "^Teleport\ Pro" bad_bot
  57. SetEnvIfNoCase user-agent "^VoidEYE" bad_bot
  58. SetEnvIfNoCase user-agent "^Web\ Image\ Collector" bad_bot
  59. SetEnvIfNoCase user-agent "^Web\ Sucker" bad_bot
  60. SetEnvIfNoCase user-agent "^WebAuto" bad_bot
  61. SetEnvIfNoCase user-agent "^WebCopier" bad_bot
  62. SetEnvIfNoCase user-agent "^WebFetch" bad_bot
  63. SetEnvIfNoCase user-agent "^WebGo\ IS" bad_bot
  64. SetEnvIfNoCase user-agent "^WebLeacher" bad_bot
  65. SetEnvIfNoCase user-agent "^WebReaper" bad_bot
  66. SetEnvIfNoCase user-agent "^WebSauger" bad_bot
  67. SetEnvIfNoCase user-agent "^Website\ eXtractor" bad_bot
  68. SetEnvIfNoCase user-agent "^Website\ Quester" bad_bot
  69. SetEnvIfNoCase user-agent "^WebStripper" bad_bot
  70. SetEnvIfNoCase user-agent "^WebWhacker" bad_bot
  71. SetEnvIfNoCase user-agent "^WebZIP" bad_bot
  72. SetEnvIfNoCase user-agent "^Widow" bad_bot
  73. SetEnvIfNoCase user-agent "^WWWOFFLE" bad_bot
  74. SetEnvIfNoCase user-agent "^Xaldon\ WebSpider" bad_bot
  75. SetEnvIfNoCase user-agent "^Zeus" bad_bot
  • tan_go
  • Banned
  • Banned
  • User avatar
  • Posts: 65

Post 3+ Months Ago

Thanks for that detailed info.

Post Information

  • Total Posts in this topic: 3 posts
  • Users browsing this forum: No registered users and 3 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.