使用Fail2ban禁止垃圾采集爬虫,保护Nginx服务器

前面有文章介绍过Fail2ban安装、禁止SSH暴力破解和Postfix破解的文章,软件确实比较好用。现在我们来介绍如何保护nginx服务器,阻止垃圾爬虫或者简单的攻击防护。

nginx-fail2ban.jpg

第一步:创建过滤规则

Fail2ban使用前必须有过滤规则,创建规则nginx-badbots.conf。

cd /etc/fail2ban/filter.d
cp apache-badbots.conf nginx-badbots.conf
vim nginx-badbots.conf

内容如下:

# Fail2Ban configuration file
#
# Regexp to catch known spambots and software alike. Please verify
# that it is your intent to block IPs which were driven by
# above mentioned bots.


[Definition]

badbotscustom = EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider
badbots = -|Atomic_Email_Hunter/4\.0|atSpider/1\.0|autoemailspider|bwh3_user_agent|China Local Browse 2\.6|ContactBot/0\.2|ContentSmartz|DataCha0s/2\.0|DBrowse 1\.4b|DBrowse 1\.4d|Demo Bot DOT 16b|Demo Bot Z 16b|DSurf15a 01|DSurf15a 71|DSurf15a 81|DSurf15a VA|EBrowse 1\.4b|Educate Search VxB|EmailSiphon|EmailSpider|EmailWolf 1\.00|ESurf15a 15|ExtractorPro|Franklin Locator 1\.8|FSurf15a 01|Full Web Bot 0416B|Full Web Bot 0516B|Full Web Bot 2816B|Guestbook Auto Submitter|Industry Program 1\.0\.x|ISC Systems iRc Search 2\.1|IUPUI Research Bot v 1\.9a|LARBIN-EXPERIMENTAL \(efp@gmx\.net\)|LetsCrawl\.com/1\.0 \+http\://letscrawl\.com/|Lincoln State Web Browser|LMQueueBot/0\.2|LWP\:\:Simple/5\.803|Mac Finder 1\.0\.xx|MFC Foundation Class Library 4\.0|Microsoft URL Control - 6\.00\.8xxx|Missauga Locate 1\.0\.0|Missigua Locator 1\.9|Missouri College Browse|Mizzu Labs 2\.2|Mo College 1\.9|MVAClient|Mozilla/2\.0 \(compatible; NEWT ActiveX; Win32\)|Mozilla/3\.0 \(compatible; Indy Library\)|Mozilla/3\.0 \(compatible; scan4mail \(advanced version\) http\://www\.peterspages\.net/?scan4mail\)|Mozilla/4\.0 \(compatible; Advanced Email Extractor v2\.xx\)|Mozilla/4\.0 \(compatible; Iplexx Spider/1\.0 http\://www\.iplexx\.at\)|Mozilla/4\.0 \(compatible; MSIE 5\.0; Windows NT; DigExt; DTS Agent|Mozilla/4\.0 efp@gmx\.net|Mozilla/5\.0 \(Version\: xxxx Type\:xx\)|NameOfAgent \(CMS Spider\)|NASA Search 1\.0|Nsauditor/1\.x|PBrowse 1\.4b|PEval 1\.4b|Poirot|Port Huron Labs|Production Bot 0116B|Production Bot 2016B|Production Bot DOT 3016B|Program Shareware 1\.0\.2|PSurf15a 11|PSurf15a 51|PSurf15a VA|psycheclone|RSurf15a 41|RSurf15a 51|RSurf15a 81|searchbot admin@google\.com|ShablastBot 1\.0|snap\.com beta crawler v0|Snapbot/1\.0|Snapbot/1\.0 \(Snap Shots, \+http\://www\.snap\.com\)|sogou develop spider|Sogou Orion spider/3\.0\(\+http\://www\.sogou\.com/docs/help/webmasters\.htm#07\)|sogou spider|Sogou web spider/3\.0\(\+http\://www\.sogou\.com/docs/help/webmasters\.htm#07\)|sohu agent|SSurf15a 11 |TSurf15a 11|Under the Rainbow 2\.2|User-Agent\: Mozilla/4\.0 \(compatible; MSIE 6\.0; Windows NT 5\.1\)|VadixBot|WebVulnCrawl\.unknown/1\.0 libwww-perl/5\.803|Wells Search II|WEP Search 00|ZmEu|spiderman|sqlmap|FeedDemon|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|YisouSpider|HttpClient|MJ12bot|heritrix|EasouSpider|LinkpadBot|YandexBot|RU_Bot|200PleaseBot|DuckDuckGo-Favicons-Bot|Wotbox|SeznamBot|Exabot|SemrushBot|PictureBot|SMTBot|SEOkicks-Robot|AdvBot|TrueBot|BLEXBot|WangIDSpider|Ezooms

failregex = ^ -.*"(GET|POST|HEAD).*HTTP.*" \d+ \d+ ".*" "(?:%(badbots)s|%(badbotscustom)s)" (-|.*)$

ignoreregex =

# DEV Notes:
# List of bad bots fetched from http://www.user-agents.org
# Generated on Thu Nov  7 14:23:35 PST 2013 by files/gen_badbots.
#
# Author: Yaroslav Halchenko

默认badbots没有很多,而且比较老,根据自己需要我又添加了下,同时也检查过滤空的user agent

第二步:检查过滤规则

检查正则表达式写的对或者不对,可以使用fail2ban-regex命令,具体用法如下所示。

fail2ban-regex /var/log/nginx/access.log /etc/fail2ban/filter.d/nginx-badbots.conf

第三步:创建jail规则

和SSH等一样,单独创建nginx的jail规则文件。

vim /etc/fail2ban/jail.d/nginx.local

例如:

[nginx-badbots]

enabled  = true
port     = http,https
filter   = nginx-badbots
logpath  = /home/wwwlogs/access.log
           /home/wwwlogs/www.sijitao.net.log
maxretry = 3

这里可以指定多个日志文件的路径。

第四步:重启fail2ban

重启fail2ban

service fail2ban restart

查看iptables规则是否生效

iptables --list -n

最后修改:2017 年 11 月 05 日 12 : 41 PM
如果觉得我的文章对你有用,请随意赞赏

10 条评论

  1. 简单生活

    涨见识了,回去试试。

    1. 明月登楼
      @简单生活

      呵呵,多谢支持!

  2. 今日新闻头条

    文章不错支持一下吧

    1. 明月登楼
      @今日新闻头条

      多谢了,欢迎常来哦!

  3. 历史趣谈

    这个很不错哦

    1. 明月登楼
      @历史趣谈

      多谢支持哦!

  4. 老站新站长

    这个是个啥原理呀?效果如何呢?

    1. 明月登楼
      @老站新站长

      其实就是给iptables提供了一个可以创建IP“监狱”的功能扩展!具体效果还得通过这几天多观察日志来看看了!

  5. 懿古今

    明月兄对nginx越来越熟悉了,我到现在都不太敢折腾服务器,只能慢慢学习了

    1. 明月登楼
      @懿古今

      其实吧,我现在的“折腾”都是限定在“已有”或者“已集成”的功能模块为范围的,基本上都是“折腾”不坏的范畴!

发表评论