Ban shitty little robots with the help of nginx and fail2ban

Apr 20, 2020

If you have ever maintained a site chances are you have something like this in your logs.

134.175.102.205 - - [19/Apr/2020:09:11:24 +0000] "GET /phpiMyAdmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:28 +0000] "GET /phpNyAdmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:32 +0000] "GET /1/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:32 +0000] "GET /download/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:33 +0000] "GET /phpMyAdmin_111/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:36 +0000] "GET /phpmadmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:40 +0000] "GET /321/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:40 +0000] "GET /123131/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:44 +0000] "GET /phpMyAdminn/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:44 +0000] "GET /phpMyAdminhf/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:48 +0000] "GET /sbb/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:48 +0000] "GET /WWW/phpMyAdmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:48 +0000] "GET /phpMyAdmln/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:52 +0000] "GET /phpMyAdmin_ai/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:52 +0000] "GET /__phpMyAdmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:52 +0000] "GET /program/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:56 +0000] "GET /shopdb/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:56 +0000] "GET /phppma/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:11:56 +0000] "GET /phpmy/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:12:00 +0000] "GET /mysql/admin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:12:04 +0000] "GET /mysql/dbadmin/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"
134.175.102.205 - - [19/Apr/2020:09:12:04 +0000] "GET /mysql/sqlmanager/index.php HTTP/1.1" 404 178 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0"

Unless you are running PHP chances you don’t really care about any of theses requests. Well I say ban the shit robots regardless. Why give them hope when you can crush their little mechanical dreams of hacking your site with a simple little app called fail2ban. I’m guessing that if you are reading this blog you are not using php, if in the off chance you are then this solution is not for you because it assumes that any request made to something ending in .php came from a shitty little robot and will result in fail2ban blocking that request after 2 attempts.

So I’m not going to go over how to install fail2ban there are many good tuts on how to do that. If you need that heres a link to a good one if you are on Ubuntu. https://www.digitalocean.com/community/tutorials/how-to-protect-ssh-with-fail2ban-on-ubuntu-14-04

Any how the idea is simple. given you try to go to something that ends in .php I log you in a custom log file which if fail2ban will match on every line.

First assuming you are using nginx we log any request assumed to be a shit robot to our custom ban_me.log

# logs anything ending in .php
location ~ \.php$ {
  access_log /var/log/nginx/ban_me.log;
}

# logs anything ending in .cgi
location ~ \.cgi$ {
  access_log /var/log/nginx/ban_me.log;
}

# logs anything ending in .asp
location ~ \.asp$ {
  access_log /var/log/nginx/ban_me.log;
}

We do it this way because we don’t want our fail2ban to have to read the main access log and since nginx will be more proficient at logging these request to our custom log we do that instead.

From there we have our fail2ban filter on this log and ban everything it finds.

For that I high jack fail2ban’s nginx-botsearch by copying/etc/fail2ban/filter.d/nginx-botsearch.conf to /etc/fail2ban/filter.d/nginx-botsearch.local Then I update the regex match like so.

[INCLUDES]

# Load regexes for filtering
before = botsearch-common.conf

[Definition]

failregex = ^<HOST>.*?$

ignoreregex =

datepattern = {^LN-BEG}%%ExY(?P<_sep>[-/.])%%m(?P=_sep)%%d[T ]%%H:%%M:%%S(?:[.,]%%f)?(?:\s*%%z)?
              ^[^\[]*\[({DATE})
              {^LN-BEG}

Make sure to restart your fail2ban and thats it.

In the end this should help address the signal to noise in your logs by removing a good majority of bad requests made by 💩 little 🤖.

Photo by: Tim Robinson