BitBucket push surprise

Today I was very surprised and even scared by a response from pushing my changes to a repository hosted on BitBucket.

Not that I am against of people celebrating whatever they want, changing site logos and whatsoever, but this way too much, especially in such types of the tasks. When you do a lot of pushes all the time, you mind is getting used to recognise patterns in the response messages act accordingly, but this one screws the mind completely. To make it simplier: imaging the traffic lights would change colours from time to time to for similar occasions…

Amazon AWS Subnet Custom Gateway

While Amazon provides different ways to route traffic within and out of your subnets by means of internet gateways and NAT gateways, it’s not always the case that you it will suite your needs. If you want full control with lots of possibilities for customisation, you might consider building your own firewall instance and push all traffic via it.

Amazon provides NAT instances, but they also have some limitations, so to gain full features, its is possible to build a custom EC2 instance with whatever AMI and settings you like, attach two network interfaces to it where one is in a private and one in a public subnet and do classic iptables NAT on it.

For all instances in the private subnet to be routed via your custom firewall, you need to adjust the routing table for that subnet and point default route to the network interface of the firewall instance that is in that same private subnet.

All of the above works pretty good, but looks a bit weird: lets assume the following:

  • We have a VPC 10.10.0.0/16
  • We have a private subnet 10.10.10.0/24 within our VPC
  • We have a public subnet 10.10.20.0/24 within our VPC
  • We have a firewall instance with:
    • public subnet IP: 10.10.10.10
    • private subnet IP: 10.10.20.20
  • We have a host in private subnet with IP 10.10.10.50

Now, the first question is why we don’t use 10.10.10.1 on firewall instance in a private subnet? Easy: it is used by amazon gateway and even though we have created routing table to through all to 10.10.10.10, on an actual host in a private subnet, the routing table will be:

default via 10.10.20.1 dev eth0 
10.10.20.0/24 dev eth0 proto kernel scope link src 10.10.20.50 

This means that host will send traffic to AWS gateway, and that one will pass it over to our firewall. Probably the idea behind such configuration is that AWS still needs to check security groups and so on, before it handles traffic to us.

Cool, and this work fine, but there is a small issue: if you will try to ping or access any services at firewalls public IP (10.10.10.10) from the host in your private subnet – you will fail! Moreover, if you fire up tcpdump on a firewall server listening for any packets from host in private subnet via interface of private subnet (10.10.20.20) and try to ping 10.10.10.10 from the private host – you will see completely nothing related to this. Nor you will see any other activity from your private host towards public IP address of your firewall.

Wanna go even more weird? If you will try to access any other hosts in the public subnet from your private host (for a sake of example assume you have another host with IP 10.10.10.99 and you try to ping it from 10.10.20.20): this will work as expected and traffic will flow via firewall as configured.

Not sure why and how, but Amazon seems to block access on 10.10.20.1 from any host in 10.10.20.0/24 network to 10.10.10.10, because that IP belongs to a firewall that is a default gateway for any host on 10.10.20.0/24 (even though the IP is another subnet).

The solution for this problem (if that’s a problem for your case) is either to put a direct route to 10.10.10.10 via 10.10.20.20 on the private host to make sure private host avoids using amazon 10.10.20.1 for this route:

default via 10.10.20.1 dev eth0 
10.10.10.10/32 via 10.10.20.20 dev eth0
10.10.20.0/24 dev eth0 proto kernel scope link src 10.10.20.50

or even completely ignore 10.10.20.1 and set default gateway via 10.10.20.20:

default via 10.10.20.20 dev eth0 
10.10.20.0/24 dev eth0 proto kernel scope link src 10.10.20.50

In either way you won’t be able to do it via AWS routing tables, but will have to configure routing right on the private host via IP route tool or route-ifcfgN file for persistence.

Keep in mind that if you will divert all traffic via 10.10.20.20, you might lose some security group checks within a subnet, so make sure to implement whatever security you need on the actual firewall.

Fixing very outdated Let’s Encrypt

Following my brothers post about Fixing outdated Let’s Encrypt which is pretty useful when sorting out the SSL stuff on the servers, I run into the problem when even given solution won’t help you will still receive a message about missing zope.interface like in initial post.

Luckily, the comment from @skatsumata on github proposes a working solution:

# pip install pip --upgrade
# pip install virtualenv --upgrade
# virtualenv -p /usr/bin/python27 venv27
# . venv27/bin/activate

After the above is done, you still need to re-init Let’s Encrypt as per my brothers post:

# rm -rf /root/.local/share/letsencrypt
# /opt/letsencrypt/letsencrypt-auto --debug

And then renew the certs as you normally do.

HAProxy abuse filtering and rate limiting

Just recently Nginx rate limit by user agent (control bots) which is all cool and handy, but what if you have a number of Nginx behind HAProxy and want to offload some the job to it? Fortunately HAProxy is very easy on configuration and very flexible on ACLs. Here is a simple example on how to do different blacklists and rate limiting (just a part of configuration, apply where appropriate):

frontend HTTP *:80
 description Incoming traffic to port 80
 # IP address white/blacklist
 TCP-request connection accept if { src -f /etc/haproxy/whitelist.lst }
 TCP-request connection reject if { src -f /etc/haproxy/blacklist.lst }

 # Max possible time delay on inspection
 TCP-request inspect-delay 15s

 # ACLs for blacklist UAs and Paths
 acl abuse_ua hdr_sub(user-agent) -f /etc/haproxy/blacklist_ua.lst
 acl abuse_path path_beg -f /etc/haproxy/blacklist_path.lst

 # Reject blacklisted UAs and Paths
 TCP-request content reject if abuse_ua
 TCP-request content reject if abuse_path

 # At most 10 concurrent connections from a client
 acl too_fast fe_sess_rate ge 10

 # Effectively working as a delay mechanism for clients that are too fast
 TCP-request inspect-delay 1000ms

 # Fast-path - accept connection if it's not this troublesome client
 TCP-request content accept unless too_fast

 # The very fast client gets here meaning they have to wait full inspect-delay
 TCP-request content accept if WAIT_END

Whenever you refer to a list file to search for a value from your configuration, make sure the file is actually in place (even if it’s empty), otherwise you will fail.

The only limitation for the above is that you can’t really check headers if you are using HAProxy SSL frontend with SSL SNI, by in that case you can still implement the limits on Nginx side. The fe_sess_rate limit though is still applicable.

One note that I forgot to mention in my previous post on Nginx rate limits, you can also adjust it to work based on requested paths, not only user agents.

P.S.: When dealing with configuration changes, make sure to check the validity of the config file after changes before restarting/reloading the service. You can do it with haproxy -f /etc/haproxy/haproxy.cfg for HAProxy or nginx -t for Nginx.

 

Nginx rate limit by user agent (control bots)

As search engine indexing bots are getting more and more intelligent and thus more aggressive, sometimes they become really annoying or even can affect the performance of the system.

While Nginx is a very powerful and flexible system, it is not always clear how to put all the configuration together to do the job. It is getting even harder when single Nginx server serves multiple virtual hosts and you want to apply the same policy for all of them from withing the HTTP section of the configuration, instead of the server section of each site.

For any rate limiting, Nginx uses the ngx_http_limit_req_module module and it is pretty much strait-forward to limit based by IP address or any other simple value, but for advanced configuration you need to use maps with either static keys or regexp. That’s where things are getting more confusing, especially if you want to have some defaults and white-listing (exclusion from limiting) based on certain condition.

The Nginx documentation for rate limiting with regards to exclusions states:

The key can contain text, variables, and their combination. Requests with an empty key value are not accounted.

Sounds easy, but to find the correct way to implement such an exclusion took me quite some time of googling, reading, trying failing, googling again and so on and so forth. So just to have a correct solution documented somewhere closer to me, will just cover it here.

For the clarity, and more extended solution, let’s assume we want to limit user agents that match (GoogleBot|bingbot|YandexBot|mj12bot) pattern to 1 request from IP per minute burstable to 2 and the rest of the world to 10 requests per IP per second burstable to 15. To do this, the HTTP section of the nginx.conf has to have the following part:

map $http_user_agent $isbot_ua {
        default 0;
        ~*(GoogleBot|bingbot|YandexBot|mj12bot) 1;
}
map $isbot_ua $limit_bot {
        0       "";
        1       $binary_remote_addr;
}

limit_req_zone $limit_bot zone=bots:10m rate=1r/m;
limit_req_zone $binary_remote_addr zone=one:10m rate=10r/s;

limit_req zone=bots burst=2 nodelay;
limit_req zone=one burst=15 nodelay;

The trick here is that we need to use two maps where first one sets a value 0 based on the $http_user_agent and the second one sets the empty value for everyone who got in the first map, but $binary_remote_addr for the ones who got value 1 in the same first map. The idea is that in order for nginx whitelist the request from limit zone, the return key and value from the map have to be empty (0 for key and “” for value), so the first map sets 0 for the value, and the second map takes that value as it’s key and sets the “” value.

Rest of the configuration parameters are pretty much easy to understand and I won’t cover them here, since you can easily refer to nginx documentation.

To make things even nicer, we can also tell nginx to send a good HTTP 429 code (Too Many Requests) when someone is above the limit and hope that the requester will interpret accordingly and slow down. To make this, just add the following line in the same part of nginx configuration:

limit_req_status 429;

If you are using the limit_conn directive anywhere for nginx, you can add the same thing for it as well:

limit_conn_status 429;

Hope the above will save me and maybe even someone else some time and more similar posts to come later as I am getting my hands on different things.

P.S.: do you know the difference between “~” and “~*” for nginx configuration when dealing with regex? The answer is pretty simple: while the first one matches case-sensitive, the last one matches case-insensitive ;-)