Skip to main content

HAProxy automatic failover


HAProxy automatic failover

My aims were simple:
  • If a server fails, stop using it.
  • If said server starts working again (i.e. because the problem is fixed) start using it again.
  • If all servers in the load balancer pool fail, serve a temporary static page from another location.
But first: Some basic HAProxy concepts
HAProxy as a load balancer is fairly simple, and works on the basis of defined frontends and backends. A frontend is simply an IP and port declaration that you want the load balancer to listen on. A backend is the set of servers that requests to a frontend are sent to. Your HAProxy configuration lists the frontends and their respective backends, as well as the load balancing algorithm you wish to use to distribute traffic amongst the backend nodes. Here’s an example- one frontend proxies to two nodes at the backend.
 frontend my_tls_frontend
          mode tcp
         bind *:443 # port/IP to bind on
         bind ipv6@:443
          default_backend my_tls_backend # Backend to proxy to

 ...

 backend my_tls_backend
    mode tcp
     balance     roundrobin # Balancing algorithm
     server node1 10.0.0.1:443 # Server 1
     server node2 10.0.0.2:443 # Server 2
The ‘balance’ statement here defines a load balancing algorithm to use:
roundrobin: This is the default option, and simply picks between each node in turn. It can also be weighted to prioritise some nodes above others.
leastconn: Picks the node with the least number of active connections (i.e. the least busy server).
source: Hashes the visitor IP to determine which node to send a user to. This ensures that, assuming nothing changes in the pool, a visitor will always hit the same backend node.
I was interested in HAProxy’s state-checking abilities:

Introducing: HAProxy health checking

HAProxy has some fairly broad health-checking features which allow it to check the state of any server in a backend pool on a set interval.
At it’s most basic, you simply add the ‘check’ statement to each backend node:

  server node1 10.0.0.1:443 check # Server 1
  server node2 10.0.0.2:443 check # Server 2

This instructs HAProxy to perform basic healthchecks on the servers, by opening a TCP connection and verifying if it gets a response. If it doesn’t, then it marks the offending server as “Down” and stops directing traffic to it.

That’s all well and good, but what if we need a more detailed check? Fortunately, HAProxy has the ability to check services in more depth. In this case, we want to send a HTTP request to each backend and check whether we get a good HTTP response back (i.e. a 2XX or 3XX code).

This is done by defining a ‘httpchk’ option statement in your backend configuration, which tells HAProxy to request each server over HTTP. In your backend definition, add:

  option httpchk GET / HTTP/1.1\r\nHost:\ blog.jcdev.org

(I know HAProxy’s method is a little hacky- having the whole request on one line).
The above is a GET request / for each backend, defining a Host header as appropriate to ensure the backend servers handle the request correctly.

Next, we need to define the health checking interval, and the conditions for marking a node as ‘Down’ and then (hopefully) ‘Up’ again. This is done in the backend with the ‘inter’ statement, which looks like this:
 default-server inter 60s fall 2 rise 5

What this says is:
  • Inter: This is the interval at which to check each node.
  • Fall: The number of requests that have to fail before you mark the server as ‘Down’.
  • Rise: The number of requests that have to succeed before you mark the server as ‘Up’ again.
The above settings are quite trigger-happy, so feel free to adjust them to your needs.

Backup nodes

Finally- in the event of a failure of all the nodes we’re using, we want to define a ‘backup’ server to direct requests to. Although HAProxy has custom error support to display a holding page in the event all nodes in a backend aren’t available, I wanted something a little more flexible, so I spun up an instance of Nginx on port 8443 to serve a simple message. Adding this to our load balancer configuration is as simple as adding the node with the ‘backup’ keyword:
server backup 127.0.0.1:8443 backup
This ensures that it isn’t part of the normal pool for serving load-balanced traffic.
Our completed backend configuration would look something like this:
 backend my_tls_backend
     mode tcp
      balance roundrobin
      default-server inter 60s fall 1 rise 2
      option httpchk GET / HTTP/1.1\r\nHost:\ blog.jcdev.org
      server node1 10.0.0.1:443 check check-ssl
      server node2 10.0.0.2:443 check check-ssl
      server backup 127.0.0.1:8443 backup
Note that I’ve used the ‘check-ssl’ parameter for the ‘check’ keyword, which ensures the health check connects over TLS. If you don’t do this, HAProxy sends a plain HTTP request to port 443, which your backend servers won’t like.
With the above configuration, we now have a working load balancer which will fail over in the event of a problem, and fail back automatically when the issue is fixed.

Bonus: Email Alert notifications.

Although it’s a good idea to monitor the health of each node separately (using a monitoring tool such as Nagios), HAProxy 1.6 is also able to send email notifications when a server changes state.
More information on this feature can be found here.

Comments

Popular posts from this blog

Configuring Failover and Load Balancing with HAproxy using Keepalived

Network Scenario: LB1: 192.168.10.10 LB2: 192.168.10.11 Virtual IP: 192.168.10.12 APP_Server1: 192.168.10.20 APP_Server2: 192.168.10.21 Load Balancing: STEP 1 - Install HAProxy: HAProxy package is available under default yum repository for CentOS, Redhat systems. Use the following yum package manager command to install HAProxy on your system.   # yum install haproxy   STEP 2 - Configure HAProxy : Update your HAProxy configuration file /etc/haproxy/haproxy.cfg as per your requirement, You may also use below given configuration file as an example of setup and modify it. Keep the config file same of both servers i.e. LB1 and LB2.   ----------------------------------------------------------------------------------------------------------- global         log /dev/log    local0         log /dev/log    local1 notice     ...

ElasticSearch Clustering and Backups

ElasticSearch Installation: The version we want to install is 2.x (latest sub version of 2). First we have to install java on centOS machine with following command: sudo yum install java-1.8.0-openjdk.x86_64 Install Public Signing Key:   rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch Create new repo in your /etc/yum.repos.d/ directory. For example I have created elasticsearch.repo here. /etc/yum.repos.d/elasticsearch.repo -------------------------------------------------------------------------------------------- [elasticsearch-2.x] name=Elasticsearch repository for 2.x packages baseurl=http://packages.elastic.co/elasticsearch/2.x/centos gpgcheck=1 gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch enabled=1 --------------------------------------------------------------------------------------------- Now Elasticsearch 2.x will avilable for installation using yum   yum install elasticsearch After installation enable the service a...

How to configure CentOS Firewalld

Introduction Firewalld is a complete firewall solution available by default on CentOS 7 servers. In this guide, we will cover how to set up a firewall for your server and show you the basics of managing the firewall with the firewall-cmd administrative tool. Basic Concepts in Firewalld Before we begin talking about how to actually use the firewall-cmd utility to manage your firewall configuration, we should get familiar with a few basic concepts that the tool introduces. Zones The firewalld daemon manages groups of rules using entities called "zones". Zones are basically sets of rules dictating what traffic should be allowed depending on the level of trust you have in the networks your computer is connected to. Network interfaces are assigned a zone to dictate the behavior that the firewall should allow. For computers that might move between networks frequently (like laptops), this kind of flexibility provides a good method of changing your ru...