HAProxy High Availability using RHI, Quagga and OSPF

Usually, you can use VRRP + keepalived for making HAProxy redundant and providing a good service availability.
In day-to-day operations, there are many cases where you have to take down a whole HAProxy host. For example, server reboots (because of kernel/os updates), hardware maintenance, new version of os distribution or installing/upgrading new HAProxy versions.

For this types of planned maintenance, dns failover or other dns based solutions aren’t an option because you can’t control dns caching (ISP resolvers, browsers, etc.) on the client side.

If you want to do planned maintenance with keepalived, for example, on the MASTER of a keepalived/VRRP failover pair, the VIPs/services have to be failed over to the BACKUP node. Even if you set the lowest VRRP timers, you have a downtime of 3.6 seconds when keepalived fails over the VIP to the other node. Even worse, all current tcp sessions are terminated and your service is disrupted.

This problem can be solved at layer 3 and with the help of routing protocols. In the solution i’ve implemented, the service IPs (VIPs) are configured as /32 loopback ips on the HAProxy servers. Then, these ips are announced from multiple HAProxy nodes with Quagga (you can use BIRD as well) as OSPF routes to the upstream layer 3 devices. This technique is also called “Route Health Injection” and is used here as a sort of Anycast.

The upstream layer 3 devices than have multiple paths to the ips and can balance the traffic over all available paths (with ECMP for example) or can use a preferred path.

I would have preferred to use BGP as routing protocol for this use case, but this wasn’t available on the cisco network gear in this project. If you can use BGP, you can also use a lightweight solution (ExaBGP/GoBGP) to announce the ips (instead of Quagga/BIRD).

This setup implements an Active/Passive setup of the HAProxy boxes. The HAProxy nodes are multihomed (connected to two upstream layer 3 devices) for redundancy reasons. Which node is active is controlled by the different OSPF cost that is announced by Quagga. If you can use ECMP, you can use this setup as an Active/Active solution and for horizontal scaling (scale-out) of the HAProxy nodes. The general network setup is shown in the figure below:
The setup above uses a VLAN interface on each L3 switch (This is due to the existing network architecture). In other setups/network environments you can use a real point-to-point L3 connection for every link.

Configuration upstream layer 3 devices (Cisco)

sw01:

 interface Vlan150
 ip address 10.25.150.1 255.255.255.0
 ip ospf hello-interval 5
 ip ospf dead-interval 60

router ospf 10
 router-id 10.25.150.1
 log-adjacency-changes detail
 passive-interface default
 no passive-interface Vlan150
 network 10.25.150.0 0.0.0.255 area 51
 distribute-list prefix AllowedOSPFRoutes in

ip prefix-list AllowedOSPFRoutes seq 5 deny 0.0.0.0/0
ip prefix-list AllowedOSPFRoutes seq 6 deny 10.25.160.0/24
ip prefix-list AllowedOSPFRoutes seq 10 permit 0.0.0.0/0 le 32

sw02:

interface Vlan160
 ip address 10.25.160.1 255.255.255.0
 ip ospf hello-interval 5
 ip ospf dead-interval 60

router ospf 10
 router-id 10.25.160.1
 log-adjacency-changes detail
 passive-interface default
 no passive-interface Vlan160
 network 10.25.160.0 0.0.0.255 area 51
 distribute-list prefix AllowedOSPFRoutes in

ip prefix-list AllowedOSPFRoutes seq 5 deny 0.0.0.0/0
ip prefix-list AllowedOSPFRoutes seq 10 deny 10.25.150.0/24
ip prefix-list AllowedOSPFRoutes seq 15 permit 0.0.0.0/0 le 32

Configure Loopback IPs on HAProxy nodes

haproxy01:

auto lo
iface lo inet loopback
  up ip addr add 10.46.46.46/32 dev lo

# to sw01
auto eth0
iface eth0 inet static
address 10.25.150.10
netmask 255.255.255.0

# to sw02
auto eth1
iface eth1 inet static
address 10.25.160.10
netmask 255.255.255.0

haproxy02:

auto lo
iface lo inet loopback
up ip addr add 10.46.46.46/32 dev lo

# to sw01
auto eth0
iface eth0 inet static
address 10.25.150.11
netmask 255.255.255.0

# to sw02
auto eth1
iface eth1 inet static
address 10.25.160.11
netmask 255.255.255.0

Quagga configuration

haproxy01:

!
hostname haproxy01
log file /var/log/quagga/zebra.log
log file /var/log/quagga/ospfd.log
!
password PLEASECHANGEME
enable password PLEASECHANGEME
!
interface eth0
 ip ospf dead-interval 60
 ip ospf hello-interval 5
 ip ospf priority 0
 ipv6 nd suppress-ra
 no link-detect
!
interface eth1
 ip ospf dead-interval 60
 ip ospf hello-interval 5
 ip ospf priority 0
 ipv6 nd suppress-ra
 no link-detect
!
interface lo
 ip ospf cost 300
 ip ospf priority 0
 no link-detect
!
router ospf
 ospf router-id 1.1.1.15
 log-adjacency-changes detail
 passive-interface default
 no passive-interface eth0
 no passive-interface eth1
 no passive-interface lo
 network 10.25.150.0/24 area 0.0.0.51
 network 10.25.160.0/24 area 0.0.0.51
 network 10.46.46.46/32 area 0.0.0.51
!
line vty
 exec-timeout 0 0
!

haproxy02:

!
hostname haproxy02
log file /var/log/quagga/zebra.log
log file /var/log/quagga/ospfd.log
!
password PLEASECHANGEME
enable password PLEASECHANGEME
!
interface eth0
 ip ospf dead-interval 60
 ip ospf hello-interval 5
 ip ospf priority 0
 ipv6 nd suppress-ra
 no link-detect
!
interface eth1
 ip ospf dead-interval 60
 ip ospf hello-interval 5
 ip ospf priority 0
 ipv6 nd suppress-ra
 no link-detect
!
interface lo
 ip ospf cost 400
 ip ospf priority 0
 no link-detect
!
router ospf
 ospf router-id 1.1.1.20
 log-adjacency-changes detail
 passive-interface default
 no passive-interface eth0
 no passive-interface eth1
 no passive-interface lo
 network 10.25.150.0/24 area 0.0.0.51
 network 10.25.160.0/24 area 0.0.0.51
 network 10.46.46.46/32 area 0.0.0.51
!
line vty
 exec-timeout 0 0
!

Setting the default route on the HAProxy Boxes

Because the servers are multihomed, they can’t have only one default gateway. To distribute a default route to the HAProxy nodes you can use OSPF for example:

sw01:

router ospf 10
 default-information originate always metric 15

sw02:

router ospf 10
 default-information originate always metric 20

With these settings, the HAProxy nodes receive 2 default routes and install the one with the lower metric.

A disadvantage of this solution is that it can lead to asymmetric routing (the return traffic is send out over an interface differing to the incoming interface). This can be difficult to debug and can cause problems when state is involved/needed (firewalls for example).Therefore, i’ve used a static configuration without OSPF to manage the default routes. This included the usage of Linux Policy Routing to send the return traffic through the interface which the traffic was coming in.

Configure “rp_filter” (also needed if the ip is only configured on a loopback device)

haproxy01 / haproxy02:

for i in /proc/sys/net/ipv4/conf/*/rp_filter; do echo 2 > $i; done

Configure iptables connection/packet marking based on the incoming interface

haproxy01 / haproxy02:

iptables -t mangle -A PREROUTING -j CONNMARK --restore-mark
iptables -t mangle -A OUTPUT -j CONNMARK --restore-mark

iptables -t mangle -A INPUT -i eth0 -j MARK --set-mark 1
iptables -t mangle -A INPUT -i eth1 -j MARK --set-mark 2

iptables -t mangle -A INPUT -j CONNMARK --save-mark

Configure linux policy routing (enabling “ip_forward” is not required in this setup and not desired)

haproxy01 / haproxy02:

echo 100 vl150 >> /etc/iproute2/rt_tables
echo 200 vl160 >> /etc/iproute2/rt_tables

ip rule add prio 100 from all fwmark 1 lookup vl150
ip rule add prio 110 from all fwmark 2 lookup vl160

ip route add table vl150 default via 10.25.150.1 dev eth0 metric 100
ip route add table vl160 default via 10.25.160.1 dev eth1 metric 100

Verification of the setup

After all configuration steps are completed, verify if the routes are injected properly:

sw01# sh ip route ospf
10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O 10.46.46.46/32 [110/301] via 10.25.150.10, 01:53:46, Vlan150

sw02# sh ip route ospf
10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O 10.46.46.46/32 [110/301] via 10.25.160.10, 00:04:55, Vlan160

Service Healthcheck

If you use route health injection, you also have to monitor the HAProxy process itself.

If the haproxy process fails, the routes to this haproxy node have to be removed from OSPF, because otherwise traffic is send to this node even if haproxy isn’t running. If this occurs, you are “blackholing” the traffic.

If you use BGP, there are many solutions with integrated healthcheck capabilities, ExaBGP/GoBGP for example. If you use BIRD + BGP/OSPF, take a look at anycast-healthchecker.

I haven’t found something similar for Quagga, so i created something simple with monit. The following is a simple example which stops Quagga when the HAProxy process fails. When Quagga is stopped, the OSPF neighbours detect this and removes the routes (after a failure/timeout threshold).

‘monit’ configuration on haproxy01 / haproxy02

cat /etc/monit/conf.d/haproxy-quagga:

check process haproxy with pidfile /var/run/haproxy.pid
  if does not exist for 5 cycles then exec "/etc/init.d/quagga stop" else if succeeded for 6 cycles then exec "/etc/init.d/quagga start"

Stopping the Quagga process is only one possibility, there are many more to increase granularity:

only set higher ospf cost instead of shutting down the quagga process
only remove specific routes for specific ips
remove routes when HAProxy has no backends available in a pool

How to do zero downtime maintenance

Simply go to the active node (haproxy01 in our case) and set the ospf cost to be higher than on haproxy02:

root@haproxy01:~# vtysh -c 'conf t' -c 'int lo' -c 'ip ospf cost 500'

After executing the command above, wait a few seconds till the upstream devices have changed their preferred path for the service IPs to haproxy02. When the upstream devices change the paths, not a single packet is lost.

Before switchover:

sw01# sh ip route ospf
     10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O       10.46.46.46/32 [110/301] via 10.25.150.10, 3w2d, Vlan150

After switchover:

sw01# sh ip route ospf
     10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O       10.46.46.46/32 [110/401] via 10.25.150.11, 00:00:06, Vlan150

One thought on “HAProxy High Availability using RHI, Quagga and OSPF”

Thomas Mangin

6. September 2016

Hello,

Nice article. Good to see more and more devops embrassing L3 🙂

ExaBGP comes with an ‘healthcheck’ program which can check if an application is up and withdraw announcement(s) if it goes down (for example pulling the haproxy admin page).

Vincent Bernat explains how it can be achieved in his blog:
https://vincent.bernat.im/en/blog/2013-exabgp-highavailability.html

Log in to Reply

godevops.net

GoDevOps

HAProxy High Availability using RHI, Quagga and OSPF

Configuration upstream layer 3 devices (Cisco)

Configure Loopback IPs on HAProxy nodes

Quagga configuration

Setting the default route on the HAProxy Boxes

Verification of the setup

Service Healthcheck

How to do zero downtime maintenance

One thought on “HAProxy High Availability using RHI, Quagga and OSPF”

Leave a Reply Cancel reply