Installing the HAProxy Load Balancer on CentOS

The translation of the article was prepared on the eve of the start of the course Linux administrator. Virtualization and Clustering»

Installing the HAProxy Load Balancer on CentOS

Load balancing is a common solution for scaling out web applications across multiple hosts while providing users with a single point of access to a service. HAProxy is one of the most popular open source load balancing software that also provides high availability and proxying functionality.

HAProxy aims to optimize resource usage, maximize throughput, minimize response time, and avoid overloading any single resource. It can be installed on a variety of Linux distributions, such as CentOS 8, which we will focus on in this guide, as well as systems Debian 8 и Ubuntu 16.

Installing the HAProxy Load Balancer on CentOS

HAProxy is particularly suited to very high traffic websites and is therefore often used to improve the reliability and performance of multi-server web service configurations. This guide outlines the steps to set up HAProxy as a load balancer on a CentOS 8 cloud host, which then routes traffic to your web servers.

As a prerequisite for best results, you should have at least two web servers and a load balancing server. Web servers must be running at least a basic web service such as nginx or httpd in order to check load balancing between them.

Installing HAProxy on CentOS 8

Because HAProxy is a rapidly evolving open source application, the distribution available to you in the standard CentOS repositories may not be the latest version. To find out the latest version, run the following command:

sudo yum info haproxy

HAProxy always provides three stable versions to choose from: the two most recent supported versions and the third, older version that is still receiving critical updates. You can always check the latest stable version listed on the HAProxy website and then decide which version you want to work with.

In this guide, we will be installing the latest stable version 2.0, which was not yet available in the standard repositories at the time of writing. You will need to install it from the original source. But first, check if you have met the necessary conditions for downloading and compiling the program.

sudo yum install gcc pcre-devel tar make -y

Download the source code using the command below. You can check if there is a newer version available at HAProxy download page.

wget http://www.haproxy.org/download/2.0/src/haproxy-2.0.7.tar.gz -O ~/haproxy.tar.gz

Once the download is complete, extract the files using the command below:

tar xzvf ~/haproxy.tar.gz -C ~/

Change to the unpacked source directory:

cd ~/haproxy-2.0.7

Then compile the program for your system:

make TARGET=linux-glibc

And finally install HAProxy itself:

sudo make install

Now HAProxy is installed, but it requires some additional manipulations to make it work. Let's continue setting up the software and services below.

Setting up HAProxy for your server

Now add the following directories and statistics file for HAProxy entries:

sudo mkdir -p /etc/haproxy
sudo mkdir -p /var/lib/haproxy 
sudo touch /var/lib/haproxy/stats

Create a symbolic link for the binaries so you can run HAProxy commands as a normal user:

sudo ln -s /usr/local/sbin/haproxy /usr/sbin/haproxy

If you want to add the proxy to your system as a service, copy the haproxy.init file from examples to your /etc/init.d directory. Edit the file's permissions so the script will run, then reload the systemd daemon:

sudo cp ~/haproxy-2.0.7/examples/haproxy.init /etc/init.d/haproxy
sudo chmod 755 /etc/init.d/haproxy
sudo systemctl daemon-reload

You also need to allow the service to automatically restart on system startup:

sudo chkconfig haproxy on

For convenience, it is also recommended to add a new user to run HAProxy:

sudo useradd -r haproxy

After that, you can check the installed version number again with the following command:

haproxy -v
HA-Proxy version 2.0.7 2019/09/27 - https://haproxy.org/

In our case, the version should be 2.0.7, as shown in the sample output above.

Finally, the default firewall in CentOS 8 is quite restrictive for this project. Use the following commands to enable the required services and reset the firewall:

sudo firewall-cmd --permanent --zone=public --add-service=http
sudo firewall-cmd --permanent --zone=public --add-port=8181/tcp
sudo firewall-cmd --reload

Setting up a load balancer

Setting up HAProxy is a fairly simple process. Essentially, all you need to do is tell HAProxy which connections it should listen on and where to relay them.

This is done by creating a configuration file /etc/haproxy/haproxy.cfg with defining settings. You can read about HAProxy configuration options on the documentation pageif you want to know more about it.

Load balancing at the transport layer (layer 4)

Let's start with the basic setup. Create a new config file, for example using vi with below command:

sudo vi /etc/haproxy/haproxy.cfg

Add the following sections to the file. Replace server_name the one that should call your servers on the statistics page, and private_ip - private IP addresses of the servers to which you want to direct web traffic. You can check private IP addresses in the UpCloud control panel and on the tab private network the menu Network.

global
   log /dev/log local0
   log /dev/log local1 notice
   chroot /var/lib/haproxy
   stats timeout 30s
   user haproxy
   group haproxy
   daemon

defaults
   log global
   mode http
   option httplog
   option dontlognull
   timeout connect 5000
   timeout client 50000
   timeout server 50000

frontend http_front
   bind *:80
   stats uri /haproxy?stats
   default_backend http_back

backend http_back
   balance roundrobin
   server server_name1 private_ip1:80 check
   server server_name2 private_ip2:80 check

This defines a transport layer load balancer (layer 4) externally named http_front listening on port 80, which then routes traffic to the default backend named http_back. Additional statistics /haproxy?stats connects the statistics page to the specified address.

Various load balancing algorithms.

Specifying servers in the backend section allows HAProxy to use those servers for load balancing according to the round robin algorithm when possible.

Balancing algorithms are used to determine which server in the backend each connection is sent to. Here are some of the useful options:

  • Roundrobin: each server is used in turn according to its weight. This is the smoothest and most fair algorithm when the processing time of the servers remains evenly distributed. This algorithm is dynamic, which allows you to adjust the weight of the server on the fly.
  • Leastconn: the server with the fewest connections is selected. Round robin is performed between servers with the same load. Using this algorithm is recommended for long sessions such as LDAP, SQL, TSE, etc., but not very suitable for short sessions such as HTTP.
  • First: the first server with available connection slots receives the connection. Servers are selected from the lowest numeric ID to the highest, which by default corresponds to the position of the server in the farm. Once the server reaches the maxconn value, the next server is used.
  • Source: The source IP address is hashed and divided by the total weight of running servers to determine which server will receive the request. Thus, the same client IP address will always go to the same server, while the servers remain unchanged.

Configuring load balancing at the application layer (layer 7)

Another option available is to configure the load balancer to work at the application layer (layer 7), which is useful when parts of your web application are located on different hosts. This can be achieved by throttling the transfer of the connection, for example by URL.

Open the HAProxy configuration file with a text editor:

sudo vi /etc/haproxy/haproxy.cfg

Then set up the frontend and backend segments according to the example below:

frontend http_front
   bind *:80
   stats uri /haproxy?stats
   acl url_blog path_beg /blog
   use_backend blog_back if url_blog
   default_backend http_back

backend http_back
   balance roundrobin
   server server_name1 private_ip1:80 check
   server server_name2 private_ip2:80 check

backend blog_back
   server server_name3 private_ip3:80 check

The frontend declares an ACL rule named url_blog that applies to all connections with paths beginning with /blog. Use_backend specifies that connections matching the url_blog condition should be served by the backend named blog_back and all other requests are handled by the default backend.

On the back end, the configuration sets up two server groups: http_back, as before, and a new one called blog_back, which handles connections to example.com/blog.

After changing the settings, save the file and restart HAProxy with the following command:

sudo systemctl restart haproxy

If you get any warnings or errors while starting, check the configuration for any and make sure you have created all the necessary files and folders, and then try restarting again.

Testing the setup

Once HAProxy is configured and running, open the public IP address of the load balancer server in a browser and check if you have connected to the backend correctly. The stats uri parameter in the configuration creates a statistics page at the specified address.

http://load_balancer_public_ip/haproxy?stats

When you load the stats page, if all your servers are green, then setup was successful!

Installing the HAProxy Load Balancer on CentOS

The statistics page contains some useful information for tracking your web hosts, including up/down time and number of sessions. If the server is marked red, make sure the server is up and that you can ping it from the load balancer.

If your load balancer is not responding, make sure HTTP connections are not being blocked by a firewall. Also make sure HAProxy is working with the command below:

sudo systemctl status haproxy

Protecting the statistics page with a password

However, if the stats page is just listed in the frontend, then it's open to the public, which might not be a good idea. Instead, you can assign your own port number to it by adding the example below to the end of your haproxy.cfg file. Replace username и Password to something safe:

listen stats
   bind *:8181
   stats enable
   stats uri /
   stats realm Haproxy Statistics
   stats auth username:password

After adding the new listener group, remove the old stats uri reference from the frontend group. When finished, save the file and restart HAProxy.

sudo systemctl restart haproxy

Then open the load balancer again with the new port number and log in with the username and password you specified in the configuration file.

http://load_balancer_public_ip:8181

Make sure all your servers are still green and then open only the load balancer IP without any port numbers in your browser.

http://load_balancer_public_ip/

If you have at least some variety of landing pages on your internal servers, you will notice that every time you reload the page, you get a response from a different host. You can try different balancing algorithms in the configuration section or see complete documentation.

Conclusion: HAProxy Load Balancer

Congratulations on successfully setting up your HAProxy load balancer! Even with a basic load balancing setup, you can greatly improve the performance and availability of your web application. This guide is just an introduction to load balancing with HAProxy, which is capable of much more than what can be described in a quick setup guide. We recommend experimenting with different configurations using extensive documentationavailable to HAProxy and then start planning load balancing for your production environment.

By using multiple hosts to protect your web service with headroom, the load balancer itself can still represent a point of failure. You can further improve high availability by setting up a floating IP between multiple load balancers. You can find out more about this in our article about floating IP addresses on UpCloud.

More about the course Linux administrator. Virtualization and Clustering»***

Source: habr.com

Add a comment