Sei sulla pagina 1di 60

Instructor: Hadi Alnabriss

WHAT IS HAPROXY?
• Even though you try to optimize your service configurations but
Sometimes your service will fail
• This is normal because any software or hardware has a maximum
capacity, that it cannot afford any more connections

Q) How can we increase the capacity of our service??


INCREASE YOUR SERVICE CAPACITY
• To increase service capacity you can make more than one image of your server
• Clients can access any server

• But how ?
CLIENT REQUESTS DISTRIBUTION
• How can you distribute client requests among your web servers?

• May be using DNS!! But we have problems here


• DNS Cache
• What if one server failed

Myservice.com A 5.5.5.1
Myservice.com A 5.5.5.2
Myservice.com A 5.5.5.3
SOLUTION
• Use Load Balancer
• All requests will be directed to the load balancer
• Load balancer will forward requests to the web
servers according to configurations
WHAT IS HAPROXY?
• a TCP proxy : it can accept a TCP connection from a listening socket,
connect to a server and attach these sockets together allowing traffic to
flow in both directions
WHAT IS HAPROXY?
• HTTP reverse-proxy: it presents itself as a server, receives HTTP requests over
connections accepted on a listening TCP socket, and passes the requests from
these connections to servers using different connections.
WHAT IS HAPROXY?
• an SSL terminator: SSL/TLS may be used on the connection coming from the
client, on the connection going to the server, or even on both connections.

Secure Connection
WHAT IS HAPROXY?
• a TCP normalizer: abnormal traffic such as invalid packets or incomplete
connections (SYN floods) can be dropped here
WHAT IS HAPROXY?
• an HTTP normalizer : when configured to process HTTP traffic, only valid complete
requests are passed.
• This protects against a lot of protocol-based attacks.
WHAT IS HAPROXY?
• a server load balancer : it can load balance TCP connections and HTTP requests.
• In TCP mode, load balancing decisions are taken for the whole connection.
• In HTTP mode, decisions are taken per request.
WHAT IS HAPROXY?
• a Traffic Regulator: it can apply some rate limiting at various points, protect the
servers against overloading, adjust traffic priorities based on the contents, and
even pass such information to lower layers and outer network components by
marking packets.

Max Connections : 5000


WHY HAPROXY?
• Load Balancer
• Fast, reliable
• Comprehensive statistics and monitoring
• HAProxy is an open source project covered by the GPLv2 license,
• meaning that everyone is allowed to redistribute it provided that access to the
sources is also provided upon request, especially if any modifications were
made.
HAPROXY TASKS
• process incoming connections
• periodically check the servers' status (known as health checks)
HAPROXY COMPONENTS
 Frontend system : defines the IP address and port on which the proxy listens
 Back-end systems: The back-end system is a pool of real servers, and defines the
load balancing (Scheduling) algorithms.
HAPROXY SCHEDULING ALGORITHMS
(1) Round-Robin (roundrobin)
 Distributes each request sequentially around the pool of real servers.
 All the real servers are treated as equals without regard to capacity or load.
HAPROXY SCHEDULING ALGORITHMS
• Round-Robin (roundrobin)
• i.e Assume the following scenario :
• The URL requested on the 1st server needs 5 seconds to finish
• The URL requested on the 2nd server needs 1 second to finish
• The URL requested on the 3rd server needs 5 seconds to finish

What is going on After 2 seconds ?


HAPROXY SCHEDULING ALGORITHMS
• Round-Robin (roundrobin)
• What if we have new 3 requests now ?
• In this case more load will be added to some servers
HAPROXY SCHEDULING ALGORITHMS
(2) Least-Connection
 Distributes more requests to real servers with fewer active connections.
 Administrators with a dynamic environment with varying session or connection
lengths may find this scheduler a better fit for their environments.
 It is also ideal for an environment where a group of servers have different
capacities
 Can use Weights
HAPROXY SCHEDULING ALGORITHMS
(3) Source
 The same client IP always reaches the same server as long
 This algorithm is generally used in TCP mode where cookies cannot be inserted.
HAPROXY SCHEDULING ALGORITHMS
(4) First
 The first server with available connection slots receives the connection. Once a
server reaches its maxconn value, the next server is used.
HAPROXY SCHEDULING ALGORITHMS
(5) URL Parameter
 This static algorithm can only be used on an HTTP backend
 The URL parameter that’s specified is looked up in the query string of each HTTP
GET request. http://test.com/?page=index
 If the parameter that’s found is followed by an equal
sign and value, the value is hashed and divided by
the total weight of running servers.
If the parameter is missing from the URL, the
scheduler defaults to Round-robin scheduling
profile
HAPROXY SCHEDULING ALGORITHMS
(6) URI
 This algorithm hashes either the left part of the URI (before the question mark)
or the whole URI
 This ensures that the same URI will always be
http://test.com/?page=index
directed to the same server as long as no server goes
up or down. This is used with proxy caches and
anti-virus proxies in order to maximize the cache hit
rate.
Note that this algorithm may only be used in an HTTP
backend
profile
HAPROXY SCHEDULING ALGORITHMS
(7) Header
 Distributes requests to servers by checking a particular header name in each
source HTTP request and performing a hash calculation divided by the weight of
all running servers.
 If the header is absent, the scheduler defaults to Round-robin scheduling.
HAPROXY CONFIGURATION
HAProxy is configured by editing the /etc/haproxy/haproxy.cfg file

The configurations file include the sections:


 Global Settings section
 Default Settings
 Frontend Settings
 Backend Settings
HAPROXY CONFIGURATION
(1) Global Settings
 Parameters in the "global" section are process-wide and often OS-specific.
 They are generally set once for all and do not need being changed once
correct. Some of them have command-line equivalents.
HAPROXY CONFIGURATION
• log all entries to the local syslog server
• The maxconn parameter specifies the maximum number of concurrent
connections
• The user and group parameters specifies the user name and group name for
which the haproxy process belongs.
• The daemon parameter specifies that haproxy runs as a background process.
HAPROXY CONFIGURATION
(2) Default Settings
 Sets default parameters for all other sections following its declaration.
HAPROXY CONFIGURATION
• Mode specifies the protocol for the HAProxy instance.
 Using the http mode connects source requests to real servers based on HTTP, ideal for load balancing web servers.
 For other applications, use the tcp mode.
 HTTP mode allows using some algorithms like URL parameter.

• log specifies log address and syslog facilities to which log entries are written.
• option httplog enables logging of various values of an HTTP session, including HTTP requests, session
status, connection numbers, source address, and connection timers among other values.
• option dontlognull disables logging of null connections, meaning that HAProxy will not log connections
wherein no data has been transferred.
 null connections could indicate malicious activities such as open port-scanning for vulnerabilities.
HAPROXY CONFIGURATION
• retries : is the number of times a connection attempt should be retried on a server when a
connection either is refused or times out
• http-request 10s : period to wait for a complete HTTP request from a client.
• queue 1m : period to wait before a connection is dropped and a client receives a 503 or
"Service Unavailable" error.
• connect 10s : period to wait for a successful connection to a server.
• client 1m : period a client can remain inactive (it neither accepts nor sends data).
• server 1m : period a server is given to accept or send data before timeout occurs
HAPROXY CONFIGURATION
(3) Frontend Section:
 The frontend settings configure the servers' listening sockets for client
connection requests

• The frontend called main


• Configured listen on the socket 192.168.0.10:80
• Once connected, the use backend specifies that all sessions connect to the app
back end
HAPROXY CONFIGURATION
(4) Backend Section
 Specifies the real server IP addresses as well as the load balancer scheduling algorithm.

• The back-end server is named app.


• The balance specifies the load balancer scheduling algorithm to be used.
• The server lines specify the servers available in the back end.
 app1 to app4 are the names assigned internally to each real server.
HAPROXY CONFIGURATION
 The check option flags a server for periodic health checks.
 inter 2s healthcheck interval
 rise 4: number of consecutive valid health checks before considering the server as
UP
 fall 3 :number of consecutive invalid health checks before considering the server as
DOWN.
EXAMPLE CONFIGURATIONS
THE LISTEN BLOCK
PRACTICAL EXAMPLE
1. Prepare Three CentOS 7 minimal OS
HAProxy
2. Disable firewalld and selinux haproxy
192.168.132.145
3. Install apache on two servers
4. On one server Install HAProxy

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
STATISTICS
• You can enable statistics in HAProxy to monitor the status of your servers
STATISTICS
• Add the following to the frontend
stats enable
stats auth admin:password
stats hide-version
stats show-node
stats refresh 60s
stats uri /haproxy?stats
TCP AND HTTP MODE
• You need to choose one mode to your backends (TCP or HTTP)
• What is the difference between them?
TCP AND HTTP MODES
• TCP works in Lower Layers (Networking concepts and OSI model)
• You have to understand that HTTP mode data is carried by TCP protocol
• TCP Protocol has general information about :
• Source and Destination Ports
• Specific flags like Ack , Syn and Fin
• To guarantee receiving and ordering data

TCP Protocol
Source Port: 5158 HTTP Traffic
Destination Port: 80
TCP AND HTTP MODES
• HTTP has more information about the http request

TCP Protocol
Source Port: 5158 HTTP Traffic
Destination Port: 80
TCP AND HTTP MODE
• If you need to redirect any traffic received on frontend port to your backend with
scheduling algorithms like i.e roundrobbin use TCP mode

• If you need to use Scheduling algorithms that need information from the http
header or access lists that reads http header then you have to use the http mode
FORWARDFOR OPTIONS
Why we need the forwardfor option?
FORWARDFOR OPTIONS Client
192.168.132.1
Apache server access logs show clients IP : 192.168.132.145

HAProxy
haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
FORWARDFOR OPTIONS Client
192.168.132.1

HAProxy
haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
FORWARDFOR OPTIONS Client
192.168.132.1
• To see the original IP you need to:
• Keep forwardfor option enabled in haproxy
• Add %{X-Forwarded-For}i to your log configurations in Apache

HAProxy
haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
ACCESS LISTS
• The purpose in using Access Control Lists (ACL) is to provide a flexible solution to
make decisions based on content extracted from the request, the response, or
any environmental status.
ACCESS LISTS
• The ACL Syntax

acl <aclname> <criterion> [Flags] [operators] <pattern>

acl host_1 hdr(host) -i mydomain.com


example from: https://www.haproxy.com/documentation/aloha/9-5/traffic-
management/lb-layer7/writing-conditions/

acl url_static path_beg /static /images /img /css


acl url_static path_end .gif .png .jpg .css .js
acl host_www hdr_beg(host) -i www
acl host_static hdr_beg(host) -i img. video. download. ftp.

# now use backend "static" for all static-only hosts, and for static urls # of host "www". Use backend
"www" for the rest.

use_backend static if host_static or host_www url_static


use_backend www if host_www
STICKY SESSIONS IN HAPROXY Client
192.168.132.1
• What is the problem of sessions in HAProxy?
• HTTP is not a connected protocol: it means that the session is totally
independent from the TCP connections.
• Session information is saved on the Web server
HAProxy
• The problem haproxy
192.168.132.145
• Client will create session on websrv01
• Then HAProxy will redirect him to webserver02
• Webserver02 will ask the client to login again!!

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
STICKY SESSIONS IN HAPROXY HAProxy
haproxy
192.168.132.145
• Solutions !!
• Make a shared storage for session files !
• Save Sessions in Database!

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144

Sessions
STICKY SESSIONS IN HAPROXY Client
192.168.132.1
• Solutions!!
• Use the source scheduling algorithm
• This will guarantee that the same client will access the same server

HAProxy
• What if we have a proxy server accessing our environments? haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
STICKY SESSIONS IN HAPROXY
• Solution(1)
• Inject Cookie in the Client Browser
• This will make the client tell haproxy that I was redirected to server 01
• always redirect me to server 01
STICKY SESSIONS IN HAPROXY
• Solutions(2)

• appsession PHPSESSID len 64 timeout 3h request-learn prefix


SSL CERTIFICATES Client
192.168.132.1
• If your web servers have HTTPS enabled, the HAProxy will appear
a hacker making Man-In-The-Middle Attack
• So the SSL certificates must be defined on your HAPROXY system
HAProxy
haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
SSL CERTIFICATES Client
192.168.132.1
• Configurations :
• Create a .pem combined certificates
• Then add a frontend to receive https traffic

HAProxy
haproxy
frontend www-https 192.168.132.145
bind *:443 ssl crt /etc/haproxy/mydomain.combined.pem
reqadd X-Forwarded-Proto:\ https
default_backend app

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
SPOF
Client
192.168.132.1

HAProxy
haproxy
192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
AVOID SPOF
How can we avoid SPOF for HAProxy??

Pacemaker HAProxy
haproxy
HAProxy
haproxy
VIP: 192.168.132.147 192.168.132.146 192.168.132.145

Apache Apache
Websrv01 Websrv02
192.168.132.143 192.168.132.144
CONCLUSION
• HAProxy can be used for Load Balancing and fault tolerance
• It is stable, free and open source
• It can work with http protocol and it can extract information from the http header
• It can also be used for any Application Layer protocol that uses TCP protocol.
• Provides many different scheduling algorithms
• It can be configured to display statistics and monitoring information
• You can configure it as an SSL terminator’
• It can work together with Pacemaker to avoid SPOF
Rate how much this course was helpful for you

If you have any questions , you can add them to the course comments

Potrebbero piacerti anche