Table of Contents
[HOWTO] Public balancing Gateways (Tengine+F5)
Documentation | |
---|---|
Name: | [HOWTO] Public balancing Gateways (Tengine+keepalived) |
Description: | A production-ready way to balance gateways |
Modification date : | 24/01/2020 |
Owner: | dodger |
Notify changes to: | Owner |
Tags: | ceph, object storage |
Scalate to: | Thefuckingbofh |
Pre-Requirements
- Tengine documentation about proxy requests buffering (
proxyrequestbuffering
option)
Instructions
Setup Tengine
Tengine must be built from source, there's no official repository for centos…
Pre-required libs for a complete setup:
yum -y install pcre-devel openssl-devel libxslt-devel gd-devel GeoIP-devel
Download the latest version from my repository https://github.com/Jorge-Holgado/tengine.
Configure, make and install:
./configure \ --with-http_ssl_module \ --with-http_v2_module \ --with-http_realip_module \ --with-http_addition_module \ --with-http_xslt_module=dynamic \ --with-http_image_filter_module=dynamic \ --with-http_geoip_module=dynamic \ --with-http_sub_module \ --with-http_gunzip_module \ --with-http_random_index_module \ --with-http_secure_link_module \ --with-http_degradation_module \ --with-http_slice_module \ --with-http_stub_status_module \ --conf-path=/etc/nginx/nginx.conf \ --error-log-path=/var/log/nginx/error.log \ --http-log-path=/var/log/nginx/access.log \ --pid-path=/run/nginx.pid make make install
VERY IMPORTANT
Ngnix/Tengine have an internal module: src/http/ngxhttpspecial_response.c
. This module displays a dynamic webpage and is inside the binary, the code.
To avoid that page, I have had to modify the source code.
The patch is in my repo
VERY IMPORTANT
Configuration of Tengine
Structure
I've split Tengine configuration into multiple files:
file/folder name | type | description |
---|---|---|
nginx.conf | file | Main configuration file, it just contain the very basic setup and includes |
conf.d | Folder | contains multiple config files that will be common to all hosts |
sites-available | Folder | contains all the virtual hosts config files |
sites-enabled | Folder | contains the active virtual host configs |
bucket.d | Folder | contains the active public buckets |
Main config files
The main file nginx.conf
for clover
nowadays is:
- nginx.conf
worker_processes 64; error_log /var/log/nginx/error.log info; events { worker_connections 1024; } http { access_log /var/log/nginx/access.log ; include mime.types; include conf.d/blacklist.conf; include conf.d/security_headers.conf; include conf.d/security_request_limit_zones.conf; default_type application/octet-stream; sendfile on; keepalive_timeout 65; server_tokens off; include sites-enabled/*; }
- conf.d/blacklist.conf
#-*- mode: nginx; mode: flyspell-prog; ispell-local-dictionary: "american" -*- ### This file implements a blacklist for certain user agents and ### referrers. It's a first line of defense. It must be included ### inside a http block. ## Add here all user agents that are to be blocked. map $http_user_agent $bad_bot { default 0; libwww-perl 1; ~(?i)(httrack|htmlparser|libwww) 1; } ## Add here all referrers that are to blocked. map $http_referer $bad_referer { default 0; ~(?i)(babes|click|diamond|forsale|girl|jewelry|love|nudit|organic|poker|porn|poweroversoftware|sex|teen|webcam|zippo|casino|replica) 1; }
- conf.d/security_headers.conf
add_header X-Content-Type-Options nosniff; add_header X-Frame-Options "SAMEORIGIN"; add_header X-XSS-Protection "1; mode=block"; add_header X-Robots-Tag none;
- conf.d/ceph_public_request_method.conf
if ($request_method !~ ^(GET)$ ) { #return 444; return 403; }
- conf.d/upstream_ceph.conf
upstream ceph { ip_hash; server avmlp-osgw-001.ciberterminal.net; server avmlp-osgw-002.ciberterminal.net; server avmlp-osgw-003.ciberterminal.net; server avmlp-osgw-004.ciberterminal.net; }
- conf.d/security_request_limit_zones.conf
limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s; limit_req_zone $binary_remote_addr zone=ten:10m rate=10r/s; limit_req_zone $binary_remote_addr zone=hundred:10m rate=100r/s; limit_req_zone $binary_remote_addr zone=thousand:10m rate=1000r/s;
- bucket.d/PUBLIC_BUCKET_TEMPLATE.conf
location /PUBLIC_BUCKET_NAME { include conf.d/ceph_public_request_method.conf; proxy_pass http://ceph; }
clover virtualhost
- clover.ciberterminal.net.conf
include conf.d/upstream_ceph.conf ; server { listen 80; server_name clover.ciberterminal.net clover.devoluiva.com clover; access_log /var/log/nginx/clover.ciberterminal.net.access.log; error_log /var/log/nginx/clover.ciberterminal.net.error.log; root /usr/local/nginx/html ; client_max_body_size 0; proxy_buffering off; proxy_request_buffering off; # This must be set to clover.ciberterminal.net as is the internal name known by the gateways proxy_set_header Host clover.ciberterminal.net; proxy_set_header X-Forwarded-For $remote_addr; # limiting requests per second limit_req zone=ten burst=30 nodelay; # Nginx status include /etc/nginx/conf.d/nginx_status.conf; # PUBLIC BUCKETS include bucket.d/*.conf ; location / { allow 127.0.0.1; # f5 ip's allow 10.20.0.5; allow 10.20.0.6; allow 10.20.0.7; allow 10.20.0.8; deny all; } }
systemd setup
Systemd Unit:
- /usr/lib/systemd/system/tengine.service
[Unit] Description=The tengine HTTP and reverse proxy server After=network.target remote-fs.target nss-lookup.target [Service] Type=forking PIDFile=/run/nginx.pid # tengine will fail to start if /run/tengine.pid already exists but has the wrong # SELinux context. This might happen when running `tengine -t` from the cmdline. # https://bugzilla.redhat.com/show_bug.cgi?id=1268621 ExecStartPre=/usr/bin/rm -f /run/tengine.pid ExecStartPre=/usr/local/nginx/sbin/nginx -t ExecStart=/usr/local/nginx/sbin/nginx ExecReload=/bin/kill -s HUP $MAINPID KillSignal=SIGQUIT TimeoutStopSec=5 KillMode=process PrivateTmp=true [Install] WantedBy=multi-user.target
Reload daemon:
systemctl daemon-reload
Start&enable tengine/nginx:
systemctl start tengine
systemctl enable tengine
logrotate
cat >>/etc/logrotate.d/nginx<<EOF /var/log/nginx/*.log { daily missingok rotate 30 compress delaycompress notifempty create 640 nginx adm sharedscripts postrotate if [ -f /var/run/nginx.pid ]; then kill -USR1 `cat /var/run/nginx.pid` fi endscript } EOF
Load balancing with F5
We've decided to perform the load balancing with F5 instead of using Keepalived+VIP.
That means that both servers are answering requests.
- Pool used is:
LTM-PoolRD0PROD-DMZciberterminal-CEPH-VIP80
(Partition:PROD-DMZ-FE
). - Nodes health check: ICMP
- Service health check:
LTM-MonitorCOMMONhttpHEAD-root-healthStatusCode-2XX-3XX
(HEAD /health and expect a 2xx or 3xx response code).
Removing one server from the pool
The fastest way to remove one server from the F5 pool is remove health page:
rm -fv /usr/local/nginx/html/health
Re-adding one server to the pool
echo "OK" > /usr/local/nginx/html/health
Security with limit_req+fail2ban
Start and enable firewalld:
systemctl enable firewalld
systemctl start firewalld
Allow http & https:
firewall-cmd --zone=public --add-service=http firewall-cmd --zone=public --add-service=https firewall-cmd --zone=public --add-service=snmp firewall-cmd --permanent --zone=public --add-service=https firewall-cmd --permanent --zone=public --add-service=http firewall-cmd --permanent --zone=public --add-service=snmp
Install fail2ban:
yum -y install fail2ban-all
Enable the pre-defined nginx jails:
- nginx-botsearch
- nginx-limit-req
This is done by editing /etc/fail2ban/jail.conf
and adding:
enabled=true
To the corresponding sections, here is a patch file with the changes:
- jail.conf.patch
--- jail.conf 2020-02-10 18:05:03.815727022 +0100 +++ jail.conf.bck 2020-02-10 18:18:32.869001684 +0100 @@ -349,14 +349,13 @@ [nginx-limit-req] port = http,https logpath = %(nginx_error_log)s -enabled = true [nginx-botsearch] port = http,https logpath = %(nginx_error_log)s maxretry = 2 -enabled = true + # Ban attackers that try to use PHP's URL-fopen() functionality # through GET/POST variables. - Experimental, with more than a year
Fail2bn jail nginx-limit-req
needs that limitreq
with the following zones:
inside nginx is configured.
securityrequestlimitzones.conf
So I've wrote the main config options inside
zone name | Max connections per second |
---|---|
one | 1 |
ten | 10 |
hundred | 100 |
thousand | 1000 |
Keepalived setup (UNUSED)
Setup keepalived
UNUSED NOW, we're using F5 to load balancing.
- keepalived.conf
global_defs { notification_email { dodger@ciberterminal.net } notification_email_from clover@ciberterminal.net smtp_server mta4.bavel.biz smtp_connect_timeout 30 ! router_id LVS_DEVEL ! vrrp_skip_check_adv_addr ! vrrp_strict ! vrrp_garp_interval 0 ! vrrp_gna_interval 0 } vrrp_script chk_haproxy { script "killall -0 nginx" # check the nginx process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance VI_1 { interface eth0 # interface to monitor state MASTER # MASTER on haproxy, BACKUP on haproxy2 virtual_router_id 51 priority 101 # 101 on haproxy, 100 on haproxy2 virtual_ipaddress { 10.20.54.0 # virtual ip address } track_script { chk_haproxy } smtp_alert }
On the secondary node, you'll have to chante the line:
state MASTER # MASTER on haproxy, BACKUP on haproxy2
setup pmta to allow sending un-authenticated emails
# avmlp-osnx-001 <source 10.20.0.46> always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full </source> # avmlp-osnx-002 <source 10.20.0.47> always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full </source> # clover.ciberterminal.net public <source 10.20.0.45> always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full </source>
Restart all
systemctl restart tengine systemctl restart keepalived.service