====== [HOWTO] Public balancing Gateways (Tengine+F5) ====== ^ Documentation ^| ^Name:| [HOWTO] Public balancing Gateways (Tengine+keepalived) | ^Description:| A production-ready way to balance gateways | ^Modification date :|24/01/2020| ^Owner:|dodger| ^Notify changes to:|Owner | ^Tags:|ceph, object storage | ^Scalate to:|The_fucking_bofh| ====== Pre-Requirements ====== * [[linux:ceph:howtos:using_amazon_dns_bucket_naming|Setup S3 naming method]] * [[https://ceph.io/geen-categorie/a-use-case-of-tengine-a-drop-in-replacement-and-fork-of-nginx/|Why tengine]]. * Tengine [[http://tengine.taobao.org/document/http_core.html|documentation]] about proxy requests buffering (''proxy_request_buffering'' option) ====== Instructions ====== ===== Setup Tengine ===== Tengine must be built from source, there's no official repository for centos...\\ Pre-required libs for a complete setup: yum -y install pcre-devel openssl-devel libxslt-devel gd-devel GeoIP-devel Download the latest version from my repository [[https://github.com/Jorge-Holgado/tengine]].\\ Configure, make and install: ./configure \ --with-http_ssl_module \ --with-http_v2_module \ --with-http_realip_module \ --with-http_addition_module \ --with-http_xslt_module=dynamic \ --with-http_image_filter_module=dynamic \ --with-http_geoip_module=dynamic \ --with-http_sub_module \ --with-http_gunzip_module \ --with-http_random_index_module \ --with-http_secure_link_module \ --with-http_degradation_module \ --with-http_slice_module \ --with-http_stub_status_module \ --conf-path=/etc/nginx/nginx.conf \ --error-log-path=/var/log/nginx/error.log \ --http-log-path=/var/log/nginx/access.log \ --pid-path=/run/nginx.pid make make install \\ \\ **VERY IMPORTANT**\\ Ngnix/Tengine have an internal module: ''src/http/ngx_http_special_response.c''. This module displays a dynamic webpage **and is inside the binary, the code**.\\ To avoid that page, I have had to modify the source code.\\ The patch is in my repo\\ **VERY IMPORTANT**\\ ===== Configuration of Tengine ===== ==== Structure ==== I've split Tengine configuration into multiple files: ^ file/folder name ^ type ^ description ^ | ''nginx.conf'' | file | Main configuration file, it just contain the very basic setup and includes | | ''conf.d'' | Folder | contains multiple config files that will be common to all hosts | | ''sites-available'' | Folder | contains all the virtual hosts config files | | ''sites-enabled'' | Folder | contains the **active** virtual host configs | | ''bucket.d'' | Folder | contains the active public buckets | ==== Main config files ==== \\ The main file ''nginx.conf'' for ''clover''nowadays is: worker_processes 64; error_log /var/log/nginx/error.log info; events { worker_connections 1024; } http { access_log /var/log/nginx/access.log ; include mime.types; include conf.d/blacklist.conf; include conf.d/security_headers.conf; include conf.d/security_request_limit_zones.conf; default_type application/octet-stream; sendfile on; keepalive_timeout 65; server_tokens off; include sites-enabled/*; } #-*- mode: nginx; mode: flyspell-prog; ispell-local-dictionary: "american" -*- ### This file implements a blacklist for certain user agents and ### referrers. It's a first line of defense. It must be included ### inside a http block. ## Add here all user agents that are to be blocked. map $http_user_agent $bad_bot { default 0; libwww-perl 1; ~(?i)(httrack|htmlparser|libwww) 1; } ## Add here all referrers that are to blocked. map $http_referer $bad_referer { default 0; ~(?i)(babes|click|diamond|forsale|girl|jewelry|love|nudit|organic|poker|porn|poweroversoftware|sex|teen|webcam|zippo|casino|replica) 1; } add_header X-Content-Type-Options nosniff; add_header X-Frame-Options "SAMEORIGIN"; add_header X-XSS-Protection "1; mode=block"; add_header X-Robots-Tag none; if ($request_method !~ ^(GET)$ ) { #return 444; return 403; } upstream ceph { ip_hash; server avmlp-osgw-001.ciberterminal.net; server avmlp-osgw-002.ciberterminal.net; server avmlp-osgw-003.ciberterminal.net; server avmlp-osgw-004.ciberterminal.net; } limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s; limit_req_zone $binary_remote_addr zone=ten:10m rate=10r/s; limit_req_zone $binary_remote_addr zone=hundred:10m rate=100r/s; limit_req_zone $binary_remote_addr zone=thousand:10m rate=1000r/s; location /PUBLIC_BUCKET_NAME { include conf.d/ceph_public_request_method.conf; proxy_pass http://ceph; } ==== clover virtualhost ==== include conf.d/upstream_ceph.conf ; server { listen 80; server_name clover.ciberterminal.net clover.devoluiva.com clover; access_log /var/log/nginx/clover.ciberterminal.net.access.log; error_log /var/log/nginx/clover.ciberterminal.net.error.log; root /usr/local/nginx/html ; client_max_body_size 0; proxy_buffering off; proxy_request_buffering off; # This must be set to clover.ciberterminal.net as is the internal name known by the gateways proxy_set_header Host clover.ciberterminal.net; proxy_set_header X-Forwarded-For $remote_addr; # limiting requests per second limit_req zone=ten burst=30 nodelay; # Nginx status include /etc/nginx/conf.d/nginx_status.conf; # PUBLIC BUCKETS include bucket.d/*.conf ; location / { allow; # f5 ip's allow; allow; allow; allow; deny all; } } ===== systemd setup ===== Systemd Unit: [Unit] Description=The tengine HTTP and reverse proxy server After=network.target remote-fs.target nss-lookup.target [Service] Type=forking PIDFile=/run/nginx.pid # tengine will fail to start if /run/tengine.pid already exists but has the wrong # SELinux context. This might happen when running `tengine -t` from the cmdline. # https://bugzilla.redhat.com/show_bug.cgi?id=1268621 ExecStartPre=/usr/bin/rm -f /run/tengine.pid ExecStartPre=/usr/local/nginx/sbin/nginx -t ExecStart=/usr/local/nginx/sbin/nginx ExecReload=/bin/kill -s HUP $MAINPID KillSignal=SIGQUIT TimeoutStopSec=5 KillMode=process PrivateTmp=true [Install] WantedBy=multi-user.target Reload daemon: systemctl daemon-reload Start&enable tengine/nginx: systemctl start tengine systemctl enable tengine ===== logrotate ===== cat >>/etc/logrotate.d/nginx< ====== Load balancing with F5 ====== We've decided to perform the load balancing with F5 instead of using Keepalived+VIP.\\ That means that **both** servers are answering requests.\\ * Pool used is: ''LTM-Pool_RD0_PROD-DMZ_ciberterminal-CEPH-VIP_80'' (Partition: ''PROD-DMZ-FE''). * Nodes health check: ICMP * Service health check: ''LTM-Monitor_COMMON_http_HEAD-root-health_StatusCode-2XX-3XX'' (**HEAD /health** and expect a 2xx or 3xx response code). \\ ===== Removing one server from the pool ===== The fastest way to remove one server from the F5 pool is remove //health// page:\\ rm -fv /usr/local/nginx/html/health ===== Re-adding one server to the pool ===== echo "OK" > /usr/local/nginx/html/health ====== Security with limit_req+fail2ban ====== Start and enable firewalld: systemctl enable firewalld systemctl start firewalld Allow http & https: firewall-cmd --zone=public --add-service=http firewall-cmd --zone=public --add-service=https firewall-cmd --zone=public --add-service=snmp firewall-cmd --permanent --zone=public --add-service=https firewall-cmd --permanent --zone=public --add-service=http firewall-cmd --permanent --zone=public --add-service=snmp Install fail2ban: yum -y install fail2ban-all Enable the pre-defined nginx jails: * nginx-botsearch * nginx-limit-req This is done by editing ''/etc/fail2ban/jail.conf'' and adding: enabled=true To the corresponding sections, here is a patch file with the changes: --- jail.conf 2020-02-10 18:05:03.815727022 +0100 +++ jail.conf.bck 2020-02-10 18:18:32.869001684 +0100 @@ -349,14 +349,13 @@ [nginx-limit-req] port = http,https logpath = %(nginx_error_log)s -enabled = true [nginx-botsearch] port = http,https logpath = %(nginx_error_log)s maxretry = 2 -enabled = true + # Ban attackers that try to use PHP's URL-fopen() functionality # through GET/POST variables. - Experimental, with more than a year Fail2bn jail ''nginx-limit-req'' needs that ''limit_req'' inside nginx is configured.\\ So I've wrote the main config options inside ''security_request_limit_zones.conf'' with the following zones: ^ zone name ^ Max connections per second ^ | one | 1 | | ten | 10 | | hundred | 100 | | thousand | 1000 | ====== Keepalived setup (UNUSED) ====== ===== Setup keepalived ===== UNUSED NOW, we're using F5 to load balancing. global_defs { notification_email { dodger@ciberterminal.net } notification_email_from clover@ciberterminal.net smtp_server mta4.bavel.biz smtp_connect_timeout 30 ! router_id LVS_DEVEL ! vrrp_skip_check_adv_addr ! vrrp_strict ! vrrp_garp_interval 0 ! vrrp_gna_interval 0 } vrrp_script chk_haproxy { script "killall -0 nginx" # check the nginx process interval 2 # every 2 seconds weight 2 # add 2 points if OK } vrrp_instance VI_1 { interface eth0 # interface to monitor state MASTER # MASTER on haproxy, BACKUP on haproxy2 virtual_router_id 51 priority 101 # 101 on haproxy, 100 on haproxy2 virtual_ipaddress { # virtual ip address } track_script { chk_haproxy } smtp_alert } On the secondary node, you'll have to chante the line: state MASTER # MASTER on haproxy, BACKUP on haproxy2 ===== setup pmta to allow sending un-authenticated emails ===== # avmlp-osnx-001 always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full # avmlp-osnx-002 always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full # clover.ciberterminal.net public always-allow-relaying yes default-virtual-mta operativa smtp-service yes require-auth false dsn-return-default full ===== Restart all ===== systemctl restart tengine systemctl restart keepalived.service