User Tools

Site Tools


linux:ceph:howtos:balancing_gateways_nginx

[HOWTO] Public balancing Gateways (Tengine+F5)

Documentation
Name: [HOWTO] Public balancing Gateways (Tengine+keepalived)
Description: A production-ready way to balance gateways
Modification date :24/01/2020
Owner:dodger
Notify changes to:Owner
Tags:ceph, object storage
Scalate to:The_fucking_bofh

Pre-Requirements

Instructions

Setup Tengine

Tengine must be built from source, there's no official repository for centos…
Pre-required libs for a complete setup:

yum -y install pcre-devel openssl-devel libxslt-devel gd-devel GeoIP-devel

Download the latest version from my repository https://github.com/Jorge-Holgado/tengine.
Configure, make and install:

./configure \
  --with-http_ssl_module \
  --with-http_v2_module \
  --with-http_realip_module \
  --with-http_addition_module \
  --with-http_xslt_module=dynamic \
  --with-http_image_filter_module=dynamic \
  --with-http_geoip_module=dynamic \
  --with-http_sub_module \
  --with-http_gunzip_module \
  --with-http_random_index_module \
  --with-http_secure_link_module \
  --with-http_degradation_module \
  --with-http_slice_module \
  --with-http_stub_status_module \
  --conf-path=/etc/nginx/nginx.conf \
  --error-log-path=/var/log/nginx/error.log \
  --http-log-path=/var/log/nginx/access.log \
  --pid-path=/run/nginx.pid
make
make install



VERY IMPORTANT
Ngnix/Tengine have an internal module: src/http/ngx_http_special_response.c. This module displays a dynamic webpage and is inside the binary, the code.
To avoid that page, I have had to modify the source code.
The patch is in my repo
VERY IMPORTANT

Configuration of Tengine

Structure

I've split Tengine configuration into multiple files:

file/folder name type description
nginx.conf file Main configuration file, it just contain the very basic setup and includes
conf.d Folder contains multiple config files that will be common to all hosts
sites-available Folder contains all the virtual hosts config files
sites-enabled Folder contains the active virtual host configs
bucket.d Folder contains the active public buckets

Main config files


The main file nginx.conf for clovernowadays is:

nginx.conf
worker_processes  64;
error_log  /var/log/nginx/error.log  info;
events {
    worker_connections  1024;
}
http {
    access_log  /var/log/nginx/access.log ;
 
    include       mime.types;
    include conf.d/blacklist.conf;
    include conf.d/security_headers.conf;
    include conf.d/security_request_limit_zones.conf;
 
    default_type  application/octet-stream;
    sendfile        on;
    keepalive_timeout  65;
    server_tokens off;
 
    include sites-enabled/*;
}
conf.d/blacklist.conf
#-*- mode: nginx; mode: flyspell-prog; ispell-local-dictionary: "american" -*-
### This file implements a blacklist for certain user agents and
### referrers. It's a first line of defense. It must be included
### inside a http block.
 
 
## Add here all user agents that are to be blocked.
map $http_user_agent $bad_bot {
    default 0;
    libwww-perl                      1;
    ~(?i)(httrack|htmlparser|libwww) 1;
}
 
## Add here all referrers that are to blocked.
map $http_referer $bad_referer {
    default 0;
    ~(?i)(babes|click|diamond|forsale|girl|jewelry|love|nudit|organic|poker|porn|poweroversoftware|sex|teen|webcam|zippo|casino|replica) 1;
}
conf.d/security_headers.conf
add_header X-Content-Type-Options nosniff;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Robots-Tag none;
conf.d/ceph_public_request_method.conf
if ($request_method !~ ^(GET)$ ) {
    #return 444;
    return 403;
}
conf.d/upstream_ceph.conf
upstream ceph {
    ip_hash;
    server avmlp-osgw-001.ciberterminal.net;
    server avmlp-osgw-002.ciberterminal.net;
    server avmlp-osgw-003.ciberterminal.net;
    server avmlp-osgw-004.ciberterminal.net;
}
conf.d/security_request_limit_zones.conf
limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
limit_req_zone $binary_remote_addr zone=ten:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=hundred:10m rate=100r/s;
limit_req_zone $binary_remote_addr zone=thousand:10m rate=1000r/s;
bucket.d/PUBLIC_BUCKET_TEMPLATE.conf
location /PUBLIC_BUCKET_NAME {
    include conf.d/ceph_public_request_method.conf;
    proxy_pass http://ceph;
}   

clover virtualhost

clover.ciberterminal.net.conf
include conf.d/upstream_ceph.conf ;
 
server {
    listen       80;
    server_name  clover.ciberterminal.net clover.devoluiva.com clover;
    access_log /var/log/nginx/clover.ciberterminal.net.access.log;
    error_log /var/log/nginx/clover.ciberterminal.net.error.log;
    root /usr/local/nginx/html ;
 
    client_max_body_size 0;
 
    proxy_buffering off;
    proxy_request_buffering off;
    # This must be set to clover.ciberterminal.net as is the internal name known by the gateways
    proxy_set_header Host clover.ciberterminal.net;
    proxy_set_header X-Forwarded-For $remote_addr;
 
 
    # limiting requests per second
    limit_req zone=ten burst=30 nodelay;
 
    # Nginx status
    include /etc/nginx/conf.d/nginx_status.conf;
    # PUBLIC BUCKETS
    include bucket.d/*.conf ;
 
    location / {
        allow 127.0.0.1;
        # f5 ip's
        allow 10.20.0.5;
        allow 10.20.0.6;
        allow 10.20.0.7;
        allow 10.20.0.8;
        deny all;
    }   
 
}

systemd setup

Systemd Unit:

/usr/lib/systemd/system/tengine.service
[Unit]
Description=The tengine HTTP and reverse proxy server
After=network.target remote-fs.target nss-lookup.target
 
[Service]
Type=forking
PIDFile=/run/nginx.pid
# tengine will fail to start if /run/tengine.pid already exists but has the wrong
# SELinux context. This might happen when running `tengine -t` from the cmdline.
# https://bugzilla.redhat.com/show_bug.cgi?id=1268621
ExecStartPre=/usr/bin/rm -f /run/tengine.pid
ExecStartPre=/usr/local/nginx/sbin/nginx -t
ExecStart=/usr/local/nginx/sbin/nginx
ExecReload=/bin/kill -s HUP $MAINPID
KillSignal=SIGQUIT
TimeoutStopSec=5
KillMode=process
PrivateTmp=true
 
[Install]
WantedBy=multi-user.target

Reload daemon:

systemctl daemon-reload

Start&enable tengine/nginx:

systemctl start tengine
systemctl enable tengine

logrotate

cat >>/etc/logrotate.d/nginx<<EOF
/var/log/nginx/*.log {
        daily
        missingok
        rotate 30
        compress
        delaycompress
        notifempty
        create 640 nginx adm
        sharedscripts
        postrotate
                if [ -f /var/run/nginx.pid ]; then
                        kill -USR1 `cat /var/run/nginx.pid`
                fi
        endscript
}
EOF

Load balancing with F5

We've decided to perform the load balancing with F5 instead of using Keepalived+VIP.
That means that both servers are answering requests.

  • Pool used is: LTM-Pool_RD0_PROD-DMZ_ciberterminal-CEPH-VIP_80 (Partition: PROD-DMZ-FE).
  • Nodes health check: ICMP
  • Service health check: LTM-Monitor_COMMON_http_HEAD-root-health_StatusCode-2XX-3XX (HEAD /health and expect a 2xx or 3xx response code).


Removing one server from the pool

The fastest way to remove one server from the F5 pool is remove health page:

rm -fv /usr/local/nginx/html/health

Re-adding one server to the pool

echo "OK" > /usr/local/nginx/html/health

Security with limit_req+fail2ban

Start and enable firewalld:

systemctl enable firewalld
systemctl start firewalld

Allow http & https:

firewall-cmd --zone=public --add-service=http
firewall-cmd --zone=public --add-service=https
firewall-cmd --zone=public --add-service=snmp
firewall-cmd --permanent --zone=public --add-service=https
firewall-cmd --permanent --zone=public --add-service=http
firewall-cmd --permanent --zone=public --add-service=snmp

Install fail2ban:

yum -y install fail2ban-all

Enable the pre-defined nginx jails:

  • nginx-botsearch
  • nginx-limit-req

This is done by editing /etc/fail2ban/jail.conf and adding:

enabled=true

To the corresponding sections, here is a patch file with the changes:

jail.conf.patch
--- jail.conf   2020-02-10 18:05:03.815727022 +0100
+++ jail.conf.bck       2020-02-10 18:18:32.869001684 +0100
@@ -349,14 +349,13 @@
 [nginx-limit-req]
 port    = http,https
 logpath = %(nginx_error_log)s
-enabled = true
 
 [nginx-botsearch]
 
 port     = http,https
 logpath  = %(nginx_error_log)s
 maxretry = 2
-enabled = true
+
 
 # Ban attackers that try to use PHP's URL-fopen() functionality
 # through GET/POST variables. - Experimental, with more than a year

Fail2bn jail nginx-limit-req needs that limit_req inside nginx is configured.
So I've wrote the main config options inside security_request_limit_zones.conf with the following zones:

zone name Max connections per second
one 1
ten 10
hundred 100
thousand 1000

Keepalived setup (UNUSED)

Setup keepalived

UNUSED NOW, we're using F5 to load balancing.

keepalived.conf
global_defs {
   notification_email {
     dodger@ciberterminal.net
   }
   notification_email_from clover@ciberterminal.net
   smtp_server mta4.bavel.biz
   smtp_connect_timeout 30
!   router_id LVS_DEVEL
!   vrrp_skip_check_adv_addr
!   vrrp_strict
!   vrrp_garp_interval 0
!   vrrp_gna_interval 0
}
 
vrrp_script chk_haproxy {
  script "killall -0 nginx" # check the nginx process
  interval 2 # every 2 seconds
  weight 2 # add 2 points if OK
}
 
vrrp_instance VI_1 {
  interface eth0 # interface to monitor
  state MASTER # MASTER on haproxy, BACKUP on haproxy2
  virtual_router_id 51
  priority 101 # 101 on haproxy, 100 on haproxy2
  virtual_ipaddress {
    10.20.54.0 # virtual ip address
  }
  track_script {
    chk_haproxy
  }
  smtp_alert
}

On the secondary node, you'll have to chante the line:

state MASTER # MASTER on haproxy, BACKUP on haproxy2

setup pmta to allow sending un-authenticated emails

# avmlp-osnx-001
<source 10.20.0.46>
        always-allow-relaying yes 
        default-virtual-mta operativa
        smtp-service yes 
        require-auth false
        dsn-return-default full
</source>
 
# avmlp-osnx-002
<source 10.20.0.47>
        always-allow-relaying yes 
        default-virtual-mta operativa
        smtp-service yes 
        require-auth false
        dsn-return-default full
</source>
 
# clover.ciberterminal.net public
<source 10.20.0.45>
        always-allow-relaying yes
        default-virtual-mta operativa
        smtp-service yes
        require-auth false
        dsn-return-default full
</source>

Restart all

systemctl restart tengine
systemctl restart keepalived.service
linux/ceph/howtos/balancing_gateways_nginx.txt · Last modified: 2022/02/11 11:36 (external edit)