Skip to content

Latest commit

 

History

History
3288 lines (2097 loc) · 149 KB

RULES.md

File metadata and controls

3288 lines (2097 loc) · 149 KB

Base Rules

These are the basic set of rules to keep NGINX in good condition.

🔰 Organising Nginx configuration

Rationale

When your NGINX configuration grow, the need for organising your configuration will also grow. Well organised code is:

  • easier to understand
  • easier to maintain
  • easier to work with

Use include directive to move common server settings into a separate files and to attach your specific code to global config, contexts and other.

I always try to keep multiple directories in root of configuration tree. These directories stores all configuration files which are attached to the main file. I prefer the following structure:

  • html - for default static files, e.g. global 5xx error page
  • master - for main configuration, e.g. acls, listen directives, and domains
    • _acls - for access control lists, e.g. geo or map modules
    • _basic - for rate limiting rules, redirect maps, or proxy params
    • _listen - for all listen directives; also stores SSL configuration
    • _server - for domains (localhost) configuration; also stores all backends definitions
  • modules - for modules which are dynamically loading into NGINX
  • snippets - for NGINX aliases, configuration templates, e.g. logrotate

I attach some of them, if necessary, to files which has server directives.

Example
# Store this configuration in https.conf for example:
listen 10.240.20.2:443 ssl;

ssl_certificate /etc/nginx/master/_server/example.com/certs/nginx_example.com_bundle.crt;
ssl_certificate_key /etc/nginx/master/_server/example.com/certs/example.com.key;

# Include this file to the server section:
server {

  include /etc/nginx/master/_listen/10.240.20.2/https.conf;

  # And other:
  include /etc/nginx/master/_static/errors.conf;
  include /etc/nginx/master/_server/_helpers/global.conf;

  ...

  server_name domain.com www.domain.com;

  ...
External resources

🔰 Format, prettify and indent your Nginx code

Rationale

Work with unreadable configuration files is terrible, if syntax isn’t very readable, it makes your eyes sore, and you suffers from headaches.

When your code is formatted, it is significantly easier to maintain, debug, optimise, and can be read and understood in a short amount of time. You should eliminate code style violations from your NGINX configuration files.

Choose your formatter style and setup a common config for it. Some rules are universal, but the most important thing is to keep a consistent NGINX code style throughout your code base:

  • use whitespaces and blank lines to arrange and separate code blocks
  • use tabs for indents - they are consistent, customizable and allow mistakes to be more noticeable (unless you are a 4 space kind of guy)
  • use comments to explain why things are done not what is done
  • use meaningful naming conventions
  • simple is better than complex but complex is better than complicated

Some would say that NGINX's files are written in their own language or syntax so we should not overdo it with above rules. I think it's worth sticking to the general (programming) rules and make your and other NGINX adminstrators life easier.

Example
# Bad code style:
http {
  include    nginx/proxy.conf;
  include    /etc/nginx/fastcgi.conf;
  index    index.html index.htm index.php;

  default_type application/octet-stream;
  log_format   main '$remote_addr - $remote_user [$time_local]  $status '
    '"$request" $body_bytes_sent "$http_referer" '
    '"$http_user_agent" "$http_x_forwarded_for"';
  access_log   logs/access.log    main;
  sendfile on;
  tcp_nopush   on;
  server_names_hash_bucket_size 128; # this seems to be required for some vhosts

  ...

# Good code style:
http {

  # Attach global rules:
  include         /etc/nginx/proxy.conf;
  include         /etc/nginx/fastcgi.conf;

  index           index.html index.htm index.php;

  default_type    application/octet-stream;

  # Standard log format:
  log_format      main '$remote_addr - $remote_user [$time_local]  $status '
                       '"$request" $body_bytes_sent "$http_referer" '
                       '"$http_user_agent" "$http_x_forwarded_for"';

  access_log      /var/log/nginx/access.log main;

  sendfile        on;
  tcp_nopush      on;

  # This seems to be required for some vhosts:
  server_names_hash_bucket_size 128;

  ...
External resources

🔰 Use reload option to change configurations on the fly

Rationale

Use the reload option to achieve a graceful reload of the configuration without stopping the server and dropping any packets. This function of the master process allows to rolls back the changes and continues to work with stable and old working configuration.

This ability of NGINX is very critical in a high-uptime and dynamic environments for keeping the load balancer or standalone server online.

Master process checks the syntax validity of the new configuration and tries to apply all changes. If this procedure has been accomplished, the master process create new worker processes and sends shutdown messages to old. Old workers stops accepting new connections after received a shut down signal but current requests are still processing. After that, the old workers exit.

When you restart the NGINX service you might encounter situation in which NGINX will stop, and won't start back again, because of syntax error. Reload method is safer than restarting because before old process will be terminated, new configuration file is parsed and whole process is aborted if there are any problems with it.

To stop processes with waiting for the worker processes to finish serving current requests use nginx -s quit command. It's better than nginx -s stop for fast shutdown.

From NGINX documentation:

In order for NGINX to re-read the configuration file, a HUP signal should be sent to the master process. The master process first checks the syntax validity, then tries to apply new configuration, that is, to open log files and new listen sockets. If this fails, it rolls back changes and continues to work with old configuration. If this succeeds, it starts new worker processes, and sends messages to old worker processes requesting them to shut down gracefully. Old worker processes close listen sockets and continue to service old clients. After all clients are serviced, old worker processes are shut down.

Example
# 1)
systemctl reload nginx

# 2)
service nginx reload

# 3)
/etc/init.d/nginx reload

# 4)
/usr/sbin/nginx -s reload

# 5)
kill -HUP $(cat /var/run/nginx.pid)
# or
kill -HUP $(pgrep -f "nginx: master")

# 6)
/usr/sbin/nginx -g 'daemon on; master_process on;' -s reload
External resources

🔰 Separate listen directives for 80 and 443

Rationale

If you served HTTP and HTTPS with the exact same config (a single server that handles both HTTP and HTTPS requests) NGINX is intelligent enough to ignore the SSL directives if loaded over port 80.

Best practice with NGINX is to use a separate server for a redirect like this (not shared with the server of your main configuration), to hardcode everything, and not use regular expressions at all.

I don't like duplicating the rules, but separate listen directives is certainly to help you maintain and modify your configuration.

It's useful if you pin multiple domains to one IP address. This allows you to attach one listen directive (e.g. if you keep it in the configuration file) to multiple domains configurations.

It may also be necessary to hardcode the domains if you're using HTTPS, because you have to know upfront which certificates you'll be providing.

Example
# For HTTP:
server {

  listen 10.240.20.2:80;

  ...

}

# For HTTPS:
server {

  listen 10.240.20.2:443 ssl;

  ...

}
External resources

🔰 Define the listen directives explicitly with address:port pair

Rationale

NGINX translates all incomplete listen directives by substituting missing values with their default values.

And what's more, will only evaluate the server_name directive when it needs to distinguish between server blocks that match to the same level in the listen directive.

Set IP address and port number to prevents soft mistakes which may be difficult to debug.

Example
# Client side:
curl -Iks http://api.random.com

# Server side:
server {

  # This block will be processed:
  listen 192.168.252.10;  # --> 192.168.252.10:80

  ...

}

server {

  listen 80;  # --> *:80 --> 0.0.0.0:80
  server_name api.random.com;

  ...

}
External resources

🔰 Prevent processing requests with undefined server names

Rationale

NGINX should prevent processing requests with undefined server names (also on IP address). It protects against configuration errors, e.g. traffic forwarding to incorrect backends. The problem is easily solved by creating a default dummy vhost that catches all requests with unrecognized Host headers.

If none of the listen directives have the default_server parameter then the first server with the address:port pair will be the default server for this pair (it means that the NGINX always has a default server).

If someone makes a request using an IP address instead of a server name, the Host request header field will contain the IP address and the request can be handled using the IP address as the server name.

The server name _ is not required in modern versions of NGINX. If a server with a matching listen and server_name cannot be found, NGINX will use the default server. If your configurations are spread across multiple files, there evaluation order will be ambiguous, so you need to mark the default server explicitly.

NGINX uses Host header for server_name matching. It does not use TLS SNI. This means that for an SSL server, NGINX must be able to accept SSL connection, which boils down to having certificate/key. The cert/key can be any, e.g. self-signed.

It is a simple procedure for all non defined server names:

  • one server block, with...
  • complete listen directive, with...
  • default_server parameter, with...
  • only one server_name definition, and...
  • preventively I add it at the beginning of the configuration

Also good point is return 444; for default server name because this will close the connection and log it internally, for any domain that isn't defined in NGINX.

Example
# Place it at the beginning of the configuration file to prevent mistakes:
server {

  # For ssl option remember about SSL parameters (private key, certs, cipher suites, etc.);
  # add default_server to your listen directive in the server that you want to act as the default:
  listen 10.240.20.2:443 default_server ssl;

  # We catch:
  #   - invalid domain names
  #   - requests without the "Host" header
  #   - and all others (also due to the above setting)
  #   - default_server in server_name directive is not required - I add this for a better understanding and I think it's an unwritten standard
  # ...but you should know that it's irrelevant, really, you can put in everything there.
  server_name _ "" default_server;

  ...

  return 444;

  # We can also serve:
  # location / {

    # static file (error page):
    #   root /etc/nginx/error-pages/404;
    # or redirect:
    #   return 301 https://badssl.com;

    # return 444;

  # }

}

server {

  listen 10.240.20.2:443 ssl;

  server_name domain.com;

  ...

}

server {

  listen 10.240.20.2:443 ssl;

  server_name domain.org;

  ...

}
External resources

🔰 Never use a hostname in a listen or upstream directives

Rationale

Generaly, uses of hostnames in the listen or upstream directives is a bad practice.

In the worst case NGINX won't be able to bind to the desired TCP socket which will prevent NGINX from starting at all.

The best and safer way is to know the IP address that needs to be bound to and use that address instead of the hostname. This also prevents NGINX from needing to look up the address and removes dependencies on external and internal resolvers.

Uses of $hostname (the machine’s hostname) variable in the server_name directive is also example of bad practice (it's similar to use hostname label).

I believe it is also necessary to set IP address and port number pair to prevents soft mistakes which may be difficult to debug.

Example

Bad configuration:

upstream {

  server http://x-9s-web01-prod:8080;

}

server {

  listen rev-proxy-prod:80;

  ...

}

Good configuration:

upstream {

  server http://192.168.252.200:8080;

}

server {

  listen 10.10.100.20:80;

  ...

}
External resources

🔰 Use only one SSL config for the listen directive

Rationale

For me, this rule making it easier to debug and maintain.

Remember that regardless of SSL parameters you are able to use multiple SSL certificates on the same listen directive (IP address).

For sharing a single IP address between several HTTPS servers in my opinion you should use one SSL config (e.g. protocols, ciphers, curves). It's to prevent mistakes and configuration mismatch.

Also remember about configuration for default server. It's important because if none of the listen directives have the default_server parameter then the first server in your configuration will be default server. So you should use only one SSL setup with several names on the same IP address.

From NGINX documentation:

This is caused by SSL protocol behaviour. The SSL connection is established before the browser sends an HTTP request and nginx does not know the name of the requested server. Therefore, it may only offer the default server’s certificate.

Also take a look at this:

A more generic solution for running several HTTPS servers on a single IP address is TLS Server Name Indication extension (SNI, RFC6066 - Transport Layer Security (TLS) Extensions: Extension Definitions), which allows a browser to pass a requested server name during the SSL handshake and, therefore, the server will know which certificate it should use for the connection.

Another good idea is to move common server settings into a separate file, i.e. common/example.com.conf and then include it in separate server blocks.

Example
# Store this configuration in e.g. https.conf:
listen 192.168.252.10:443 default_server ssl http2;

ssl_protocols TLSv1.2;
ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384";

ssl_prefer_server_ciphers on;

ssl_ecdh_curve secp521r1:secp384r1;

...

# Include this file to the server context (attach domain-a.com for specific listen directive):
server {

  include             /etc/nginx/https.conf;

  server_name         domain-a.com;

  ssl_certificate     domain-a.com.crt;
  ssl_certificate_key domain-a.com.key;

  ...

}

# Include this file to the server context (attach domain-b.com for specific listen directive):
server {

  include             /etc/nginx/https.conf;

  server_name         domain-b.com;

  ssl_certificate     domain-b.com.crt;
  ssl_certificate_key domain-b.com.key;

  ...

}
External resources

🔰 Use geo/map modules instead of allow/deny

Rationale

Use map or geo modules (one of them) to prevent users abusing your servers. This allows to create variables with values depending on the client IP address.

Since variables are evaluated only when used, the mere existence of even a large number of declared e.g. geo variables does not cause any extra costs for request processing.

These directives provides the perfect way to block invalid visitors e.g. with ngx_http_geoip_module. For example, geo module is great for conditionally allow or deny IP.

geo module (watch out: don't mistake this module for the GeoIP) builds in-memory radix tree when loading configs. This is the same data structure as used in routing, and lookups are really fast. If you have many unique values per networks, then this long load time is caused by searching duplicates of data in array. Otherwise, it may be caused by insertions to a radix tree.

I use both modules for a large lists. You should've thought about it because this rule requires to use several if conditions. I think that allow/deny directives are better solution for simple lists, after all. Take a look at the example below:

# Allow/deny:
location /internal {

  include acls/internal.conf;
  allow   192.168.240.0/24;
  deny    all;

  ...

# vs geo/map:
location /internal {

  if ($globals_internal_map_acl) {
    set $pass 1;
  }

  if ($pass = 1) {
    proxy_pass http://localhost:80;
  }

  if ($pass != 1) {
    return 403;
  }

  ...

}
Example
# Map module:
map $remote_addr $globals_internal_map_acl {

  # Status code:
  #  - 0 = false
  #  - 1 = true
  default 0;

  ### INTERNAL ###
  10.255.10.0/24 1;
  10.255.20.0/24 1;
  10.255.30.0/24 1;
  192.168.0.0/16 1;

}

# Geo module:
geo $globals_internal_geo_acl {

  # Status code:
  #  - 0 = false
  #  - 1 = true
  default 0;

  ### INTERNAL ###
  10.255.10.0/24 1;
  10.255.20.0/24 1;
  10.255.30.0/24 1;
  192.168.0.0/16 1;

}
External resources

🔰 Map all the things...

Rationale

Manage a large number of redirects with maps and use them to customise your key-value pairs.

The map directive maps strings, so it is possible to represent e.g. 192.168.144.0/24 as a regular expression and continue to use the map directive.

Map module provides a more elegant solution for clearly parsing a big list of regexes, e.g. User-Agents, Referrers.

You can also use include directive for your maps so your config files would look pretty.

Example
map $http_user_agent $device_redirect {

  default "desktop";

  ~(?i)ip(hone|od) "mobile";
  ~(?i)android.*(mobile|mini) "mobile";
  ~Mobile.+Firefox "mobile";
  ~^HTC "mobile";
  ~Fennec "mobile";
  ~IEMobile "mobile";
  ~BB10 "mobile";
  ~SymbianOS.*AppleWebKit "mobile";
  ~Opera\sMobi "mobile";

}

# Turn on in a specific context (e.g. location):
if ($device_redirect = "mobile") {

  return 301 https://m.domain.com$request_uri;

}
External resources

🔰 Set global root directory for unmatched locations

Rationale

Set global root inside server directive for requests. It specifies the root directory for undefined locations.

From official documentation:

If you add a root to every location block then a location block that isn’t matched will have no root. Therefore, it is important that a root directive occur prior to your location blocks, which can then override this directive if they need to.

Example
server {

  server_name domain.com;

  root /var/www/domain.com/public;

  location / {

    ...

  }

  location /api {

    ...

  }

  location /static {

    root /var/www/domain.com/static;

    ...

  }

}
External resources

🔰 Use return directive for URL redirection (301, 302)

Rationale

It's a simple rule. You should use server blocks and return statements as they're way faster than evaluating RegEx.

It is simpler and faster because NGINX stops processing the request (and doesn't have to process a regular expressions).

Example
server {

  server_name www.example.com;

  # return    301 https://$host$request_uri;
  return      301 $scheme://www.example.com$request_uri;

}
External resources

🔰 Configure log rotation policy

Rationale

Log files gives you feedback about the activity and performance of the server as well as any problems that may be occurring. They are records details about requests and NGINX internals. Unfortunately, logs use more disk space.

You should define a process which periodically archiving the current log file and starting a new one, renames and optionally compresses the current log files, delete old log files, and force the logging system to begin using new log files.

I think the best tool for this is a logrotate. I use it everywhere if I want to manage logs automatically, and for a good night's sleep also. It is a simple program to rotate logs, uses crontab to work. It's scheduled work, not a daemon, so no need to reload its configuration.

Example
  • for manually rotation:

    # Check manually (all log files):
    logrotate -dv /etc/logrotate.conf
    
    # Check manually with force rotation (specific log file):
    logrotate -dv --force /etc/logrotate.d/nginx
  • for automate rotation:

    cat > /etc/logrotate.d/nginx << __EOF__
    /var/log/nginx/*.log {
      daily
      missingok
      rotate 14
      compress
      delaycompress
      notifempty
      create 0640 nginx nginx
      sharedscripts
      prerotate
        if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
          run-parts /etc/logrotate.d/httpd-prerotate; \
        fi \
      endscript
      postrotate
        # test ! -f /var/run/nginx.pid || kill -USR1 `cat /var/run/nginx.pid`
        invoke-rc.d nginx reload >/dev/null 2>&1
      endscript
    }
    
    /var/log/nginx/localhost/*.log {
      daily
      missingok
      rotate 14
      compress
      delaycompress
      notifempty
      create 0640 nginx nginx
      sharedscripts
      prerotate
        if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
          run-parts /etc/logrotate.d/httpd-prerotate; \
        fi \
      endscript
      postrotate
        # test ! -f /var/run/nginx.pid || kill -USR1 `cat /var/run/nginx.pid`
        invoke-rc.d nginx reload >/dev/null 2>&1
      endscript
    }
    
    /var/log/nginx/domains/example.com/*.log {
      daily
      missingok
      rotate 14
      compress
      delaycompress
      notifempty
      create 0640 nginx nginx
      sharedscripts
      prerotate
        if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
          run-parts /etc/logrotate.d/httpd-prerotate; \
        fi \
      endscript
      postrotate
        # test ! -f /var/run/nginx.pid || kill -USR1 `cat /var/run/nginx.pid`
        invoke-rc.d nginx reload >/dev/null 2>&1
      endscript
    }
    __EOF__
External resources

🔰 Don't duplicate index directive, use it only in the http block

Rationale

Use the index directive one time. It only needs to occur in your http context and it will be inherited below.

I think we should be careful about duplicating the same rules. But, of course, rules duplication is sometimes okay or not necessarily a great evil.

Example

Bad configuration:

http {

  ...

  index index.php index.htm index.html;

  server {

    server_name www.example.com;

    location / {

      index index.php index.html index.$geo.html;

      ...

    }

  }

  server {

    server_name www.example.com;

    location / {

      index index.php index.htm index.html;

      ...

    }

    location /data {

      index index.php;

      ...

    }

    ...

}

Good configuration:

http {

  ...

  index index.php index.htm index.html index.$geo.html;

  server {

    server_name www.example.com;

    location / {

      ...

    }

  }

  server {

    server_name www.example.com;

    location / {

      ...

    }

    location /data {

      ...

    }

    ...

}
External resources

Debugging

NGINX has many methods for troubleshooting configuration problems. In this chapter I will present a few ways to deal with them.

🔰 Use custom log formats

Rationale

Anything you can access as a variable in NGINX config, you can log, including non-standard http headers, etc. so it's a simple way to create your own log format for specific situations.

This is extremely helpful for debugging specific location directives.

Example
# Default main log format from the NGINX repository:
log_format main
                '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';

# Extended main log format:
log_format main-level-0
                '$remote_addr - $remote_user [$time_local] '
                '"$request_method $scheme://$host$request_uri '
                '$server_protocol" $status $body_bytes_sent '
                '"$http_referer" "$http_user_agent" '
                '$request_time';

# Debug log formats:
log_format debug-level-0
                '$remote_addr - $remote_user [$time_local] '
                '"$request_method $scheme://$host$request_uri '
                '$server_protocol" $status $body_bytes_sent '
                '$request_id $pid $msec $request_time '
                '$upstream_connect_time $upstream_header_time '
                '$upstream_response_time "$request_filename" '
                '$request_completion';
External resources

🔰 Use debug mode to track down unexpected behaviour

Rationale

There's probably more detail than you want, but that can sometimes be a lifesaver (but log file growing rapidly on a very high-traffic sites).

Generally, the error_log directive is specified in the main context but you can specified inside a particular server or a location block, the global settings will be overridden and such error_log directive will set its own path to the log file and the level of logging.

It is possible to enable the debugging log for a particular IP address or a range of IP addresses (see examples).

The alternative method of storing the debug log is keep it in the memory (to a cyclic memory buffer). The memory buffer on the debug level does not have significant impact on performance even under high load.

If you want to logging of ngx_http_rewrite_module (at the notice level) you should enable rewrite_log on; in a http, server, or location contexts.

Words of caution:

  • never leave debug logging to a file on in production
  • don't forget to revert debug-level for error_log on a very high traffic sites
  • absolutely use log rotation policy
Example
  • Debugging log to a file:
# Turn on in a specific context, e.g.:
#   - global    - for global logging
#   - http      - for http and all locations logging
#   - location  - for specific location
error_log /var/log/nginx/error-debug.log debug;
  • Debugging log to memory:

    error_log memory:32m debug;

    You can read more about that in the Show debug log in memory chapter.

  • Debugging log for a IP address/range:

    events {
    
      debug_connection    192.168.252.15/32;
      debug_connection    10.10.10.0/24;
    
    }
  • Debugging log for each server:

    error_log /var/log/nginx/debug.log debug;
    
    ...
    
    http {
    
      server {
    
        # To enable debugging:
        error_log /var/log/nginx/domain.com/domain.com-debug.log debug;
        # To disable debugging:
        error_log /var/log/nginx/domain.com/domain.com-debug.log;
    
        ...
    
      }
    
    }
External resources

🔰 Disable daemon, master process, and all workers except one

Rationale

These directives with following values are mainly used during development and debugging, e.g. while testing a bug/feature.

For example, daemon off and master_process off lets me test configurations rapidly.

For normal production the NGINX server will start in the background (daemon on). In this way NGINX and other services are running and talking to each other. One server runs many services.

In a development or debugging environment (you should never run NGINX in production with this), using master_process off, I usually run NGINX in the foreground without the master process and press ^C (SIGINT) to terminated it simply.

worker_processes 1 is also very useful because can reduce number of worker processes and the data they generate, so that is pretty comfortable for us to debug.

Example
# From configuration file (global context):
daemon            off
master_process    off;
worker_processes  1;

# From shell (oneliner):
/usr/sbin/nginx -t -g 'daemon off; master_process off; worker_processes 1;'
External resources

🔰 Use core dumps to figure out why NGINX keep crashing

Rationale

A core dump is basically a snapshot of the memory when the program crashed.

NGINX is a very stable daemon but sometimes it can happen that there is a unique termination of the running NGINX process.

It ensures two important directives that should be enabled if you want the memory dumps to be saved, however, in order to properly handle memory dumps, there are a few things to do. For fully information about it see Dump a process's memory (from this handbook) chapter.

You should always enable core dumps when your NGINX instance receive an unexpected error or when it crashed.

Example
worker_rlimit_core    500m;
worker_rlimit_nofile  65535;
working_directory     /var/dump/nginx;
External resources

Performance

NGINX is a insanely fast, but you can adjust a few things to make sure it's as fast as possible for your use case.

🔰 Adjust worker processes

Rationale

The worker_processes directive is the sturdy spine of life for NGINX. This directive is responsible for letting our virtual server know many workers to spawn once it has become bound to the proper IP and port(s) and its value is helpful in CPU-intensive work.

The safest setting is to use the number of cores by passing auto. You can adjust this value to maximum throughput under high concurrency.

How many worker processes do you need? Do some multiple load testing. Hit the app hard and see what happens with only one. Then add some more to it and hit it again. At some point you'll reach a point of truly saturating the server resources. That's when you know you have the right balance.

I think for high load proxy servers (also standalone servers) interesting value is ALL_CORES - 1 (or more) because if you're running NGINX with other critical services on the same server, you're just going to thrash the CPUs with all the context switching required to manage all of those processes.

Rule of thumb: If much time is spent blocked on I/O, worker processes should be increased further.

Increasing the number of worker processes is a great way to overcome a single CPU core bottleneck, but may opens a whole new set of problems.

Official NGINX documentation say:

When one is in doubt, setting it to the number of available CPU cores would be a good start (the value "auto" will try to autodetect it). [...] running one worker process per CPU core - makes the most efficient use of hardware resources.

Example
# The safest way:
worker_processes auto;

# VCPU = 4 , expr $(nproc --all) - 1
worker_processes 3;
External resources

🔰 Use HTTP/2

Rationale

HTTP/2 will make our applications faster, simpler, and more robust. The primary goals for HTTP/2 are to reduce latency by enabling full request and response multiplexing, minimise protocol overhead via efficient compression of HTTP header fields, and add support for request prioritisation and server push.

HTTP/2 is backwards-compatible with HTTP/1.1, so it would be possible to ignore it completely and everything will continue to work as before because if the client that does not support HTTP/2 will never ask the server for an HTTP/2 communication upgrade: the communication between them will be fully HTTP1/1.

Note that HTTP/2 multiplexes many requests within a single TCP connection. Typically, a single TCP connection is established to a server when HTTP/2 is in use.

You should also include the ssl parameter, required because browsers do not support HTTP/2 without encryption.

HTTP/2 has a extremely large blacklist of old and insecure ciphers, so you should avoid them.

Example
server {

  listen 10.240.20.2:443 ssl http2;

  ...
External resources

🔰 Maintaining SSL sessions

Rationale

This improves performance from the clients’ perspective, because it eliminates the need for a new (and time-consuming) SSL handshake to be conducted each time a request is made.

The TLS RFC recommends that sessions are not kept alive for more than 24 hours (it is the maximum time). But a while ago, I found ssl_session_timeout with less time (e.g. 15 minutes) for prevent abused by advertisers like Google and Facebook.

Default, "built-in" session cache is not optimal as it can be used by only one worker process and can cause memory fragmentation. It is much better to use shared cache.

When using ssl_session_cache, the performance of keep-alive connections over SSL might be enormously increased. 10M value of this is a good starting point (1MB shared cache can hold approximately 4,000 sessions). With shared a cache shared between all worker processes (a cache with the same name can be used in several virtual servers).

Most servers do not purge sessions or ticket keys, thus increasing the risk that a server compromise would leak data from previous (and future) connections.

Ivan Ristić (Founder of Hardenize) say:

Session resumption either creates a large server-side cache that can be broken into or, with tickets, kills forward secrecy. So you have to balance performance (you don't want your users to use full handshakes on every connection) and security (you don't want to compromise it too much). Different projects dictate different settings. [...] One reason not to use a very large cache (just because you can) is that popular implementations don't actually delete any records from there; even the expired sessions are still in the cache and can be recovered. The only way to really delete is to overwrite them with a new session. [...] These days I'd probably reduce the maximum session duration to 4 hours, down from 24 hours currently in my book. But that's largely based on a gut feeling that 4 hours is enough for you to reap the performance benefits, and using a shorter lifetime is always better.

Example
ssl_session_cache   shared:NGX_SSL_CACHE:10m;
ssl_session_timeout 12h;
ssl_session_tickets off;
ssl_buffer_size     1400;
External resources

🔰 Use exact names in a server_name directive where possible

Rationale

Exact names, wildcard names starting with an asterisk, and wildcard names ending with an asterisk are stored in three hash tables bound to the listen ports.

The exact names hash table is searched first. If a name is not found, the hash table with wildcard names starting with an asterisk is searched. If the name is not found there, the hash table with wildcard names ending with an asterisk is searched. Searching wildcard names hash table is slower than searching exact names hash table because names are searched by domain parts.

Regular expressions are tested sequentially and therefore are the slowest method and are non-scalable. For these reasons, it is better to use exact names where possible.

Example
# It is more efficient to define them explicitly:
server {

    listen       192.168.252.10:80;

    server_name  example.org  www.example.org  *.example.org;

    ...

}

# Than to use the simplified form:
server {

    listen       192.168.252.10:80;

    server_name  .example.org;

    ...

}
External resources

🔰 Avoid checks server_name with if directive

Rationale

When NGINX receives a request no matter what is the subdomain being requested, be it www.example.com or just the plain example.com this if directive is always evaluated. Since you’re requesting NGINX to check for the Host header for every request. It’s extremely inefficient.

Instead use two server directives like the example below. This approach decreases NGINX processing requirements.

Example

Bad configuration:

server {

  server_name                 domain.com www.domain.com;

  if ($host = www.domain.com) {

    return                    301 https://domain.com$request_uri;

  }

  server_name                 domain.com;

  ...

}

Good configuration:

server {

    server_name               www.domain.com;

    return                    301 $scheme://domain.com$request_uri;

    # If you force your web traffic to use HTTPS:
    #                         301 https://domain.com$request_uri;

    ...

}

server {

    listen                    192.168.252.10:80;

    server_name               domain.com;

    ...

}
External resources

🔰 Use $request_uri to avoid using regular expressions

Rationale

With built-in variable $request_uri we can effectively avoid doing any capturing or matching at all. By default, the regex is costly and will slow down the performance.

This rule is addressing passing the URL unchanged to a new host, sure return is more efficient just passing through the existing URI.

I think the best explanation comes from the official documentation:

Don’t feel bad here, it’s easy to get confused with regular expressions. In fact, it’s so easy to do that we should make an effort to keep them neat and clean.

Example

Bad configuration:

# 1)
rewrite ^/(.*)$ https://example.com/$1 permanent;

# 2)
rewrite ^ https://example.com$request_uri? permanent;

Good configuration:

return 301 https://example.com$request_uri;
External resources

🔰 Use try_files directive to ensure a file exists

Rationale

try_files is definitely a very useful thing. You can use try_files directive to check a file exists in a specified order.

You should use try_files instead of if directive. It's definitely better way than using if for this action because if directive is extremely inefficient since it is evaluated every time for every request.

The advantage of using try_files is that the behavior switches immediately with one command. I think the code is more readable also.

try_files allows you:

  • to check if the file exists from a predefined list
  • to check if the file exists from a specified directory
  • to use an internal redirect if none of the files are found
Example

Bad configuration:

  ...

  root /var/www/example.com;

  location /images {

    if (-f $request_filename) {

      expires 30d;
      break;

    }

  ...

}

Good configuration:

  ...

  root /var/www/example.com;

  location /images {

    try_files $uri =404;

  ...

}
External resources

🔰 Use return directive instead of rewrite for redirects

Rationale

You should use server blocks and return statements as they're way simpler and faster than evaluating RegEx via location blocks. This directive stops processing and returns the specified code to a client.

Example

Bad configuration:

server {

  ...

  if ($host = api.domain.com) {

    rewrite     ^/(.*)$ http://example.com/$1 permanent;

  }

  ...

Good configuration:

server {

  ...

  if ($host = api.domain.com) {

    return      403;

    # or other examples:
    #   return    301 https://domain.com$request_uri;
    #   return    301 $scheme://$host$request_uri;

  }

  ...
External resources

🔰 Enable PCRE JIT to speed up processing of regular expressions

Rationale

Enables the use of JIT for regular expressions to speed-up their processing.

By compiling NGINX with the PCRE library, you can perform complex manipulations with your location blocks and use the powerful rewrite and return directives.

PCRE JIT can speed up processing of regular expressions significantly. NGINX with pcre_jit is magnitudes faster than without it.

If you’ll try to use pcre_jit on; without JIT available, or if NGINX was compiled with JIT available, but currently loaded PCRE library does not support JIT, will warn you during configuration parsing.

The --with-pcre-jit is only needed when you compile PCRE library using NGNIX configure (./configure --with-pcre=). When using a system PCRE library whether or not JIT is supported depends on how the library was compiled.

From NGINX documentation:

The JIT is available in PCRE libraries starting from version 8.20 built with the --enable-jit configuration parameter. When the PCRE library is built with nginx (--with-pcre=), the JIT support is enabled via the --with-pcre-jit configuration parameter.

Example
# In global context:
pcre_jit on;
External resources

🔰 Make an exact location match to speed up the selection process

Rationale

Exact location matches are often used to speed up the selection process by immediately ending the execution of the algorithm.

Example
# Matches the query / only and stops searching:
location = / {

  ...

}

# Matches the query /v9 only and stops searching:
location = /v9 {

  ...

}

...

# Matches any query due to the fact that all queries begin at /,
# but regular expressions and any longer conventional blocks will be matched at first place:
location / {

  ...

}
External resources

🔰 Use limit_conn to improve limiting the download speed

Rationale

NGINX provides two directives to limiting download speed:

  • limit_rate_after - sets the amount of data transferred before the limit_rate directive takes effect
  • limit_rate - allows you to limit the transfer rate of individual client connections (past exceeding limit_rate_after)

This solution limits NGINX download speed per connection, so, if one user opens multiple e.g. video files, it will be able to download X * the number of times he connected to the video files.

To prevent this situation use limit_conn_zone and limit_conn directives.

Example
# Create limit connection zone:
limit_conn_zone $binary_remote_addr zone=conn_for_remote_addr:1m;

# Add rules to limiting the download speed:
limit_rate_after 1m;  # run at maximum speed for the first 1 megabyte
limit_rate 250k;      # and set rate limit after 1 megabyte

# Enable queue:
location /videos {

  # Max amount of data by one client: 10 megabytes (limit_rate_after * 10)
  limit_conn conn_for_remote_addr 10;

  ...
External resources

Hardening

In this chapter I will talk about some of the NGINX hardening approaches and security standards.

🔰 Always keep NGINX up-to-date

Rationale

NGINX is a very secure and stable but vulnerabilities in the main binary itself do pop up from time to time. It's the main reason for keep NGINX up-to-date as hard as you can.

A very safe way to plan the update is once a new stable version is released but for me the most common way to handle NGINX updates is to wait a few weeks after the stable release.

Before update/upgrade NGINX remember about do it on the testing environment.

Most modern GNU/Linux distros will not push the latest version of NGINX into their default package lists so maybe you should consider install it from sources.

External resources

🔰 Run as an unprivileged user

Rationale

There is no real difference in security just by changing the process owner name. On the other hand in security, the principle of least privilege states that an entity should be given no more permission than necessary to accomplish its goals within a given system. This way only master process runs as root.

This is the default NGINX behaviour, but remember to check it.

Example
# Edit nginx.conf:
user nginx;

# Set owner and group for root (app, default) directory:
chown -R nginx:nginx /var/www/domain.com
External resources

🔰 Disable unnecessary modules

Rationale

It is recommended to disable any modules which are not required as this will minimise the risk of any potential attacks by limiting the operations allowed by the web server.

The best way to unload unused modules is use the configure option during installation. If you have static linking a shared module you should re-compile NGINX.

Use only high quality modules and remember about that:

Unfortunately, many third‑party modules use blocking calls, and users (and sometimes even the developers of the modules) aren’t aware of the drawbacks. Blocking operations can ruin NGINX performance and must be avoided at all costs.

Example
# 1) During installation:
./configure --without-http_autoindex_module

# 2) Comment modules in the configuration file e.g. modules.conf:
# load_module                 /usr/share/nginx/modules/ndk_http_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_auth_pam_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_cache_purge_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_dav_ext_module.so;
load_module                   /usr/share/nginx/modules/ngx_http_echo_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_fancyindex_module.so;
load_module                   /usr/share/nginx/modules/ngx_http_geoip_module.so;
load_module                   /usr/share/nginx/modules/ngx_http_headers_more_filter_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_image_filter_module.so;
# load_module                 /usr/share/nginx/modules/ngx_http_lua_module.so;
load_module                   /usr/share/nginx/modules/ngx_http_perl_module.so;
# load_module                 /usr/share/nginx/modules/ngx_mail_module.so;
# load_module                 /usr/share/nginx/modules/ngx_nchan_module.so;
# load_module                 /usr/share/nginx/modules/ngx_stream_module.so;
External resources

🔰 Protect sensitive resources

Rationale

Hidden directories and files should never be web accessible - sometimes critical data are published during application deploy. If you use control version system you should defninitely drop the access to the critical hidden directories like a .git or .svn to prevent expose source code of your application.

Sensitive resources contains items that abusers can use to fully recreate the source code used by the site and look for bugs, vulnerabilities, and exposed passwords.

Example
if ($request_uri ~ "/\.git") {

  return 403;

}

# or
location ~ /\.git {

  deny all;

}

# or
location ~* ^.*(\.(?:git|svn|htaccess))$ {

  return 403;

}

# or all . directories/files excepted .well-known
location ~ /\.(?!well-known\/) {

  deny all;

}
External resources

🔰 Hide Nginx version number

Rationale

Disclosing the version of NGINX running can be undesirable, particularly in environments sensitive to information disclosure.

But the Official Apache Documentation (yep, it's not a joke, in my opinion that's an interesting point of view) say:

Setting ServerTokens to less than minimal is not recommended because it makes it more difficult to debug interoperational problems. Also note that disabling the Server: header does nothing at all to make your server more secure. The idea of "security through obscurity" is a myth and leads to a false sense of safety.

Example
server_tokens off;
External resources

🔰 Hide Nginx server signature

Rationale

One of the easiest first steps to undertake, is to prevent the web server from showing its used software via the server header. Certainly, there are several reasons why you would like to change the server header. It could be security, it could be redundant systems, load balancers etc.

In my opinion there is no real reason or need to show this much information about your server. It is easy to look up particular vulnerabilities once you know the version number.

You should compile NGINX from sources with ngx_headers_more to used more_set_headers directive or use a nginx-remove-server-header.patch.

Example
more_set_headers "Server: Unknown";
External resources

🔰 Hide upstream proxy headers

Rationale

Securing a server goes far beyond not showing what's running but I think less is more is better.

When NGINX is used to proxy requests to an upstream server (such as a PHP-FPM instance), it can be beneficial to hide certain headers sent in the upstream response (e.g. the version of PHP running).

Example
proxy_hide_header X-Powered-By;
proxy_hide_header X-AspNetMvc-Version;
proxy_hide_header X-AspNet-Version;
proxy_hide_header X-Drupal-Cache;
External resources

🔰 Force all connections over TLS

Rationale

TLS provides two main services. For one, it validates the identity of the server that the user is connecting to for the user. It also protects the transmission of sensitive information from the user to the server.

In my opinion you should always use HTTPS instead of HTTP to protect your website, even if it doesn’t handle sensitive communications. The application can have many sensitive places that should be protected.

Always put login page, registration forms, all subsequent authenticated pages, contact forms, and payment details forms in HTTPS to prevent injection and sniffing. Them must be accessed only over TLS to ensure your traffic is secure.

If page is available over TLS, it must be composed completely of content which is transmitted over TLS. Requesting subresources using the insecure HTTP protocol weakens the security of the entire page and HTTPS protocol. Modern browsers should blocked or report all active mixed content delivered via HTTP on pages by default.

Also remember to implement the HTTP Strict Transport Security (HSTS).

We have currently the first free and open CA - Let's Encrypt - so generating and implementing certificates has never been so easy. It was created to provide free and easy-to-use TLS and SSL certificates.

Example
  • force all traffic to use TLS:

    server {
    
      listen 10.240.20.2:80;
    
      server_name domain.com;
    
      return 301 https://$host$request_uri;
    
    }
    
    server {
    
      listen 10.240.20.2:443 ssl;
    
      server_name domain.com;
    
      ...
    
    }
  • force e.g. login page to use TLS:

    server {
    
      listen 10.240.20.2:80;
    
      server_name domain.com;
    
      ...
    
      location ^~ /login {
    
        return 301 https://domain.com$request_uri;
    
      }
    
    }
External resources

🔰 Use only the latest supported OpenSSL version

Rationale

Before start see Release Strategy Policies and Changelog on the OpenSSL website. Criteria for choosing OpenSSL version can vary and it depends all on your use.

The latest versions of the major OpenSSL library are (may be changed):

  • the next version of OpenSSL will be 3.0.0
  • version 1.1.1 will be supported until 2023-09-11 (LTS)
    • last minor version: 1.1.1d (September 10, 2019)
  • version 1.1.0 will be supported until 2019-09-11
    • last minor version: 1.1.0k (May 28, 2018)
  • version 1.0.2 will be supported until 2019-12-31 (LTS)
    • last minor version: 1.0.2s (May 28, 2018)
  • any other versions are no longer supported

In my opinion the only safe way is based on the up-to-date and still supported version of the OpenSSL. And what's more, I recommend to hang on to the latest versions (e.g. 1.1.1) but you should know one thing: OpenSSL 1.1.1 has a different API than the current 1.0.2 so that's not just a simple flick of the switch.

If your system repositories do not have the newest OpenSSL, you can do the compilation process (see OpenSSL sub-section).

I also recommend track the Vulnerabilities official newsletter, if you want to know a security bugs and issues fixed in OpenSSL.

External resources

🔰 Use min. 2048-bit private keys

Rationale

The truth is, the industry/community are split on this topic. I am in the "use 2048, because 4096 gives us almost nothing, while costing us quite a lot" camp myself.

Advisories recommend 2048 for now. Security experts are projecting that 2048 bits will be sufficient for commercial use until around the year 2030 (as per NIST). The latest version of FIPS-186 also say the U.S. Federal Government generate (and use) digital signatures with 1024, 2048, or 3072 bit key lengths.

Generally there is no compelling reason to choose 4096 bit keys over 2048 provided you use sane expiration intervals. While it is true that a longer key provides better security, doubling the length of the key from 2048 to 4096, the increase in bits of security is only 18, a mere 16% (the time to sign a message increases by 7x, and the time to verify a signature increases by more than 3x in some cases). Moreover, besides requiring more storage, longer keys also translate into increased CPU usage and higher power consumption.

The real advantage of using a 4096-bit key nowadays is future proofing. If you want to get A+ with 100%s on SSL Lab (for Key Exchange) you should definitely use 4096 bit private keys. That's the main (and the only one for me) reason why you should use them.

Longer keys take more time to generate and require more CPU and power when used for encrypting and decrypting, also the SSL handshake at the start of each connection will be slower. It also has a small impact on the client side (e.g. browsers).

Use OpenSSL's speed command to benchmark the two types and compare results, e.g. openssl speed rsa2048 rsa4096 or openssl speed rsa. Remember, however, in OpenSSL speed tests you see difference on block cipher speed, while in real life most cpu time is spent on asymmetric algorithms during SSL handshake. On the other hand, modern processors are capable of executing at least 1k of RSA 1024-bit signs per second on a single core, so this isn't usually an issue.

Use of alternative solution: ECC Certificate Signing Request (CSR) - ECDSA certificates contain an ECC public key. ECC keys are better than RSA & DSA keys in that the ECC algorithm is harder to break.

The "SSL/TLS Deployment Best Practices" book say:

The cryptographic handshake, which is used to establish secure connections, is an operation whose cost is highly influenced by private key size. Using a key that is too short is insecure, but using a key that is too long will result in "too much" security and slow operation. For most web sites, using RSA keys stronger than 2048 bits and ECDSA keys stronger than 256 bits is a waste of CPU power and might impair user experience. Similarly, there is little benefit to increasing the strength of the ephemeral key exchange beyond 2048 bits for DHE and 256 bits for ECDHE.

Konstantin Ryabitsev (Reddit):

Generally speaking, if we ever find ourselves in a world where 2048-bit keys are no longer good enough, it won't be because of improvements in brute-force capabilities of current computers, but because RSA will be made obsolete as a technology due to revolutionary computing advances. If that ever happens, 3072 or 4096 bits won't make much of a difference anyway. This is why anything above 2048 bits is generally regarded as a sort of feel-good hedging theatre.

My recommendation:

Use 2048-bit key instead of 4096-bit at this moment.

Example
### Example (RSA):
( _fd="domain.com.key" ; _len="2048" ; openssl genrsa -out ${_fd} ${_len} )

# Let's Encrypt:
certbot certonly -d domain.com -d www.domain.com --rsa-key-size 2048

### Example (ECC):
# _curve: prime256v1, secp521r1, secp384r1
( _fd="domain.com.key" ; _fd_csr="domain.com.csr" ; _curve="prime256v1" ; \
openssl ecparam -out ${_fd} -name ${_curve} -genkey ; \
openssl req -new -key ${_fd} -out ${_fd_csr} -sha256 )

# Let's Encrypt (from above):
certbot --csr ${_fd_csr} -[other-args]

For x25519:

( _fd="private.key" ; _curve="x25519" ; \
openssl genpkey -algorithm ${_curve} -out ${_fd} )

  :arrow_right: ssllabs score: 100%

( _fd="domain.com.key" ; _len="2048" ; openssl genrsa -out ${_fd} ${_len} )

# Let's Encrypt:
certbot certonly -d domain.com -d www.domain.com

  :arrow_right: ssllabs score: 90%

External resources

🔰 Keep only TLS 1.3 and TLS 1.2

Rationale

It is recommended to run TLS 1.2/1.3 and fully disable SSLv2, SSLv3, TLS 1.0 and TLS 1.1 that have protocol weaknesses and uses older cipher suites (do not provide any modern ciper modes).

TLS 1.0 and TLS 1.1 must not be used (see Deprecating TLSv1.0 and TLSv1.1) and were superceded by TLS 1.2, which has now itself been superceded by TLS 1.3 (must be included by January 1, 2024). They are also actively being deprecated in accordance with guidance from government agencies (e.g. NIST Special Publication (SP) 800-52 Revision 2) and industry consortia such as the Payment Card Industry Association (PCI) PCI-TLS - Migrating from SSL and Early TLS (Information Suplement).

TLS 1.2 and TLS 1.3 are both without security issues. Only these versions provides modern cryptographic algorithms. TLS 1.3 is a new TLS version that will power a faster and more secure web for the next few years. What's more, TLS 1.3 comes without a ton of stuff (was removed): renegotiation, compression, and many legacy algorithms: DSA, RC4, SHA1, MD5, CBC MAC-then-Encrypt ciphers. TLS 1.0 and TLS 1.1 protocols will be removed from browsers at the beginning of 2020.

TLS 1.2 does require careful configuration to ensure obsolete cipher suites with identified vulnerabilities are not used in conjunction with it. TLS 1.3 removes the need to make these decisions. TLS 1.3 version also improves TLS 1.2 security, privace and performance issues.

Before enabling specific protocol version, you should check which ciphers are supported by the protocol. So if you turn on TLS 1.2 and TLS 1.3 both remember about the correct (and strong) ciphers to handle them. Otherwise, they will not be anyway works without supported ciphers (no TLS handshake will succeed).

I think the best way to deploy secure configuration is: enable TLS 1.2 without any CBC Ciphers (is safe enough) only TLS 1.3 is safer because of its handling improvement and the exclusion of everything that went obsolete since TLS 1.2 came up.

If you told NGINX to use TLS 1.3, it will use TLS 1.3 only where is available. NGINX supports TLS 1.3 since version 1.13.0 (released in April 2017), when built against OpenSSL 1.1.1 or more.

For TLS 1.3, think about using ssl_early_data to allow TLS 1.3 0-RTT handshakes.

My recommendation:

Use only TLSv1.3 and TLSv1.2.

Example

TLS 1.3 + 1.2:

ssl_protocols TLSv1.3 TLSv1.2;

TLS 1.2:

ssl_protocols TLSv1.2;

  :arrow_right: ssllabs score: 100%

TLS 1.3 + 1.2 + 1.1:

ssl_protocols TLSv1.3 TLSv1.2 TLSv1.1;

TLS 1.2 + 1.1:

ssl_protocols TLSv1.2 TLSv1.1;

  :arrow_right: ssllabs score: 95%

External resources

🔰 Use only strong ciphers

Rationale

This parameter changes quite often, the recommended configuration for today may be out of date tomorrow.

To check ciphers supported by OpenSSL on your server: openssl ciphers -s -v, openssl ciphers -s -v ECDHE or openssl ciphers -s -v DHE.

For more security use only strong and not vulnerable cipher suites. Place ECDHE and DHE suites at the top of your list. The order is important because ECDHE suites are faster, you want to use them whenever clients supports them. Ephemeral DHE/ECDHE are recommended and support Perfect Forward Secrecy.

For backward compatibility software components you should use less restrictive ciphers. Not only that you have to enable at least one special AES128 cipher for HTTP/2 support regarding to RFC7540: TLS 1.2 Cipher Suites, you also have to allow prime256 elliptic curves which reduces the score for key exchange by another 10% even if a secure server preferred order is set.

Also modern cipher suites (e.g. from Mozilla recommendations) suffers from compatibility troubles mainly because drops SHA-1. But be careful if you want to use ciphers with HMAC-SHA-1 - there's a perfectly good explanation why.

If you want to get A+ with 100%s on SSL Lab (for Cipher Strength) you should definitely disable 128-bit ciphers. That's the main reason why you should not use them.

In my opinion 128-bit symmetric encryption doesn’t less secure. Moreover, there are about 30% faster and still secure. For example TLS 1.3 use TLS_AES_128_GCM_SHA256 (0x1301) (for TLS-compliant applications).

We currently don't have the ability to control TLS 1.3 cipher suites without support from the NGINX to use new API. NGINX isn't able to influence that so at this moment all available ciphers are always on (also if you disable potentially weak cipher from NGINX). On the other hand the ciphers in TLSv1.3 have been restricted to only a handful of completely secure ciphers by leading crypto experts.

For TLS 1.2 you should consider disable weak ciphers without forward secrecy like ciphers with CBC algorithm. Using them also reduces the final grade because they don't use ephemeral keys. In my opinion you should use ciphers with AEAD (TLS 1.3 supports only these suites) encryption because they don't have any known weaknesses.

Recently new vulnerabilities like Zombie POODLE, GOLDENDOODLE, 0-Length OpenSSL and Sleeping POODLE were published for websites that use CBC (Cipher Block Chaining) block cipher modes. These vulnerabilities are applicable only if the server uses TLS 1.2 or TLS 1.1 or TLS 1.0 with CBC cipher modes. Look at Zombie POODLE, GOLDENDOODLE, & How TLSv1.3 Can Save Us All presentation from Black Hat Asia 2019.

Disable TLS cipher modes (all ciphers that start with TLS_RSA_WITH_*) that use RSA encryption because they are vulnerable to ROBOT attack. Not all servers that support RSA key exchange are vulnerable, but it is recommended to disable RSA key exchange ciphers as it does not support forward secrecy.

You should also absolutely disable weak ciphers regardless of the TLS version do you use, like those with DSS, DSA, DES/3DES, RC4, MD5, SHA1, null, anon in the name.

We have a nice online tool for testing compatibility cipher suites with user agents: CryptCheck. I think it will be very helpful for you.

My recommendation:

Use only TLSv1.3 and TLSv1.2 with below cipher suites:

ssl_ciphers "TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-256-GCM-SHA384:TLS13-AES-128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256";
Example

Cipher suites for TLS 1.3:

ssl_ciphers "TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-256-GCM-SHA384";

Cipher suites for TLS 1.2:

ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES256-SHA384";

  :arrow_right: ssllabs score: 100%

Cipher suites for TLS 1.3:

ssl_ciphers "TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-256-GCM-SHA384:TLS13-AES-128-GCM-SHA256";

Cipher suites for TLS 1.2:

# 1)
ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES256-SHA384";

# 2)
ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256";

# 3)
ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256";

# 4)
ssl_ciphers "EECDH+CHACHA20:EDH+AESGCM:AES256+EECDH:AES256+EDH";

Cipher suites for TLS 1.1 + 1.2:

# 1)
ssl_ciphers "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256";

# 2)
ssl_ciphers "ECDHE-ECDSA-CHACHA20-POLY1305:ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:!AES256-GCM-SHA256:!AES256-GCM-SHA128:!aNULL:!MD5";

  :arrow_right: ssllabs score: 90%

This will also give a baseline for comparison with Mozilla SSL Configuration Generator:

  • Modern profile with OpenSSL 1.1.0b (TLSv1.2)
ssl_ciphers 'ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256';
  • Intermediate profile with OpenSSL 1.1.0b (TLSv1, TLSv1.1 and TLSv1.2)
ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
External resources

🔰 Use more secure ECDH Curve

Rationale

In my opinion your main source of knowledge should be The SafeCurves web site. This site reports security assessments of various specific curves.

For a SSL server certificate, an "elliptic curve" certificate will be used only with digital signatures (ECDSA algorithm). NGINX provides directive to specifies a curve for ECDHE ciphers.

x25519 is a more secure (also with SafeCurves requirements) but slightly less compatible option. I think to maximise interoperability with existing browsers and servers, stick to P-256 prime256v1 and P-384 secp384r1 curves. Of course there's tons of different opinions about P-256 and P-384 curves.

NSA Suite B says that NSA uses curves P-256 and P-384 (in OpenSSL, they are designated as, respectively, prime256v1 and secp384r1). There is nothing wrong with P-521, except that it is, in practice, useless. Arguably, P-384 is also useless, because the more efficient P-256 curve already provides security that cannot be broken through accumulation of computing power.

Bernstein and Lange believe that the NIST curves are not optimal and there are better (more secure) curves that work just as fast, e.g. x25519.

Keep an eye also on this: Secure implementations of the standard curves are theoretically possible but very hard.

The SafeCurves say:

  • NIST P-224, NIST P-256 and NIST P-384 are UNSAFE

From the curves described here only x25519 is a curve meets all SafeCurves requirements.

I think you can use P-256 to minimise trouble. If you feel that your manhood is threatened by using a 256-bit curve where a 384-bit curve is available, then use P-384: it will increases your computational and network costs.

If you use TLS 1.3 you should enable prime256v1 signature algorithm. Without this SSL Lab reports TLS_AES_128_GCM_SHA256 (0x1301) signature as weak.

If you do not set ssl_ecdh_curve, then NGINX will use its default settings, e.g. Chrome will prefer x25519, but it is not recommended because you can not control default settings (seems to be P-256) from the NGINX.

Explicitly set ssl_ecdh_curve X25519:prime256v1:secp521r1:secp384r1; decreases the Key Exchange SSL Labs rating.

Definitely do not use the secp112r1, secp112r2, secp128r1, secp128r2, secp160k1, secp160r1, secp160r2, secp192k1 curves. They have a too small size for security application according to NIST recommendation.

My recommendation:

Use only TLSv1.3 and TLSv1.2 and only strong ciphers with above curves:

ssl_ecdh_curve X25519:secp521r1:secp384r1:prime256v1;
Example

Curves for TLS 1.2:

ssl_ecdh_curve secp521r1:secp384r1:prime256v1;

  :arrow_right: ssllabs score: 100%

# Alternative (this one doesn’t affect compatibility, by the way; it’s just a question of the preferred order).

# This setup downgrade Key Exchange score but is recommended for TLS 1.2 + 1.3:
ssl_ecdh_curve X25519:secp521r1:secp384r1:prime256v1;
External resources

🔰 Use strong Key Exchange with Perfect Forward Secrecy

Rationale

To use a signature based authentication you need some kind of DH exchange (fixed or ephemeral/temporary), to exchange the session key. If you use it, NGINX will use the default Ephemeral Diffie-Hellman (DHE) paramaters to define how performs the Diffie-Hellman (DH) key-exchange. This uses a weak key (by default: 1024 bit) that gets lower scores.

You should always use the Elliptic Curve Diffie Hellman Ephemeral (ECDHE). Due to increasing concern about pervasive surveillance, key exchanges that provide Forward Secrecy are recommended, see for example RFC 7525 - 6.3. Forward Secrecy.

For greater compatibility but still for security in key exchange, you should prefer the latter E (ephemeral) over the former E (EC). There is recommended configuration: ECDHE > DHE (with min. 2048 bit size) > ECDH. With this if the initial handshake fails, another handshake will be initiated using DHE.

DHE is slower than ECDHE. If you are concerned about performance, prioritize ECDHE-ECDSA over DHE. OWASP estimates that the TLS handshake with DHE hinders the CPU by a factor of 2.4 compared to ECDHE.

Diffie-Hellman requires some set-up parameters to begin with. Parameters from ssl_dhparam (which are generated with openssl dhparam ...) define how OpenSSL performs the Diffie-Hellman (DH) key-exchange. They include a field prime p and a generator g. The purpose of the availability to customize these parameter is to allow everyone to use own parameters for this. This can be used to prevent being affected from the Logjam attack.

Modern clients prefer ECDHE instead other variants and if your NGINX accepts this preference then the handshake will not use the DH param at all since it will not do a DHE key exchange but an ECDHE key exchange. Thus, if no plain DH/DHE ciphers are configured at your server but only Eliptic curve DH (e.g. ECDHE) then you don't need to set your own ssl_dhparam directive. Enabling DHE requires us to take care of our DH primes (a.k.a. dhparams) and to trust in DHE.

Elliptic curve Diffie-Hellman is a modified Diffie-Hellman exchange which uses Elliptic curve cryptography instead of the traditional RSA-style large primes. So while I'm not sure what parameters it may need (if any), I don't think it needs the kind you're generating (ECDH is based on curves, not primes, so I don't think the traditional DH params will do you any good).

Cipher suites using DHE key exchange in OpenSSL require tmp_DH parameters, which the ssl_dhparam directive provides. The same is true for DH_anon key exchange, but in practice nobody uses those. The OpenSSL wiki page for Diffie Hellman Parameters it says: To use perfect forward secrecy cipher suites, you must set up Diffie-Hellman parameters (on the server side). Look also at SSL_CTX_set_tmp_dh_callback.

If you use ECDH/ECDHE key exchange please see Use more secure ECDH Curve rule.

Default key size in OpenSSL is 1024 bits - it's vulnerable and breakable. For the best security configuration use your own DH Group (min. 2048 bit) or use known safe ones pre-defined DH groups (it's recommended) from the Mozilla.

The 2048 bit is generally expected to be safe and is already very far into the "cannot break it zone". However years ago people expected 1024 bit to be safe so if you are after long term resistance You would go up to 4096 bit (for both RSA keys and DH parameters). It's also important if you want to get 100% on Key Exchange of the SSL Labs test.

You should remember that the 4096 bit modulus will make DH computations slower and won’t actually improve security.

There is good explanation about DH parameters recommended size:

Current recommendations from various bodies (including NIST) call for a 2048-bit modulus for DH. Known DH-breaking algorithms would have a cost so ludicrously high that they could not be run to completion with known Earth-based technology. See this site for pointers on that subject.

You don't want to overdo the size because the computational usage cost rises relatively sharply with prime size (somewhere between quadratic and cubic, depending on some implementation details) but a 2048-bit DH ought to be fine (a basic low-end PC can do several hundreds of 2048-bit DH per second).

Look also at this answer by Matt Palmer:

Indeed, characterising 2048 bit DH parameters as "weak as hell" is quite misleading. There are no known feasible cryptographic attacks against arbitrary strong 2048 bit DH groups. To protect against future disclosure of a session key due to breaking DH, sure, you want your DH parameters to be as long as is practical, but since 1024 bit DH is only just getting feasible, 2048 bits should be OK for most purposes for a while yet.

My recommendation:

If you use only TLS 1.3 - ssl_dhparam is not required (not used). Also, if you use ECDHE/ECDH - ssl_dhparam is not required (not used). If you use DHE/DH - ssl_dhparam with DH parameters is required (min. 2048 bit). By default no parameters are set, and therefore DHE ciphers will not be used.

Example
# To generate a DH parameters:
openssl dhparam -out /etc/nginx/ssl/dhparam_4096.pem 4096

# To produce "DSA-like" DH parameters:
openssl dhparam -dsaparam -out /etc/nginx/ssl/dhparam_4096.pem 4096

# NGINX configuration only for DH/DHE:
ssl_dhparam /etc/nginx/ssl/dhparams_4096.pem;

  :arrow_right: ssllabs score: 100%

# To generate a DH parameters:
openssl dhparam -out /etc/nginx/ssl/dhparam_2048.pem 2048

# To produce "DSA-like" DH parameters:
openssl dhparam -dsaparam -out /etc/nginx/ssl/dhparam_2048.pem 2048

# NGINX configuration only for DH/DHE:
ssl_dhparam /etc/nginx/ssl/dhparam_2048.pem;

  :arrow_right: ssllabs score: 90%

External resources

🔰 Prevent Replay Attacks on Zero Round-Trip Time

Rationale

This rules is only important for TLS 1.3. By default enabling TLS 1.3 will not enable 0-RTT support. After all, you should be fully aware of all the potential exposure factors and related risks with the use of this option.

0-RTT Handshakes is part of the replacement of TLS Session Resumption and was inspired by the QUIC Protocol.

0-RTT creates a significant security risk. With 0-RTT, a threat actor can intercept an encrypted client message and resend it to the server, tricking the server into improperly extending trust to the threat actor and thus potentially granting the threat actor access to sensitive data.

On the other hand, including 0-RTT (Zero Round Trip Time Resumption) results in a significant increase in efficiency and connection times. TLS 1.3 has a faster handshake that completes in 1-RTT. Additionally, it has a particular session resumption mode where, under certain conditions, it is possible to send data to the server on the first flight (0-RTT).

For example, Cloudflare only supports 0-RTT for GET requests with no query parameters in an attempt to limit the attack surface. Moreover, in order to improve identify connection resumption attempts, they relay this information to the origin by adding an extra header to 0-RTT requests. This header uniquely identifies the request, so if one gets repeated, the origin will know it's a replay attack (the application needs to track values received from that and reject duplicates on non-idempotent endpoints).

To protect against such attacks at the application layer, the $ssl_early_data variable should be used. You'll also need to ensure that the Early-Data header is passed to your application. $ssl_early_data returns 1 if TLS 1.3 early data is used and the handshake is not complete.

However, as part of the upgrade, you should disable 0-RTT until you can audit your application for this class of vulnerability.

In order to send early-data, client and server must support PSK exchange mode (session cookies).

In addition, I would like to recommend this great discussion about TLS 1.3 and 0-RTT.

If you are unsure to enable 0-RTT, look what Cloudflare say about it:

Generally speaking, 0-RTT is safe for most web sites and applications. If your web application does strange things and you’re concerned about its replay safety, consider not using 0-RTT until you can be certain that there are no negative effects. [...] TLS 1.3 is a big step forward for web performance and security. By combining TLS 1.3 with 0-RTT, the performance gains are even more dramatic.

Example

Test 0-RTT with OpenSSL:

# 1)
_host="example.com"

cat > req.in << __EOF__
HEAD / HTTP/1.1
Host: $_host
Connection: close
__EOF__
# or:
# echo -e "GET / HTTP/1.1\r\nHost: $_host\r\nConnection: close\r\n\r\n" > req.in

openssl s_client -connect ${_host}:443 -tls1_3 -sess_out session.pem -ign_eof < req.in
openssl s_client -connect ${_host}:443 -tls1_3 -sess_in session.pem -early_data req.in

# 2)
python -m sslyze --early_data "$_host"

Enable 0-RTT with $ssl_early_data variable:

server {

  ...

  ssl_protocols   TLSv1.2 TLSv1.3;
  # To enable 0-RTT (TLS 1.3):
  ssl_early_data  on;

  location / {

    proxy_pass       http://backend_x20;
    # It protect against such attacks at the application layer:
    proxy_set_header Early-Data $ssl_early_data;

  }

  ...

}
External resources

🔰 Defend against the BEAST attack

Rationale

Generally the BEAST attack relies on a weakness in the way CBC mode is used in SSL/TLS.

More specifically, to successfully perform the BEAST attack, there are some conditions which needs to be met:

  • vulnerable version of SSL must be used using a block cipher (CBC in particular)
  • JavaScript or a Java applet injection - should be in the same origin of the web site
  • data sniffing of the network connection must be possible

To prevent possible use BEAST attacks you should enable server-side protection, which causes the server ciphers should be preferred over the client ciphers, and completely excluded TLS 1.0 from your protocol stack.

Example
ssl_prefer_server_ciphers on;
External resources

🔰 Mitigation of CRIME/BREACH attacks

Rationale

Disable HTTP compression or compress only zero sensitive content.

You should probably never use TLS compression. Some user agents (at least Chrome) will disable it anyways. Disabling SSL/TLS compression stops the attack very effectively. A deployment of HTTP/2 over TLS 1.2 must disable TLS compression (please see RFC 7540 - 9.2. Use of TLS Features).

CRIME exploits SSL/TLS compression which is disabled since nginx 1.3.2. BREACH exploits HTTP compression

Some attacks are possible (e.g. the real BREACH attack is a complicated) because of gzip (HTTP compression not TLS compression) being enabled on SSL requests. In most cases, the best action is to simply disable gzip for SSL.

Compression is not the only requirement for the attack to be done so using it does not mean that the attack will succeed. Generally you should consider whether having an accidental performance drop on HTTPS sites is better than HTTPS sites being accidentally vulnerable.

You shouldn't use HTTP compression on private responses when using TLS.

I would gonna to prioritise security over performance but compression can be (I think) okay to HTTP compress publicly available static content like css or js and HTML content with zero sensitive info (like an "About Us" page).

Remember: by default, NGINX doesn't compress image files using its per-request gzip module.

Gzip static module is better, for 2 reasons:

  • you don't have to gzip for each request
  • you can use a higher gzip level

You should put the gzip_static on; inside the blocks that configure static files, but if you’re only running one site, it’s safe to just put it in the http block.

Example
# Disable dynamic HTTP compression:
gzip off;

# Enable dynamic HTTP compression for specific location context:
location / {

  gzip on;

  ...

}

# Enable static gzip compression:
location ^~ /assets/ {

  gzip_static on;

  ...

}
External resources

🔰 HTTP Strict Transport Security

Rationale

Generally HSTS is a way for websites to tell browsers that the connection should only ever be encrypted. This prevents MITM attacks, downgrade attacks, sending plain text cookies and session ids.

The header indicates for how long a browser should unconditionally refuse to take part in unsecured HTTP connection for a specific domain.

When a browser knows that a domain has enabled HSTS, it does two things:

  • always uses an https:// connection, even when clicking on an http:// link or after typing a domain into the location bar without specifying a protocol
  • removes the ability for users to click through warnings about invalid certificates

I recommend to set the max-age to a big value like 31536000 (12 months) or 63072000 (24 months).

There are a few simple best practices for HSTS (from The Importance of a Proper HTTP Strict Transport Security Implementation on Your Web Server):

  • The strongest protection is to ensure that all requested resources use only TLS with a well-formed HSTS header. Qualys recommends providing an HSTS header on all HTTPS resources in the target domain

  • It is advisable to assign the max-age directive’s value to be greater than 10368000 seconds (120 days) and ideally to 31536000 (one year). Websites should aim to ramp up the max-age value to ensure heightened security for a long duration for the current domain and/or subdomains

  • RFC 6797, section 14.4 advocates that a web application must aim to add the includeSubDomain directive in the policy definition whenever possible. The directive’s presence ensures the HSTS policy is applied to the domain of the issuing host and all of its subdomains, e.g. example.com and www.example.com

  • The application should never send an HSTS header over a plaintext HTTP header, as doing so makes the connection vulnerable to SSL stripping attacks

  • It is not recommended to provide an HSTS policy via the http-equiv attribute of a meta tag. According to HSTS RFC 6797, user agents don’t heed http-equiv="Strict-Transport-Security" attribute on <meta> elements on the received content`

To meet the HSTS preload list standard a root domain needs to return a strict-transport-security header that includes both the includeSubDomains and preload directives and has a minimum max-age of one year. Your site must also serve a valid SSL certificate on the root domain and all subdomains, as well as redirect all HTTP requests to HTTPS on the same host.

You had better be pretty sure that your website is indeed all HTTPS before you turn this on because HSTS adds complexity to your rollback strategy. Google recommend enabling HSTS this way:

  1. Roll out your HTTPS pages without HSTS first
  2. Start sending HSTS headers with a short max-age. Monitor your traffic both from users and other clients, and also dependents' performance, such as ads
  3. Slowly increase the HSTS max-age
  4. If HSTS doesn't affect your users and search engines negatively, you can, if you wish, ask your site to be added to the HSTS preload list used by most major browsers
Example
add_header Strict-Transport-Security "max-age=63072000; includeSubdomains" always;

  :arrow_right: ssllabs score: A+

External resources

🔰 Reduce XSS risks (Content-Security-Policy)

Rationale

CSP reduce the risk and impact a wide range of attacks, including cross-site scripting and other cross-site injections in modern browsers. Is a good defence-in-depth measure to make exploitation of an accidental lapse in that less likely.

The inclusion of CSP policies significantly impedes successful XSS attacks, UI Redressing (Clickjacking), malicious use of frames or CSS injections.

Whitelisting known-good resource origins, refusing to execute potentially dangerous inline scripts, and banning the use of eval are all effective mechanisms for mitigating cross-site scripting attacks.

The default policy that starts building a header is: block everything. By modifying the CSP value, the programmer loosens restrictions for specific groups of resources (e.g. separately for scripts, images, etc.).

You should approach very individually and never set CSP sample values found on the Internet or anywhere else. Blindly deploying "standard/recommend" versions of the CSP header will broke the most of web apps.

Before enabling this header, you should discuss about CSP parameters with developers and application architects. They probably going to have to update web application to remove any inline scripts and styles, and make some additional modifications there.

Strict policies will significantly increase security, and higher code quality will reduce the overall number of errors. CSP can never replace secure code - new restrictions help reduce the effects of attacks (such as XSS), but they are not mechanisms to prevent them!

You should always validate CSP before implement:

For generate a policy (remember, however, that these types of tools may become outdated or have errors):

Example
# This policy allows images, scripts, AJAX, and CSS from the same origin, and does not allow any other resources to load:
add_header Content-Security-Policy "default-src 'none'; script-src 'self'; connect-src 'self'; img-src 'self'; style-src 'self';" always;
External resources

🔰 Control the behaviour of the Referer header (Referrer-Policy)

Rationale

Referral policy deals with what information (related to the url) the browser ships to a server to retrieve an external resource.

Basically this is a privacy enhancement, when you want to hide information for owner of the domain of a link where is clicked that the user came from your website.

I think the most secure value is no-referrer which specifies that no referrer information is to be sent along with requests made from a particular request client to any origin. The header will be omitted entirely.

The use of no-referrer has its advantages because it allows you to hide the HTTP header, and this increases online privacy and the security of users themselves. On the other hand, it can mainly affects analytics (in theory, should not have any SEO impact) because no-referrer specifies to hide that kind of information.

Mozilla has a good table explaining how each of referrer policy options works. It comes from Mozilla's reference documentation about Referer Policy.

Example
# This policy does not send information about the referring site after clicking the link:
add_header Referrer-Policy "no-referrer";
External resources

🔰 Provide clickjacking protection (X-Frame-Options)

Rationale

Helps to protect your visitors against clickjacking attacks by declaring a policy whether your application may be embedded on other (external) pages using frames.

It is recommended that you use the x-frame-options header on pages which should not be allowed to render a page in a frame.

This header allows 3 parameters, but in my opinion you should consider only two: a deny parameter to disallow embedding the resource in general or a sameorigin parameter to allow embedding the resource on the same host/origin.

It has a lower priority than CSP but in my opinion it is worth using as a fallback.

Example
# Only pages from the same domain can "frame" this URL:
add_header X-Frame-Options "SAMEORIGIN" always;
External resources

🔰 Prevent some categories of XSS attacks (X-XSS-Protection)

Rationale

Enable the cross-site scripting (XSS) filter built into modern web browsers.

It's usually enabled by default anyway, so the role of this header is to re-enable the filter for this particular website if it was disabled by the user.

I think you can set this header without consulting its value with web application architects but all well written apps have to emit header X-XSS-Protection: 0 and just forget about this feature. If you want to have extra security that better user agents can provide, use a strict Content-Security-Policy header. There is an exact answer by Mikko Rantalainen.

Example
add_header X-XSS-Protection "1; mode=block" always;
External resources

🔰 Prevent Sniff Mimetype middleware (X-Content-Type-Options)

Rationale

It prevents the browser from doing MIME-type sniffing.

Setting this header will prevent the browser from interpreting files as something else than declared by the content type in the HTTP headers.

Example
# Disallow content sniffing:
add_header X-Content-Type-Options "nosniff" always;
External resources

🔰 Deny the use of browser features (Feature-Policy)

Rationale

This header protects your site from third parties using APIs that have security and privacy implications, and also from your own team adding outdated APIs or poorly optimised images.

Example
add_header Feature-Policy "geolocation 'none'; midi 'none'; notifications 'none'; push 'none'; sync-xhr 'none'; microphone 'none'; camera 'none'; magnetometer 'none'; gyroscope 'none'; speaker 'none'; vibrate 'none'; fullscreen 'none'; payment 'none'; usb 'none';";
External resources

🔰 Reject unsafe HTTP methods

Rationale

An ordinary web server supports the HEAD, GET and POST methods to retrieve static and dynamic content. Other (e.g. OPTIONS, TRACE) methods should not be supported on public web servers, as they increase the attack surface.

Example
add_header Allow "GET, POST, HEAD" always;

if ($request_method !~ ^(GET|POST|HEAD)$) {

  return 405;

}
External resources

🔰 Prevent caching of sensitive data

Rationale

This policy should be implemented by the application architect, however, I know from experience that this does not always happen.

Don' to cache or persist sensitive data. As browsers have different default behaviour for caching HTTPS content, pages containing sensitive information should include a Cache-Control header to ensure that the contents are not cached.

One option is to add anticaching headers to relevant HTTP/1.1 and HTTP/2 responses, e.g. Cache-Control: no-cache, no-store and Expires: 0.

To cover various browser implementations the full set of headers to prevent content being cached should be:

Cache-Control: no-cache, no-store, private, must-revalidate, max-age=0, no-transform Pragma: no-cache Expires: 0

Example
location /api {

  expires 0;
  add_header Cache-Control "no-cache, no-store";

}
External resources

🔰 Control Buffer Overflow attacks

Rationale

Buffer overflow attacks are made possible by writing data to a buffer and exceeding that buffers’ boundary and overwriting memory fragments of a process. To prevent this in NGINX we can set buffer size limitations for all clients.

Example
client_body_buffer_size 100k;
client_header_buffer_size 1k;
client_max_body_size 100k;
large_client_header_buffers 2 1k;
External resources

🔰 Mitigating Slow HTTP DoS attacks (Closing Slow Connections)

Rationale

Close connections that are writing data too infrequently, which can represent an attempt to keep connections open as long as possible.

You can close connections that are writing data too infrequently, which can represent an attempt to keep connections open as long as possible (thus reducing the server’s ability to accept new connections).

Example
client_body_timeout 10s;
client_header_timeout 10s;
keepalive_timeout 5s 5s;
send_timeout 10s;
External resources

Reverse Proxy

One of the frequent uses of the NGINX is setting it up as a proxy server that can off load much of the infrastructure concerns of a high-volume distributed web application.

🔰 Use pass directive compatible with backend protocol

Rationale

All proxy_* directives are related to the backends that use the specific backend protocol.

You should use proxy_pass only for HTTP servers working on the backend layer (set also the http:// protocol before referencing the HTTP backend) and other *_pass directives only for non-HTTP backend servers (like a uWSGI or FastCGI).

Directives such as uwsgi_pass, fastcgi_pass, or scgi_pass are designed specifically for non-HTTP apps and you should use them instead of the proxy_pass (non-HTTP talking).

For example: uwsgi_pass uses an uwsgi protocol. proxy_pass uses normal HTTP to talking with uWSGI server. uWSGI docs claims that uwsgi protocol is better, faster and can benefit from all of uWSGI special features. You can send to uWSGI information what type of data you are sending and what uWSGI plugin should be invoked to generate response. With http (proxy_pass) you won't get that.

Example

Bad configuration:

server {

  location /app/ {

    # For this, you should use uwsgi_pass directive.
    proxy_pass      192.168.154.102:4000;         # backend layer: uWSGI Python app.

  }

  ...

}

Good configuration:

server {

  location /app/ {

    proxy_pass      http://192.168.154.102:80;    # backend layer: OpenResty as a front for app.

  }

  location /app/v3 {

    uwsgi_pass      192.168.154.102:8080;         # backend layer: uWSGI Python app.

  }

  location /app/v4 {

    fastcgi_pass    192.168.154.102:8081;         # backend layer: php-fpm app.

  }
  ...

}
External resources

🔰 Be careful with trailing slashes in proxy_pass directive

Rationale

NGINX replaces part literally and you could end up with some strange url.

If proxy_pass used without URI (i.e. without path after server:port) NGINX will put URI from original request exactly as it was with all double slashes, ../ and so on.

URI in proxy_pass acts like alias directive, means NGINX will replace part that matches location prefix with URI in proxy_pass directive (which I intentionally made the same as location prefix) so URI will be the same as requested but normalized (without doule slashes and all that staff).

Example
location = /a {

  proxy_pass http://127.0.0.1:8080/a;

  ...

}

location ^~ /a/ {

  proxy_pass http://127.0.0.1:8080/a/;

  ...

}
External resources

🔰 Set and pass Host header only with $host variable

Rationale

You should almost always use $host as a incoming host variable, because it's the only one guaranteed to have something sensible regardless of how the user-agent behaves, unless you specifically need the semantics of one of the other variables.

$host is simply $http_host with some processing (stripping port number and lowercasing) and a default value (of the server_name), so there's no less "exposure" to the Host header sent by the client when using $http_host. There's no danger in this though.

The variable $host is the host name from the request line or the http header. The variable $server_name is the name of the server block we are in right now.

The difference is explained in the NGINX documentation:

  • $host contains "in this order of precedence: host name from the request line, or host name from the Host request header field, or the server name matching a request"
  • $http_host contains the content of the HTTP Host header field, if it was present in the request (equals always the HTTP_HOST request header)
  • $server_name contains the server_name of the virtual host which processed the request, as it was defined in the NGINX configuration. If a server contains multiple server names, only the first one will be present in this variable

http_host, moreover, is better than $host:$server_port because it uses the port as present in the URL, unlike $server_port which uses the port that NGINX listens on.

Example
proxy_set_header    Host    $host;
External resources

🔰 Set properly values of the X-Forwarded-For header

Rationale

In the light of the latest httpoxy vulnerabilities, there is really a need for a full example, how to use HTTP_X_FORWARDED_FOR properly. In short, the load balancer sets the 'most recent' part of the header. In my opinion, for security reasons, the proxy servers must be specified by the administrator manually.

X-Forwarded-For is the custom HTTP header that carries along the original IP address of a client so the app at the other end knows what it is. Otherwise it would only see the proxy IP address, and that makes some apps angry.

The X-Forwarded-For depends on the proxy server, which should actually pass the IP address of the client connecting to it. Where a connection passes through a chain of proxy servers, X-Forwarded-For can give a comma-separated list of IP addresses with the first being the furthest downstream (that is, the user). Because of this, servers behind proxy servers need to know which of them are trustworthy.

The proxy used can set this header to anything it wants to, and therefore you can't trust its value. Most proxies do set the correct value though. This header is mostly used by caching proxies, and in those cases you're in control of the proxy and can thus verify that is gives you the correct information. In all other cases its value should be considered untrustworthy.

Some systems also use X-Forwarded-For to enforce access control. A good number of applications rely on knowing the actual IP address of a client to help prevent fraud and enable access.

Value of the X-Forwarded-For header field can be set at the client's side - this can also be termed as X-Forwarded-For spoofing. However, when the web request is made via a proxy server, the proxy server modifies the X-Forwarded-For field by appending the IP address of the client (user). This will result in 2 comma separated IP addresses in the X-Forwarded-For field.

A reverse proxy is not source IP address transparent. This is a pain when you need the client source IP address to be correct in the logs of the backend servers. I think the best solution of this problem is configure the load balancer to add/modify an X-Forwarded-For header with the source IP of the client and forward it to the backend in the correct form.

Unfortunately, on the proxy side we are not able to solve this problem (all solutions can be spoofable), it is important that this header is correctly interpreted by application servers. Doing so ensures that the apps or downstream services have accurate information on which to make their decisions, including those regarding access and authorization.

There is also an interesing idea what to do in this situation:

To prevent this we must distrust that header by default and follow the IP address breadcrumbs backwards from our server. First we need to make sure the REMOTE_ADDR is someone we trust to have appended a proper value to the end of X-Forwarded-For. If so then we need to make sure we trust the X-Forwarded-For IP to have appended the proper IP before it, so on and so forth. Until, finally we get to an IP we don’t trust and at that point we have to assume that’s the IP of our user. - it comes from Proxies & IP Spoofing by Xiao Yu.

Example
# The whole purpose that it exists is to do the appending behavior:
proxy_set_header    X-Forwarded-For    $proxy_add_x_forwarded_for;
# Above is equivalent for this:
proxy_set_header    X-Forwarded-For    $http_x_forwarded_for,$remote_addr;
# The following is also equivalent for above but in this example we use http_realip_module:
proxy_set_header    X-Forwarded-For    "$http_x_forwarded_for, $realip_remote_addr";
External resources

🔰 Don't use X-Forwarded-Proto with $scheme behind reverse proxy

Rationale

X-Forwarded-Proto can be set by the reverse proxy to tell the app whether it is HTTPS or HTTP or even an invalid name.

The scheme (i.e. HTTP, HTTPS) variable evaluated only on demand (used only for the current request).

Setting the $scheme variable will cause distortions if it uses more than one proxy along the way. For example: if the client go to the https://example.com, the proxy stores the scheme value as HTTPS. If the communication between the proxy and the next-level proxy takes place over HTTP, then the backend sees the scheme as HTTP. So if you set $scheme for X-Forwarded-Proto on the next-level proxy, app will see a different value than the one the client came with.

For resolve this problem you can also use this) configuration snippet.

Example
# 1) client <-> proxy <-> backend
proxy_set_header    X-Forwarded-Proto  $scheme;

# 2) client <-> proxy <-> proxy <-> backend
# proxy_set_header  X-Forwarded-Proto  https;
proxy_set_header    X-Forwarded-Proto  $proxy_x_forwarded_proto;
External resources

🔰 Always pass Host, X-Real-IP, and X-Forwarded headers to the backend

Rationale

When using NGINX as a reverse proxy you may want to pass through some information of the remote client to your backend web server. I think it's good practices because gives you more control of forwarded headers.

It's very important for servers behind proxy because it allow to interpret the client correctly. Proxies are the "eyes" of such servers, they should not allow a curved perception of reality. If not all requests are passed through a proxy, as a result, requests received directly from clients may contain e.g. inaccurate IP addresses in headers.

X-Forwarded headers are also important for statistics or filtering. Other example could be access control rules on your app, because without these headers filtering mechanism may not working properly.

If you use a front-end service like Apache or whatever else as the front-end to your APIs, you will need these headers to understand what IP or hostname was used to connect to the API.

Forwarding these headers is also important if you use the https protocol (it has become a standard nowadays).

However, I would not rely on either the presence of all X-Forwarded headers, or the validity of their data.

Example
location / {

  proxy_pass          http://bk_upstream_01;

  # The following headers also should pass to the backend:
  #   - Host - host name from the request line, or host name from the Host request header field, or the server name matching a request
  # proxy_set_header  Host               $host:$server_port;
  # proxy_set_header  Host               $http_host;
  proxy_set_header    Host               $host;

  #   - X-Real-IP - forwards the real visitor remote IP address to the proxied server
  proxy_set_header    X-Real-IP          $remote_addr;

  # X-Forwarded headers stack:
  #   - X-Forwarded-For - mark origin IP of client connecting to server through proxy
  # proxy_set_header  X-Forwarded-For    $remote_addr;
  # proxy_set_header  X-Forwarded-For    $http_x_forwarded_for,$remote_addr;
  # proxy_set_header  X-Forwarded-For    "$http_x_forwarded_for, $realip_remote_addr";
  proxy_set_header    X-Forwarded-For    $proxy_add_x_forwarded_for;

  #   - X-Forwarded-Host - mark origin host of client connecting to server through proxy
  # proxy_set_header  X-Forwarded-Host   $host:443;
  proxy_set_header    X-Forwarded-Host   $host:$server_port;

  #   - X-Forwarded-Server - the hostname of the proxy server
  proxy_set_header    X-Forwarded-Server $host;

  #   - X-Forwarded-Port - defines the original port requested by the client
  # proxy_set_header  X-Forwarded-Port   443;
  proxy_set_header    X-Forwarded-Port   $server_port;

  #   - X-Forwarded-Proto - mark protocol of client connecting to server through proxy
  # proxy_set_header  X-Forwarded-Proto  https;
  # proxy_set_header  X-Forwarded-Proto  $proxy_x_forwarded_proto;
  proxy_set_header    X-Forwarded-Proto  $scheme;

}
External resources

🔰 Use custom headers without X- prefix

Rationale

Internet Engineering Task Force released a new RFC (RFC-6648), recommending deprecation of X- prefix.

The X- in front of a header name customarily has denoted it as experimental/non-standard/vendor-specific. Once it's a standard part of HTTP, it'll lose the prefix.

If it’s possible for new custom header to be standardized, use a non-used and meaningful header name.

The use of custom headers with X- prefix is not forbidden but discouraged. In other words, you can keep using X- prefixed headers, but it's not recommended and you may not document them as if they are public standard.

Example

Not recommended configuration:

add_header X-Backend-Server $hostname;

Recommended configuration:

add_header Backend-Server   $hostname;
External resources

Load Balancing

Load balancing is a useful mechanism to distribute incoming traffic around several capable servers. We may improve of some rules about the NGINX working as a load balancer.

🔰 Tweak passive health checks

Rationale

Monitoring for health is important on all types of load balancing mainly for business continuity. Passive checks watches for failed or timed-out connections as they pass through NGINX as requested by a client.

This functionality is enabled by default but the parameters mentioned here allow you to tweak their behaviour. Default values are: max_fails=1 and fail_timeout=10s.

Example
upstream backend {

  server bk01_node:80 max_fails=3 fail_timeout=5s;
  server bk02_node:80 max_fails=3 fail_timeout=5s;

}
External resources

🔰 Don't disable backends by comments, use down parameter

Rationale

Sometimes we need to turn off backends e.g. at maintenance-time. I think good solution is marks the server as permanently unavailable with down parameter even if the downtime takes a short time.

It's also important if you use IP Hash load balancing technique. If one of the servers needs to be temporarily removed, it should be marked with this parameter in order to preserve the current hashing of client IP addresses.

Comments are good for really permanently disable servers or if you want to leave information for historical purposes.

NGINX also provides a backup parameter which marks the server as a backup server. It will be passed requests when the primary servers are unavailable. I use this option rarely for the above purposes and only if I am sure that the backends will work at the maintenance time.

Example
upstream backend {

  server bk01_node:80 max_fails=3 fail_timeout=5s down;
  server bk02_node:80 max_fails=3 fail_timeout=5s;

}
External resources

Others

This rules aren't strictly related to the NGINX but in my opinion they're also very important aspect of security.

🔰 Enable DNS CAA Policy

Rationale

DNS CAA policy helps you to control which Certificat Authorities are allowed to issue certificates for your domain becaues if no CAA record is present, any CA is allowed to issue a certificate for the domain.

Example

Generic configuration (Google Cloud DNS, Route 53, OVH, and other hosted services) for Let's Encrypt:

example.com. CAA 0 issue "letsencrypt.org"

Standard Zone File (BIND, PowerDNS and Knot DNS) for Let's Encrypt:

example.com. IN CAA 0 issue "letsencrypt.org"
External resources

🔰 Define security policies with security.txt

Rationale

The main purpose of security.txt is to help make things easier for companies and security researchers when trying to secure platforms. It also provides information to assist in disclosing security vulnerabilities.

When security researchers detect potential vulnerabilities in a page or application, they will try to contact someone "appropriate" to "responsibly" reveal the problem. It's worth taking care of getting to the right address.

This file should be placed under the /.well-known/ path, e.g. /.well-known/security.txt (RFC5785) of a domain name or IP address for web properties.

Example
curl -ks https://example.com/.well-known/security.txt

Contact: [email protected]
Contact: +1-209-123-0123
Encryption: https://example.com/pgp.txt
Preferred-Languages: en
Canonical: https://example.com/.well-known/security.txt
Policy: https://example.com/security-policy.html

And from Google:

curl -ks https://www.google.com/.well-known/security.txt

Contact: https://g.co/vulnz
Contact: mailto:[email protected]
Encryption: https://services.google.com/corporate/publickey.txt
Acknowledgements: https://bughunter.withgoogle.com/
Policy: https://g.co/vrp
Hiring: https://g.co/SecurityPrivacyEngJobs
# Flag: BountyCon{075e1e5eef2bc8d49bfe4a27cd17f0bf4b2b85cf}
External resources