1. nginx配置文件的通用语法介绍

nginx的二进制模块中已经指定它包含的模块，但每个模块都会提供独一无二的配置文件，这些所有的配置语法，会遵循同样的语法规则，现在我们来看一下主要的语法规则：

nginx配置语法：

配置文件由指令与指令块组成
每条指令以;分号结尾，指令与参数间以空格符号分隔
指令块以{}大括号将多条指令组织在一起
include语句允许组合多个配置文件以提升可维护性
使用#符号添加注释，提高可读性
使用$符号使用变量
部分指令的参数支持正则表达式

配置参数：时间的单位 ms: milliseconds d:days s: seconds w:weeks m: minutes M:months, 30days h: hours y:years,365days

配置参数：空间的单位默认：bytes k/K: kilobytes m/M: megabytes g/G: gigabytes

http配置的指令块： http, server(对应一个域名或者多个域名), upstream, location(url表达式)

1.1 冲突的配置指令以谁为准？

每个http模块提供的指令，很多时候它可以出现的context，也就是上下文，既可以在location中，也可以在server中，也可以在http中，甚至在if等等配置块中。当一个指令出现在多个配置块中的时候，它们可能值是冲突的，那到底以谁的为准呢？

配置块的嵌套:

http {
    upstream {...}
    split_clients {...}
    map {...}
    geo {...}
    server {
        if () {...}
        location {
            limit_except {...}
        }
        location {
            location {
            }
        }
    }
    server {
    }
}

当指令在多个块下同时存在的时候，它是可以合并的，但是并不是所有的指令都可以合并，我先讲一下指令合并的一个总体的规则。

  值指令:存储配置项的值            动作类指令:指定行为
<---------------------------------------------------------->
   * 可以合并               |     * 不可以合并
   * 示例                   |     * 示例
      * root                |           * rewrite
      * access_log          |           * proxy_pass
      * gzip                |     * 生效阶段
                            |           * server_rewrite 阶段
                            |           * rewrite 阶段
                            |           * content 阶段

存储值的指令继承规则：向上覆盖子配置不存在时，直接使用父配置块子配置存在时，直接覆盖父配置块

server {
    listen 8080;
    root /home/geek/nginx/html;
    access_log logs/geek.access.log main;
    location /test {
        # 子配置存在时，直接覆盖父配置块
        root /home/geek/nginx/test;
        access_log logs/access.test.log main;
    }
    location /dlib {
        alias dlib/;
    }
    location / {
        # 子配置不存在时，直接使用父配置块
        # 所以，server下定义的root在这里会生效
    }
}

1.2 HTTP模块合并配置的实现

对于没有遵循上面这些规则的第三方模块，如果它相应的说明文档也不是非常详细的话，就需要我们去通过源码来判断，当它们的值指令出现冲突的时候，究竟以哪一个为准，怎么样来通过它的源码来看呢？

指令在哪个块下生效？
指令允许出现在哪些块下？
在server块内生效，从http向server合并指令： * char *(*merge_srv_conf)(ngx_conf_t *cf, void *prev, void *conf);
在location块内生效，向location合并指令： * char *(*merge_loc_conf)(ngx_conf_t *cf, void *prev, void *conf);

例如:

// 任何nginx模块都会有一个结构体，叫做ngx_module_t
ngx_module_t  ngx_http_referer_module = {
    NGX_MODULE_V1,
    &ngx_http_referer_module_ctx,             /* module context */
    ngx_http_referer_commands,                /* module directives */
    NGX_HTTP_MODULE,                          /* module type */
    NULL,                                     /* init master */
    NULL,                                     /* init module */
    NULL,                                     /* init process */
    NULL,                                     /* init thread */
    NULL,                                     /* exit thread */
    NULL,                                     /* exit process */
    NULL,                                     /* exit master */
    NGX_MODULE_V1_PADDING
};

// 上面这个模块所提供的指令，全部在ngx_command_t的一个数组中：
static ngx_command_t ngx_http_referer_commands[] = {
    { ngx_string("valid_referers"),
      // NGX_HTTP_SRV_CONF表示它可以出现在server块下
      // NGX_HTTP_LOC_CONF表示它可以出现在location块下
      // NGX_CONF_1MORE表示它可以包含一个或多个参数
      NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_1MORE,
      ngx_http_valid_referers,
      NGX_HTTP_LOC_CONF_OFFSET,
      0,
      NULL
    },
    { ngx_string("referer_hash_max_size"),
      NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1,
      ngx_conf_set_num_slot,
      NGX_HTTP_LOC_CONF_OFFSET,
      offsetof(ngx_http_referer_conf_t, referer_hash_max_size),
      NULL
    },
}

// ngx_http_module_t会定义各种回调方法，其中
// ngx_http_referer_merge_conf回调方法中就定义配置合并的规则
static ngx_http_module_t  ngx_http_referer_module_ctx = {
    ngx_http_referer_add_variables,      /* preconf;gurat;on */
    NULL,                                /* postconf;gurat;on */

    NULL,                                /* create main conf;gurat;on */
    NULL,                                /* init main configuration */

    NULL,                                /* create server conf;guration */
    NULL,                                /* merge server conf;guration */

    ngx_http_referer_create_conf,        /* create location configuration */
    ngx_http_referer_merge_conf          /* merge location configuration */
}

// 合并方法
static char * 
ngx_http_referer_merge_conf(ngx_conf_t *cf, void *parent, void *child)
{
    ngx_http_referer_conf_t *prev = parent;
    ngx_http_referer_conf_t *conf = child;

    ngx_uint_t                    n;
    ngx_hash_init_t               hash;
    ngx_http_server_name_t        *sn;
    ngx_http_core_srv_conf_t      *cscf;

    if (conf->keys == NULL) {
        conf->hash = prev->hash;
#if (NGX_PCRE)
        ngx_conf_merge_ptr_value(conf->regex, prev->regex, NULL);
        ngx_conf_merge_ptr_value(conf->server_name_regex,
                                 prev->server_name_regex, NULL);
#endif
        ngx_conf_merge_value(conf->no_referer, prev->no_referer, 0);
        ngx_conf_merge_value(conf->blocked_referer, prev->blocked_referer, 0);
        ngx_conf_merge_uint_value(conf->referer_hash_max_size,
                                  prev->referer_hash_max_size, 2048);
        ngx_conf_merge_uint_value(conf->referer_hash_bucket_size,
                                  prev->referer_hash_bucket_size, 64);
        return NGX_CONF_OK;
    }
    if (conf->server_names == 1) {
        ...
    }
    ...
}

2. Listen指令的用法

Listen指令：

1. Syntax:

listen address[:port][default_server]
       [ssl][http2|spdy][proxy_protocol][setfib=number][fastopen=number]
       [backlog=number][rcvbuf=size][sndbuf=size][accept_filter=filter]
       [deferred][bind][ipv6only=on|off][reuseport]
       [so_keepalive=on|off[keepidle]:[keepintvl]:[keepcnt]];

listen port[default_server]
       [ssl][http2|spdy][proxy_protocol][setfib=number][fastopen=number]
       [backlog=number][rcvbuf=size][sndbuf=size][accept_filter=filter]
       [deferred][bind][ipv6only=on|off][reuseport]
       [so_keepalive=on|off[keepidle]:[keepintvl]:[keepcnt]];

listen unix:path[default_server]
       [ssl][http2|spdy][proxy_protocol]
       [backlog=number][rcvbuf=size][sndbuf=size][accept_filter=filter]
       [deferred][bind]
       [so_keepalive=on|off[keepidle]:[keepintvl]:[keepcut]]

Default: listen *.80 | *.8000;

context: server

2.1 处理HTTP请求头部的流程

处理连接


操作系统内核                  |  事件模块                 |    HTTP模块
  SYN                        |                          |
SYN+ACK       负载均衡，      |                          |
  ACK     选中CPU上的worker   |                          |
--------------------------------->epoll_wait            |
                             | 读事件?|                  |
                             |        V                 |
                             | {accept,分配连接   }      |
                             | {  内存池          } \    |
                             | connection_pool_size: \  |
                             | 512                    -->-
                             |                          | \
                             |                          |  \
                             |                          |   > {ngx_http_init_connection}
                             |                          |     {设置回调方法,epoll_ctl,  }
                             |                          |     {添加超时定时器           }
                             |                          |     client_header_timeout:60s
                             |                          |
    DATA                     |                          |
     ACK---------------------------->epoll_wait---------->{ngx_http_wait_request_handler}
                             |                          | {  分配内存，read读缓冲区       }
                             |                          |                |
                             |                          |                V
                             |                          | client_header_buffer_size:1k
                             |                          |

处理请求收到client_header_buffer_size指定的1k的内容后呢？处理连接和处理请求是不一样的，处理连接，我只要把它收到nginx的内存中，就OK了。但处理请求的时候，我可能需要去做大量的上下文分析，去分析它这个http协议，分析每个header，这个时候就需要分配一个请求内存池。

           [接收URI]               |           [接收header]
               |                   |
               V                   |
        {分配请求内存池}            |  +--->{状态机解析header}
               |                   |  |            |
      request_pool_size:4k         |  |            V
               V                   |  |      {分配大内存}
       {状态机解析请求行}           |  |            |
               |                   |  | large_client_header_buffers:4 8k
               V                   |  |            V
         {分配大内存}               |  |      {标识header}
               |                   |  |            |
large_client_header_buffers:4 8k   |  |            V
               V                   |  |    {移除超时定时器}
       {状态机解析请求行}           |  |            |
               |                   |  | client_header_timeout:60s
               V                   |  ^            V
           {标识URL}--------->-----|--+    {开始11个阶段的http请求处理}

2.2 用于找到处理请求的server指令

2.2.1 server_name指令可以保证我们在处理11个阶段的http模块之前，先决定哪个server块指令被使用。

指令后可以跟多个域名，第1个是主域名
*泛域名：仅支持在最前或者最后例如： server_name *.taohui.tech;
正则表达式：加~前缀例如： server_name www.taohui.tech ~^www\d+\.taohui\.tech$;

用正则表达式来创建变量：用小括号()

# 使用序号
server {
    server_name ~^(www\.)?(.+)$;
    location / { root /sites/$2; }
}

# 使用名字
server {
    server_name ~^(www\.)?(?<domain>.+)$;
    location / { root /sites/$domain; }
}

其他

.taohui.tech可以匹配taohui.tech和*.taohui.tech
_匹配所有请求
""匹配没有传递Host头部的请求

可以使用server_name_in_redirect来控制主域名的作用

Syntax:    server_name_in_redirect on | off;
Default:   server_name_in_redirect off;
Context:   http,server,location

在执行重定向的时候，如果server_name_in_redirect是off，则返回的Location头部的域名和请求的域名保持一致；如果server_name_in_redirect是on，则返回的Location头部的域名和主域名保持一致。

2.2.2 多个server指令块的匹配顺序

精确匹配
*在前的泛域名
*在后的泛域名
按文件中的顺序匹配正则表达式域名注意，1、2、3都与顺序无关当有多个正则表达式同时匹配的时候，优先选择前面的
1、2、3、4都没有找到的时候，会启用default server
- 第1个
- listen指定default

3. http模块的11个阶段

除了http过滤模块和只提供变量的nginx模块之外，所有的http模块必须从nginx定义好的11个阶段进行请求的处理。所以，每个http模块，有没有机会生效、何时生效，都要看一个请求究竟处理到哪个阶段。

请求被nginx处理的示意图（和真正的流程并不完全吻合）：

                            Internet Requests
                                   V
                            +---------------------+
                            |Read Request Headers |
             +------------->|上面第1章节的所有步骤  |---+
+---+        ^              +---------------------+   | 进入11个阶段的处理
|log|<-----<-+                                        |
+---+        |                                        V
             |                                  +----------------------+
             |                                  |     Identify         |
             |                  +-------------->|Configuration Block   |
             |                  |               |寻找哪个location生效   |
             ^                  ^               +----------------------+
             |                  |                     V
     +----------------+    +-----------+        +----------------------+
     |Response Filters|--->|子请求和   |<-------|Apply Rate Limits      |
     |                |    |重定向     |        |决定是否对它限速        |
     +----------------+    +-----------+        +----------------------+
             ^                  ^   ^                 V
             |                  |   |           +----------------------+
             |                  |   +---<-------|Perform Authentication|
             |                  |               |权限验证               |
             |                  |               +----------------------+
             |            +-----------------+         |
             |            | Generate Content|         |
             +---<--------|生成响应          |<--------+
                          +-----------------+
                                   ^
                                   V
                       +-------------------------+
                       |Upstream Services        |
                       +-------------------------+

实际的流程其实是下面的11个阶段：

阶段	模块
POST_READ	realip
SERVER_REWRITE	rewrite
FIND_CONFIG
REWRITE	rewrite
POST_REWRITE
PREACCESS	limit_conn,limit_req
ACCESS	auth_basic,access,auth_request
POST_ACCESS
PRECONTENT	try_files
CONTENT	index,autoindex,concat
LOG	access_log
每个阶段中，可能会有多个http模块，它们之间的处理顺序也很重要

3.1 11个阶段的顺序处理

当http请求进入nginx这11个阶段的时候，由于每个阶段都可能有0个或者多个http模块，如果某个模块不再把请求向下传递，那么后面的模块是得不到执行的。而且同一个阶段的多个模块，也不一定每个模块都有机会执行到，可能会有前面的模块把请求传递给下一个阶段中的模块去处理。

-------->{realip}-------------------------+ postread阶段
                                          |
                                          |
 +--[rewrite]<-[find-config]<-[rewrite]<--+
 |
 |
 +-->{limit_req}->{limit_conn}--+ preaccess阶段
                                |
    +----------------<----------+
    V
{access}->{auth_basic}->{auth_request}--+ access阶段
                                        |
         +------------------------------+
         V
    {try_files}->{mirrors}--+ precontent阶段
                            |
  +-------------------------+
  V
{concat}->{random_index}->{index}->{auto_index}->{static} content阶段
                                                     |
     {log}<------------------------------------------+ log阶段

各个阶段的多个模块之间的顺序，由objs/ngx_modules.c文件定义的ngx_modules数组确定，越靠后的先执行。

3.2 postread阶段：获取真实客户端地址的realip模块

如何在存在各种代理的网络环境中拿到真实的用户IP？ HTTP头部X-Forwarded-For用于传递IP HTTP头部X-Real-IP用于传递用户IP

用户
 |   内网IP：192.168.0.x
 |
 V
ADSL
 |   运营商公网IP：115.204.33.1
 |
 V
CDN
 |   IP地址：1.1.1.1    X-Forwarded-For: 115.204.33.1
 |                      X-Real-IP: 115.204.33.1
 V
某反向代理(clb)
 |   IP地址；2.2.2.2    X-Forwarded-For: 115.204.33.1,1.1.1.1
 |                      X-Real-IP: 115.204.33.1
 V
Nginx  用户地址: 115.204.33.1
       remote_addr: 2.2.2.2

binary_remote_addr和remote_addr这两个变量，原先存放的是直接和nginx连接的客户端的IP地址。但是，经过了realip模块以后，我们就会使用X-Forwarded-For或X-Real-IP把这两个变量的值给覆盖掉。

模块默认不会编译进Nginx，通过--with-http_realip_module启用该功能
变量 realip_remote_addr: 获取被替换之前的remote_addr的值 realip_remote_port: 获取被替换之前的remote_port的值
指令

# 设置信任地址：
# 设置连接的source IP地址满足什么条件时，
# realip模块才执行remote_addr和remote_port值的替换.
# 默认值为-，表示所有请求，无论它的source IP是什么，都执行替换。
Syntax:  set_real_ip_from address | CIDR | unix;
Default: -
Context: http,server,location

Syntax:  real_ip_header field | X-Real-IP | X-Forwarded-For | proxy_protocol;
Default: real_ip_header X-Real-IP;
Context: http,server,location

# 环回地址： 如果开启的话，它会判断X-Forwarded-For的最后一个地址，
# 如果它的值和客户端的值相同的话，就会把它pass掉去取上一个地址。
Syntax:  real_ip_recursive on | off;
Default: real_ip_recursive off;
Context: http,server,location

3.3 rewrite阶段的rewrite模块

3.3.1 return指令

return指令语法：

Syntax:  return code [text];
         return code URL;
         return URL;
Default: -
Context: server,location,if

返回状态码：

Nginx自定义 444: 关闭连接，不再向客户端返回内容，所以客户端无法收到444响应码
HTTP 1.0标准 301: http1.0永久重定向（没有指定能否改变方法） 302: 临时重定向，禁止被缓存（没有指定能否改变方法）
HTTP 1.1标准 303: 临时重定向，允许改变方法，禁止被缓存 307: 临时重定向，不允许改变方法，禁止被缓存 308: 永久重定向，不允许改变方法

所有重定向请求，响应头会包含一个Location头部，用于指定重定向的URL。

渲染返回码的error_page指令语法：

Syntax:  error_page code ... [=[response]] uri;
Default: -
Context: http,server,location,if in location

接收到某个或某些返回码的时候，我可以重定向为另外一个uri，也可以改变返回码。

例子：

1. error_page 404 /404.html;
2. error_page 500 502 503 504 /50x.html;
3. error_page 404 =200 /empty.gif;
4. error_page 404 =/404.php;
5. location / {
        error_page 404 = @fallback;
   }
   location @fallback {
        proxy_pass http://backend;
   }
6. error_page 403 http://example.com/forbidden.html;
7. error_page 404 =301 http://example.com/notfound.html;

return与error_page指令的关系：

server {
    server_name return.taohui.tech;
    listen 8080;

    root html/;
    error_page 404 /403.html
    location / {
        # 由于这里的return指令指定了响应体，所以：
        # 这里的return指令会覆盖上面的error_page指令的行为
        return 404 "find nothing!\n";
    }
}

3.3.2 rewrite指令

语法：

Syntax:  rewrite regex replacement [flag];
Default: -
Context: server,location,if

功能： rewrite ^/admin/website/article/(\d+)/change/uploads/(\w+)\/(\w+)\.(png|jpg|gif|jpeg|bmp)$ /static/uploads/$2/$3.$4 last;

将regex指定的url替换成replacement这个新的url
当replacement以http://或者https://或者$schema开头，则直接返回302重定向
替换后的url根据flag指定的方式进行处理不带flag: 单纯的做url替换，不执行location匹配 last: 用replacement这个URI进行新的location匹配 break: break指令停止当前脚本指令的执行，等价于独立的break指令 redirect: 返回302重定向 permanent: 返回301重定向

例子:

server {
    server_name rewrite.taohui.tech;
    # rewrite_log默认是不开，一旦开启，就会把重定向的过程记录在error_log下
    rewrite_log on;
    error_log logs/rewrite_error.log notice;

    root html/;
    location /first {
        rewrite /first(.*) /second$1 last;
        return 200 'first!\n';
    }

    location /second {
        rewrite /second(.*) /third$1 break;
        #rewrite /second(.*) /thrid$1;
        return 200 'second!\n';
    }
    
    location /third {
        return 200 'third!\n';
    }
}

3.3.3 if指令

语法：

Syntax:  if(condition) {...}
Default: -
Context: server,location

断定真假的规则：

检查变量为空或者值是否为0，直接使用
将变量与字符串做匹配，使用=或者!=
将变量与正则表达式做匹配
- 大小写敏感，~或者!~
- 大小写不敏感，~*或者!~*
检查文件是否存在，使用-f或者!-f
检查目录是否存在，使用-d或者!-d
检查文件、目录、软链接是否存在，使用-e或者!-e
检查是否为可执行文件，使用-x或者!-x

if ($http_user_agent ! MSIE) {
    rewrite ^(.*)$ /msie/$1 break;
}
if ($http_cookie ~* "id=([^;]+)(?:;|$)") {
    # 直接修改cookie中id属性的值
    # 括号中开头的?:表示不对该括号的内容进行捕获，以节省内存
    set $id $1;
}
if ($request_method = POST) {
    return 405;
}
if ($slow) {
    limit_rate 10k;
}
if ($invalid_referer) {
    return 403;
}

3.4 find_config阶段: 找到处理请求的location指令块

指令

Syntax:  location [ =|~|~*|^~ ] uri {...}
         location @name {...}
Default: -
Context: server,location

Syntax:  merge_slashes on | off;
Default: merge_slashes on;
Context: http,server
功能：   用于合并uri中多个连续的斜杠/，只有对uri做了base64编码的时候，才关闭它

location匹配规则：仅匹配URI，不能像rewrite指令一样使用参数

前缀字符串
- 常规前缀字符串匹配
- ^~: 前缀字符串匹配，但是匹配上后则不再进行正则表达式匹配
- =: 精确匹配
正则表达式
- ~: 大小写敏感的正则匹配
- ~*: 忽略大小写的正则匹配
用于内部跳转的命名location: @

location匹配顺序

                     {遍历匹配全部前缀字符串location}:使用二叉树来存放
                                   |
       匹配上=字符串                V     匹配上^~字符串
      +--------------------------< ? >-----------------------+
      |                            |                         |
      V                            |                         V
{使用匹配上的=精确匹配location}      |              {使用匹配上的^~字符串location}
      V                            |                         V
   {return}                        |                      {return}
                                   V
                  {记住最长匹配的前缀字符串location}
                                   V
         +->{按nginx.conf中的顺序依次匹配正则表达式location}
         |                         |
         |未匹配上                  V      匹配
         +-----------------------< ? >---------->{使用匹配上的正则表达式}
                                   |
                        所有正则表达式都不匹配
                                   V
                  {使用最长匹配的前缀字符串location}

3.5 preaccess阶段

3.5.1 对连接做限制的limit_conn模块

默认编译进nginx，通过--without-http_limit_conn_module禁用

生效范围：

全部worker进程（基于共享内存）
限制的有效性取决于key的设计：依赖postread阶段的realip模块取到真实ip

使用步骤：

定义共享内存（包括大小），以及key关键字

Syntax:  limit_conn_zone key zone=name:size;
Default: -
Context: http

由于limit_conn_zone指令带有_zone后缀，所以它是基于共享内存的，那么所有worker进程都会生效。

限制并发连接数

Syntax:  limit_conn zone number;
Default: -
Context: http,server,location

其他

# 限制发生时，会在error_log中打印一条日志，下面的指令可以设置日志的级别
Syntax:  limit_conn_log_level info | notice | warn | error;
Default: limit_conn_log_level error;
Context: http,server,location

# 限制发生时向客户端返回的错误码
Syntax:  limit_conn_status code;
Default: limit_conn_status 503;
Context: http,server,location

示例

# 相对于$remote_addr，$binary_remote_addr的解析速度更快，因为是二进制的
limit_conn_zone $binary_remote_addr zone=addr:10m;

server {
    server_name limit.taohui.tech;
    root html/;
    error_log logs/myerror.log info;

    location / {
        limit_conn_status 500;
        limit_conn_log_level warn;
        limit_rate 50;      #限制向用户返回的速度，每秒钟只返回50个字节，方便同时发起两个请求
        limit_conn addr 1;  #限制并发连接数为1
    }
}

3.5.2 对请求做限制的limit_req模块

默认编译进nginx，通过--without-http_limit_req_module禁用功能生效算法： leaky bucket算法（漏桶算法） * Bursty Flow(突发流量) —> Leaky Bucket —> Fixed Flow(恒定流量) * 当桶满了，还在滴水怎么办呢？我们立刻向用户返回503。生效范围： * 全部worker进程（基于共享内存） * 进入preaccess阶段前不生效

使用步骤：

定义共享内存（包括大小），以及key关键字和限制速率

Syntax:  limit_req_zone key zone=name:size rate=rate;
Default: -
Context: http
   * rate单位为r/s或者r/m，用于设置漏水速度

限制并发连接数

Syntax:  limit_req zone=name [burst=number] [nodelay];
Default: -
Context: http,server,location
    * burst默认为0，表示桶里最多可以容纳多少个请求
    * nodelay, 对burst中的请求不再采用延时处理的做法，而是立刻返回错误

其他

# 限制发生时的日志级别
Syntax:  limit_req_log_level info | notice | warn | error;
Default: limit_req_log_level error;
Context: http,server,location

# 限制发生时向客户端返回的错误码
Syntax:  limit_red_status code;
Default: limit_red_status 503;
Context: http,server,location

示例

limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_req_zone $binary_remote_addr zone=one:10m rate=2r/m;

server {
    server_name limit.taohui.tech;
    root html/;
    error_log logs/myerror.log info;

    location / {
        limit_conn_status 500;
        limit_conn_log_level warn;
        limit_rate 50;
        limit_conn addr 1;
        #limit_req zone=one burst3 nodelay;
        limit_req zone=one;
    }
}

如果limit_req和limit_conn同时生效，那么limit_conn会被短路，因为limit_req模块在limit_conn模块之前执行。

3.6 access阶段

3.6.1 对IP做限制的access模块

默认编译进nginx，通过--without-http_access_module禁用该功能

语法

Syntax:  allow address | CIDR | unix: | all;
Default: -
Context: http,server,location,limit_except

Syntax:  deny address | CIDR | unix: | all;
Default: -
Context: http,server,location,limit_except

示例：

location / {
    # 注意，这是顺序执行的，当满足一条之后，就不会再继续执行了。
    deny  192.168.1.1;
    allow 192.168.1.0/24;
    allow 10.1.1.0/16;
    allow 2001:0db8::/32;
    deny all;
}

3.6.2 对用户名密码做限制的auth_basic模块

RFC2617: HTTP Basic Authentication

流程：

浏览器                                                                                      Nginx
   |                                                                                          |
   |------------>----------------------1.---------------------------------------------------->|
   |                                                                                          |
   |                                                                                          |
   |                                    Hypertext Transfer Protocol                           |
   |                                     HTTP/1.1 401 Unauthorized\r\n                        |
   |                                     Server: openresty/1.13.6.2\r\n                       |
   |                                     Date: Sat, 10 Nov 2018 01:32:41 GMT\r\n              |
   |                                     Content-Type: text/html\r\n                          |
   |                                     Content-Length: 603\r\n                              |
   |                                     Connection: keep-alive\r\n                           |
   |                                     WWW-Authenticate: Basic realm="test auth_basic"\r\n  |
   |                                     \r\n                                                 |
   |<-----------<----------------------2.-----------------------------------------------------|
   |                                                                                          |
   |------->浏览器会弹出对话框，                                                                |
   |        让用户输入用户名和密码                                                              |
   |                                                                                          |
   | Hypertext Transfer Protocol                                                              |
   |  GET /  HTTP/1.1\r\n                                                                     |
   |  HOST: access.taohui.tech\r\n                                                            |
   |  Connection: keep-alive\r\n                                                              |
   |  Pragma: no-cache\r\n                                                                    |
   |  Cache-Control: no-cache\r\n                                                             |
   |  Authorization: Basic dXNlcjpwYXNzd29yZA==\r\n                                           |
   |  Upgrade-Insecure-Requests: 1\r\n                                                        |
   |------------>----------------------3.---------------------------------------------------->|
   |                                                                                          |

* echo -n "user:password" |base64  的结果就是: dXNlcjpwYXNzd29yZA==

默认编译进nginx，通过--without-http_auth_basic_module禁用该功能。

指令:

Syntax:  auth_basic string | off;
Default: auth_basic off;
Context: http,server,location,limit_except
功能： 可以给浏览器的弹窗设置标头

Syntax:  auth_basic_user_file file;
Default: -
Context: http,server,location,limit_except
功能： 设置用户名密码对存放的文件路径

生成密码文件文件的格式:

# comment
name1:password1
name2:password2:comment
name3:password3

生成工具：htpasswd

#安装依赖包
$ yum install -y httpd-tools

# 生成文件
$ htpasswd -b -c file user pass

3.6.3 使用第三方做权限控制的auth_request模块

auth_request模块它默认是没有编译进Nginx的，我们需要通过--with-http_auth_request_module来把它添加进来。

它的原理呢，就是收到请求后，先把这个请求hold住，然后生成子请求，子请求的内容与请求的内容是相同的，然后通过反向代理技术把子请求传递给上游服务器，最后通过子请求返回的响应来决定是否处理当前这个请求。

它的功能是，向上游的服务转发请求，若上游服务返回的响应码是2xx，则继续执行，若上游服务返回的是401或者403，则将响应返回给客户端。

它提供的指令呢，主要有两个:

Syntax:    auth_request uri | off;
Default:   auth_request off;
Context:   http,server,location

Syntax:    auth_request_set $variable value;
Default:   ---
Context:   http,server,location

示例：

server {
    server_name access.taohui.tech;
    err_log logs/error.log debug;
    root html/;
    default_type text/plain;

    location / {
        auth_request /test_auth;
    }

    location = /test_auth {
        proxy_pass http://127.0.0.1:8090/auth_upstream;
        proxy_pass_request_body off;
        proxy_set_header Content-Length "";
        proxy_set_header X-Original-URI $request_uri;
    }
}

另一台监听8090端口的nginx的配置如下:

http {
    include    mime.type;
    default_type  application/octet-stream;
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';
    access_log logs/access.log main;

    # 
    sendfile on;
    tcp_nopush on;

    server {
        listen 8090;
        location /auth_upstream {
            return 401 'auth success!';
        }
    }
}

3.6.4 satisfy指令(该指令由http框架提供)

它的语法:

Syntax: satisfy all | any;
Default: satisfy all;
Context: http,server,location

access阶段的模块：

access模块
auth_basic模块
auth_request模块
其他模块

                   执行一个access模块
                           V
                          / \
          +--------------<   >--------------+
       允许放行           \ /           请求被拒绝
          |                |                |
          |                |                |
          V                |                V
    判断satisfy开关      忽略          判断satisfy开关
          |                |                |
          V                |                V
         / \               |               / \
      +-<   >--all----->---+--<--any------<   >-+
   any|  \ /               |               \ /  |all
      |                    |                    |
      V                    V                    V
access阶段放行      执行下一个access模块        拒绝

3.7 precontent阶段

3.7.1 按序访问资源的try_files模块

该模块由http框架提供，没有办法取消。

try_files指令：

Syntax:   try_files file ... uri;
          try_files file ... =code;
Default:  -
Context:  server,location

功能：依次试图访问多个url对应的文件（由root或者alias指令指定），当文件存在时直接返回文件内容，如果所有文件都不存在，则按最后一个 URL结果或者code返回。

server {
    location /first {
        try_files /system/maintenance.html
               $uri $uri/index.html $uri.html
               @lasturl;
    }
    location @lasturl {
        return 200 'lasturl!\n';
    }

    location /second {
        try_files $uri $uri/index.html $uri.html =404;
    }
}

3.7.2 实时拷贝流量的mirror模块

默认编译进nginx中，可以通过--without-http_mirror_module移除该模块。

功能：处理请求时，生成子请求访问其他服务，对子请求的返回值不做处理。

指令：

Syntax:  mirror uri | off
Default: mirror off;
Context: http,server,location

Syntax:  mirror_request_body on |off;
Default: mirror_request_body on;
Context: http,server,location

3.8 content阶段

3.8.1 static模块

Syntax:  alias path;
Default: --
Context: location

Syntax:  root path;
Default: root html;
Context: http,server,location,if in location

功能：将url映射为文件路径，以返回静态文件内容差别：root会将完整url映射进文件路径中,alias只会将location后的URL映射到文件路径

生成待访问文件的三个相关变量问题：访问/RealPath/1.txt时，这三个变量的值各为多少？

location /RealPath/ {
    alias html/realpath/;
}

                                   realpath_root
                 document_root     将document_root
request_filename 由URI和root/alias 中的软链接等换
待访问文件的完整   规则生成的文件夹   成真实路径
路径              路径

静态文件返回时的content-type

Syntax:     types {...}
Default:    types { text/html html; image/gif gif; image/jpeg jpg; }
Context:    http,server,location

Syntax:     default_type mime-type;
Default:    default_type text/plain;
Context:    http,server,location

Syntax:     types_hash_bucket_size size;
Default:    types_hash_bucket_size 64;
Context:    http,server,location

Syntax:     types_hash_max_size size;
Default:    types_hash_max_size 1024;
Context:    http,server,location

未找到文件时的错误日志

Syntax:    log_not_found on | off
Default:   log_not_found on;
Context:   http,server,location

[error] 10156#0: *10723 open() "/html/first/2.txt/root/2.txt" failed(2:No such file or directory)

问题：访问目录时URL最后没有带/ ? static模块实现了root/alias功能时，发现访问目录是目录，但URL末尾未加/时，会返回301重定向。

重定向跳转的域名：

Syntax:   server_name_in_redirect on | off;
Default:  server_name_in_redirect off;
Context:  http,server,location

Syntax:   port_in_redirect on | off;
Default:  port_in_redirect on;
Context:  http,server,location

Syntax:   absolute_redirect on | off;
Default:  absolute_redirect on;
Context:  http,server,location

3.8.2 index和autoindex模块

对访问/时的处理： content阶段的index模块功能：指定/访问时返回index文件内容模块: ngx_http_index_module 语法：

Syntax:  index file ...;
Default: index index.html;
Context: http,server,location

显示目录内容：content阶段的autoindex模块功能：当URL以/结尾时，尝试以html/xml/json/jsonp等格式返回root/alias中指向目录的目录结构模块：ngx_http_index_module，默认编译进Nginx，--without-http_autoindex_module取消

autoindex模块的指令：

Syntax:   autoindex on | off;
Default:  autoindex off;
Context:  http,server,location

Syntax:   autoindex_exact_size on | off;
Default:  autoindex_exact_size on;
Context:  http,server,location

Syntax:   autoindex_format html | xml | json | jsonp;
Default:  autoidnex_format html;
Context:  http,server,location

Syntax:   autoindex_localtime on | off;
Default:  autoindex_localtime off;
Context:  http,server,location

3.8.3 提升多个小文件性能的concat模块

提升性能：content阶段的concat模块功能：当页面需要访问多个小文件时，把它们的内容合并到一次http响应中返回，提升性能模块：ngx_http_concat_module，模块开发者Tengine(https://github.com/alibaba/nginx-http-concat): –add-module=../nginx-http-concat/ 使用：在URI后加上??，后通过多个,逗号分隔文件。如果还有参数，则在最后通过?添加参数例子： https://g.alicdn.com/??kissy/k/6.2.4/seed-min.js,kg/global-util/1.0.7/index-min.js,tb/tracker/4.3.5/index.js,kg/tb-nav/2.5.3/index-min.js,secdev/sufei_data/3.3.5/index.js

concat模块的指令：

concat:   on | off
default:  concat off
Context:  http,server,location

concat_delimiter: string
Default: NONE
Context: http,server,locatione

concat_types: MIME types
Default: concat_types:text/css application/x-javascript
Context: http,server,location

concat_unique: on | off
Default: concat_unique on
Context: http,server,location

concat_ignore_file_error: on | off
Default: off
Context: http,server,location

concat_max_files: numberp
Default: concat_max_files 10
Context: http,server,location

3.9 log阶段

3.9.1 access日志的详细用法

log阶段：记录请求访问日志的log模块功能：将HTTP请求相关信息记录到日志模块：ngx_http_log_module，无法禁用

access日志格式:

Syntax:   log_format name [escape=default|json|none] string ...;
Default:  log_format combined "...";
Context:  http

默认的combined日志格式
log_format combined '$remote_addr - $remote_user [$time_local]'
'"$request" $status $body_bytes_sent' "'$http_referer" "$http_user_agent"';

配置日志文件路径:

Syntax:    access_log path [format [buffer=size][gzip[=level]] [flush=time] [if=condition]];
           access_log off;
Default:   access_log logs/access.log combined;
Context:   http,server,location,if in location, limit_except

> path路径可以包含变量：不打开cache时每记录一条日志都需要打开、关闭日志文件
    比如添加$Host这个变量，也就是根据每个请求的Host头部，把它的值写到不同的文件中；如果path中包含变量，就会出现一个问题，可能每个请求这个变量是不同的，所以就意味着很可能不同的请求的日志会进入到不同的文件中，所以nginx怎么办呢？它只能记录到一条日志的时候就先打开、关闭日志文件。
> if通过变量值控制请求日志是否记录
> 日志缓存
  * 功能：批量将内存中的日志写入磁盘
  * 写入磁盘的条件
    * 所有待写入磁盘的日志大小超过缓存大小
    * 达到flush指定的过期时间
    * worker进程执行reopen命令，或者正在关闭
> 日志压缩
  * 功能：批量压缩内存中的日志，再写入磁盘
  * buffer大小默认为64KB
  * 压缩级别默认为1 (1最快压缩率最低，9最慢压缩率最高)

对日志文件名包含变量时的优化：

Syntax:    open_log_file_cache max=N [inactive=time] [min_uses=N] [valid=time];
           open_log_file_cache off;
Default:   open_log_file_cache off;
Context:   http,server,location

max:       缓存内的最大文件句柄数，超出后用LRU算法淘汰
inactive:  文件访问完后在这段时间内不会被关闭。默认10秒
min_uses:  在inactive时间内，使用次数超过min_uses才会继续存在内存中。默认1
valid:     超出valid时间后，将对缓存的日志文件检查是否存在。默认60秒
off:       关闭缓存功能

4. HTTP过滤模块

4.1 流程

返回响应-加工响应内容 HTTP过滤模块-> copy_filter:复制包体内容 HTTP过滤模块-> postpone_filter:处理子请求 HTTP过滤模块-> header_filter:构造响应头部 write_filter:发送响应

&ngx_http_write_filter_module, +
&ngx_http_header_filter_module, +
&ngx_http_chunked_filter_module,
&ngx_http_v2_filter_module,
&ngx_http_range_header_filter_module,
&ngx_http_gzip_filter_module,
&ngx_http_postpone_filter_module, +
&ngx_http_ssi_filter_module,
&ngx_http_charset_filter_module,
&ngx_http_sub_filter_module,
&ngx_http_additioin_filter_module,
&ngx_http_userid_filter_module,
&ngx_http_headers_filter_module,
&ngx_http_echo_module,
&ngx_http_xss_filter_module,
&ngx_http_srcache_filter_module,
&ngx_http_lua_module,
&ngx_http_headers_more_filter_module,
&ngx_http_rds_json_filter_module,
&ngx_http_rds_csv_filter_module,
&ngx_http_copy_filter_module, +
&ngx_http_range_body_filter_module,
&ngx_http_not_modified_filter_module,

我们在返回响应的时候，这些http过滤模块的顺序，也很重要。就像我们之前提到的，在11个阶段的顺序以及每个阶段中每个http模块的顺序，会影响我们的请求处理一样，它也有很重要的影响。那这些模块的顺序是怎样的呢？我们看一下nginx的module.c中的那个数组，那个数组所有的过滤模块会以上面的顺序写出来，这个顺序怎么看呢，也是从下往上看。也就是一个请求会首先被下面的http模块处理，再依次的被上面的过滤模块，挨个处理。那在这个顺序中呢，我们要重点关注四个http模块：ngx_http_copy_filter_module, ngx_http_postpone_filter_module, ngx_http_header_filter_module, ngx_http_write_filter_module。

4.2 sub模块

替换响应中的字符串：sub模块功能：将响应中指定的字符串，替换成新的字符串模块：ngx_http_sub_filter_module模块，默认未编译进Nginx，通过--with-http_sub_module启用。

sub模块的指令：

Syntax:   sub_filter string  replacement;
Default:  --
Context:  http,server,location

Syntax:   sub_filter_last_modified on | off;
Default:  sub_filter_last_modified off;
Context:  http,server,location

Syntax:   sub_filter_once on | off;
Default:  sub_filter_once on;
Context:  http,server,location

Syntax:   sub_filter_types mime-type ...;
Default:  sub_filter_types text/html;
Context:  http,server,location

4.3 addition模块

在响应的前后添加内容：addition模块功能：在响应前或者响应后增加内容，而增加内容的方式是通过新增子请求的响应完成。模块：ngx_http_addition_filter_module默认未编译进nginx，通过--with-http_addition_module启用。

addition模块的指令：

Syntax:  add_before_body uri;
Default: --
Context: http,server,location

Syntax:  add_after_body uri;
Default: --
Context: http,server,location

Syntax:  addition_types mime-type ...;
Default: addition_types text/html;
Context: http,server,location

5. Nginx变量

5.1 原理

在nginx中变量是一个非常强大的工具，我们可以在nginx.conf中，通过变量去修改各个模块处理请求的方式，所以变量是非常好的解耦的工具。变量也可以在openrestry的lua语言中大有用处，那么现在我们先来看一下变量它使用的原理。

变量的提供模块与使用模块:

+--------------------------------+   |Nginx启动
|     提供变量的模块              |   |             +-----------------------+
|+------------------------------+|   |             |  使用变量的模块，比如  |
||preconfiguration中定义新的变量 ||   |             |  http的access日志     |
||     {解析出变量的方法}<-------||<--|----.--------|-----+                 |
||              ^               ||   |     \       |     |                |
||              V               ||   V      \      | {解析nginx.conf时定义}|
||           {变量名}<----------||---HTTP头部 \     | {变量使用方式}        |
|+------------------------------+|  读取完毕   \    |                      |
+--------------------------------+        \    *-->|-{变量值} 处理请求     |
                                           \       +-------------|--------+
                                            *---<---------------<+

每个变量分为提供变量的模块和使用变量的模块，那么提供变量的模块是如何提供变量的呢？首先我们提供nginx，启动nginx之后，nginx发现这是一个http模块，http模块呢我们之前简单的给大家看过一些源码，第六部分，我们还会做更详细的介绍，但是这里我只是简单的提一下，有一个回调方法叫做preconfiguration，这个名字不是很重要，但是我们可以看到两个信息，第一个信息，configuration这个模块开始读取配置文件了，preconfiguration就是在读取配置文件之前，它开始添加它提供的新变量了。

变量的特性：

惰性求值
变量值可以时刻变化，其值为使用的那一时刻的值。

存放变量的哈希表：

Syntax:  variables_hash_bucket_size size;
Default: variables_hash_bucket_size 64;
Context: http

Syntax:  variables_hash_max_size size;
Default: variables_hash_max_size 1024;
Context: http

5.2 HTTP框架提供的请求相关的变量

HTTP框架提供的变量 01 HTTP请求相关的变量 02 TCP连接相关的变量 03 Nginx处理请求过程中产生的变量 04 发送HTTP响应时相关的变量 05 Nginx系统变量

HTTP请求相关的变量:

arg_参数名：     URL中某个具体参数的值
query_string:    与args变量完全相同
args:            全部URL参数
is_args:         如果请求URL中有参数则返回?否则返回空
content_length:  HTTP请求中标识包体长度的Content-Length头部的值
content_type:    标识请求包体类型的Content-Type头部的值
uri:             请求的URI (不同于URL，不包括?后的参数)
document_uri:    与uri完全相同
request_uri:     请求的URL(包括URI以及完整的参数)
scheme:          协议名，例如HTTP或者HTTPS
request_method:  请求方法，例如GET或者POST
request_length:  所有请求内容的大小，包括请求行、头部、包体等
remote_user:     由HTTP Basic Authentication协议传入的用户名
request_body_file:
        > 临时存放请求包体的文件
          * 如果包体非常小则不会存文件
          * client_body_in_file_only强制所有包体存入文件，且可决定是否删除
request_body:
        > 请求中的包体，这个变量当且仅当使用反向代理，且设定用内存暂存包体时才有效
request:
        > 原始的url请求，含有方法与协议版本，例如GET /?a=1&b=22 HTTP/1.1
host:
        > 先从请求行中获取
        > 如果含有Host头部，则用其值替换掉请求行中的主机名
        > 如果前两者都取不到，则使用匹配上的server_name
http_头部名字: 返回一个具体请求头部的值
        > 特殊：
          * http_host
          * http_user_agent
          * http_referer
          * http_via
          * http_x_forwarded_for
          * http_cookie
        > 通用

5.3 HTTP框架提供的其他变量

TCP连接相关的变量

binary_remote_addr>  客户端地址的整型格式，对于IPv4是4字节，对于IPv6是16字节
connection>  递增的连接序号
connection_requests> 当前连接上执行过的请求数，对keepalive连接有意义
remote_addr> 客户端地址
remote_port> 客户端端口
proxy_protocol_addr> 若使用了proxy_protocol协议则返回协议中的地址，否则返回空
proxy_protocol_port> 若使用了proxy_protocol协议则返回协议中的端口，否则返回空

server_addr> 服务器端地址
server_port> 服务器端端口
TCP_INFO> tcp内核层参数，包括$tcpinfo_rtt, $tcpinfo_rttvar, $tcpinfo_snd_cwnd, $tcpinfo_rcv_space
server_protocol> 服务器端协议，例如HTTP/1.1

Nginx处理请求过程中产生的变量:

request_time: 请求处理到现在的耗时，单位为妙，精确到毫秒
server_name: 匹配上请求的server_name值
https: 如果开启了TLS/SSL，则返回on，否则返回空
request_completion: 若请求处理完则返回OK，否则返回空
request_id: 以16进制输出的请求标识id，该id共含有16个字节，是随机生成的
request_filename: 待访问文件的完整路径
document_root: 由URI和root/alias规则生成的文件夹路径
realpath_root: 将document_root中的软链接等换成真实路径
limit_rate: 返回客户端响应时的速度上限，单位为每秒字节数。可以通过set指令修改对请求产生效果。

发送HTTP响应时对应的变量：

body_bytes_sent> 响应中body包体的长度
bytes_sent> 全部http响应的长度
status> http响应中的返回码
sent_trailer_名字> 把响应结尾内容里值返回
sent_http_头部名字> 响应中某个具体头部的值:
        >特殊处理
         * sent_http_content_type
         * sent_http_content_length
         * sent_http_location
         * sent_http_last_modified
         * sent_http_connection
         * sent_http_keep_alive
         * sent_http_transfer_encoding
         * sent_http_cache_control
         * sent_http_link
        >通用

Nginx系统变量

time_local>  以本地时间标准输出的当前时间，例如14/Nov/2018:15:55:37 +0800
time_iso8601> 使用ISO 8601标准输出的当前时间，例如2018-11-14T15:55:37+08:00
nginx_version> Nginx版本号
pid> 所属worker进程的进程id
pipe> 使用了管道则返回p，否则返回.
hostname> 所在服务器的主机名，与hostname命令输出一致
msec > 1970年1月1日到现在的时间，单位为妙，小数点后精确到毫秒

5.4 使用变量防盗链的referer模块

有一类http模块，它只是提供变量或者修改变量的值，比如我们接下来将要介绍的referer模块，那么它对于解决网络中防盗链问题，有简单直接有效的一个效果。

简单有效的防盗链手段：referer模块场景：某网站通过url引用了你的页面，当用户在浏览器上点击url时，http请求的头部中会通过referer头部，将该网站当前页面的url带上，告诉服务器本次请求是由这个页面引起的。目的：拒绝非正常的网站访问我们站点的资源思路：通过referer模块，用invalid_referer变量根据配置判断referer头部是否合法 referer模块: 默认编译进Nginx，通过--without-http_referer_module禁用。

referer模块的指令：

Syntax:  valid_referers none | blocked | server_names | string ...;
Default: --
Context: server,location

Syntax:  referer_hash_bucket_size size;
Default: referer_hash_bucket_size 64;
Context: server,location

Syntax:  referer_hash_max_size size;
Default: referer_hash_max_size 2048;
Context: server,location

valid_referers指令:可同时携带多个参数，表示多个referer头部都生效参数值：

none> 允许缺失referer头部的请求访问
block> 允许referer头部没有对应的值的请求访问
server_names> 
    * 若referer中站点域名与server_name中本机域名某个匹配，则允许该请求访问
    * 表示域名及URL的字符串，对域名可在前缀或者后缀中含有`*`通配符
        若referer头部的值匹配字符串后，则允许访问
    * 正则表达式
        若referer头部的值匹配正则表达式后，则允许访问

invalid_referer变量:

* 允许访问时变量值为空
* 不允许访问时变量值为1

问题：

server_name referer.taohui.tech;

location / {
    valid_referers none blocked server_names *.taohui.pub www.taohui.org.cn/nginx/ ~\.google\.;

    if($invalid_referer) {
        return 403;
    }

    retunr 200 'valid\n'
}

以下请求哪些会被拒绝？

curl -H 'referer: http://www.taohui.org.cn/ttt' referer.taohui.tech/
curl -H 'referer: http://www.taohui.org/ttt' referer.taohui.tech/
curl -H 'referer:' referer.taohui.tech/
curl referer.taohui.tech/
curl -H 'referer: http://www.taohui.tech' referer.taohui.tech/
curl -H 'referer: http://referer.taohui.tech' referer.taohui.tech/
curl -H 'referer: http://image.baidu.com/search/detail' referer.taohui.tech/
curl -H 'referer: http://image.google.com/search/detail' referer.taohui.tech/

5.5 使用变量实现防盗链功能实践：secure_link模块

referer模块必须依赖于从浏览器端发出的请求，如果攻击者可以伪造referer头部，这个成本其实是非常低的，那么referer模块的防盗链功能就基本失效了。

防盗链的一种解决方案: secure_link模块变量：secure_link, secure_link_expires 模块：ngx_http_secure_link_module, 默认未编译进nginx，需要通过--with-http_secure_link_module添加功能：通过验证URL中哈希值的方式防盗链 > 过程 * 由某服务器(也可以是nginx)生成加密后的安全连接url，返回给客户端 * 客户端使用安全url访问nginx，由nginx的secure_link变量判断是否验证通过 > 原理 * 哈希算法是不可逆的 * 客户端只能拿到执行过哈希算法的URL * 仅生成URL的服务器、验证URL是否安全的nginx这二者，才保存执行哈希算法前的原始字符串 * 原始字符串通常由一下部分有序组成： 1. 资源位置，例如HTTP中指定资源的URL，防止攻击者拿到一个安全URL后可以访问任意资源 2. 用户信息，例如用户IP地址，限制其他用户盗用安全URL 3. 时间戳，使安全URL及时过期 4. 秘钥，仅服务器用户，增加攻击者猜测出原始字符串的难度

secure_link模块指令:

Syntax:   secure_link expression;
Default:  --
Context:  http,server,location

Syntax:   secure_link_md5 expression;
Default:  --
Context:  http,server,location

Syntax:   secure_link_secret word;
Default:  --
Context:  location

变量值及带过期时间的配置示例：

变量: >secure_link: 值为空字符串(验证不通过)
                    值为0(URL过期)
                    值为1(验证通过)
      >secure_link_expires:  时间戳的值

命令行生成安全链接：

原请求：

/test1.txt?md5=md5生成值&expires=时间戳 (如2147483647)

生成md5

echo -n ‘时间戳URL客户端IP 密钥’ | openssl md5 -binary | openssl base64 | tr +/ - | tr -d =

Nginx配置：

secure_link&arg_md5,$arg_expires; secure_link_md5 "$secure_link_expires$uri$remote_addr secret";

server {
    server_name securelink.taohui.tech;
    error_log logs/myerror.log info;
    default_type text/plain;
    location / {
        secure_link $arg_md5,$arg_expires;
        secure_link_md5 "$secure_link_expire$uri$remote_addr secret";

        if ($secure_link = "") {
            return 403;
        }
        if ($secure_link = "0") {
            return 410;
        }
        return 200 "$secure_link:$secure_link_expires\n";
    }

    location /p/ {
        secure_link_secret mysecret2;

        if ($secure_link = "") {
            return 403;
        }
        rewrite ^ /secure/$secure_link;
    }

    location /secure/ {
        alias html/;
        internal;
    }
}

仅对URL进行哈希的简单方法：原理： 1. 将请求URL分为三个部分, /prefix/hash/link 2. Hash生成方式，对"link秘钥"做md5哈希求值 3. 用secure_link_secret secret;配置秘钥命令行生成安全链接: > 原请求：link > 生成的安全请求： /prefix/md5/link > 生成md5: echo -n ’linksecret’ | openssl md5 -hex Nginx配置： > secure_link_secret secret;

5.6 为复杂的业务生成新的变量：map模块

很多时候我们直接使用某些变量的值，做逻辑判断去实现功能是比较困难的，而map模块提供了一种可能性，它可以根据一个或者多个变量，组合成的值结果做判断再修改新的变量的值，通过判断新变量的值，来实现非常复杂的业务逻辑。

功能：基于已有变量，使用类似switch {case: … default: …}的语法创建新变量，为其他基于变量值实现功能的模块提供更多的可能性模块：ngx_http_map_module，默认编译进Nginx，通过--without-http_map_module禁用

map模块的指令：

Syntax:   map string $variable {...}
Default:  --
Context:  http

Syntax:   map_hash_bucket_size size;
Default:  map_hash_bucket_size 32|64|128;
Context:  http

Syntax:   map_hash_max_size size;
Default:  map_hash_max_size 2048;
Context:  http

已有变量：>字符串 >一个或者多个变量 >变量与字符串的组合 case规则：> 字符串严格匹配 > 使用hostnames指令，可以对域名使用前缀*泛域名匹配 > 使用hostnames指令，可以对域名使用后缀*泛域名匹配 > ~和~*正则表达式匹配，后者忽略大小写 default规则：> 没有匹配到任何规则时，使用default > 缺失default时，返回空字符串给新变量其他： > 使用include语法提升可读性 > 使用volatile禁止变量值缓存

问题：对下面的map配置，当以下请求发生时，name变量值是？

* 'Host: map.taohui.org.cn'
* 'Host: map.tao123.org.cn'
* 'Host: map.taohui.pub'
* 'Host: map.taohui.tech'

nginx配置文件：

map $http_host $name {
    hostnames;
    default 0;
    ~map\.tao\w+\.org.cn 1;
    *.taohui.org.cn 2;
    map.taohui.tech 3;
    map.taohui.* 4;
}

map $http_user_agent $mobile {
    default 0;
    "~Opera Mini" 1;
}

5.7 通过变量指定少量用户实现AB测试：split_client模块

模块： ngx_http_split_clients_module，默认编译进Nginx，通过--without-http_split_clients_module禁用功能：基于已有变量创建新变量，为其他AB测试提供更多的可能性

对已有变量的值执行MurmurHash2算法，得到32位整型哈希数字，记为hash 32位无符号整型的最大数字2^32-1，记为max 哈希数字与最大数字相除hash/max，可以得到百分比percent 配置指令中指示了各个百分比构成的范围，如0-1%,1%-5%等，及范围对应的值当percent落在哪个范围里，新变量的值就对应着其后的参数规则：已有变量: * 字符串 * 一个或者多个变量 * 变量与字符串的组合 case规则： * xx.xx%，支持小数点后2位，所有项的百分比相加不能超过100% * *, 由它匹配剩余的百分比 (100%减去以上所有项相加的百分比)

split_clients模块的指令：

Syntax:  split_clients string $variable { ... }
Default: --
Context: http

下面这行配置有问题吗？

split_clients "${http_testcli}" $variant {
    0.51%       .one;
    20.0%       .two;
    50.5%       .three;
    40%         .four;
    *           "";
}

0.51%，20.0%，50.5%，40%的和已经超过了100%。

5.8 根据IP地址范围的匹配生成新变量：geo模块

geo模块的指令：

Syntax:  geo [$address] $variable {...}
Default: --
Context: http

功能：根据IP地址创建新变量模块： ngx_http_geo_module，默认编译进nginx，通过--without-http_geo_module禁用规则：> 如果geo指令后不输入$address，那么默认使用$remote_addr变量作为IP地址 > {} 内的指令匹配：优先最长匹配 * 通过IP地址及子网掩码的方式，定义IP范围，当IP地址在范围内时新变量使用其后的参数值 * default制定了当以上范围都未匹配上时，新变量的默认值 * 通过proxy指令指定可信地址 (参考realip模块)，此时remote_addr的值为X-Forwarded-For头部值中最后一个IP地址
* proxy_recursive允许循环地址搜索 * include, 优化可读性 * delete删除指定网络

geo模块示例：

geo $country {
    default ZZ;
    #include conf/geo.conf;
    proxy  116.62.160.193;

    127.0.0.0/24 US;
    127.0.0.1/32 RU;
    10.1.0.0/16  RU;
    192.168.1.0/24 UK;
}

问题，以下命令执行时，变量country的值各位多少？(proxy为客户端地址)

1
2
3

curl -H 'X-Forwarded-For: 10.1.0.0,127.0.0.2' geo.taohui.tech
curl -H 'X-Forwarded-For: 10.1.0.0,127.0.0.1' geo.taohui.tech
curl -H 'X-Forwarded-For: 10.1.0.0,127.0.0.1,1.2.3.4' geo.taohui.tech

5.9 使用变量获取用户的地理位置：geoip模块

基于MaxMind数据库从客户端地址获取变量：geoip模块功能：根据IP地址创建新变量模块： ngx_http_geoip_module，默认未编译进nginx，通过--with-http_geoip_module禁用流程： > 安装MaxMind里geoip的C开发库(https://dev.maxmind.com/geoip/legacy/downloadable) > 编译nginx时带上--with-http_geoip_module参数 > 下载MaxMind中的二进制地址库 > 使用geoip_country或者geoip_city指令配置好nginx.conf > 运行(或者升级nginx)

geoip_country指令提供的变量：

Syntax:  geoip_country file;
Default: --
Context: http

Syntax:  geoip_proxy address | CIDR;
Default: --
Context: http

变量：> $geoip_country_code: 两个字母的国家代码，比如CN或者US > $geoip_country_code3: 三个字母的国家代码，比如CHN或者USA > $geoip_country_name: 国家名称，例如"China",“United States”

geoip_city指令:

Syntax:  geoip_city file;
Default: --
Context: http

geoip_city指令提供的变量：

> $geoip_latitude: 维度
> $geoip_longitude: 经度
> $geoip_city_continent_code: 属于全球哪个洲，例如EU或者AS
> 与`geoip_country`指令生成的变量重叠：
    * $geoip_city_country_code: 两个字母的国家代码，例如CN或者US
    * $geoip_city_country_code3: 三个字母的国家代码，比如CHN或者USA
    * $geoip_city_country_name: 国家名称，例如"China"，"United States"
> $geoip_region: 洲或者省的编码，例如02
> $geoip_region_name: 洲或者省的名称，例如Zhejiang或者Saint Petersburg
> $geoip_city: 城市名
> $geoip_postal_code: 邮编号
> $geoip_area_code: 仅美国使用的电话区号，例如408
> $geoip_dma_code:  仅美国使用的DMA编号，例如807

在nginx.conf中配置GeoIP:

geoip_country   /usr/local/share/GeoIP/GeoIP.dat;
geoip_city      /usr/local/share/GeoIP/GeoLiteCity.dat;
geoip_proxy     116.62.160.193/32;
geoip_proxy_recursive on;

server {
    server_name geoip.taohui.tech;
    error_log logs/myerror.log info;
    location / {
        keepalive_requests 2;
        keepalive_timeout 75s 20;
        return 200 'country:$geoip_country_code,$geoip_country_code3,$geoip_country_name
country from city:$geoip_city_country_code,$geoip_city_country_code3,$geoip_city_country_name
city: $geoip_area_code,$geoip_city_continent_code,$geoip_dma_code
$geoip_latitude,$geoip_longtitude,$geoip_region,$geoip_region_name,$geoip_city,$geoip_postal_code';
    }
}

测试：从http://www.goubanjia.com

6. 对客户端使用keepalive提升连接效率

对客户端keepalive行为控制的指令：(这里指的是http的keepalive，而不是tcp的keepalive) 功能：多个HTTP请求通过复用TCP连接，实现以下功能：

减少握手次数
通过减少并发连接数减少了服务器资源的消耗
降低TCP拥塞控制的影响协议：> Connection头部：取值为close或者keep-alive，前者表示请求处理完即关闭连接，后者表示复用连接处理下一条请求

Keep-Alive头部：其值为timeout=n，后面的数字n单位是秒，告诉客户端连接至少保留n秒

指令：

Syntax:  keepalive_disable none | browser ...;
Default: keepalive_disable msie6;
Context: http,server,location

Syntax:  keepalive_request number;
Default: keepalive_requests 100;
Context: http,server,location

Syntax:  keepalive_timeout timeout [header_timeout];
Default: keepalive_timeout 75s;
Context: http,server,location

请求头中包含Connection: keep-alive头部表示“客户端请求长连接”; 响应头中包含Connection: keep-alive头部表示“服务器支持长连接”; HTTP/1.1默认支持长连接，Connection: keep-alive无意义; HTTP/1.0不支持长连接，Connection: keep-alive无意义;

问题： (1) 一般长连接会保持多久，客户端和服务器什么时候决定关闭长连接？关闭总是由客户端发起吗。是不是服务器如果长时间未接收到数据就会关闭连接，还是浏览器在页面渲染完就关闭长连接。比如打开一个网页(如果带css样式表和其他js请求)公共会发起多少长连接。

客户端和服务器谁先超时，谁就先关闭连接。比如，Nginx的默认超时时间是65秒。当然可以通过配置文件修改。
打开一个网页，Chrome当前默认最多打开6个连接。

(2) 如何生成Proxy-Connection: keep-alive 浏览器默认不会创建Proxy-Connection: keep-alive，除非使用SwitchyOmega类似的工具开启的正向代理。当浏览器使用Proxy-Connection头部之后，就不会再使用Connection头部了，二者只能使用其中之一。

(3) proxy-connection是怎么被加到Header中的呢？还有老的代理服务器和新的代理服务器是怎么处理proxy-connection头部的呢？

1. 每个编程框架都有添加header的方式，例如Nginx中有一个add_header指令，而js、python的requests库等也有相应的添加header的方法。
2. 老的代理不认proxy-connection头部时，就是直接转发；认识该头部的代理，会使用Connection替代Proxy-Connection头部，并建立与上游的长连接。

(4)APP应用通过http长连接向后台发送请求，中间有3个代理服务器，偶尔发现app发送的请求返回的状态码是正常的200，但是没有返回值，后台也没有找到结果，这是什么原因，该如何解决呢？

你是说，status code是200，但没有body是吗？中间的代理服务器上，你可以查看access.log，看看3个代理是不是都没有body。
可以通过response的content-length来确定，如果是Nginx可以通过$body_bytes_sent变量来看。

(5)浏览器明确知道用户点击配置了代理服务器。但是反向代理例如nginx都是配置在其他的服务器上的，这样浏览器就感知不到了，就不会携带proxy connection了。

1、 Nginx作为反向代理时通常不需要携带proxy-connection头部，因为它面向的都是企业自有服务器，是否支持长连接很清楚。
2、如果Nginx连接的是外部服务，就需要手动通过proxy_set_header等指令来向上游配置连接属性了。
3、目前的Nginx框架不处理proxy-connection头部。

附录1 Nginx中的正则表达式

元字符

代码	说明
.	匹配除换行符以外的任意字符
\w	匹配字母或数组或下划线或汉字
\s	匹配任意的空白符
\d	匹配数字
\b	匹配单词的开始或结束
^	匹配字符串的开始
$	匹配字符串的结束

重复

代码说明

* 重复零次或更多次

+ 重复一次或更多次

? 重复零次或一次

{n} 重复n次

{n,} 重复n次或更多次

{n,m} 重复n到m次
其他 \转义符号：取消元字符的特殊含义 ()分组与取值：
例子原始url: /admin/website/article/35/change/uploads/party/5.jpg 转换后的url：/static/uploads/party/5.jpg

代码	说明
*	重复零次或更多次
+	重复一次或更多次
?	重复零次或一次
{n}	重复n次
{n,}	重复n次或更多次
{n,m}	重复n到m次

匹配原始url的正则表达式： /^\/admin\/website\/article\/(\d+)\/change\/uploads\/(\w+)\/(\w+)\.(png|jpg|gif|jpeg|bmp)$/

rewrite ^/admin/website/article/(\d+)/change/uploads/(\w+)\/(\w+)\.(png|jpg|gif|jpeg|bmp)$ /static/uploads/$2/$3.$4 last;

使用pcretest来验证正则表达式

$ wget https://github.com/PhilipHazel/pcre2/releases/download/pcre2-10.39/pcre2-10.39.tar.gz
$ tar -xzf pcre2-10.39.tar.gz
$ cd pcre2-10.39
$ ./configure --prefix=/usr/local/pcre
# 如果报错
$ yum -y install gcc gcc-c++ autoconf automake make
$ yum -y install zlib zlib-devel openssl openssl-devel pcre pcre-devel
# 继续执行上一步，然后进行安装
$ make && make install
# 查看版本
$ pcre2-config --version

使用pcre2test:

$ pcre2test
PCRE version 8.43 2019-02-23

  re> /^\/admin\/website\/article\/(\d+)\/change\/uploads\/(\w+)\/(\w+)\.(png|jpg|gif|jpeg|bmp)$/
data> /admin/website/article/35/change/uploads/party/5.jpg
 0: /admin/website/article/35/change/uploads/party/5.jpg
 1: 35
 2: party
 3: 5
 4: jpg

使用HTTP image filter module的时候，报错了： error: the HTTP image filter module requires the GD library.

`1`	`$ yum install gd gd-devel -y`

本文发表于 0001-01-01，最后修改于 0001-01-01。

本站永久域名「 jiavvc.top 」，也可搜索「后浪笔记一零二四」找到我。