后浪笔记一零二四

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ docker run -d --name nginx-1 nginx
$ docker exec -it nginx-1 /bin/bash
root@e8fbf25d12b7:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
5: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever
root@e8fbf25d12b7:/# ip route
default {via 172.17.0.1:网关为172.17.0.1} dev eth0
172.17.0.0/16 dev eth0 {proto kernel:内核自动创建} {scope link:直连,表示Gateway地址是0.0.0.0,没有网关所以直连} src 172.17.0.2

从ip route的第一条输出中可以看出,eth0是这个容器里的默认路由设备; 从ip route的第二条输出中可以看出,所有对 172.17.0.0/16 网段的请求,都会被交给 eth0 来处理

在宿主机上

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ ip a
...
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:b2:76:b8:f2 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:b2ff:fe76:b8f2/64 scope link 
       valid_lft forever preferred_lft forever
4: br-9c8b000a0c11: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:c2:37:8a:33 brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global br-9c8b000a0c11
       valid_lft forever preferred_lft forever
6: vethd4be8d9@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether aa:28:0c:3a:c0:54 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::a828:cff:fe3a:c054/64 scope link 
       valid_lft forever preferred_lft forever

$ brctl show
bridge name	bridge id		STP enabled	interfaces
br-9c8b000a0c11		8000.0242c2378a33	no
docker0		8000.0242b276b8f2	no		vethd4be8d9

通过 ip a 命令的输出,你可以看到,nginx-1 容器对应的 Veth Pair 设备,在宿主机上是一张虚拟网卡。它的名字叫作 vethd4be8d9。并且,通过 brctl show 的输出,你可以看到这张网卡被“插”在了 docker0 上。

这时候,如果我们再在这台宿主机上启动另一个 Docker 容器,比如 nginx-2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ docker run -d --name nginx-2 nginx
$ brctl show
bridge name bridge id  STP enabled interfaces
docker0  8000.0242d8e4dfc1 no  veth9c02e56
       vethb4963f3
$ brctl show
bridge name	bridge id		STP enabled	interfaces
br-9c8b000a0c11		8000.0242c2378a33	no		
docker0		8000.0242b276b8f2	no		vethd3f80d9
							vethd4be8d9

会发现一个新的、名叫 vethd3f80d9 的虚拟网卡,也被“插”在了 docker0 网桥上。

同一个宿主机的不同容器通过docker0网桥进行通信的流程如下图: pod1-pod2-icmp nginx-1在ping nginx-2的时候,iptables调试日志如下:

nginx-1的eth0网卡的另一端vethd4be8d9是插在docker0网卡上的,一旦一张虚拟网卡被“插”在网桥上,它就会变成该网桥的“从设备”。从设备会被“剥夺”调用网络协议栈处理数据包的资格,从而“降级”成为网桥上的一个端口。所以,源MAC地址不会是vethd4be8d9的MAC地址,而是eth0的MAC地址,同理对于nginx-2也是一样的。

nginx-1在发送ICMP包之前,会先发送ARP包给docker0网桥以获取nginx-2的IP地址172.17.0.3所对应的MAC地址。而同在docker0网桥上的nginx-2容器的网络协议栈收到这个ARP包后,会把172.17.0.3所对应的MAC地址回复给nginx-1容器(同时会更新docker0网桥的CAM表,即交换机通过"MAC地址学习"维护的端口和MAC地址的对应表)。有了这个目的MAC地址,nginx-1容器的eth0网卡就可以将ICMP数据包发出去。

1. 将ICMP包发给docker0网桥
-----02:42:ac:11:00:03是nginx-2的mac地址, 02:42:ac:11:00:02是nginx-1的mac地址
TRACE: raw:PREROUTING:policy:4  IN=docker0 OUT= PHYSIN=vethd4be8d9 MAC=02:42:ac:11:00:03:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=172.17.0.3 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=53716 DF PROTO=ICMP TYPE=8 CODE=0 ID=433 SEQ=1
TRACE: mangle:PREROUTING:rule:1  
TRACE: mangle:PREROUTING_direct:return:1  
TRACE: mangle:PREROUTING:rule:2  
TRACE: mangle:PREROUTING_ZONES:rule:1  
TRACE: mangle:PRE_docker:rule:1  
TRACE: mangle:PRE_docker_log:return:1  
TRACE: mangle:PRE_docker:rule:2  
TRACE: mangle:PRE_docker_deny:return:1  
TRACE: mangle:PRE_docker:rule:3  
TRACE: mangle:PRE_docker_allow:return:1  
TRACE: mangle:PRE_docker:return:4  
TRACE: mangle:PREROUTING:policy:3  
TRACE: nat:PREROUTING:rule:1  
TRACE: nat:PREROUTING_direct:return:1  
TRACE: nat:PREROUTING:rule:2  
TRACE: nat:PREROUTING_ZONES:rule:1  
TRACE: nat:PRE_docker:rule:1  
TRACE: nat:PRE_docker_log:return:1  
TRACE: nat:PRE_docker:rule:2  
TRACE: nat:PRE_docker_deny:return:1  
TRACE: nat:PRE_docker:rule:3  
TRACE: nat:PRE_docker_allow:return:1  
TRACE: nat:PRE_docker:return:4  
TRACE: nat:PREROUTING:policy:4  

2. 根据目的MAC地址查询docker0的CAM表,获得nginx-2容器在docker0上的端口为vethd3f80d9
TRACE: mangle:FORWARD:rule:1 IN=docker0 OUT=docker0 PHYSIN=vethd4be8d9 PHYSOUT=vethd3f80d9 MAC=02:42:ac:11:00:03:02:42:ac:11:00:02:08:00 SRC=172.17.0.2 DST=172.17.0.3 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=53716 DF PROTO=ICMP TYPE=8 CODE=0 ID=433 SEQ=1 
TRACE: mangle:FORWARD_direct:return:1 
TRACE: mangle:FORWARD:policy:2 
TRACE: filter:FORWARD:rule:1 
TRACE: filter:DOCKER-USER:return:1 
TRACE: filter:FORWARD:rule:2 
TRACE: filter:DOCKER-ISOLATION-STAGE-1:return:3 
TRACE: filter:FORWARD:rule:4 
TRACE: filter:DOCKER:return:1 
TRACE: filter:FORWARD:rule:6 
TRACE: security:FORWARD:rule:1 
TRACE: security:FORWARD_direct:return:1 
TRACE: security:FORWARD:policy:2 

3. 把ICMP包发往vethd3f80d9这个端口
TRACE: mangle:POSTROUTING:rule:1 IN= OUT=docker0 PHYSIN=vethd4be8d9 PHYSOUT=vethd3f80d9 SRC=172.17.0.2 DST=172.17.0.3 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=53716 DF PROTO=ICMP TYPE=8 CODE=0 ID=433 SEQ=1 
TRACE: mangle:POSTROUTING_direct:return:1 
TRACE: mangle:POSTROUTING:policy:2 
TRACE: nat:POSTROUTING:rule:3 
TRACE: nat:POSTROUTING_direct:return:1 
TRACE: nat:POSTROUTING:rule:4 
TRACE: nat:POSTROUTING_ZONES:rule:1 
TRACE: nat:POST_docker:rule:1 
TRACE: nat:POST_docker_log:return:1 
TRACE: nat:POST_docker:rule:2 
TRACE: nat:POST_docker_deny:return:1 
TRACE: nat:POST_docker:rule:3 

本文发表于 0001-01-01,最后修改于 0001-01-01。

本站永久域名「 jiavvc.top 」,也可搜索「 后浪笔记一零二四 」找到我。


上一篇 « 下一篇 »

赞赏支持

请我吃鸡腿 =^_^=

i ysf

云闪付

i wechat

微信

推荐阅读

Big Image