kube-ovn实现Kubernetes多租户网络管理

SuKai June 5, 2022

Kubernetes容器平台正在成为越来越多的数据中心基础平台,我们希望Kubernetes能够满足虚拟化平台的一些基本要求,比如实现了多租户的灵活的软件定义网络SDN。工作中一个项目在使用Kubernetes平台,所以考虑通过KubeVirt来管理虚拟机,同时使用kube-ovn来实现多租户网络隔离。下面我们一起来看看如何使用kube-ovn来管理网络。

基本概念

Underlay/Overlay网络

Underlay网络是指传统IT基础设施网络,是由交换机、路由器、负载均衡等设备组成的底层物理网络。Overlay网络是通过网络虚拟化技术,在Underlay网络上构建出的虚拟的逻辑网络。

OVS/OVN

Open vSwitch(OVS)是一个多层软件交换机,OVS只里一个单机软件,没有集群的信息。Open Virtual Nework(OVN)提供了一个集中式的OVS控制器,从集群的角度对整个网络设施进行编排。

使用Kubernetes后会发现,Kubernetes网络功能缺少软件定义网络SDN能力,缺少VPC, Subnet, Nat, Route, SecurityGroup等常用功能。Kube-OVN基于OVN为Kubernetes网络提供了网络编排能力。

CNI

容器网络接口(Container Network Interface),由CoreOS提出的一种容器网络规范,主要内容是容器创建时的网络分配,和容器被删除时释放网络资源。CNI让网络层变得可插拔,只要遵循CNI的协议规范,容器管理平台就可以调用CNI插件可执行文件提供网络功能。Kubernetes网络模型采用了CNI容器网络接口规范。

macvlan

macvlan是一种Linux内核的网络虚拟化技术,从一个主机接口虚拟出多个虚拟网络接口。macvlan可以在物理网卡构成的父接口上添加子接口,每个子接口都拥有独立的MAC地址和IP地址。容器可以通过绑定子接口,拥有与物理网络通信的能力。这解决了容器接入物理网络需求,比如我们需要通过docker运行gitlab服务,gitlab服务需要用到80,443,22端口,这些常用端口经常会产生冲突,那么我们可以通过docker命令创建一个macvlan驱动类型的网络来拥有独立MAC和IP地址。Kubernetes内置CNI插件包含了macvlan,配置使用macvlan CNI,可以让Kubernetes的Pod使用Underlay网络 。

Kube-OVN

Kube-OVN插件将Kubernetes容器网络接入ovs网络。提供了vpc, router, switch, subnet管理能力。

Multus

Multus CNI插件提供了Kubernetes Pod添加多块网卡的能力。容器同时接入多个不同的网络,解决了类似Ceph这种区分多个网络应用场景。

IPAM

IP地址管理(IP Address Management),分配和维护IP地址,DNS,网关,路由等信息。CNI插件在执行过程中调用相应的IPAM插件,IPAM插件将IP相关信息返回到主CNI插件。IPAM插件减少了CNI插件重复编写相同代码管理IP的工作,而且解决了多个CNI插件统一集中IP管理的需求。

场景需求

1,通过VPC实现网络租户隔离

2,通过NAT网关SNAT访问外网

3,通过NAT网关DNAT暴露端口给外网访问

安装部署

安装Kube-OVN

curl -O https://raw.githubusercontent.com/kubeovn/kube-ovn/release-1.10/yamls/crd.yaml
curl -O https://raw.githubusercontent.com/kubeovn/kube-ovn/release-1.10/yamls/ovn.yaml
curl -O https://raw.githubusercontent.com/kubeovn/kube-ovn/release-1.10/yamls/kube-ovn.yaml
curl -O https://raw.githubusercontent.com/kubeovn/kube-ovn/master/charts/templates/kubeovn-crd.yaml
sed -i 's/\$addresses/<Node IP>/g' ovn.yaml
kubectl label node ubuntuserver1 kube-ovn/role=master
kubectl apply -f crd.yaml
kubectl apply -f kubeovn-crd.yaml
kubectl apply -f ovn.yaml
kubectl apply -f kube-ovn.yaml

wget https://raw.githubusercontent.com/kubeovn/kube-ovn/release-1.10/dist/images/kubectl-ko
mv kubectl-ko /usr/local/bin/kubectl-ko
chmod +x /usr/local/bin/kubectl-ko

安装Multus

Multus的Daemonset主要实现功能:

1,拷贝multus可执行文件到/opt/cni/bin/目录下

2,生成multus配置文件和访问kubernetes的kubeconfig文件

3,调谐CRD资源NetworkAttachmentDefinition,创建附加网络,提供给Pod容器使用。

curl -O https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset-thick-plugin.yml
kubectl apply -f multus-daemonset-thick-plugin.yml

sukai@ubuntuserver1:~$ sudo ls -al /opt/cni/bin/
total 175112
drwxrwxr-x 2 root root     4096 Jun  4 02:56 .
drwxr-xr-x 3 root root     4096 Jun  2 13:36 ..
-rwxr-xr-x 1 root root  4159518 May 13  2020 bandwidth
-rwxr-xr-x 1 root root  4671647 May 13  2020 bridge
-rwxr-xr-x 1 root root 12124326 May 13  2020 dhcp
-rwxr-xr-x 1 root root  5945760 May 13  2020 firewall
-rwxr-xr-x 1 root root  3069556 May 13  2020 flannel
-rwxr-xr-x 1 root root  4174394 May 13  2020 host-device
-rwxr-xr-x 1 root root  3614480 May 13  2020 host-local
-rwxr-xr-x 1 root root  4314598 May 13  2020 ipvlan
-rwxr-xr-x 1 root root 64466944 Jun  4 06:56 kube-ovn
-rwxr-xr-x 1 root root  3472123 Jun  4 06:56 loopback
-rwxr-xr-x 1 root root  4216875 Jun  4 06:56 macvlan
-rwxr-xr-x 1 root root 42573403 Jun  4 06:54 multus
-rwxr-xr-x 1 root root  3924908 Jun  4 06:56 portmap
-rwxr-xr-x 1 root root  4590277 May 13  2020 ptp
-rwxr-xr-x 1 root root  3392826 May 13  2020 sbr
-rwxr-xr-x 1 root root  2885430 May 13  2020 static
-rwxr-xr-x 1 root root  3356587 May 13  2020 tuning
-rwxr-xr-x 1 root root  4314446 May 13  2020 vlan

sukai@ubuntuserver1:~$ sudo ls -al /etc/cni/net.d/00-multus.conf
-rw------- 1 root root 399 Jun  4 06:56 /etc/cni/net.d/00-multus.conf
sukai@ubuntuserver1:~$ sudo ls -al /etc/cni/net.d/multus.d/
total 12
drw------- 2 root root 4096 Jun  4 02:56 .
drwx------ 3 root root 4096 Jun  4 06:56 ..
-rw------- 1 root root 2819 Jun  4 06:54 multus.kubeconfig

网络管理

创建Underlay附加网络

VPC子网通过NAT Gateway访问外网,Gateway容器需要双网卡连接子网和外网。这里通过Multus CRD来创建macvlan附加网络。容器的主网卡使用kube-ovn网络。可以看到附加网络定义CNI类型是macvlan,macvlan工作模式为bridge,ipam地址管理使用kube-ovn进行IP分配。

sukai@ubuntuserver1:~/kube-ovn/vpc$ cat nad-macvlan.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ovn-vpc-external-network
  namespace: kube-system
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "eno2",
      "mode": "bridge",
      "ipam": {
        "type": "kube-ovn",
        "server_socket": "/run/openvswitch/kube-ovn-daemon.sock",
        "provider": "ovn-vpc-external-network.kube-system"
      }
    }'

配置NAT网关

开启VPC NAT网关功能,配置网关容器使用的docker镜像,访问外网使用的物理网卡eno2

kind: ConfigMap
apiVersion: v1
metadata:
  name: ovn-vpc-nat-gw-config
  namespace: kube-system
data:
  image: 'kubeovn/vpc-nat-gateway:v1.10.0'
  enable-vpc-nat-gw: 'true'
  nic: eno2

创建VPC

专有网络Virtual Private Cloud(VPC),云上逻辑隔离网络。这里指定了VPC作用的namespace和staticRoutes,在命名空间sukai263下创建的Pod默认属于VPC mail263,默认路由下一跳为10.0.1.254,这个IP为NAT Gateway的内网地址。

kind: Vpc
apiVersion: kubeovn.io/v1
metadata:
  name: mail263
spec:
  namespaces:
  - sukai263
  staticRoutes:
    - cidr: 0.0.0.0/0
      nextHopIP: 10.0.1.254
      policy: policyDst

创建子网

创建一个名为vmservers子网,地址段为10.0.1.0/24网段,作用的命名空间为sukai263。

kind: Subnet
apiVersion: kubeovn.io/v1
metadata:
  name: vmservers
spec:
  vpc: mail263
  cidrBlock: 10.0.1.0/24
  gateway: 10.0.1.254
  protocol: IPv4
  namespaces:
    - sukai263

创建一个名为ovn-vpc-external-network子网,这是NAT网关使用的访问外网的固定名称的子网,provider指定了上面创建的macvlan网络ovn-vpc-external-network。

apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  name: ovn-vpc-external-network
spec:
  protocol: IPv4
  provider: ovn-vpc-external-network.kube-system
  cidrBlock: 172.16.3.0/24
  gateway: 172.16.3.2
  excludeIps:
  - 172.16.3.1..172.16.3.100

创建网关

创建网关natgw,网关的内网地址为10.0.1.254,子网为vmservers,vpc为mail263。这里子网只指定内网,容器创建时自动附加固定名称的子网ovn-vpc-external-network为外网。

kind: VpcNatGateway
apiVersion: kubeovn.io/v1
metadata:
  name: natgw
spec:
  vpc: mail263
  subnet: vmservers
  lanIp: 10.0.1.254
  selector:
    - "kubernetes.io/hostname: ubuntuserver1"
    - "kubernetes.io/os: linux"

配置NAT地址转换

SNAT源地址转换,内网访问外网时,将内网地址转换为外网IP。

---
kind: IptablesEIP
apiVersion: kubeovn.io/v1
metadata:
  name: eips01
spec:
  natGwDp: natgw  # crd VpcNatGateway name
  v4ip: 172.16.3.208   # specify ip in macvlan public subnet

---
kind: IptablesSnatRule
apiVersion: kubeovn.io/v1
metadata:
  name: snat01
spec:
  eip: eips01                      # eip name
  internalCIDR: 10.0.1.0/24     # vpc subnet cidr

DNAT目标地址转换

将外网地址172.16.3.203的8888端口映射到内网地址10.0.1.2的80端口

---
kind: IptablesEIP
apiVersion: kubeovn.io/v1
metadata:
  name: eipd01
spec:
  natGwDp: natgw
  v4ip: 172.16.3.203
---
kind: IptablesDnatRule
apiVersion: kubeovn.io/v1
metadata:
  name: dnat01
spec:
  eip: eipd01               # eip name
  externalPort: '8888'
  internalIp: 10.0.1.2
  internalPort: '80'
  protocol: tcp

查看网关容器

eth0为VPC子网,net1为外网,这个容器负责路由和网络地址转换SNAT,DNAT工作。

sukai@ubuntuserver1:~$ kubectl -n kube-system exec -it vpc-nat-gw-natgw-969fdb9f5-vb27r -- /bin/bash
bash-5.1# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: net1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 46:43:48:79:5d:70 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.3.101/24 brd 172.16.3.255 scope global net1
       valid_lft forever preferred_lft forever
    inet 172.16.3.208/24 scope global secondary net1
       valid_lft forever preferred_lft forever
    inet 172.16.3.200/24 scope global secondary net1
       valid_lft forever preferred_lft forever
    inet 172.16.3.203/24 scope global secondary net1
       valid_lft forever preferred_lft forever
    inet6 fe80::4443:48ff:fe79:5d70/64 scope link
       valid_lft forever preferred_lft forever
549: eth0@if550: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
    link/ether 00:00:00:f9:6c:f2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.254/24 brd 10.0.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::200:ff:fef9:6cf2/64 scope link
       valid_lft forever preferred_lft forever
bash-5.1# ip route
default via 172.16.3.2 dev net1
default via 10.0.1.254 dev eth0
10.0.1.0/24 dev eth0 proto kernel scope link src 10.0.1.254
172.16.3.0/24 dev net1 proto kernel scope link src 172.16.3.101
bash-5.1#

创建容器

在namespace命名空间为sukai263下,创建一个nginx容器。

apiVersion: v1
kind: Pod
metadata:
  name: vpc-nginx-2
  namespace: sukai263
spec:
  containers:
  - name: vpc-nginx
    image: nginx:alpine

查看容器

容器分配到了子网10.0.1.0/24的地址10.0.1.2,网关配置到了10.0.1.254,能够访问外网,可以通过外网172.16.3.203:8888访问到nginx。

sukai@ubuntuserver1:~$ kubectl -n sukai263 get pods -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP         NODE            NOMINATED NODE   READINESS GATES
vpc-nginx     1/1     Running   0          16h   10.0.1.1   ubuntuserver1   <none>           <none>
vpc-nginx-2   1/1     Running   0          16h   10.0.1.2   ubuntuserver1   <none>           <none>
sukai@ubuntuserver1:~$ kubectl -n sukai263 exec -it vpc-nginx-2 -- sh
/ # ping -c 1 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=114 time=41.192 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 41.192/41.192/41.192 ms
/ # ip route
default via 10.0.1.254 dev eth0
10.0.1.0/24 dev eth0 scope link  src 10.0.1.2
/ #
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
553: eth0@if554: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1400 qdisc noqueue state UP
    link/ether 00:00:00:dd:7e:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.1.2/24 brd 10.0.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::200:ff:fedd:7ee9/64 scope link
       valid_lft forever preferred_lft forever
/ #


[root@sukai ~]# curl http://172.16.3.203:8888
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@sukai ~]#

查看kube-ovn信息

查看分配的IP

sukai@ubuntuserver1:~$ kubectl get ips
NAME                                                                                V4IP           V6IP   MAC                 NODE            SUBNET
busybox.sukai2                                                                      172.16.3.201          00:00:00:BC:3D:13   ubuntuserver1   office-2
coredns-6d4b75cb6d-6mkss.kube-system                                                10.16.0.3             00:00:00:55:EC:51   ubuntuserver1   ovn-default
coredns-6d4b75cb6d-ct9vd.kube-system                                                10.16.0.2             00:00:00:7E:1E:52   ubuntuserver1   ovn-default
kube-ovn-pinger-58dft.kube-system                                                   10.16.0.4             00:00:00:6C:BA:B8   ubuntuserver1   ovn-default
node-ubuntuserver1                                                                  100.64.0.2            00:00:00:36:FB:7D   ubuntuserver1   join
vpc-nat-gw-natgw-969fdb9f5-vb27r.kube-system                                        10.0.1.254            00:00:00:F9:6C:F2   ubuntuserver1   vmservers
vpc-nat-gw-natgw-969fdb9f5-vb27r.kube-system.ovn-vpc-external-network.kube-system   172.16.3.101          00:00:00:8F:17:B9   ubuntuserver1   ovn-vpc-external-network
vpc-nat-gw-ngw-55f54bd8db-mp9ql.kube-system.ovn-vpc-external-network.kube-system    172.16.3.104          00:00:00:7B:46:19   ubuntuserver1   ovn-vpc-external-network
vpc-nginx-2.sukai263                                                                10.0.1.2              00:00:00:DD:7E:E9   ubuntuserver1   vmservers
vpc-nginx.sukai263                                                                  10.0.1.1              00:00:00:A4:25:27   ubuntuserver1   vmservers

查看vpc和子网

sukai@ubuntuserver1:~$ kubectl get vpc
NAME          STANDBY   SUBNETS                                                                 NAMESPACES
mail263       true      ["vmservers"]                                                           ["sukai263"]
ovn-cluster   true      ["join","ovn-default","ovn-vpc-external-network","office","office-2"]
sukai@ubuntuserver1:~$ kubectl get subnet
NAME                       PROVIDER                               VPC           PROTOCOL   CIDR            PRIVATE   NAT     DEFAULT   GATEWAYTYPE   V4USED   V4AVAILABLE   V6USED   V6AVAILABLE   EXCLUDEIPS
join                       ovn                                    ovn-cluster   IPv4       100.64.0.0/16   false     false   false     distributed   1        65532         0        0             ["100.64.0.1"]
office                     ovn                                    ovn-cluster   IPv4       172.16.4.0/24   false     false   false     distributed   0        154           0        0             ["172.16.4.1..172.16.4.100"]
office-2                   ovn                                    ovn-cluster   IPv4       172.16.3.0/24   false     false   false     distributed   1        53            0        0             ["172.16.3.1..172.16.3.200"]
ovn-default                ovn                                    ovn-cluster   IPv4       10.16.0.0/16    false     true    true      distributed   3        65530         0        0             ["10.16.0.1"]
ovn-vpc-external-network   ovn-vpc-external-network.kube-system   ovn-cluster   IPv4       172.16.3.0/24   false     false   false     distributed   4        150           0        0             ["172.16.3.1..172.16.3.100"]
vmservers                  ovn                                    mail263       IPv4       10.0.1.0/24     false     false   false     distributed   3        250           0        0             ["10.0.1.254"]
sukai@ubuntuserver1:~$

查看ovn信息

这里可以看到router, switch,port信息

sukai@ubuntuserver1:~$ kubectl ko nbctl show
switch 1343336c-ea3a-4ba0-ae7b-7c2fbf2570bb (vmservers)
    port vpc-nginx-2.sukai263
        addresses: ["00:00:00:DD:7E:E9 10.0.1.2"]
    port vpc-nat-gw-natgw-969fdb9f5-vb27r.kube-system
        addresses: ["00:00:00:F9:6C:F2 10.0.1.254"]
    port vpc-nginx.sukai263
        addresses: ["00:00:00:A4:25:27 10.0.1.1"]
    port vmservers-mail263
        type: router
        router-port: mail263-vmservers
switch 5084db34-7721-4fc1-b4c6-1c7bdf618a81 (ovn-default)
    port coredns-6d4b75cb6d-6mkss.kube-system
        addresses: ["00:00:00:55:EC:51 10.16.0.3"]
    port coredns-6d4b75cb6d-ct9vd.kube-system
        addresses: ["00:00:00:7E:1E:52 10.16.0.2"]
    port kube-ovn-pinger-58dft.kube-system
        addresses: ["00:00:00:6C:BA:B8 10.16.0.4"]
    port ovn-default-ovn-cluster
        type: router
        router-port: ovn-cluster-ovn-default
switch 804c73f2-7fd8-4ddc-a45b-479272368eb7 (office-2)
    port localnet.office-2
        type: localnet
        addresses: ["unknown"]
    port busybox.sukai2
        addresses: ["00:00:00:BC:3D:13 172.16.3.201"]
switch 09b2e08a-3bab-48da-9232-4b80e1e45a01 (join)
    port join-ovn-cluster
        type: router
        router-port: ovn-cluster-join
    port node-ubuntuserver1
        addresses: ["00:00:00:36:FB:7D 100.64.0.2"]
switch fa4131d7-61dd-4c62-8beb-ca647f035f64 (office)
    port localnet.office
        type: localnet
        addresses: ["unknown"]
router 2d42fd6c-1565-4fa3-be71-960eac2adfaf (mail263)
    port mail263-vmservers
        mac: "00:00:00:6A:33:EA"
        networks: ["10.0.1.254/24"]
router 914d28c9-b604-4db9-a505-b45080264ba7 (ovn-cluster)
    port ovn-cluster-join
        mac: "00:00:00:D7:D3:57"
        networks: ["100.64.0.1/16"]
    port ovn-cluster-ovn-default
        mac: "00:00:00:CB:F6:B0"
        networks: ["10.16.0.1/16"]
sukai@ubuntuserver1:~$

查看交换机

sukai@ubuntuserver1:~$ kubectl ko nbctl show vmservers
switch 1343336c-ea3a-4ba0-ae7b-7c2fbf2570bb (vmservers)
    port vpc-nginx-2.sukai263
        addresses: ["00:00:00:DD:7E:E9 10.0.1.2"]
    port vpc-nat-gw-natgw-969fdb9f5-vb27r.kube-system
        addresses: ["00:00:00:F9:6C:F2 10.0.1.254"]
    port vpc-nginx.sukai263
        addresses: ["00:00:00:A4:25:27 10.0.1.1"]
    port vmservers-mail263
        type: router
        router-port: mail263-vmservers
sukai@ubuntuserver1:~$

查看路由器和路由表

sukai@ubuntuserver1:~$ kubectl ko nbctl show mail263
router 2d42fd6c-1565-4fa3-be71-960eac2adfaf (mail263)
    port mail263-vmservers
        mac: "00:00:00:6A:33:EA"
        networks: ["10.0.1.254/24"]
sukai@ubuntuserver1:~$
sukai@ubuntuserver1:~$ kubectl ko nbctl lr-route-list mail263
IPv4 Routes
Route Table <main>:
                0.0.0.0/0                10.0.1.254 dst-ip

总结

Kube-OVN将IaaS层丰富的网络编排能力带到了Kubernetes集群中,可以为大家提供Kubernetes中的网络租户隔离所需的功能。