kubernetes Pod

来源：原创

时间：2019-04-21

作者：脚本小站

分类：云原生

k8s上最小的调度单元。Pod对象是一组容器的集合。

暴露端口：

适用的资源类型 deployment、pod、replicaset、replicationcontroller、service

被暴露的Pod必须有labels。

kubectl expose pod test-pod --port=80 --name=test-pod-svc --type=NodePort

上面的命令其实就是创建了一个Service类型的资源。

访问被暴露的Pod资源：先查看端口，在使用 NodeIP:NodePort 方式访问。

]# kubectl get svc test-pod-svc 
NAME           TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
test-pod-svc   NodePort   10.105.77.130   <none>        80:30361/TCP   12m

curl 172.26.253.208:30361

查看标签：

kubectl get pods --show-labels

共享节点网络名称空间：

让Pod监听在主机节点IP的端口上。设置 spec.hostNetwork 为 true。

apiVersion: v1
kind: Pod
metadata:
  name: pod-use-hostnetwork
  labels:
    app: pod-use-hostnetwork
spec:
  containers:
  - name: nginx
    image: nginx
  hostNetwork: true

查看创建的Pod：

看IP这一列是节点的IP地址，到所在的节点使用netstat命令可以看到监听的端口。可以直接使用宿主机IP进行访问。还可以使用spec.hostIP 和 spec.hostIPC 来共享节点的PID和IPC名称空间。

]# kubectl get pods -l app=pod-use-hostnetwork  -o wide
NAME                  READY   STATUS    RESTARTS   AGE     IP               NODE    NOMINATED NODE   READINESS GATES
pod-use-hostnetwork   1/1     Running   0          6m11s   172.26.253.209   node1   <none>           <none>

设置Pod对象的安全上下文：

如让容器中的应用以非root用户运行。和安全上下文的还有fsGroup、seLinuxOptions、supplementalGroups、sysctls、capabilities、privileged 等等，具体查看 kubectl explain pods.spec.containers.securityContext。

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-securitycontext
  labels:
    app: pod-with-securitycontext
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["/bin/sh","-c","sleep 3600"]
    securityContext:
      runAsNonRoot: true
      runAsUser: 1000
      allowPrivilegeEscalation: false

查看运行用户：可以看到运行程序的用户为1000而非root。

]# kubectl exec -it pod-with-securitycontext -- ps aux
PID   USER     TIME  COMMAND
    1 1000      0:00 sleep 3600
   11 1000      0:00 ps aux

查看Pod：

kubectl get pods --show-labels
kubectl get pods -L app,env # app和env为标签的健名称

给Pod打标签：

kubectl label pods/pod-use-hostnetwork env=pro
kubectl label pods/pod-use-hostnetwork env=production --overwrite # 覆盖已有的标签

标签选择器：=、==、!=、

kubectl get pods -L app -l "app!=myapp" # 在key为app中值不等于myapp的Pod
kubectl get pods -L app -l "app!=myapp,env=production"
kubectl get pods -L app -l 'app in (pod-use-hostnetwork)'
kubectl get pods -l 'env in (myapp,dev),!tier' -L env,tier # 括号外的!tier为and关系,用单引号

Pod节点选择器nodeSelector：

给节点打标签：

kubectl label nodes node1 disktype=ssd

创建拥有spec.nodeSelector的资源：

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-nodeselector
  labels:
    app: pod-with-nodeSelector
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["/bin/sh","-c","sleep 3600"]
  nodeSelector:
    disktype: ssd

修改查看资源注解annotations：

添加注解：deployment、Pod 等资源都可以。

kubectl annotate deployments myapp mark="This is myapp"
kubectl annotate pods myapp kubernetes.io=myapp

查看注解：

kubectl describe pods myapp | grep Annotations

强行删除Pod：无宽限期，此方法不一定会立即删除Pod，该Pod也可能会一直运行。

kubectl delete pods myapp --grace-period=0 --force

Pod生命周期

生命周期钩子函数：

spec.lifecycle.postStart：容器创建完成之后立即执行的操作，不确保在ENTYPOINTER之前运行。

spec.lifecycle.preStop：容器结束之前运行的操作。

apiVersion: v1
kind: Pod
metadata:
  name: lifecycle-demo
spec: 
  containers:
  - name: lifecycle-demo
    image: nginx
    lifecycle:
      postStart:
        exec:
          command: ["/bin/sh","-c","date > /usr/share/nginx/html/index.html"]

Pod存活性探测

存活性探测的作用是确认Pod是否真正可用，不仅仅只是Running，应用可能出现故障但是还在Running，存活性探测就是周期性的请求探测应用是否可用，如果不可用则重启Pod或停止Pod。存活性探测有以下三种方式：ExecAction、TCPSocketAction、HTTPGetAction。

readnessProbe/livenessProbe 配置字段，可用于Exec、TCP、HTTP：

initialDelaySeconds：探测延迟时长

periodSeconds：探测频率，默认10s，最小一秒

successThreshold：失败探测后再探测几次确认为探测成功，默认1秒

timeoutSeconds：探测超时时长，默认1s

failureThreshold：探测几次被认定为失败，默认3

设置exec探针存活性：

spec.containers.livenessProbe.exec.command 返回的值为0表示成功，其余全是失败。

apiVersion: v1
kind: Pod
metadata:
  name: liveness-exec
  labels:
    app: liveness-exec
spec:
  containers:
  - name: liveness-exec
    image: busybox
    args: ["/bin/sh","-c"," touch /tmp/healthy; sleep 60; rm -rf /tmp/healthy; sleep 600"]
    livenessProbe:
      exec:
        command: ["test","-e","/tmp/healthy"]

运行后在一定时间内Pod是正常运行的，再运行一段时间后就可以使用 describe 命令查看Events 下的信息中有 “Liveness probe failed”表示健康监测到失败。

HTTP探针：

向容器发送一个HTTP请求，响应码为2xx、3xx表示通过检测。

spec.containers.livenessProbe.httpGet

httpGet.path：检测的路径，如：/Healthy

httpGet.port：端口名称，容器中定义的，如：http

httpGet.scheme：请求的协议。

httpGet.httpHeader：可选，定义请求报文的首部。

apiVersion: v1
kind: Pod
metadata:
  name: liveness-http
  labels:
    app: liveness-http
spec:
  containers:
  - name: liveness-http
    image: nginx:1.12-alpine
    ports: 
    - name: http
      containerPort: 80
    lifecycle:
      postStart:
        exec:
          command: ["/bin/sh","-c","echo Healthy > /usr/share/nginx/html/Healthy"]
    livenessProbe:
      httpGet:
        path: /Healthy
        port: http
        scheme: HTTP
      initialDelaySeconds: 3
      timeoutSeconds: 2
      failureThreshold: 2
      periodSeconds: 5
      successThreshold: 1
      failureThreshold: 3

创建好后使用 kubectl describe pods liveness-http 来查看状态，可以看到容器正常运行。

删除测试页面：

kubectl exec liveness-http -- rm /usr/share/nginx/html/Healthy

删除测试页面后再来看 describe 信息，可以看到“Liveness probe failed: HTTP probe failed with statuscode: 404”“Container will be killed and recreated”说明HTTP检测失败了，容器将被删除和重建。

TCP探针：

向容器的特定端口发起TCP请求，建立成功即为通过检测。

spec.containers.livenessProbe.tcpSocket

tcpSocket.host：可选，请求的IP地址，默认为Pod IP。

tcpSocket.port：目标端口名称。

TCP存活性探测示例：只要端口存在就视为检测成功。

apiVersion: v1
kind: Pod
metadata:
  name: liveness-tcp
  labels:
    app: liveness-tcp
spec:
  containers:
  - name: liveness-tcp
    image: nginx:1.12-alpine
    ports:
    - name: http
      containerPort: 80
    livenessProbe:
      tcpSocket:
        port: 8080
      timeoutSeconds: 2
      failureThreshold: 2
      periodSeconds: 5
      successThreshold: 1

使用 describe 命令可以看到设置后的属性。

Liveness:       tcp-socket :http delay=0s timeout=2s period=5s #success=1 #failure=2

Pod就绪性探测

就绪性探测的功能是Pod在Running后会有一段初始化时间，这段时间可能无法接受请求，待应用真正可用时再接收外部请求，这个探测过程是在Pod内部进行的。就绪性探测的三种方式：ExecAction、TCPSocketAction、HTTPGetAction，与存活性探测相同。不同的是就绪性探测不会杀死或重启Pod。

Exec就绪性探测：

apiVersion: v1
kind: Pod
metadata:
  name: readiness-exec
  labels:
    app: readiness-exec
spec:
  containers:
  - name: readiness-exec
    image: busybox
    args: ["/bin/sh","-c","while true; do rm -rf /tmp/ready; sleep 30; touch /tmp/ready; sleep 300;done"]
    readinessProbe:
      exec:
        command: ["test","-e","/tmp/ready"]
      initialDelaySeconds: 5  # 第一次探测等待的时间
      periodSeconds: 5  # 每隔几秒探测一次

创建好后使用 kubectl get pods -l app=readiness-exec --watch 查看Pod状态：当状态为Running的时候READY还没有准备好，直到存活性检查完成后才显示出READY。

在一些容器需要初始化的场景中，应用未就绪之前肯定是不能接受请求的。

NAME             READY   STATUS    RESTARTS   AGE
readiness-exec   0/1     Running   0          7s
readiness-exec   1/1     Running   0          2m57s

官方文档：

https://v1-18.docs.kubernetes.io/zh/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

启动探针

启动探针只有startupProbe探测成功后再交给livenessProbe。我们startupProbe配置的是10s*10+10s，也就是说只要应用在110s内启动都是OK的，而且应用挂掉了10s就会发现问题。

startupProbe:
  httpGet:
    path: /test
    prot: 80
  failureThreshold: 10
  initialDelay：10
  periodSeconds: 10

注释：

startupProbe:                     #健康检查方式
  failureThreshold: 3             #检测失败3次表示未就绪
  httpGet:                        #请求方式
    path: /ready                  #请求路径
    port: 8182                    #请求端口
    scheme: HTTP                  #请求协议
  periodSeconds: 10               #检测间隔
  successThreshold: 1             #检查成功为2次表示就绪
  timeoutSeconds: 1               #检测失败1次表示未就绪

资源需求、资源限制

CPU单位：1CPU=1000m，1G=1000M=1000*1000K

内存单位：1G=1024m

容器中可用的资源量任然是节点级别的总量。

资源请求 requests：

定义了资源请求用量的Pod节点会为其预留定义的资源，Pod调度时也会根据定义的请求用量来调度到合适的节点，未定义请求用量的Pod资源可能会被压缩到相关的进程被杀死。

例子：

这个例子中stress会尽量多的占用资源，当其它Pod占用资源时，此可能会被OOMkiled，最终保证拥有128Mi的内存资源可用。

apiVersion: v1
kind: Pod
metadata: 
  name: stress-pod
spec:
  containers:
  - name: stress
    image: ikubernetes/stress-ng
    command: ["/usr/bin/stress-ng","-m 1","-c 1","--metrics-brief"]
    resources:
      requests:
        memory: 128Mi
        cpu: 100m

资源限制 limits：

当超出定义的资源时可能会被OOMkild，Pod可能会被重启，或子进程被被父进程重启。

apiVersion: v1
kind: Pod
metadata:
  name: memleak-pod
  labels:
    app: memleak
spec:
  containers:
  - name: memleak
    image: saadali/simmemleak
    resources:
      requests:
        memory: 64Mi
        cpu: 100m
      limits:
        memory: 64Mi
        cpu: 100m

查看状态：

[root@master resources]# kubectl get pods -l app=memleak --watch
NAME                     READY   STATUS              RESTARTS   AGE
memleak-pod              0/1     ContainerCreating   0          7s
memleak-pod              0/1     OOMKilled           1          16s
memleak-pod              0/1     CrashLoopBackOff    1          17s

官方文档：可以对名称空间、单个容器、单个Pod设置资源限制。

https://kubernetes.io/zh/docs/tasks/administer-cluster/manage-resources/

POD中的其他字段

Pod常见字段：

apiVersion: v1
kind: Pod
metadata:
  name: pod
  namespace: prd
spec:
  containers:
  - image: nginx:1.12
    imagePullPolicy: IfNotPresent
	name: nginx
  dnsPolicy: ClusterFirst # 优先使用集群的DNS
  enableServiceLinks: true # 时候允许service来代理pods
  nodeName: node1  # 固定到某个节点
  priority: 0 # Pod优先级
  restartPolicy: Always
  schedulerName: default-scheduler # 使用那个调度器来调用
  serviceAccount: default
  terminationGracePeriodSeconds: 30 # 终止宽限期

Pod中获取PodIP：Pod字段的信息都可以获取。

https://kubernetes.io/zh-cn/docs/tasks/inject-data-application/environment-variable-expose-pod-information/

上一篇：kubernetes ReplicaSet rs

下一篇：kubernetes initContainers

TOP