K8s 应用存活和容器启动结束钩子

Pod 正常里面的 Docker 服务不一定正常。Docker 服务正常,Docker 里面的服务不一定正常。所以如何正确的监测这些状态,成为了应用健康很重要的关键。 livenessProbe, 用来判定容器是否正常。readinessProbe 用来判定容器中的服务是否正常。这两种探测非常重要,一定要利用探测来证明容器正常后才能接入 Service。不然用户可能会访问失败。同时设置 readinessProbe 有助于在滚动更新时候判断容器中服务的状态,保证应用能提供健康的服务。livenessProbe,readinessProbe 和 postStart,preStop 都支持三种方式的探测,分别是 exec 执行系统命令,tcp socket 和 http get 请求。

livenessProbe

kubectl explain pods.spec.containers.livenessProbe

livenessProbe 支持三种存活状态的检测,分别是 tcp,exec,http get。下面演示两种

exec 存活探测

创建一个 yaml 文件,内容如下:

vim liveness-exec.yaml
apiVersion: v1
kind: Pod
metadata:
  name: liveness-exec-pod      
  namespace: default
spec:
  containers:
  - name: liveness-exec-container
    image: busybox:latest
    imagePullPolicy: IfNotPresent # 镜像拉取规则,此处为不存在才拉取
    command: ["/bin/sh", "-c", "touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 3600"] # 执行命令,先创建一个 healthy 文件,睡 30s 后进行删除,然后睡 3600s
    livenessProbe:  # 容器健康检查探测,用来判定容器是否正常。还有一个是 readiness 用来判定容器中的服务是否正常
      exec: # 检查方式为执行命令。另外还支持 TCP socket 探测和 HTTP GET 探测。
        command: ["test", "-e", "/tmp/healthy"]
      initialDelaySeconds: 1  # 默认为 0s,表示容器启动后多长时间开启健康监测
      periodSeconds: 3  # 默认为 10s,表示每隔多少时间进行一次探测
      failureThreshold: 3  # 默认为3次,意思是3次失败才代表失败
      successThreshold: # 默认为1次,意思是1次成功就代表成功
      timeoutSeconds: 1 # 超时时间,默认为1s

上面的 Pod 创建后,就会创建 /tmp/healthy 文件,并且睡 30s,之后被删除。健康检查的内容是容器启动1s后判断 /tmp/healthy 文件是否存在,且每隔10s进行一次探测,失败3次即认为失败。健康检查失败后就会进行重新启动。下面是 pod 的列表信息,可以看到重启的次数。

[[email protected] rexyan]# kubectl get pods 
NAME                READY   STATUS    RESTARTS   AGE
liveness-exec-pod   1/1     Running   5          6m17s

查看详细信息:

[[email protected] rexyan]# kubectl describe pods liveness-exec-pod
Name:               liveness-exec-pod
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               k8s002/172.20.245.189
Start Time:         Sun, 19 May 2019 16:05:01 +0800
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.244.2.2
Containers:
  liveness-exec-container:
    Container ID:  docker://b6d08991993bb306f32b58f7bcc71651ac2b68d1021a05634bcae6832bbbe169
    Image:         busybox:latest
    Image ID:      docker-pullable://docker.io/[email protected]:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
      -c
      touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 3600
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Sun, 19 May 2019 16:10:48 +0800
      Finished:     Sun, 19 May 2019 16:11:57 +0800
    Ready:          False
    Restart Count:  5
    Liveness:       exec [test -e /tmp/healthy] delay=1s timeout=1s period=3s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vckdx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-vckdx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vckdx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  7m16s                  default-scheduler  Successfully assigned default/liveness-exec-pod to k8s002
  Normal   Pulling    7m16s                  kubelet, k8s002    Pulling image "busybox:latest"
  Normal   Pulled     7m14s                  kubelet, k8s002    Successfully pulled image "busybox:latest"
  Normal   Killing    4m17s (x3 over 6m35s)  kubelet, k8s002    Container liveness-exec-container failed liveness probe, will be restarted
  Normal   Created    3m47s (x4 over 7m14s)  kubelet, k8s00    2    Created container liveness-exec-container
  Normal   Started    3m47s (x4 over 7m13s)  kubelet, k8s002    Started container liveness-exec-container
  Normal   Pulled     3m47s (x3 over 6m5s)   kubelet, k8s002    Container image "busybox:latest" already present on machine
  Warning  Unhealthy  2m5s (x13 over 6m41s)  kubelet, k8s002    Liveness probe failed:

在 Containers 中可以看到刚才配置的健康检查的信息

Restart Count:  5
Liveness:       exec [test -e /tmp/healthy] delay=1s timeout=1s period=3s #success=1 #failure=3

http get 存活探测

apiVersion: v1
kind: Pod
metadata:
  name: liveness-http-pod
  namespace: default
spec:
  containers:
  - name: liveness-http-get-container
    image: ikubernetes/myapp:v1
    imagePullPolicy: IfNotPresent 
    ports:
    - name: http
      containerPort: 80
    livenessProbe: 
      httpGet:
        port: http
        path: /index.html
      initialDelaySeconds: 1  
      periodSeconds: 3  
      failureThreshold: 3  
      successThreshold: 1
      timeoutSeconds: 1                  

查看容器状态

[[email protected] rexyan]# kubectl get pods 
NAME                READY   STATUS             RESTARTS   AGE
liveness-exec-pod   0/1     CrashLoopBackOff   9          23m
liveness-http-pod   1/1     Running            0          104s

查看详细信息

[[email protected] rexyan]# kubectl describe pods liveness-http-pod
Name:               liveness-http-pod
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               k8s003/172.20.245.191
Start Time:         Sun, 19 May 2019 16:27:15 +0800
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.244.1.3
Containers:
  liveness-http-get-container:
    Container ID:   docker://9cb65d175dc8263f54891b597e3a5f4a334f20c4ab636d532887cabfeb7cff3c
    Image:          ikubernetes/myapp:v1
    Image ID:       docker-pullable://docker.io/ikubernetes/[email protected]:9c3dc30b5219788b2b8a4b065f548b922a34479577befb54b03330999d30d513
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 19 May 2019 16:27:18 +0800
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vckdx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-vckdx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vckdx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  2m58s  default-scheduler  Successfully assigned default/liveness-http-pod to k8s003
  Normal  Pulling    2m58s  kubelet, k8s003    Pulling image "ikubernetes/myapp:v1"
  Normal  Pulled     2m55s  kubelet, k8s003    Successfully pulled image "ikubernetes/myapp:v1"
  Normal  Created    2m55s  kubelet, k8s003    Created container liveness-http-get-container
  Normal  Started    2m55s  kubelet, k8s003    Started container liveness-http-get-container
[[email protected] rexyan]# 

在 Containers 中可以看到刚才配置的健康检查的信息

Restart Count:  0
Liveness:       http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3

现在手动进入容器,删除健康检查的 index.html 页面

[[email protected] rexyan]# kubectl get pods 
NAME                READY   STATUS             RESTARTS   AGE
liveness-exec-pod   0/1     CrashLoopBackOff   11         28m
liveness-http-pod   1/1     Running            0          6m4s
[[email protected] rexyan]# kubectl exec -it liveness-http-pod -- /bin/sh 
/ # rm -f /usr/share/nginx/html/index.html

再次看 pod 的状态就会发现 pod 已经重启了一次,重启之后删除的文件就回来了,所以就不会再重启了。

[[email protected] rexyan]# kubectl get pods 
NAME                READY   STATUS             RESTARTS   AGE
liveness-exec-pod   0/1     CrashLoopBackOff   11         30m
liveness-http-pod   1/1     Running            1          8m12s

redinessProbe

kubectl explain pods.spec.containers.readinessProbe

redinessProbe 也支持三种存活状态的检测,分别是 tcp,exec,http get,下面演示一种。

http get 存活探测

apiVersion: v1
kind: Pod
metadata:
  name: readiness-http-pod
  namespace: default
spec:
  containers:
  - name: readiness-http-get-container
    image: ikubernetes/myapp:v1
    imagePullPolicy: IfNotPresent 
    ports:
    - name: http
      containerPort: 80
    readinessProbe:
      httpGet:
        port: http
        path: /index.html
      initialDelaySeconds: 1  
      periodSeconds: 3  
      failureThreshold: 3  
      successThreshold: 1
      timeoutSeconds: 1
[[email protected] rexyan]# kubectl create -f readiness-http-get.yaml 
pod/readiness-http-pod created
[[email protected] rexyan]# kubectl get pods 
NAME                 READY   STATUS    RESTARTS   AGE
liveness-http-pod    1/1     Running   1          26m
readiness-http-pod   1/1     Running   0          5s

之后进入容器删除 index.html

[[email protected] rexyan]# kubectl exec -it readiness-http-pod -- /bin/sh 
/ # rm -f /usr/share/nginx/html/index.html 

查看 pod 的信息, 可以看到 readiness-http-pod READY 个数变成了 0。READY 中 / 前面是值表示 pod 中容器就绪的数量,后面的是 pod 中容器的总个数。

[[email protected] rexyan]# kubectl get pods 
NAME                 READY   STATUS    RESTARTS   AGE
liveness-http-pod    1/1     Running   1          30m
readiness-http-pod   0/1     Running   0          3m43s

进入容器,重新写信息到 nginx 的 index 文件中

[[email protected] rexyan]# kubectl exec -it readiness-http-pod -- /bin/sh 
/ # echo "hi k8s" >> /usr/share/nginx/html/index.html 

重新查看 pod 的信息,就可以看到 pod 的 READY 状态已经从 0 变成1了

[[email protected] rexyan]# kubectl get pods 
NAME                 READY   STATUS    RESTARTS   AGE
liveness-http-pod    1/1     Running   1          38m
readiness-http-pod   1/1     Running   0          11m

查看详细的 pod 信息

[[email protected] rexyan]# kubectl describe pods readiness-http-pod 
Name:               readiness-http-pod
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               k8s002/172.20.245.189
Start Time:         Sun, 19 May 2019 16:54:04 +0800
Labels:             <none>
Annotations:        <none>
Status:             Running
IP:                 10.244.2.3
Containers:
  readiness-http-get-container:
    Container ID:   docker://2989185e07600a552f6a57ecc3e813156002e2218701da07da8b2efbfaf7c966
    Image:          ikubernetes/myapp:v1
    Image ID:       docker-pullable://docker.io/ikubernetes/[email protected]:9c3dc30b5219788b2b8a4b065f548b922a34479577befb54b03330999d30d513
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 19 May 2019 16:54:07 +0800
    Ready:          True
    Restart Count:  0
    Readiness:      http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-vckdx (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-vckdx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-vckdx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  14m                   default-scheduler  Successfully assigned default/readiness-http-pod to k8s002
  Normal   Pulling    14m                   kubelet, k8s002    Pulling image "ikubernetes/myapp:v1"
  Normal   Pulled     14m                   kubelet, k8s002    Successfully pulled image "ikubernetes/myapp:v1"
  Normal   Created    14m                   kubelet, k8s002    Created container readiness-http-get-container
  Normal   Started    14m                   kubelet, k8s002    Started container readiness-http-get-container
  Warning  Unhealthy  4m4s (x134 over 10m)  kubelet, k8s002    Readiness probe failed: HTTP probe failed with statuscode: 404
[[email protected] rexyan]# 

在 Containers 中可以看到刚才配置的健康检查的信息

Restart Count:  0
Readiness:      http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3

容器启动和结束钩子

在容器启动后和结束前都有对应的钩子,分别是 postStart 和 preStop

postStart

 kubectl explain pods.spec.containers.lifecycle.postStart

postStart 有三种执行方式,分别是tcp,exec 和 http get。

preStop

 kubectl explain pods.spec.containers.lifecycle.preStop

preStop 也有三种执行方式,分别是tcp,exec 和 http get


 上一篇
K8s Pod 控制器 K8s Pod 控制器
之前我们共使用两种方式创建 Pod。一种是 kubectl run,另一种为 kubectl create + 配置文件。当我们将以 kubectl create 创建的 Pod 删除之后,Pod 是不会重新被创建的,也就是说此类 Pod
2019-05-26
下一篇 
K8s 使用资源清单创建资源 (二) K8s 使用资源清单创建资源 (二)
标签选选择器是 K8s 中的一个重要的组成部分,用于不同的 Pod 控制器控制此类标签的 Pod,用于 Service 进行后端的关联等,所以对 Pod 标签的操作显得尤为的种要。同时在基本的 Pod 配置清单中, Containers 部
2019-05-18
  目录