K8s启动pop后无法查看到pod故障排查

K8s启动pop后无法查看到pod故障排查

故障现象

  • 启动pod后kubectl get pod 无法看到pod
kubectl apply -f app.yaml

排查

Kubernetes Deployment 实际上是一种更高级别的资源,它使用其他 Kubernetes 资源来创建 Pod。这种复杂性的原因是因为这种Deployment类型向较低级别的资源添加了功能。该博客将指导您查看您的Deployment以及如何查找有关它的信息。这些步骤对所有人都是通用的Deployments

当您创建一个Deployment种类时,您将看到它是通过运行以下命令创建的;

$ kubectl get deployment
NAME                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
cost-attribution-grafana      1         1         1            1           2m18s

您可以描述它以查看它做了什么:

$ kubectl describe deploy cost-attribution-mk-agent
Name:                   cost-attribution-mk-agent
Namespace:              kubernetes-cost-attribution
CreationTimestamp:      Wed, 21 Nov 2018 12:30:47 -0800
Labels:                 app=cost-attribution-mk-agent
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"cost-attribution-mk-agent","namespace":"kubernetes-cost-a...
Selector:               app=cost-attribution-mk-agent
Replicas:               1 desired | 0 updated | 0 total | 0 available | 1 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app=cost-attribution-mk-agent
  Service Account:  cost-attribution-kube-state-metric
  Containers:
   mk-agent:
    Image:      gcr.io/managedkube/kubernetes-cost-attribution/agent:1.0
    Port:       9101/TCP
    Host Port:  0/TCP
    Limits:
      cpu:     500m
      memory:  500Mi
    Requests:
      cpu:        20m
      memory:     20Mi
    Liveness:     http-get http://:9101/metrics delay=5s timeout=5s period=10s #success=1 #failure=3
    Readiness:    http-get http://:9101/metrics delay=5s timeout=5s period=5s #success=1 #failure=3
    Environment:  <none>
    Mounts:       <none>
  Volumes:
   ubbagent-state:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
Conditions:
  Type             Status  Reason
  ----             ------  ------
  Progressing      True    NewReplicaSetCreated
  Available        False   MinimumReplicasUnavailable
  ReplicaFailure   True    FailedCreate
OldReplicaSets:    <none>
NewReplicaSet:     cost-attribution-mk-agent-6c78b8757f (0/1 replicas created)
Events:
  Type    Reason             Age    From                   Message
  ----    ------             ----   ----                   -------
  Normal  ScalingReplicaSet  2m27s  deployment-controller  Scaled up replica set cost-attribution-mk-agent-6c78b8757f to 1

在该Events部分中,有一个事件将 a 扩展ReplicaSet到 1。这些事件消息对于调试您的 Deployment 至关重要。这里可能还有其他失败案例,它会描述(或至少给你一个线索)失败的原因,这样你就可以补救了。

即使确实Deployment创造了那ReplicaSet并不意味着有Pods创造。流程的下一步是ReplicaSet通过运行以下命令查看资源:

$ kubectl get replicaset
NAME                                     DESIRED   CURRENT   READY   AGE
cost-attribution-grafana-bfdfddcbb       1         1         1       2m33s

这将向您显示ReplicaSets您在此命名空间中拥有的。有了这个,您可以描述它ReplicaSet以查看它做了什么:

$ kubectl describe replicaset cost-attribution-mk-agent-6c78b8757f
Name:           cost-attribution-mk-agent-6c78b8757f
Namespace:      kubernetes-cost-attribution
Selector:       app=cost-attribution-mk-agent,pod-template-hash=2734643139
Labels:         app=cost-attribution-mk-agent
                pod-template-hash=2734643139
Annotations:    deployment.kubernetes.io/desired-replicas: 1
                deployment.kubernetes.io/max-replicas: 2
                deployment.kubernetes.io/revision: 1
Controlled By:  Deployment/cost-attribution-mk-agent
Replicas:       0 current / 1 desired
Pods Status:    0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=cost-attribution-mk-agent
                    pod-template-hash=2734643139
  Service Account:  cost-attribution-kube-state-metric
  Containers:
   mk-agent:
    Image:      gcr.io/managedkube/kubernetes-cost-attribution/agent:1.0
    Port:       9101/TCP
    Host Port:  0/TCP
    Limits:
      cpu:     500m
      memory:  500Mi
    Requests:
      cpu:        20m
      memory:     20Mi
    Liveness:     http-get http://:9101/metrics delay=5s timeout=5s period=10s #success=1 #failure=3
    Readiness:    http-get http://:9101/metrics delay=5s timeout=5s period=5s #success=1 #failure=3
    Environment:  <none>
    Mounts:       <none>
  Volumes:
   ubbagent-state:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
Conditions:
  Type             Status  Reason
  ----             ------  ------
  ReplicaFailure   True    FailedCreate
Events:
  Type     Reason        Age                   From                   Message
  ----     ------        ----                  ----                   -------
  Warning  FailedCreate  76s (x15 over 2m38s)  replicaset-controller  Error creating: pods "cost-attribution-mk-agent-6c78b8757f-" is forbidden: error looking up service account kubernetes-cost-attribution/cost-attribution-kube-state-metric: serviceaccount "cost-attribution-kube-state-metric" not found

在这种特殊情况下,事件报告一个FailedCreate. 这里的具体原因是没有找到引用的服务账号Pod。不过,您的特定错误可能有所不同。这只是一个例子。

结论

该博客引导您针对特定案例跟踪您的 Deployment,但此处概述的步骤是通用的,适用于Deployment如果它没有表现或创建您期望它创建的 Pod,您将如何查看您的 Deployment。