{"id":5465,"date":"2023-12-16T17:31:04","date_gmt":"2023-12-16T16:31:04","guid":{"rendered":"http:\/\/miro.borodziuk.eu\/?p=5465"},"modified":"2025-05-17T19:11:29","modified_gmt":"2025-05-17T17:11:29","slug":"logging-monitoring-and-troubleshooting-on-kubernetes","status":"publish","type":"post","link":"http:\/\/miro.borodziuk.eu\/index.php\/2023\/12\/16\/logging-monitoring-and-troubleshooting-on-kubernetes\/","title":{"rendered":"Logging, Monitoring and Troubleshooting on Kubernetes"},"content":{"rendered":"<p><!--more--><\/p>\n<p><span style=\"color: #3366ff;\">Resource Monitoring<\/span><\/p>\n<ul>\n<li><code>kubectl get<\/code> can be used on any resource and shows generic resource<br \/>\nhealth<\/li>\n<li>If metrics are collected, use <code>kubectl top pods<\/code> and <code>kubectl top nodes<\/code> to get performance-related information about Pods and Nodes<\/li>\n<\/ul>\n<pre class=\"lang:default mark:45 decode:true\">[root@k8s cka]# kubectl top pods\r\nNAME                           CPU(cores)   MEMORY(bytes)\r\nbusybox-6fc6c44c5b-xmmxd       0m           0Mi\r\ndeploydaemon-zzllp             0m           7Mi\r\nfirstnginx-d8679d567-249g9     0m           7Mi\r\nfirstnginx-d8679d567-66c4s     0m           7Mi\r\nfirstnginx-d8679d567-72qbd     0m           7Mi\r\nfirstnginx-d8679d567-rhhlz     0m           7Mi\r\nlab4-pod                       0m           7Mi\r\nmorevol                        0m           0Mi\r\nmydaemon-z7g9c                 0m           7Mi\r\nmypod                          0m           0Mi\r\nmysapod                        0m           0Mi\r\nmystaticpod-k8s.netico.pl      0m           7Mi\r\nnginx-taint-68bd5db674-7skqs   0m           7Mi\r\nnginx-taint-68bd5db674-vjq89   0m           7Mi\r\nnginx-taint-68bd5db674-vqz2z   0m           7Mi\r\nnginxsvc-5f8b7d4f4d-dtrs7      0m           7Mi\r\npv-pod                         0m           7Mi\r\nsecurity-context-demo          0m           0Mi\r\nsleepybox1                     0m           0Mi\r\nsleepybox2                     0m           0Mi\r\nwebserver-76d44586d-8gqhf      0m           7Mi\r\nwebshop-7f9fd49d4c-92nj2       0m           7Mi\r\nwebshop-7f9fd49d4c-kqllw       0m           7Mi\r\nwebshop-7f9fd49d4c-x2czc       0m           7Mi\r\n\r\n[root@k8s cka]# kubectl top nodes\r\nNAME            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%\r\nk8s.example.pl   276m         3%     4282Mi          27%\r\n\r\n[root@k8s cka]# kubectl get all -n kube-system\r\nNAME                                        READY   STATUS             RESTARTS           AGE\r\npod\/coredns-5dd5756b68-sgfkj                0\/1     CrashLoopBackOff   3331 (4m12s ago)   13d\r\npod\/etcd-k8s.netico.pl                      1\/1     Running            0                  13d\r\npod\/kube-apiserver-k8s.netico.pl            1\/1     Running            0                  9d\r\npod\/kube-controller-manager-k8s.netico.pl   1\/1     Running            0                  9d\r\npod\/kube-proxy-hgh55                        1\/1     Running            0                  9d\r\npod\/kube-scheduler-k8s.netico.pl            1\/1     Running            0                  9d\r\npod\/metrics-server-5f8988d664-7r8j7         1\/1     Running            0                  9d\r\npod\/storage-provisioner                     1\/1     Running            8 (9d ago)         13d\r\n\r\nNAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE\r\nservice\/kube-dns         ClusterIP   10.96.0.10      &lt;none&gt;        53\/UDP,53\/TCP,9153\/TCP   13d\r\nservice\/metrics-server   ClusterIP   10.102.216.61   &lt;none&gt;        443\/TCP                  9d\r\n\r\nNAME                        DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE\r\ndaemonset.apps\/kube-proxy   1         1         1       1            1           kubernetes.io\/os=linux   13d\r\n\r\nNAME                             READY   UP-TO-DATE   AVAILABLE   AGE\r\ndeployment.apps\/coredns          0\/1     1            0           13d\r\ndeployment.apps\/metrics-server   1\/1     1            1           9d\r\n\r\nNAME                                        DESIRED   CURRENT   READY   AGE\r\nreplicaset.apps\/coredns-5dd5756b68          1         1         0       13d\r\nreplicaset.apps\/metrics-server-5f8988d664   1         1         1       9d\r\nreplicaset.apps\/metrics-server-6db4d75b97   0         0         0       9d\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Troubleshooting Flow<\/span><\/p>\n<ul>\n<li>Resources are first created in the Kubernetes etcd database<\/li>\n<li>Use <code>kubectl describe<\/code> and<code> kubectl events<\/code> to see how that has been going<\/li>\n<li>After adding the resources to the database, the Pod application is started on the nodes where it is scheduled<\/li>\n<li>Before it can be started, the Pod image needs to be fetched\n<ul>\n<li>Use <code>sudo crictl images<\/code> to get a list<\/li>\n<\/ul>\n<\/li>\n<li>Once the application is started, use <code>kubectl logs<\/code> to read the output of the application<\/li>\n<\/ul>\n<p><span style=\"color: #3366ff;\">Troubleshooting Pods<\/span><\/p>\n<ul>\n<li>The first step is to use<code> kubectl get<\/code>, which will give a generic overview of<br \/>\nPod states<\/li>\n<li>A Pod can be in any of the following states:\n<ul>\n<li>Pending: the Pod has been created in etcd, but is waiting for an eligible node<\/li>\n<li>Running: the Pod is in healthy state<\/li>\n<li>Succeeded: the Pod has done its work and there is no need to restart it<\/li>\n<li>Failed: one or more containers in the Pod have ended with an error code and will not be restarted<\/li>\n<li>Unknown: the state could not be obtained, often related to network issues<\/li>\n<li>Completed: the Pod has run to completion<\/li>\n<li>CrashLoopBackOff: one or more containers in the Pod have generated an error, but the scheduler is still trying to run them<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"color: #3366ff;\">Investigating Resource Problems<\/span><\/p>\n<ul>\n<li>If <code>kubectl get<\/code> indicates that there is an issue, the next step is to use <code>kubectl<\/code><br \/>\n<code>describe<\/code> to get more information<\/li>\n<li><code>kubectl describe<\/code> shows API information about the resource and often has good indicators of what is going wrong<\/li>\n<li>If <code>kubectl describe<\/code> shows that a Pod has an issue starting its primary container, use <code>kubectl logs<\/code> to investigate the application logs<\/li>\n<li>If the Pod is running, but not behaving as expected, open an interactive shell on the Pod for further troubleshooting:<code> kubectl exec -it mypod -- sh<\/code><\/li>\n<\/ul>\n<pre class=\"lang:default decode:true\">[root@k8s ~]# kubectl get pods\r\nNAME                           READY   STATUS             RESTARTS         AGE\r\nbusybox-6fc6c44c5b-xmmxd       1\/1     Running            165 (10m ago)    7d3h\r\ndeploydaemon-zzllp             1\/1     Running            0                12d\r\nfirstnginx-d8679d567-249g9     1\/1     Running            0                13d\r\nfirstnginx-d8679d567-66c4s     1\/1     Running            0                13d\r\nfirstnginx-d8679d567-72qbd     1\/1     Running            0                13d\r\nfirstnginx-d8679d567-rhhlz     1\/1     Running            0                13d\r\nlab4-pod                       1\/1     Running            0                12d\r\nmorevol                        2\/2     Running            590 (45m ago)    12d\r\nmydaemon-z7g9c                 1\/1     Running            0                7d1h\r\nmypod                          1\/1     Running            40 (10m ago)     46h\r\nmysapod                        1\/1     Running            39 (45m ago)     45h\r\nmystaticpod-k8s.netico.pl      1\/1     Running            0                10d\r\nnginx-taint-68bd5db674-7skqs   1\/1     Running            0                8d\r\nnginx-taint-68bd5db674-vjq89   1\/1     Running            0                8d\r\nnginx-taint-68bd5db674-vqz2z   1\/1     Running            0                8d\r\nnginxsvc-5f8b7d4f4d-dtrs7      1\/1     Running            0                11d\r\npv-pod                         1\/1     Running            0                12d\r\nsecurity-context-demo          1\/1     Running            135 (47m ago)    5d21h\r\nsleepybox1                     1\/1     Running            161 (46m ago)    6d23h\r\nsleepybox2                     1\/1     Running            161 (46m ago)    6d23h\r\ntestdb                         0\/1     CrashLoopBackOff   12 (3m23s ago)   40m\r\nwebserver-76d44586d-8gqhf      1\/1     Running            0                12d\r\nwebshop-7f9fd49d4c-92nj2       1\/1     Running            0                11d\r\nwebshop-7f9fd49d4c-kqllw       1\/1     Running            0                11d\r\nwebshop-7f9fd49d4c-x2czc       1\/1     Running            0                11d\r\n\r\n[root@k8s ~]# kubectl describe pod testdb\r\nName:             testdb\r\nNamespace:        default\r\nPriority:         0\r\nService Account:  default\r\nNode:             k8s.netico.pl\/172.30.9.24\r\nStart Time:       Wed, 14 Feb 2024 09:34:57 -0500\r\nLabels:           run=testdb\r\nAnnotations:      &lt;none&gt;\r\nStatus:           Running\r\nIP:               10.244.0.90\r\nIPs:\r\n  IP:  10.244.0.90\r\nContainers:\r\n  testdb:\r\n    Container ID:   docker:\/\/0b656c391a511154a4ef8638e175dc1a4bb675feef1ff717d2834464a9e4ebbc\r\n    Image:          mysql\r\n    Image ID:       docker-pullable:\/\/mysql@sha256:343b82684a6b05812c58ca20ccd3af8bcf8a5f48b1842f251c54379bfce848f9\r\n    Port:           &lt;none&gt;\r\n    Host Port:      &lt;none&gt;\r\n    State:          Waiting\r\n      Reason:       CrashLoopBackOff\r\n    Last State:     Terminated\r\n      Reason:       Error\r\n      Exit Code:    1\r\n      Started:      Wed, 14 Feb 2024 10:12:02 -0500\r\n      Finished:     Wed, 14 Feb 2024 10:12:03 -0500\r\n    Ready:          False\r\n    Restart Count:  12\r\n    Environment:    &lt;none&gt;\r\n    Mounts:\r\n      \/var\/run\/secrets\/kubernetes.io\/serviceaccount from kube-api-access-bxf5c (ro)\r\nConditions:\r\n  Type              Status\r\n  Initialized       True\r\n  Ready             False\r\n  ContainersReady   False\r\n  PodScheduled      True\r\nVolumes:\r\n  kube-api-access-bxf5c:\r\n    Type:                    Projected (a volume that contains injected data from multiple sources)\r\n    TokenExpirationSeconds:  3607\r\n    ConfigMapName:           kube-root-ca.crt\r\n    ConfigMapOptional:       &lt;nil&gt;\r\n    DownwardAPI:             true\r\nQoS Class:                   BestEffort\r\nNode-Selectors:              &lt;none&gt;\r\nTolerations:                 node.kubernetes.io\/not-ready:NoExecute op=Exists for 300s\r\n                             node.kubernetes.io\/unreachable:NoExecute op=Exists for 300s\r\nEvents:\r\n  Type     Reason     Age                  From               Message\r\n  ----     ------     ----                 ----               -------\r\n  Normal   Scheduled  41m                  default-scheduler  Successfully assigned default\/testdb to k8s.netico.pl\r\n  Normal   Pulled     40m                  kubelet            Successfully pulled image \"mysql\" in 19.169s (19.17s including waiting)\r\n  Normal   Pulled     40m                  kubelet            Successfully pulled image \"mysql\" in 1.03s (1.03s including waiting)\r\n  Normal   Pulled     40m                  kubelet            Successfully pulled image \"mysql\" in 1.027s (1.027s including waiting)\r\n  Normal   Created    39m (x4 over 40m)    kubelet            Created container testdb\r\n  Normal   Started    39m (x4 over 40m)    kubelet            Started container testdb\r\n  Normal   Pulled     39m                  kubelet            Successfully pulled image \"mysql\" in 1.113s (1.113s including waiting)\r\n  Normal   Pulling    39m (x5 over 41m)    kubelet            Pulling image \"mysql\"\r\n  Warning  BackOff    62s (x183 over 40m)  kubelet            Back-off restarting failed container testdb in pod testdb_default(c6d47171-78db-4038-8c8e-0599cdebde2d)\r\n\r\n[root@k8s ~]# kubectl logs testdb\r\n2024-02-14 15:17:10+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.3.0-1.el8 started.\r\n2024-02-14 15:17:11+00:00 [Note] [Entrypoint]: Switching to dedicated user 'mysql'\r\n2024-02-14 15:17:11+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.3.0-1.el8 started.\r\n2024-02-14 15:17:11+00:00 [ERROR] [Entrypoint]: Database is uninitialized and password option is not specified\r\n    You need to specify one of the following as an environment variable:\r\n    - MYSQL_ROOT_PASSWORD\r\n    - MYSQL_ALLOW_EMPTY_PASSWORD\r\n    - MYSQL_RANDOM_ROOT_PASSWORD\r\n<\/pre>\n<p>Testdb is individual unmanaged pod so we must delete it\u00a0 and run again with appropriate environment variable: <code>kubectl set env -h<\/code><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Troubleshooting Cluster Nodes<\/span><\/p>\n<ul>\n<li>Use <code>kubectl cluster-info<\/code> for a generic impression of cluster health<\/li>\n<li>Use <code>kubectl cluster-info dump<\/code> for (too much) information coming from all the cluster log files<\/li>\n<li><code>kubectl get nodes<\/code> will give a generic overview of node health<\/li>\n<li><code>kubectl get pods -n kube-system<\/code> shows Kubernetes core services running on the control node<\/li>\n<li><code>kubectl describe node nodename<\/code> shows detailed information about nodes, check the &#8220;Conditions&#8221; section for operational information<\/li>\n<li><code>sudo systemctl status kubelet<\/code> will show current status information about the kubelet<\/li>\n<li><code>sudo systemctl restart kubelet<\/code> allows you to restart it<\/li>\n<li><code>sudo openssl x509 -in \/var\/lib\/kubelet\/pki\/kubelet.crt -text<\/code> allows you to verify kubelet certificates and verify they are still valid<\/li>\n<li>The kube-proxy Pods are running to ensure connectivity with worker nodes, use <code>kubectl get pods -n kube-system<\/code> for an overview<\/li>\n<\/ul>\n<pre class=\"lang:default decode:true \">[root@k8s cka]# kubectl cluster-info\r\nKubernetes control plane is running at https:\/\/172.30.9.24:8443\r\nCoreDNS is running at https:\/\/172.30.9.24:8443\/api\/v1\/namespaces\/kube-system\/services\/kube-dns:dns\/proxy\r\n\r\nTo further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.\r\n[root@k8s cka]# kubectl cluster-info dump | more\r\n{\r\n    \"kind\": \"NodeList\",\r\n    \"apiVersion\": \"v1\",\r\n    \"metadata\": {\r\n        \"resourceVersion\": \"1061012\"\r\n    },\r\n    \"items\": [\r\n        {\r\n            \"metadata\": {\r\n                \"name\": \"k8s.example.pl\",\r\n                \"uid\": \"e15d407a-6d1a-443f-93dd-e81ca406c89f\",\r\n                \"resourceVersion\": \"1060966\",\r\n                \"creationTimestamp\": \"2024-01-31T15:03:23Z\",\r\n                \"labels\": {\r\n                    \"beta.kubernetes.io\/arch\": \"amd64\",\r\n                    \"beta.kubernetes.io\/os\": \"linux\",\r\n                    \"kubernetes.io\/arch\": \"amd64\",\r\n                    \"kubernetes.io\/hostname\": \"k8s.example.pl\",\r\n                    \"kubernetes.io\/os\": \"linux\",\r\n                    \"minikube.k8s.io\/commit\": \"8220a6eb95f0a4d75f7f2d7b14cef975f050512d\",\r\n                    \"minikube.k8s.io\/name\": \"minikube\",\r\n                    \"minikube.k8s.io\/primary\": \"true\",\r\n                    \"minikube.k8s.io\/updated_at\": \"2024_01_31T10_03_27_0700\",\r\n                    \"minikube.k8s.io\/version\": \"v1.32.0\",\r\n                    \"node-role.kubernetes.io\/control-plane\": \"\",\r\n                    \"node.kubernetes.io\/exclude-from-external-load-balancers\": \"\"\r\n                },\r\n                \"annotations\": {\r\n                    \"kubeadm.alpha.kubernetes.io\/cri-socket\": \"unix:\/\/\/var\/run\/cri-dockerd.sock\",\r\n                    \"node.alpha.kubernetes.io\/ttl\": \"0\",\r\n                    \"volumes.kubernetes.io\/controller-managed-attach-detach\": \"true\"\r\n                }\r\n            },\r\n   ...\r\n\r\n[root@k8s cka]# kubectl get pods -n kube-system\r\nNAME                                    READY   STATUS             RESTARTS           AGE\r\ncoredns-5dd5756b68-sgfkj                0\/1     CrashLoopBackOff   3537 (4m52s ago)   14d\r\netcd-k8s.example.pl                      1\/1     Running            0                  14d\r\nkube-apiserver-k8s.example.pl            1\/1     Running            0                  9d\r\nkube-controller-manager-k8s.example.pl   1\/1     Running            0                  9d\r\nkube-proxy-hgh55                        1\/1     Running            0                  9d\r\nkube-scheduler-k8s.example.pl            1\/1     Running            0                  9d\r\nmetrics-server-5f8988d664-7r8j7         1\/1     Running            0                  9d\r\nstorage-provisioner                     1\/1     Running            8 (9d ago)         14d\r\n[root@k8s cka]# kubectl describe node | more\r\nName:               k8s.example.pl\r\nRoles:              control-plane\r\nLabels:             beta.kubernetes.io\/arch=amd64\r\n                    beta.kubernetes.io\/os=linux\r\n                    kubernetes.io\/arch=amd64\r\n                    kubernetes.io\/hostname=k8s.example.pl\r\n                    kubernetes.io\/os=linux\r\n                    minikube.k8s.io\/commit=8220a6eb95f0a4d75f7f2d7b14cef975f050512d\r\n                    minikube.k8s.io\/name=minikube\r\n                    minikube.k8s.io\/primary=true\r\n                    minikube.k8s.io\/updated_at=2024_01_31T10_03_27_0700\r\n                    minikube.k8s.io\/version=v1.32.0\r\n                    node-role.kubernetes.io\/control-plane=\r\n                    node.kubernetes.io\/exclude-from-external-load-balancers=\r\nAnnotations:        kubeadm.alpha.kubernetes.io\/cri-socket: unix:\/\/\/var\/run\/cri-dockerd.sock\r\n                    node.alpha.kubernetes.io\/ttl: 0\r\n                    volumes.kubernetes.io\/controller-managed-attach-detach: true\r\nCreationTimestamp:  Wed, 31 Jan 2024 10:03:23 -0500\r\nTaints:             &lt;none&gt;\r\nUnschedulable:      false\r\nLease:\r\n  HolderIdentity:  k8s.example.pl\r\n  AcquireTime:     &lt;unset&gt;\r\n  RenewTime:       Wed, 14 Feb 2024 11:03:31 -0500\r\nConditions:\r\n  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Messag\r\ne\r\n  ----             ------  -----------------                 ------------------                ------                       ------\r\n-\r\n  MemoryPressure   False   Wed, 14 Feb 2024 11:02:26 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasSufficientMemory   kubele\r\nt has sufficient memory available\r\n  DiskPressure     False   Wed, 14 Feb 2024 11:02:26 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasNoDiskPressure     kubele\r\nt has no disk pressure\r\n  PIDPressure      False   Wed, 14 Feb 2024 11:02:26 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasSufficientPID      kubele\r\nt has sufficient PID available\r\n  Ready            True    Wed, 14 Feb 2024 11:02:26 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletReady                 kubele\r\nt is posting ready status\r\nAddresses:\r\n  InternalIP:  172.30.9.24\r\n  Hostname:    k8s.example.pl\r\nCapacity:\r\n  cpu:                8\r\n  ephemeral-storage:  64177544Ki\r\n  hugepages-1Gi:      0\r\n  hugepages-2Mi:      0\r\n  memory:             16099960Ki\r\n  pods:               110\r\nAllocatable:\r\n  cpu:                8\r\n  ephemeral-storage:  64177544Ki\r\n  hugepages-1Gi:      0\r\n  hugepages-2Mi:      0\r\n  memory:             16099960Ki\r\n  pods:               110\r\nSystem Info:\r\n  Machine ID:                 0cc7c63085694b83adcd204eff748ff8\r\n[root@k8s cka]# openssl x589 -in \/var\/lib\/kubelet\/pki\/kubelet.crt -text\r\nInvalid command 'x589'; type \"help\" for a list.\r\n\r\n[root@k8s cka]# openssl x509 -in \/var\/lib\/kubelet\/pki\/kubelet.crt -text\r\nCertificate:\r\n    Data:\r\n        Version: 3 (0x2)\r\n        Serial Number: 6515317418553622450 (0x5a6b0eac2bcdbfb2)\r\n        Signature Algorithm: sha256WithRSAEncryption\r\n        Issuer: CN = k8s.example.pl-ca@1706713394\r\n        Validity\r\n            Not Before: Jan 31 14:03:14 2024 GMT\r\n            Not After : Jan 30 14:03:14 2025 GMT\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Application Access<\/span><\/p>\n<ul>\n<li>To access applications running in the Pods, Services and Ingress are used<\/li>\n<li>The Service resource uses a selector label to connect to Pods with a matching label<\/li>\n<li>The Ingress resource connects to a Service and picks up its selector label to connect to the backend Pods directly<\/li>\n<li>To troubleshoot application access, check the labels in all of these resources<\/li>\n<\/ul>\n<pre class=\"lang:default decode:true\">[root@k8s cka]# kubectl get endpoints\r\nNAME         ENDPOINTS                                      AGE\r\napples       &lt;none&gt;                                         11d\r\ninterginx    &lt;none&gt;                                         7d17h\r\nkubernetes   172.30.9.24:8443                               14d\r\nnewdep       &lt;none&gt;                                         11d\r\nnginxsvc     &lt;none&gt;                                         11d\r\nnwp-nginx    &lt;none&gt;                                         7d5h\r\nwebserver    &lt;none&gt;                                         7d17h\r\nwebshop      10.244.0.23:80,10.244.0.24:80,10.244.0.25:80   11d\r\n\r\n[root@k8s cka]# kubectl get svc\r\nNAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE\r\napples       ClusterIP   10.101.6.55      &lt;none&gt;        80\/TCP         11d\r\ninterginx    ClusterIP   10.102.130.239   &lt;none&gt;        80\/TCP         7d17h\r\nkubernetes   ClusterIP   10.96.0.1        &lt;none&gt;        443\/TCP        14d\r\nnewdep       ClusterIP   10.100.68.120    &lt;none&gt;        8080\/TCP       11d\r\nnginxsvc     ClusterIP   10.104.155.180   &lt;none&gt;        80\/TCP         11d\r\nnwp-nginx    ClusterIP   10.109.118.169   &lt;none&gt;        80\/TCP         7d5h\r\nwebserver    ClusterIP   10.109.5.62      &lt;none&gt;        80\/TCP         7d17h\r\nwebshop      NodePort    10.109.119.90    &lt;none&gt;        80:32064\/TCP   11d\r\n\r\n[root@k8s cka]# curl 10.109.5.62\r\ncurl: (7) Failed to connect to 10.109.5.62 port 80: Po\u0142\u0105czenie odrzucone\r\n\r\n[root@k8s cka]# kubectl edit svc webserver\r\n<\/pre>\n<p>The selector is wrong:<\/p>\n<pre class=\"lang:default decode:true\">  selector:\r\n    run: webServer\r\n<\/pre>\n<p>Change the selector to <code>webserver<\/code><\/p>\n<pre class=\"lang:default mark:24 decode:true\">apiVersion: v1\r\nkind: Service\r\nmetadata:\r\n  creationTimestamp: \"2024-02-06T22:33:54Z\"\r\n  labels:\r\n    run: webserver\r\n  name: webserver\r\n  namespace: default\r\n  resourceVersion: \"439277\"\r\n  uid: 58660b0f-ddc9-4a2b-9bc7-ec52973bc38f\r\nspec:\r\n  clusterIP: 10.109.5.62\r\n  clusterIPs:\r\n  - 10.109.5.62\r\n  internalTrafficPolicy: Cluster\r\n  ipFamilies:\r\n  - IPv4\r\n  ipFamilyPolicy: SingleStack\r\n  ports:\r\n  - port: 80\r\n    protocol: TCP\r\n    targetPort: 80\r\n  selector:\r\n    run: webserver\r\n  sessionAffinity: None\r\n  type: ClusterIP\r\nstatus:\r\n  loadBalancer: {}\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Probes <\/span><\/p>\n<ul>\n<li>Probes can be used to test access to Pods<\/li>\n<li>They are a part of the Pod specification<\/li>\n<li>A<em> readinessProbe<\/em> is used to make sure a Pod is not published as available until the readinessProbe has been able to access it<\/li>\n<li>The <em>livenessProbe<\/em> is used to continue checking the availability of a Pod<\/li>\n<li>The <em>startupProbe<\/em> was introduced for legacy applications that require additional startup time on first initialization<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Probe Types<\/span><\/p>\n<ul>\n<li>The probe itself is a simple test, which is often a command<\/li>\n<li>The following probe types are defined in pods.spec.container\n<ul>\n<li><code>exec<\/code>: a command is executed and returns a zero exit value<\/li>\n<li><code>httpGet<\/code>: an HTTP request returns a response code between 200 and 399<\/li>\n<li><code>tcpSocket<\/code>: connectivity to a TCP socket (available port) is successful<\/li>\n<\/ul>\n<\/li>\n<li>The Kubernetes API provides 3 endpoints that indicate the current status of the API server\n<ul>\n<li><code>\/healtz<\/code><\/li>\n<li><code>\/livez<\/code><\/li>\n<li><code>\/readyz<\/code><\/li>\n<\/ul>\n<\/li>\n<li>These endpoints can be used by different probes<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Using Probes\u00a0<\/span><\/p>\n<ul>\n<li><code>kubectl create -f busybox-ready.yaml<\/code><\/li>\n<li><code>kubectl get pods note<\/code> the READY state, which is set to 0\/1, which means that the Pod has successfully started, but is not considered ready.<\/li>\n<li><code>kubectl edit pods busybox-ready<\/code> and change \/tmp\/nothing to \/etc\/hosts. Notice this is not allowed.<\/li>\n<li><code>kubectl exec \u2014it busybox-ready -- \/bin\/sh<\/code>\n<ul>\n<li><code>touch \/tmp\/nothing; exit<\/code><\/li>\n<\/ul>\n<\/li>\n<li><code>kubectl get pods<\/code> at this point we have a Pod that is started<\/li>\n<\/ul>\n<pre class=\"lang:default mark:90 decode:true\">[root@controller ckad]# cat busybox-ready.yaml\r\napiVersion: v1\r\nkind: Pod\r\nmetadata:\r\n  name: busybox-ready\r\n  namespace: default\r\nspec:\r\n  containers:\r\n  - name: busy\r\n    image: busybox\r\n    command:\r\n      - sleep\r\n      - \"3600\"\r\n    readinessProbe:\r\n      periodSeconds: 10\r\n      exec:\r\n        command:\r\n        - cat\r\n        - \/tmp\/nothing\r\n    resources: {}\r\n\r\n[root@controller ckad]# kubectl create -f busybox-ready.yaml\r\npod\/busybox-ready created\r\n\r\n[root@controller ckad]# kubectl get pods\r\nNAME                        READY   STATUS    RESTARTS   AGE\r\nbusybox-ready               0\/1     Running   0          8s\r\nnewnginx-798b4cfdf6-fz54b   1\/1     Running   0          167m\r\nnewnginx-798b4cfdf6-q9rqd   1\/1     Running   0          167m\r\nnewnginx-798b4cfdf6-tbwvm   1\/1     Running   0          170m\r\n\r\n[root@controller ckad]# kubectl describe pods busybox-ready\r\nName:             busybox-ready\r\nNamespace:        default\r\nPriority:         0\r\nService Account:  default\r\nNode:             worker2.example.com\/172.30.9.27\r\nStart Time:       Tue, 05 Mar 2024 12:52:36 -0500\r\nLabels:           &lt;none&gt;\r\nAnnotations:      cni.projectcalico.org\/containerID: 4ad005fe0bb34b3e7c7acbc12f8ce2aef0a7341e3baf4d220b6cdd925e612c3c\r\n                  cni.projectcalico.org\/podIP: 172.16.71.203\/32\r\n                  cni.projectcalico.org\/podIPs: 172.16.71.203\/32\r\nStatus:           Running\r\nIP:               172.16.71.203\r\nIPs:\r\n  IP:  172.16.71.203\r\nContainers:\r\n  busy:\r\n    Container ID:  containerd:\/\/c8e5fa9d2ddb913249699e6dd042284cb6ee90603be31e1c43964699bfca8973\r\n    Image:         busybox\r\n    Image ID:      docker.io\/library\/busybox@sha256:6d9ac9237a84afe1516540f40a0fafdc86859b2141954b4d643af7066d598b74\r\n    Port:          &lt;none&gt;\r\n    Host Port:     &lt;none&gt;\r\n    Command:\r\n      sleep\r\n      3600\r\n    State:          Running\r\n      Started:      Tue, 05 Mar 2024 12:52:38 -0500\r\n    Ready:          False\r\n    Restart Count:  0\r\n    Readiness:      exec [cat \/tmp\/nothing] delay=0s timeout=1s period=10s #success=1 #failure=3\r\n    Environment:    &lt;none&gt;\r\n    Mounts:\r\n      \/var\/run\/secrets\/kubernetes.io\/serviceaccount from kube-api-access-p6pdg (ro)\r\nConditions:\r\n  Type              Status\r\n  Initialized       True\r\n  Ready             False\r\n  ContainersReady   False\r\n  PodScheduled      True\r\nVolumes:\r\n  kube-api-access-p6pdg:\r\n    Type:                    Projected (a volume that contains injected data from multiple sources)\r\n    TokenExpirationSeconds:  3607\r\n    ConfigMapName:           kube-root-ca.crt\r\n    ConfigMapOptional:       &lt;nil&gt;\r\n    DownwardAPI:             true\r\nQoS Class:                   BestEffort\r\nNode-Selectors:              &lt;none&gt;\r\nTolerations:                 node.kubernetes.io\/not-ready:NoExecute op=Exists for 300s\r\n                             node.kubernetes.io\/unreachable:NoExecute op=Exists for 300s\r\nEvents:\r\n  Type     Reason     Age               From               Message\r\n  ----     ------     ----              ----               -------\r\n  Normal   Scheduled  32s               default-scheduler  Successfully assigned default\/busybox-ready to worker2.example.com\r\n  Normal   Pulling    31s               kubelet            Pulling image \"busybox\"\r\n  Normal   Pulled     30s               kubelet            Successfully pulled image \"busybox\" in 1.241s (1.241s including waiting)\r\n  Normal   Created    30s               kubelet            Created container busy\r\n  Normal   Started    30s               kubelet            Started container busy\r\n  Warning  Unhealthy  2s (x5 over 29s)  kubelet            Readiness probe failed: cat: can't open '\/tmp\/nothing': No such file or directory\r\n\r\n[root@controller ckad]# kubectl edit pods busybox-ready\r\n<\/pre>\n<p>Change \/tmp\/nothing<\/p>\n<pre class=\"lang:default decode:true \">   readinessProbe:\r\n      exec:\r\n        command:\r\n        - cat\r\n        - \/tmp\/nothing\r\n<\/pre>\n<p>to \/etc\/hosts<\/p>\n<pre class=\"lang:default decode:true \">    readinessProbe:\r\n      exec:\r\n        command:\r\n        - cat\r\n        - \/etc\/hosts\r\n<\/pre>\n<p>This is unfortunatelly forbiden so we will do it in the diffrent way:<\/p>\n<pre class=\"lang:default decode:true \">[root@controller ckad]# kubectl exec -it busybox-ready -- touch \/tmp\/nothing\r\n\r\n[root@controller ckad]# kubectl get pods\r\nNAME                        READY   STATUS    RESTARTS   AGE\r\nbusybox-ready               1\/1     Running   0          11m\r\nnewnginx-798b4cfdf6-fz54b   1\/1     Running   0          179m\r\nnewnginx-798b4cfdf6-q9rqd   1\/1     Running   0          179m\r\nnewnginx-798b4cfdf6-tbwvm   1\/1     Running   0          3h1m\r\n<\/pre>\n<p>Now busybox-pod\u00a0 1 of 1 is ready.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Using Probes example 2<\/span><\/p>\n<ul>\n<li><code>kubectl create -f nginx-probes.yaml<\/code><\/li>\n<li><code>kubectl get pods<\/code><\/li>\n<\/ul>\n<pre class=\"lang:default decode:true \">[root@controller ckad]# cat nginx-probes.yaml\r\napiVersion: v1\r\nkind: Pod\r\nmetadata:\r\n  name: nginx-probes\r\n  labels:\r\n    role: web\r\nspec:\r\n  containers:\r\n  - name: nginx-probes\r\n    image: nginx\r\n    readinessProbe:\r\n      tcpSocket:\r\n        port: 80\r\n      initialDelaySeconds: 5\r\n      periodSeconds: 10\r\n    livenessProbe:\r\n      tcpSocket:\r\n        port: 80\r\n      initialDelaySeconds: 20\r\n      periodSeconds: 20\r\n\r\n[root@controller ckad]# kubectl create -f nginx-probes.yaml\r\npod\/nginx-probes created\r\n\r\n[root@controller ckad]# kubectl get pods\r\nNAME                        READY   STATUS    RESTARTS   AGE\r\nbusybox-ready               1\/1     Running   0          19m\r\nnewnginx-798b4cfdf6-fz54b   1\/1     Running   0          3h7m\r\nnewnginx-798b4cfdf6-q9rqd   1\/1     Running   0          3h7m\r\nnewnginx-798b4cfdf6-tbwvm   1\/1     Running   0          3h10m\r\nnginx-probes                0\/1     Running   0          7s\r\n\r\n[root@controller ckad]# kubectl get pods\r\nNAME                        READY   STATUS    RESTARTS   AGE\r\nbusybox-ready               1\/1     Running   0          20m\r\nnewnginx-798b4cfdf6-fz54b   1\/1     Running   0          3h7m\r\nnewnginx-798b4cfdf6-q9rqd   1\/1     Running   0          3h7m\r\nnewnginx-798b4cfdf6-tbwvm   1\/1     Running   0          3h10m\r\nnginx-probes                1\/1     Running   0          24s\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lab: Troubleshooting Kubernetes<\/span><\/p>\n<ul>\n<li>Create a Pod that is running the Busybox container with the sleep 3600<br \/>\ncommand. Configure a Probe that checks for the existence of the file<br \/>\n<code>\/etc\/hosts<\/code> on that Pod.<\/li>\n<\/ul>\n<p>Go to the Documentation: search &#8220;<em>probes&#8221; -&gt; Configure Liveness, Readiness and Startup Probes<\/em><\/p>\n<pre class=\"lang:default decode:true \">[root@controller ckad]# vi troublelab.yaml\r\n[root@controller ckad]# cat troublelab.yaml\r\napiVersion: v1\r\nkind: Pod\r\nmetadata:\r\n  labels:\r\n    test: liveness\r\n  name: liveness-exec\r\nspec:\r\n  containers:\r\n  - name: liveness\r\n    image: registry.k8s.io\/busybox\r\n    args:\r\n    - \/bin\/sh\r\n    - -c\r\n    - touch \/tmp\/healthy; sleep 30; rm -f \/tmp\/healthy; sleep 600\r\n    livenessProbe:\r\n      exec:\r\n        command:\r\n        - cat\r\n        - \/tmp\/healthy\r\n      initialDelaySeconds: 5\r\n      periodSeconds: 5\r\n\r\n[root@controller ckad]# vim troublelab.yaml\r\n[root@controller ckad]# cat troublelab.yaml\r\napiVersion: v1\r\nkind: Pod\r\nmetadata:\r\n  labels:\r\n    test: liveness\r\n  name: liveness-exec\r\nspec:\r\n  containers:\r\n  - name: liveness\r\n    image: registry.k8s.io\/busybox\r\n    args:\r\n    - \/bin\/sh\r\n    - -c\r\n    - sleep 3600\r\n    livenessProbe:\r\n      exec:\r\n        command:\r\n        - cat\r\n        - \/etc\/hosts\r\n      initialDelaySeconds: 5\r\n      periodSeconds: 5\r\n\r\n[root@controller ckad]# kubectl create -f troublelab.yaml\r\npod\/liveness-exec created\r\n\r\n[root@controller ckad]# kubectl get all\r\nNAME                READY   STATUS    RESTARTS        AGE\r\npod\/busybox-ready   0\/1     Running   3 (7m29s ago)   3h7m\r\npod\/liveness-exec   1\/1     Running   0               90s\r\npod\/nginx-probes    1\/1     Running   0               167m\r\n\r\nNAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE\r\nservice\/canary       ClusterIP   10.96.148.14   &lt;none&gt;        80\/TCP    6h31m\r\nservice\/kubernetes   ClusterIP   10.96.0.1      &lt;none&gt;        443\/TCP   5d4h\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lab: Troubleshooting Nodes<\/span><\/p>\n<ul>\n<li>Use the appropriate tools to find out if the cluster nodes are in good health you are going to use all<\/li>\n<\/ul>\n<pre class=\"lang:default decode:true \">[root@k8s cka]# kubectl get nodes\r\nNAME            STATUS   ROLES           AGE   VERSION\r\nk8s.example.pl   Ready    control-plane   14d   v1.28.3\r\n\r\n[root@k8s cka]# kubectl describe node k8s.example.pl | more\r\nName:               k8s.example.pl\r\nRoles:              control-plane\r\nLabels:             beta.kubernetes.io\/arch=amd64\r\n                    beta.kubernetes.io\/os=linux\r\n                    kubernetes.io\/arch=amd64\r\n                    kubernetes.io\/hostname=k8s.example.pl\r\n                    kubernetes.io\/os=linux\r\n                    minikube.k8s.io\/commit=8220a6eb95f0a4d75f7f2d7b14cef975f050512d\r\n                    minikube.k8s.io\/name=minikube\r\n                    minikube.k8s.io\/primary=true\r\n                    minikube.k8s.io\/updated_at=2024_01_31T10_03_27_0700\r\n                    minikube.k8s.io\/version=v1.32.0\r\n                    node-role.kubernetes.io\/control-plane=\r\n                    node.kubernetes.io\/exclude-from-external-load-balancers=\r\nAnnotations:        kubeadm.alpha.kubernetes.io\/cri-socket: unix:\/\/\/var\/run\/cri-dockerd.sock\r\n                    node.alpha.kubernetes.io\/ttl: 0\r\n                    volumes.kubernetes.io\/controller-managed-attach-detach: true\r\nCreationTimestamp:  Wed, 31 Jan 2024 10:03:23 -0500\r\nTaints:             &lt;none&gt;\r\nUnschedulable:      false\r\nLease:\r\n  HolderIdentity:  k8s.example.pl\r\n  AcquireTime:     &lt;unset&gt;\r\n  RenewTime:       Wed, 14 Feb 2024 11:27:10 -0500\r\nConditions:\r\n  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Messag\r\ne\r\n  ----             ------  -----------------                 ------------------                ------                       ------\r\n-\r\n  MemoryPressure   False   Wed, 14 Feb 2024 11:22:50 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasSufficientMemory   kubele\r\nt has sufficient memory available\r\n  DiskPressure     False   Wed, 14 Feb 2024 11:22:50 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasNoDiskPressure     kubele\r\nt has no disk pressure\r\n  PIDPressure      False   Wed, 14 Feb 2024 11:22:50 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletHasSufficientPID      kubele\r\nt has sufficient PID available\r\n  Ready            True    Wed, 14 Feb 2024 11:22:50 -0500   Sat, 03 Feb 2024 13:59:11 -0500   KubeletReady                 kubele\r\nt is posting ready status\r\nAddresses:\r\n  InternalIP:  172.30.9.24\r\n  Hostname:    k8s.example.pl\r\nCapacity:\r\n  cpu:                8\r\n  ephemeral-storage:  64177544Ki\r\n  hugepages-1Gi:      0\r\n  hugepages-2Mi:      0\r\n  memory:             16099960Ki\r\n  pods:               110\r\nAllocatable:\r\n  cpu:                8\r\n  ephemeral-storage:  64177544Ki\r\n  hugepages-1Gi:      0\r\n  hugepages-2Mi:      0\r\n  memory:             16099960Ki\r\n  pods:               110\r\nSystem Info:\r\n  Machine ID:                 0cc7c63085694b83adcd204eff748ff8\r\n\r\n[root@k8s cka]# systemctl status kubelet\r\n\u25cf kubelet.service - kubelet: The Kubernetes Node Agent\r\n   Loaded: loaded (\/usr\/lib\/systemd\/system\/kubelet.service; disabled; vendor preset: disabled)\r\n  Drop-In: \/etc\/systemd\/system\/kubelet.service.d\r\n           \u2514\u250010-kubeadm.conf\r\n   Active: active (running) since Sat 2024-02-03 13:59:11 EST; 1 weeks 3 days ago\r\n     Docs: https:\/\/kubernetes.io\/docs\/\r\n Main PID: 555436 (kubelet)\r\n    Tasks: 17 (limit: 100376)\r\n   Memory: 103.1M\r\n   CGroup: \/system.slice\/kubelet.service\r\n           \u2514\u2500555436 \/var\/lib\/minikube\/binaries\/v1.28.3\/kubelet --bootstrap-kubeconfig=\/etc\/kubernetes\/bootstrap-kubelet.conf --co&gt;\r\n\r\nlut 14 11:27:50 k8s.example.pl kubelet[555436]: E0214 11:27:50.488077  555436 file.go:182] \"Provided manifest path is a directory,&gt;\r\n\r\n[root@k8s cka]# systemctl status containerd\r\n\u25cf containerd.service - containerd container runtime\r\n   Loaded: loaded (\/usr\/lib\/systemd\/system\/containerd.service; disabled; vendor preset: disabled)\r\n   Active: active (running) since Thu 2024-02-01 15:23:34 EST; 1 weeks 5 days ago\r\n     Docs: https:\/\/containerd.io\r\n Main PID: 15732 (containerd)\r\n    Tasks: 1102\r\n   Memory: 1.0G\r\n   CGroup: \/system.slice\/containerd.service\r\n           \u251c\u2500  15732 \/usr\/bin\/containerd\r\n           \u251c\u2500  17376 \/usr\/bin\/containerd-shim-runc-v2 -namespace moby -id 10ffa7290bd3850824976e9be9652e17b05bffbd329cf05186e0019&gt;\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":5948,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[99],"tags":[],"_links":{"self":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5465"}],"collection":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/comments?post=5465"}],"version-history":[{"count":32,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5465\/revisions"}],"predecessor-version":[{"id":5476,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5465\/revisions\/5476"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media\/5948"}],"wp:attachment":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media?parent=5465"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/categories?post=5465"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/tags?post=5465"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}