{"id":5111,"date":"2023-08-19T15:40:33","date_gmt":"2023-08-19T13:40:33","guid":{"rendered":"http:\/\/miro.borodziuk.eu\/?p=5111"},"modified":"2025-05-19T16:46:44","modified_gmt":"2025-05-19T14:46:44","slug":"troubleshooting-openshift-applications","status":"publish","type":"post","link":"http:\/\/miro.borodziuk.eu\/index.php\/2023\/08\/19\/troubleshooting-openshift-applications\/","title":{"rendered":"Troubleshooting OpenShift Applications"},"content":{"rendered":"<p>You can usually ignore the differences between Kubernetes deployments and OpenShift deployment configurations when troubleshooting applications. The common failure scenarios and the ways to troubleshoot them are essentially the same.<\/p>\n<p><!--more--><\/p>\n<p><span style=\"color: #3366ff;\">Troubleshooting Pods That Fail to Start<\/span><br \/>\nA common scenario is that OpenShift creates a pod and that pod never establishes a <em>Running<\/em> state.\u00a0 At some point, the pods are in an error state, such as <code>ErrImagePull\u00a0<\/code>or <code>ImagePullBackOff<\/code>. Troubleshooting:<\/p>\n<ul>\n<li><code>oc get pod<\/code><\/li>\n<li><code>oc status <\/code><code><\/code><\/li>\n<li><code>oc get events <\/code><\/li>\n<li><code>oc describe pod <\/code><\/li>\n<li><code>oc describe<\/code><\/li>\n<\/ul>\n<p><span style=\"color: #3366ff;\">Troubleshooting Running and Terminated Pods<\/span><br \/>\nOpenShift creates a pod, and for a short time no problem is encountered. The pod enters the Running state, which means at least one of its containers started running. Later, an application running inside one of the pod containers stops working. OpenShift tries to restart the container several times. If the application continues terminating, due to health probes or other reasons, then the pod will be left in the <code>CrashLoopBackOff<\/code> state.<\/p>\n<ul>\n<li>\u00a0<code>oc logs &lt;my-pod-name&gt;<\/code><br \/>\nIf the pod contains multiple containers, then the oc logs command requires the -c option.<\/li>\n<li><code>oc logs &lt;my-pod-name&gt; -c &lt;my-container-name&gt;<\/code><\/li>\n<\/ul>\n<p><span style=\"color: #3366ff;\">Using oc debug<\/span><\/p>\n<ul>\n<li>When troubleshooting, it&#8217;s useful to get an exact copy of a running Pod and troubleshoot from there<\/li>\n<li>Since a Pod that is failing may not be started, and for that reason is not accessible to <code>rsh<\/code> and <code>exec<\/code>, the <code>debug<\/code> command provides an alternative<\/li>\n<li>The <code>debug<\/code> Pod will start a shell inside of the first container of the referenced Pod<\/li>\n<li>The started Pod is a copy of the source Pod, with labels stripped, no probes, and the command changed to<code> \/bin\/sh<\/code><\/li>\n<li>Useful command arguments can be <code>--as-root<\/code> or <code>--as-user=10000<\/code> to run as root or as a specific user<\/li>\n<li>Use <code>exit<\/code> to close and destroy the debug Pod<\/li>\n<\/ul>\n<p><span style=\"color: #3366ff;\">Demo: Using oc debug<\/span><\/p>\n<ul>\n<li><code>oc login -u developer -p developer<\/code><\/li>\n<li><code>oc create deployment dnginx --image=nginx<\/code><\/li>\n<li><code>oc get pods<\/code> # shows failure<\/li>\n<li><code>oc debug deploymentconfig\/dnginx --as-user=10000<\/code> # will fail, select user ID as suggested\n<ul>\n<li><code>nginx<\/code> # will fail<\/li>\n<li><code>exit<\/code><\/li>\n<\/ul>\n<\/li>\n<li><code>oc debug deploymentconfig\/dnginx --as-root<\/code> # will fail, login as admin and try again\n<ul>\n<li><code>nginx<\/code> # will run<\/li>\n<li><code>exit <\/code><\/li>\n<\/ul>\n<\/li>\n<li>This test has shown that the nginx image needs to run as root<\/li>\n<\/ul>\n<p>Let&#8217;s create a new project and new deployment:<\/p>\n<pre class=\"lang:default decode:true \">$ oc login -u developer -p developer\r\n\r\n$ oc new-project debug\r\n\r\n$ oc create deployment dnginx --image=nginx\r\ndeployment.apps\/dnginx created\r\n\r\n$ oc get all\r\nNAME                         READY     STATUS    RESTARTS   AGE\r\npod\/dnginx-88c7766dd-hlbtd   0\/1       Error     2          30s\r\n\r\nNAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE\r\ndeployment.apps\/dnginx   1         1         1            0           30s\r\n\r\nNAME                               DESIRED   CURRENT   READY     AGE\r\nreplicaset.apps\/dnginx-88c7766dd   1         1         0         30s\r\n\r\n$ oc get pods\r\nNAME                     READY     STATUS             RESTARTS   AGE\r\ndnginx-88c7766dd-hlbtd   0\/1       CrashLoopBackOff   6          8m\r\n<\/pre>\n<p>Let&#8217;s debug the pod:<\/p>\n<pre class=\"lang:default decode:true\">$ oc debug deploy\/dnginx --as-user=10000\r\nDefaulting container name to nginx.\r\nUse 'oc describe pod\/dnginx-debug -n debug' to see all of the containers in this pod.\r\n\r\nDebugging with pod\/dnginx-debug, original command: &lt;image entrypoint&gt;\r\nError from server (Forbidden): pods \"dnginx-debug\" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 10000: must be in the ranges: [1000200000, 1000209999]]\r\n\r\n$ oc debug deploy\/dnginx --as-user=1000200000\r\nDefaulting container name to nginx.\r\nUse 'oc describe pod\/dnginx-debug -n debug' to see all of the containers in this pod.\r\n\r\nDebugging with pod\/dnginx-debug, original command: &lt;image entrypoint&gt;\r\nWaiting for pod to start ...\r\nIf you don't see a command prompt, try pressing enter.\r\n\r\n$ nginx\r\n2023\/07\/27 09:13:33 [warn] 7#7: the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in \/etc\/nginx\/nginx.conf:2\r\nnginx: [warn] the \"user\" directive makes sense only if the master process runs with super-user privileges, ignored in \/etc\/nginx\/nginx.conf:2\r\n2023\/07\/27 09:13:33 [emerg] 7#7: mkdir() \"\/var\/cache\/nginx\/client_temp\" failed (13: Permission denied)\r\nnginx: [emerg] mkdir() \"\/var\/cache\/nginx\/client_temp\" failed (13: Permission denied)\r\n\r\n$ exit\r\nRemoving debug pod ...\r\n<\/pre>\n<p>As we see in the log w are doing &#8220;Permission denied&#8221;, that is why there is an error in the pod.<\/p>\n<p>Now let&#8217;s debug the pod as admin:<\/p>\n<pre class=\"lang:default decode:true\">$ oc debug deploy\/dnginx --as-root\r\nDefaulting container name to nginx.\r\nUse 'oc describe pod\/dnginx-debug -n debug' to see all of the containers in this pod.\r\n\r\nDebugging with pod\/dnginx-debug, original command: &lt;image entrypoint&gt;\r\nError from server (Forbidden): pods \"dnginx-debug\" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000200000, 1000209999]]\r\n\r\n\r\n$ oc login -u kubeadmin -p redhat\r\n\r\n$ oc project debug\r\n\r\n$ oc get all\r\nNAME                         READY     STATUS             RESTARTS   AGE\r\npod\/dnginx-88c7766dd-hlbtd   0\/1       CrashLoopBackOff   11         33m\r\n\r\nNAME                     DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE\r\ndeployment.apps\/dnginx   1         1         1            0           33m\r\n\r\nNAME                               DESIRED   CURRENT   READY     AGE\r\nreplicaset.apps\/dnginx-88c7766dd   1         1         0         33m\r\n\r\n\r\n$ oc debug deploy\/dnginx --as-root\r\nDefaulting container name to nginx.\r\nUse 'oc describe pod\/dnginx-debug -n debug' to see all of the containers in this pod.\r\n\r\nDebugging with pod\/dnginx-debug, original command: &lt;image entrypoint&gt;\r\nWaiting for pod to start ...\r\nIf you don't see a command prompt, try pressing enter.\r\n\r\n# nginx\r\n2023\/07\/27 09:35:32 [notice] 7#7: using the \"epoll\" event method\r\n2023\/07\/27 09:35:32 [notice] 7#7: nginx\/1.25.1\r\n2023\/07\/27 09:35:32 [notice] 7#7: built by gcc 12.2.0 (Debian 12.2.0-14)\r\n2023\/07\/27 09:35:32 [notice] 7#7: OS: Linux 3.10.0-1160.92.1.el7.x86_64\r\n2023\/07\/27 09:35:32 [notice] 7#7: getrlimit(RLIMIT_NOFILE): 1048576:1048576\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker processes\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 9\r\n# 2023\/07\/27 09:35:32 [notice] 8#8: start worker process 10\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 11\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 12\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 13\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 14\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 15\r\n2023\/07\/27 09:35:32 [notice] 8#8: start worker process 16\r\n\r\n# ps aux\r\n\/bin\/sh: 4: ps: not found\r\n# ls \/proc\r\n1   14  9          cmdline   diskstats    filesystems  irq        kmsg        mdstat   mtrr          schedstat  stat           timer_list   vmallocinfo\r\n10  15  acpi       consoles  dma          fs           kallsyms   kpagecount  meminfo  net           scsi       swaps          timer_stats  vmstat\r\n11  16  buddyinfo  cpuinfo   driver       interrupts   kcore      kpageflags  misc     pagetypeinfo  self       sys            tty          xen\r\n12  17  bus        crypto    execdomains  iomem        key-users  loadavg     modules  partitions    slabinfo   sysrq-trigger  uptime       zoneinfo\r\n13  8   cgroups    devices   fb           ioports      keys       locks       mounts   sched_debug   softirqs   sysvipc        version\r\n# cat \/proc\/8\/cmdline\r\nnginx: master process nginx#\r\n# cat \/proc\/9\/cmdline\r\nnginx: worker process#\r\n# exit\r\n\r\nRemoving debug pod ...\r\n<\/pre>\n<p>As we see on the admin account the nginx pod works prooperly.<\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lab: Fixing Application Permissions<\/span><\/p>\n<ul>\n<li>Use <code>oc run mynginx --image=nginx<\/code> to run an Nginx webserver Pod<\/li>\n<li>It fails. Fix it.<\/li>\n<\/ul>\n<pre class=\"lang:default decode:true \">$ oc new-project fixapp\r\n$ oc run mynginx --image=nginx\r\n$ oc get pods\r\nNAME      READY   STATUS    RESTARTS   AGE\r\nmynginx   0\/1     Pending   0          11s\r\n\r\n$ oc logs mynginx\r\n$ oc describe mynginx\r\n\r\n$ oc get pod mynginx -o yaml | grep message\r\nmessage: 0\/1 nodes are available: 1 node(s) had untolerated taint {Node: Worker}.\r\n\r\n$ oc describe pod\/mynginx\r\nName:         mynginx\r\nNamespace:    fixapp\r\nPriority:     0\r\nNode:         &lt;none&gt;\r\nLabels:       run=mynginx\r\nAnnotations:  openshift.io\/scc: anyuid\r\nStatus:       Pending\r\nIP:\r\nIPs:          &lt;none&gt;\r\nContainers:\r\n  mynginx:\r\n    Image:        nginx\r\n    Port:         &lt;none&gt;\r\n    Host Port:    &lt;none&gt;\r\n    Environment:  &lt;none&gt;\r\n    Mounts:\r\n      \/var\/run\/secrets\/kubernetes.io\/serviceaccount from kube-api-access-wk9p9 (ro)\r\nConditions:\r\n  Type           Status\r\n  PodScheduled   False\r\nVolumes:\r\n  kube-api-access-wk9p9:\r\n    Type:                    Projected (a volume that contains injected data from multiple sources)\r\n    TokenExpirationSeconds:  3607\r\n    ConfigMapName:           kube-root-ca.crt\r\n    ConfigMapOptional:       &lt;nil&gt;\r\n    DownwardAPI:             true\r\n    ConfigMapName:           openshift-service-ca.crt\r\n    ConfigMapOptional:       &lt;nil&gt;\r\nQoS Class:                   BestEffort\r\nNode-Selectors:              &lt;none&gt;\r\nTolerations:                 node.kubernetes.io\/not-ready:NoExecute op=Exists for 300s\r\n                             node.kubernetes.io\/unreachable:NoExecute op=Exists for 300s\r\nEvents:\r\n  Type     Reason            Age                From               Message\r\n  ----     ------            ----               ----               -------\r\n  Warning  FailedScheduling  96s (x7 over 33m)  default-scheduler  0\/1 nodes are available: 1 node(s) had untolerated taint {Node: Worker}. preemption: 0\/1 nodes are available: 1 Preemption is not helpful for scheduling.\r\n\r\n\r\n$ oc get all\r\nNAME          READY   STATUS    RESTARTS   AGE\r\npod\/mynginx   0\/1     Pending   0          2m38s\r\n\r\n$ oc get pod mynginx -o yaml | oc adm policy scc-subject-review -f -\r\nRESOURCE      ALLOWED BY\r\nPod\/mynginx   anyuid\r\n\r\n$ oc get pods\r\nNAME      READY   STATUS    RESTARTS   AGE\r\nmynginx   0\/1     Pending   0          11m\r\n\r\n$ oc get all\r\nNAME          READY   STATUS    RESTARTS   AGE\r\npod\/mynginx   0\/1     Pending   0          11m\r\n\r\n\r\n$ oc get pods -o yaml\r\n...\r\n      message: '0\/1 nodes are available: 1 node(s) had untolerated taint {Node: Worker}.\r\n...\r\n$ oc get pods -o yaml &gt; fixapp.yaml\r\n\r\n$ oc delete pod mynginx\r\npod \"mynginx\" deleted\r\n\r\n$ oc get pods\r\nNo resources found in fixapp namespace.\r\n\r\n$ vi fixapp.yaml\r\n---\r\n    serviceAccount: fixapp-sa\r\n    serviceAccountName: fixapp-sa\r\n---\r\n\r\n tolerations:\r\n - effect: NoSchedule\r\n   key: Node\r\n   value: Worker\r\n   operator: Equal\r\n---\r\n\r\n\r\n$ oc create sa fixapp-sa\r\n\r\n$ oc adm policy add-scc-to-user anyuid -z fixapp-sa\r\n\r\n$ oc create -f fixapp.yaml\r\n\r\n\r\n$ oc get pods\r\nNAME      READY   STATUS    RESTARTS   AGE\r\nmynginx   1\/1     Running   0          23s\r\n\r\n<\/pre>\n<p>Change the yaml file:<\/p>\n<pre class=\"lang:default decode:true \">    securityContext:\r\n      fsGroup: 1000510000\r\n      seLinuxOptions:\r\n        level: s0:c23,c2\r\n    serviceAccount: fixapp-sa\r\n    serviceAccountName: fixapp-sa\r\n    terminationGracePeriodSeconds: 10\r\n    volumes:\r\n    - name: deployer-token-c84pp\r\n      secret:\r\n        defaultMode: 420\r\n        secretName: deployer-token-c84pp\r\n  status:\r\n<\/pre>\n<p>And then:<\/p>\n<pre class=\"lang:default decode:true\">$ oc create -f fixapp.yaml \r\npod\/mynginx-1-deploy created \r\n\r\n$ oc get pods \r\nNAME READY STATUS RESTARTS AGE \r\nmynginx-1-deploy 0\/1 ContainerCreating 0 6s \r\n\r\n<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lab: Configuring MySQL<\/span><\/p>\n<p>\u2022 As the developer user, use a deployment to create an application named mysql in the microservice project<br \/>\n\u2022 Create a generic secret named mysql, using password as the key and mypassword as its value.<br \/>\nUse this secret to set the MYSQL_ROOT_PASSWORD environment variable to the value of the password in the secret.<br \/>\n\u2022 Configure the MySQL application to mount a PVC to \/mnt. The PVC must have a 1GiB size, and the ReadWriteOnce access mode<br \/>\n\u2022 Use a Nodeselector to ensure that MySQL will only run on your CRC node<\/p>\n<p>&nbsp;<\/p>\n<pre class=\"lang:default decode:true \">$ oc login -u kubeadmin -p $ (cat ~\/.crc\/machines\/crc\/kubeadmin-password) https:\/\/api.crc.testing:6443\r\n\r\n$ oc new-project microservice\r\n$ oc new-app --name mysql --docker-image mysql\r\n\r\n$ oc get pods\r\nNAME                     READY   STATUS              RESTARTS   AGE\r\nmysql-59cd867785-qd6gb   0\/1     ContainerCreating   0          4s\r\n\r\n$ oc logs mysql-59cd867785-qd6gb\r\n2023-09-18 14:32:59+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 8.1.0-1.el8 started.\r\n2023-09-18 14:33:01+00:00 [ERROR] [Entrypoint]: Database is uninitialized and password option is not specified\r\n    You need to specify one of the following as an environment variable:\r\n    - MYSQL_ROOT_PASSWORD\r\n    - MYSQL_ALLOW_EMPTY_PASSWORD\r\n    - MYSQL_RANDOM_ROOT_PASSWORD\r\n\r\n$ oc create secret generic mysql --from-literal=password=mypassword\r\n\r\n$ oc set env deployment mysql --prefix MYSQL_ROOT_ --from secret\/mysql\r\n\r\n$ oc get pods\r\nRunning\r\n\r\n$ oc set volumes deployment\/mysql --name mysql-pvc --add --type pvc --claim-size 1Gi --claim-mode rwo --mount-path \/mnt\r\n\r\n$ oc get nodes\r\nNAME STATUS ROLES AGE VERSION\r\ncrc-lgph7-master-0 Ready master,worker 321d v1.24.0+3882f8f\r\n\r\n$ oc label nodes crc-lgph7-master-0 role=master\r\nnode\/crc-lgph7-master-0 labeled\r\n\r\n\r\n$ oc edit deployment\r\n\r\ndnsPolicy: ClusterFirst\r\nnodeSelector:\r\nrole:master\r\nrestartPolicy: Always\r\n\r\n$ oc get pods\r\nNAME READY STATUS RESTARTS AGE\r\nmysql-767bb84f9-8944q 1\/1 Running 0 11m<\/pre>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #3366ff;\">Lab: Configuring WordPress<\/span><\/p>\n<p>\u2022 As the developer user, use a deployment to create an application named wordpress in the microservice project<br \/>\n\u2022 Run this application with the anyuid security context assigned to the wordpress-sa service account<br \/>\n\u2022 Create a route to the WordPress application, using the hostname word press-microservice.apps-crc.testi ng<br \/>\n\u2022 Use secrets and or ConfigMaps to set environment variables:<br \/>\n\u2022 WORDPRESS_DB_HOST: is set to mysql<br \/>\n\u2022 WORDPRESS_DB_NAME: is set to the value of wordpress<br \/>\n\u2022 WORDPRESS_DB_USER: has the value &#8220;root&#8221;<br \/>\n\u2022 WORDPRESS_DB_PASSWORD is set to the value of the password key in the mysql secret<\/p>\n<pre class=\"lang:default decode:true \">$ oc whoami\r\ndeveloper\r\n$ oc new-project wordpress\r\n\r\n$ oc new-app --name wordpress --docker-image wordpress\r\nFlag --docker-image has been deprecated, Deprecated flag use --image\r\n\r\n$ oc get pods\r\nNAME READY STATUS RESTARTS AGE\r\nwordpress-5db7955867-65p8n 0\/1 CrashLoopBackOff 1 (15s ago) 2m18s\r\n\r\n$ oc login -u kubeadmin -p $(cat ~\/.crc\/machines\/crc\/kubeadmin-password) https:\/\/api.crc.testing:6443\r\n\r\n$ oc create sa wordpress-sa\r\n\r\n$ oc set sa deployment wordpress wordpress-sa\r\n\r\n$ oc adm policy add-scc-to-user anyuid -z wordpress-sa\r\n\r\n$ oc expose svc wordpress\r\n\r\n$ oc create cm wordpress-cm --from-literal=host=mysql --from-literal=name=wordpress --from-literal=user=root --from-literal=password=password\r\n\r\n$ oc set env deployment wordpress --prefix WORDPRESS_DB_ --from configmap\/wordpress-cm\r\n\r\n$ oc get pods\r\nNAME READY STATUS RESTARTS AGE\r\nwordpress-8c97b779d-svrsh 1\/1 Running 0 11s\r\n\r\n<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You can usually ignore the differences between Kubernetes deployments and OpenShift deployment configurations when troubleshooting applications. The common failure scenarios and the ways to troubleshoot them are essentially the same.<\/p>\n","protected":false},"author":2,"featured_media":5960,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[93],"tags":[],"_links":{"self":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5111"}],"collection":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/comments?post=5111"}],"version-history":[{"count":11,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5111\/revisions"}],"predecessor-version":[{"id":5126,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/posts\/5111\/revisions\/5126"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media\/5960"}],"wp:attachment":[{"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/media?parent=5111"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/categories?post=5111"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/miro.borodziuk.eu\/index.php\/wp-json\/wp\/v2\/tags?post=5111"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}