Portworx: Failed to load PX filesystem dependencies for kernel
September 22, 2020This is a follow-up my previous post on Architecture considerations for stateful Kubernetes applications and is specific to VMWare’s Tanzu Kubernetes Grid (TKG) implementation of Kubernetes. In lieu of utilizing NFS pod to gain RWX (aka ReadWriteMany) access to vSphere volumes, I decided to go a different route.
Important caveat #1 for TKG users: It is important to note that this is currently only recommended in test/dev environments. Portworx confirmed the Kernel headers issue is planned to be fixed in their v3 release.
Important caveat #2 for TKG/vSphere users: You cannot generate a spec from PX central for the ‘Essentials’ tier that includes support for creating vSphere volumes… yet. This was confirmed to be on the way however. I just checked PX Central and it appears this functionality has been fixed/added since writing this.
Portworx Install & Kernel headers
VMWare’s vSan is not currently an option, so in looking for an alternative I eventually stumbled onto Portworx, a Kubernetes storage provider that was created specifically to address multi-node accessible storage pools. This addresses my RWX (ReadWriteMany) issue that I discussed in my previous post. Their free version supports up to 5 nodes, and a 5TB storage pool.
After getting Portworx installed, I started checking on the progress and noticed that none of the Portworx pods were starting correctly. What is going on?
List the Portworx pods:
kubectl get pods -l name=portworx -nkube-system
Then inspect pod(s) failing to start:
kubectl inspect pod portworx-cbmsl -nkube-system Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal PortworxMonitorImagePullInPrgress 56m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Normal PortworxMonitorImagePullInPrgress 40m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Warning FileSystemDependency 40m portworx, tkg-dev-md-0-565567f9c9-wxz7m Failed to load PX filesystem dependencies for kernel 4.19.132-1.ph3 Normal PortworxMonitorImagePullInPrgress 25m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Normal PortworxMonitorImagePullInPrgress 9m54s portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Warning Unhealthy 63s (x11389 over 31h) kubelet, tkg-dev-md-0-565567f9c9-wxz7m Readiness probe failed: HTTP probe failed with statuscode: 503
Looks like we found the issue:
Warning FileSystemDependency 40m portworx, tkg-dev-md-0-565567f9c9-wxz7m Failed to load PX filesystem dependencies for kernel 4.19.132-1.ph3
Verify Portworx’ status with pxctl (Commands formatted for Powershell):
$PX_POD=kubectl get pods -l name=portworx -nkube-system -o jsonpath="{.items[0].metadata.name}"
kubectl exec $PX_POD -nkube-system -- /opt/pwx/bin/pxctl status PX stopped working 12m45.8s ago. Last status: Failed to load PX filesystem dependencies for kernel 4.19.132-1.ph3
After some digging and searching with Google and tdnf, I was able to determine that the linux kernel headers were the missing piece. In Photon OS 3 the needed package is: linux-devel.
TKG is deployed using certificate auth for SSH access, under the username ‘capv’. I will need to connect to the nodes and see if I can get the dependencies installed.
SSH into the node:
ssh capv@192.168.58.28
Install modules:
sudo su - tdnf install linux-devel
Once the updated kernel modules have been installed, reboot the node.
Confirm Portworx has started:
kubectl describe pod portworx-cbmsl -nkube-system Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal PortworxMonitorImagePullInPrgress 44m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Normal PortworxMonitorImagePullInPrgress 29m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Warning FileSystemDependency 29m portworx, tkg-dev-md-0-565567f9c9-wxz7m Failed to load PX filesystem dependencies for kernel 4.19.132-1.ph3 Normal PortworxMonitorImagePullInPrgress 13m portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Warning Unhealthy 6m8s (x11569 over 32h) kubelet, tkg-dev-md-0-565567f9c9-wxz7m Readiness probe failed: HTTP probe failed with statuscode: 503 Warning FailedMount 3m2s kubelet, tkg-dev-md-0-565567f9c9-wxz7m MountVolume.SetUp failed for volume "px-account-token-z587c" : failed to sync secret cache: timed out waiting for the condition Normal SandboxChanged 3m1s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Pod sandbox changed, it will be killed and re-created. Normal Pulling 3m1s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Pulling image "portworx/oci-monitor:2.6.0" Normal Pulled 3m kubelet, tkg-dev-md-0-565567f9c9-wxz7m Successfully pulled image "portworx/oci-monitor:2.6.0" Normal Pulled 2m59s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0" Normal Started 2m59s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Started container portworx Normal Pulling 2m59s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0" Normal Created 2m59s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Created container portworx Normal PortworxMonitorImagePullInPrgress 2m58s portworx, tkg-dev-md-0-565567f9c9-wxz7m Portworx image docker.io/portworx/px-essentials:2.6.0 pull and extraction in progress Normal Created 2m58s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Created container csi-node-driver-registrar Normal Started 2m58s kubelet, tkg-dev-md-0-565567f9c9-wxz7m Started container csi-node-driver-registrar Warning NodeStateChange 2m12s portworx, tkg-dev-md-0-565567f9c9-wxz7m Node is not in quorum. Waiting to connect to peer nodes on port 9002. Warning Unhealthy 63s (x12 over 2m53s) kubelet, tkg-dev-md-0-565567f9c9-wxz7m Readiness probe failed: HTTP probe failed with statuscode: 503 Normal NodeStartSuccess 62s portworx, tkg-dev-md-0-565567f9c9-wxz7m PX is ready on this node
Excellent, the pod has started.
Verify Portworx status with pxctl:
$PX_POD=kubectl get pods -l name=portworx -nkube-system -o jsonpath="{.items[0].metadata.name}"
kubectl exec $PX_POD -nkube-system -- /opt/pwx/bin/pxctl status Status: PX is operational License: PX-Essential (lease renewal in 23h, 49m) Node ID: {GUID} IP: 192.168.58.28 Local Storage Pool: 1 pool POOL IO_PRIORITY RAID_LEVEL USABLE USED STATUS ZONE REGION 0 HIGH raid0 57 GiB 5.0 GiB Online default default Local Storage Devices: 1 device Device Path Media Type Size Last-Scan 0:1 /dev/sdb2 STORAGE_MEDIUM_MAGNETIC 57 GiB 09 Sep 20 21:56 UTC total - 57 GiB Cache Devices: * No cache devices Kvdb Device: Device Path Size /dev/sdc 150 GiB * Internal kvdb on this node is using this dedicated kvdb device to store its data. Journal Device: 1 /dev/sdb1 STORAGE_MEDIUM_MAGNETIC Cluster Summary Cluster ID: px-cluster-{GUID} Cluster UUID: {GUID} Scheduler: kubernetes Nodes: 3 node(s) with storage (3 online) IP ID SchedulerNodeName StorageNode Used Capacity Status StorageStatus Version KerneOS 192.168.58.28 {GUID} tkg-dev-md-0-565567f9c9-gvp2k Yes 5.0 GiB 57 GiB Online Up (This node) 2.6.0.0-208389c 4.19.138-2.ph3 VMware Photon OS/Linux 192.168.58.29 {GUID} tkg-dev-md-0-565567f9c9-2hd5q Yes 7.2 GiB 57 GiB Online Up 2.6.0.0-208389c 4.19.138-2.ph3 VMware Photon OS/Linux 192.168.58.21 {GUID} tkg-dev-md-0-565567f9c9-wxz7m Yes 5.0 GiB 57 GiB Online Up 2.6.0.0-208389c 4.19.145-1.ph3 VMware Photon OS/Linux Warnings: WARNING: Persistent journald logging is not enabled on this node. Global Storage Pool Total Used : 17 GiB Total Capacity : 171 GiB
Portworx has fully started and can be utilized.
Create a StorageClass and the shared PVC:
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: px-shared-sc provisioner: kubernetes.io/portworx-volume parameters: repl: "1" shared: "true" allowVolumeExpansion: true
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: px-shared-pvc annotations: volume.beta.kubernetes.io/storage-class: px-shared-sc spec: accessModes: - ReadWriteMany resources: requests: storage: 60Gi
Hopefully this helps if you happen to encounter this issue before Portworx rolls out version 3.
Great post!
Hi! Someone in my Myspace group shared this website with us so I came to take a look. Anderson Czysz