Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portworx LVM script may choose the wrong disks #45

Open
displague opened this issue Feb 19, 2021 · 0 comments
Open

Portworx LVM script may choose the wrong disks #45

displague opened this issue Feb 19, 2021 · 0 comments

Comments

@displague
Copy link
Member

displague commented Feb 19, 2021

Here are some of the inconsistencies:

  1. 2 out of 3 nodes did not create a LVM for the PX KVDB. Only one node successfully did it.
  2. The 3rd node that did create the pxw_vg, picked up the larger 480GB drive instead of the 240GB.

I am including some of the snippets.

This is worker node 1 where it could not create the pwx_vg and you can clearly see the warning message.

root@equinix-metal-gke-cluster-yk9or-worker-01:~# pxctl status
Status: PX is operational
License: Trial (expires in 31 days)
Node ID: 1534165d-4b6b-41df-b8e1-03e8c8d5c4d1
    IP: 145.40.77.105 
    Local Storage Pool: 2 pools
    POOL    IO_PRIORITY RAID_LEVEL  USABLE  USED    STATUS  ZONE    REGION
    0   HIGH        raid0       447 GiB 10 GiB  Online  default default
    1   HIGH        raid0       224 GiB 10 GiB  Online  default default
    Local Storage Devices: 2 devices
    Device  Path        Media Type      Size        Last-Scan
    0:1 /dev/sdb    STORAGE_MEDIUM_SSD  447 GiB     12 Feb 21 17:34 UTC
    1:1 /dev/sdc    STORAGE_MEDIUM_SSD  224 GiB     12 Feb 21 17:34 UTC
    * Internal kvdb on this node is sharing this storage device /dev/sdc  to store its data.
    total       -   671 GiB
    Cache Devices:
     * No cache devices
Cluster Summary
    Cluster ID: equinix-metal-gke-cluster-yk9or
    Cluster UUID: 47eb0c51-b2c1-456b-a254-e5c849a7d1db
    Scheduler: kubernetes
    Nodes: 3 node(s) with storage (3 online)
    IP      ID                  SchedulerNodeName               StorageNode Used    Capacity    Status  StorageStatus   Version     Kernel          OS
    145.40.77.101   9afd9a30-0eb3-4a8d-937f-86f5cf63c4bc    equinix-metal-gke-cluster-yk9or-worker-03   Yes     20 GiB  671 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.211   99a6f578-6c6f-4b09-b516-8dd332beef7e    equinix-metal-gke-cluster-yk9or-worker-02   Yes     20 GiB  668 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.105   1534165d-4b6b-41df-b8e1-03e8c8d5c4d1    equinix-metal-gke-cluster-yk9or-worker-01   Yes     20 GiB  671 GiOnline    Up (This node)  2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    Warnings: 
         WARNING: Internal Kvdb is not using dedicated drive on nodes [145.40.77.105]. This configuration is not recommended for production clusters.
Global Storage Pool
    Total Used      :  60 GiB
    Total Capacity  :  2.0 TiB
root@equinix-metal-gke-cluster-yk9or-worker-01:~# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 447.1G  0 disk 
sdb      8:16   0 447.1G  0 disk 
sdc      8:32   0 223.6G  0 disk 
sdd      8:48   0 223.6G  0 disk 
├─sdd1   8:49   0     2M  0 part 
├─sdd2   8:50   0   1.9G  0 part 
└─sdd3   8:51   0 221.7G  0 part /
root@equinix-metal-gke-cluster-yk9or-worker-01:~# 

This worker node 2 where there is no pwx_vg for the KVDB.

root@equinix-metal-gke-cluster-yk9or-worker-02:~# pxctl status
Status: PX is operational
License: Trial (expires in 31 days)
Node ID: 99a6f578-6c6f-4b09-b516-8dd332beef7e
    IP: 145.40.77.211 
    Local Storage Pool: 2 pools
    POOL    IO_PRIORITY RAID_LEVEL  USABLE  USED    STATUS  ZONE    REGION
    0   HIGH        raid0       447 GiB 10 GiB  Online  default default
    1   HIGH        raid0       221 GiB 10 GiB  Online  default default
    Local Storage Devices: 2 devices
    Device  Path        Media Type      Size        Last-Scan
    0:1 /dev/sdb    STORAGE_MEDIUM_SSD  447 GiB     12 Feb 21 17:47 UTC
    1:1 /dev/sdc2   STORAGE_MEDIUM_SSD  221 GiB     12 Feb 21 17:47 UTC
    * Internal kvdb on this node is sharing this storage device /dev/sdc2  to store its data.
    total       -   668 GiB
    Cache Devices:
     * No cache devices
    Journal Device: 
    1   /dev/sdc1   STORAGE_MEDIUM_SSD
Cluster Summary
    Cluster ID: equinix-metal-gke-cluster-yk9or
    Cluster UUID: 47eb0c51-b2c1-456b-a254-e5c849a7d1db
    Scheduler: kubernetes
    Nodes: 3 node(s) with storage (3 online)
    IP      ID                  SchedulerNodeName               StorageNode Used    Capacity    Status  StorageStatus   Version     Kernel          OS
    145.40.77.101   9afd9a30-0eb3-4a8d-937f-86f5cf63c4bc    equinix-metal-gke-cluster-yk9or-worker-03   Yes     20 GiB  671 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.211   99a6f578-6c6f-4b09-b516-8dd332beef7e    equinix-metal-gke-cluster-yk9or-worker-02   Yes     20 GiB  668 GiOnline    Up (This node)  2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.105   1534165d-4b6b-41df-b8e1-03e8c8d5c4d1    equinix-metal-gke-cluster-yk9or-worker-01   Yes     20 GiB  671 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    Warnings: 
         WARNING: Internal Kvdb is not using dedicated drive on nodes [145.40.77.105 145.40.77.211]. This configuration is not recommended for production clusters.
Global Storage Pool
    Total Used      :  60 GiB
    Total Capacity  :  2.0 TiB
root@equinix-metal-gke-cluster-yk9or-worker-02:~# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 447.1G  0 disk 
sdb      8:16   0 447.1G  0 disk 
sdc      8:32   0 223.6G  0 disk 
├─sdc1   8:33   0     3G  0 part 
└─sdc2   8:34   0 220.6G  0 part 
sdd      8:48   0 223.6G  0 disk 
├─sdd1   8:49   0     2M  0 part 
├─sdd2   8:50   0   1.9G  0 part 
└─sdd3   8:51   0 221.7G  0 part /
root@equinix-metal-gke-cluster-yk9or-worker-02:~# 

Finally this is worker node 3. This node creates the pwx_vg on the larger capacity drive.

root@equinix-metal-gke-cluster-yk9or-worker-03:~# pxctl status
Status: PX is operational
License: Trial (expires in 31 days)
Node ID: 9afd9a30-0eb3-4a8d-937f-86f5cf63c4bc
    IP: 145.40.77.101 
    Local Storage Pool: 2 pools
    POOL    IO_PRIORITY RAID_LEVEL  USABLE  USED    STATUS  ZONE    REGION
    0   HIGH        raid0       447 GiB 10 GiB  Online  default default
    1   HIGH        raid0       224 GiB 10 GiB  Online  default default
    Local Storage Devices: 2 devices
    Device  Path        Media Type      Size        Last-Scan
    0:1 /dev/sdb    STORAGE_MEDIUM_SSD  447 GiB     12 Feb 21 17:34 UTC
    1:1 /dev/sdc    STORAGE_MEDIUM_SSD  224 GiB     12 Feb 21 17:34 UTC
    total           -           671 GiB
    Cache Devices:
     * No cache devices
    Kvdb Device:
    Device Path     Size
    /dev/pwx_vg/pwxkvdb 447 GiB
     * Internal kvdb on this node is using this dedicated kvdb device to store its data.
Cluster Summary
    Cluster ID: equinix-metal-gke-cluster-yk9or
    Cluster UUID: 47eb0c51-b2c1-456b-a254-e5c849a7d1db
    Scheduler: kubernetes
    Nodes: 3 node(s) with storage (3 online)
    IP      ID                  SchedulerNodeName               StorageNode Used    Capacity    Status  StorageStatus   Version     Kernel          OS
    145.40.77.101   9afd9a30-0eb3-4a8d-937f-86f5cf63c4bc    equinix-metal-gke-cluster-yk9or-worker-03   Yes     20 GiB  671 GiOnline    Up (This node)  2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.211   99a6f578-6c6f-4b09-b516-8dd332beef7e    equinix-metal-gke-cluster-yk9or-worker-02   Yes     20 GiB  668 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
    145.40.77.105   1534165d-4b6b-41df-b8e1-03e8c8d5c4d1    equinix-metal-gke-cluster-yk9or-worker-01   Yes     20 GiB  671 GiOnline    Up      2.6.3.0-4419aa4 5.4.0-52-generic    Ubuntu 20.04.1 LTS
Global Storage Pool
    Total Used      :  60 GiB
    Total Capacity  :  2.0 TiB
root@equinix-metal-gke-cluster-yk9or-worker-03:~# lsblk
NAME             MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda                8:0    0 447.1G  0 disk 
└─pwx_vg-pwxkvdb 253:0    0 447.1G  0 lvm  
sdb                8:16   0 447.1G  0 disk 
sdc                8:32   0 223.6G  0 disk 
sdd                8:48   0 223.6G  0 disk 
├─sdd1             8:49   0     2M  0 part 
├─sdd2             8:50   0   1.9G  0 part 
└─sdd3             8:51   0 221.7G  0 part /
root@equinix-metal-gke-cluster-yk9or-worker-03:~# 

Any thoughts regarding these inconsistencies.

Originally posted by @bikashrc25 in #37 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant