Replies: 6 comments 2 replies
-
My configuration is on a iSCSI volume (backed by https://github.com/ressu/synology-csi) and aside from CSI isssues, the configuration is stable and works as expected. But I do recall some mentions of issues if database is on NFS. There are some mentions of locking issues in the discussion there, but if your issues are resolved by simply recreating the PVC (effectively remounting the volume) then locking is an unlikely cause. Locking issues would show up as data corruption instead. Have you tried rebooting the node? While it is disruptive to the other pods running on the node, it would confirm whether this is a mount issue or an issue somewhere else. |
Beta Was this translation helpful? Give feedback.
-
It isn't fixed by remounting, only by deleting the pvc, recreating the pvc,
and mounting the new pvc. Definitely "appears to be" corruption. The issues occur within
and hour or two of the initial discovery and loading of the available media
files into the libraries. I've started over several times with different
mount options with no luck.
My main share is off a 2019 windows server that I am mounting on a Linux
system which uses it for an nfs export. I then am using that nfs share for
dynamic provisioning in the cluster. It seems to work for everything so
far, except for plex and nextcloud. I am concerned everything else I have
recently set up will all become corrupted as well. Looks like I need a
better solution. Looking for ideas though I wasn't expecting a sudden
expense...
…On Sat, Sep 25, 2021, 4:22 AM Sami Haahtinen ***@***.***> wrote:
My configuration is on a iSCSI volume (backed by
https://github.com/ressu/synology-csi) and aside from CSI isssues, the
configuration is stable and works as expected.
But I do recall some mentions
<https://www.reddit.com/r/PleX/comments/ff4a59/plex_hangs_with_library_and_database_on_nfs/>
of issues if database is on NFS. There are some mentions of locking issues
in the discussion there, but if your issues are resolved by simply
recreating the PVC (effectively remounting the volume) then locking is an
unlikely cause. Locking issues would show up as data corruption instead.
Have you tried rebooting the node? While it is disruptive to the other
pods running on the node, it would confirm whether this is a mount issue or
an issue somewhere else.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#19 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEHXQ7IMV4EHV2OVMFNS2KLUDWPGPANCNFSM5ESWC5HA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
Oh, so you need to delete the data to get everything back running. Got it. I'm not sure how locking with Windows 2019 works, so you could try setting On the upside, the way my fork of kube-plex is built avoids mounting configuration on anything but the main Plex pod. I did this because multi-mounting an ext4 filesystem isn't supported and I was too lazy to seek out other alternatives 😆 |
Beta Was this translation helpful? Give feedback.
-
Actually, I got some details wrong. My data folder is a cifs mount, but my config mount is via an nfs share hosted on a linux system with a local folder. The local folder is via locally mounted drive (a virtual disk). So, the nfs share is just a disk on a linux system. At least this removes one layer. I've learned if I wait awhile the plex server comes back up. It's very odd. Logs don't give any clues as far as I can tell. Still looking into local_lock=all. I can log into the pod, install telnet and connect to 32400, then do a GET /web/index.html and it will just hang, most of the time eventually working after several minutes. Though once it took so long I just restarted the pod. Continuing to investigate... |
Beta Was this translation helpful? Give feedback.
-
Started working after I did two things:
I also switched to longhorn (iscsi), but I think my NFS setup would have also worked with RWO. Without RWX I can't play around with the transcode working in a separate pod, so, unable to take advantage of resources on any node. Well, one thing at a time, at least I can retire my docker vm now and stay in kubernetes. Thank you for your help. |
Beta Was this translation helpful? Give feedback.
-
Discussed this with plex and ended up completely abandoning NFS in my kubernetes clusters and recommend anyone using NFS to do the same. Don't believe anyone when they tell you NFS or CIFS has fully functional file locking, and they can be used as kubernetes provider solution. Even they work for awhile, data corruption is inevitable. I ended up going with longhorn that uses a local disk on each worker node to allocate PVCs. It uses iscsi as needed without having to be involved at all in the iscsi setup, it does everything other than installing iscsi itself. Longhorn, and other similar products also create multiple copies of PVCs so if you were to lose a worker node completely the data would not be lost. I've had zero issues since switching to this solution and can hardly believe how fast plex is. Also, I should be able to re-enable kube-plex now and enjoy this project again with its separate instance of a transcoding pod. My media folders are still via CIFS but this is ok as they are only used in a read-only manner (except for occasionally deleting something via the gui). See the discussion here: |
Beta Was this translation helpful? Give feedback.
-
I've learned the hard way that plex has issues with using an nfs share as its config folder. The plexinc/pms-docker image notes that the issue is around file locking not being enabled by default with nfs.
In my storageClass nfs mount options I'm using 'hard' and 'nfsvers=4.2', yet I still watch in amazement as my plex server stops working after awhile and seems to only come back to a working condition by deleting the deployment, erasing the config pvc, recreating the pvc, and restarting.
My nfs server and all my cluster nodes are running the latest centos-8-stream.
If you guys are using NFS for your config mount, what mount options are you using (on both server and client side)? If you are using something else, what has worked for you for an onprem storage solution to store your /config on as a network share?
Beta Was this translation helpful? Give feedback.
All reactions