Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Make Read_Smb_Spool a separate task running concurrent #328

Open
sabatech opened this issue Jun 10, 2024 · 7 comments
Open

Comments

@sabatech
Copy link

I find that my system pauses for very long times during some actions on my greyhole server. I was wondering if it's possible to make the Read_Smb_Spool a separately run task so that greyhole can continue processing files while the spool is being processed. Maybe make the SMB spool action is it's own daemon or something.

Last logged action: read_smb_spool
on 2024-06-10 20:27:16 (1h 4m 8s ago)

@sabatech
Copy link
Author

Last logged action: read_smb_spool
on 2024-06-10 20:27:16 (3h 13m 3s ago)

@sabatech
Copy link
Author

and here I am a few hours later, and it finally finished its read_smb_spool, and it's back to processing the spool AGAIN, instead of processing files, so file copies are getting backed up and waiting instead of it actually managing the storage copies, I want it to keep moving data not spending hours processing the spool while waiting to process files.

@sabatech
Copy link
Author

Last logged action: read_smb_spool
on 2024-06-11 04:28:08 (29s ago)

@sabatech
Copy link
Author

I think separating the spool process vs the file handling process would be an incredible performance boost on top of everything else.

@sabatech
Copy link
Author

everything seems to just wait on the smb spool handling. and if sufficient spool stuff has happened, the spool can be the bottleneck

@sabatech
Copy link
Author

it seems like all it wants to do is process the spool rather than process the files, and its creating a huge log of stuff to do, but not actually doing it

@gboudreau
Copy link
Owner

Greyhole is trying to process the spool often because it is trying to prevent what is happening to you right now: having too many spooled operations that it takes a very long time to simply list and order them correctly, each time it needs to do so (before moving them from files in the spool folder to rows in MySQL).
You're now at a point where you have soo much behind, in spool processing, with probably millions of operations spooled, that a simple ls (I think it uses find really) and sort takes forever.

Your suggestion might be beneficial in some very specific situations, but is not that simple to implement.

To resolve your current situation, you can try to lower the value of max_queued_tasks to something lower than the default 10,000,000). This limits the number of rows inserted in MySQL, so once you've reached this limit, the spool processor will NOT do anything, and the daemon will instead work on file operations until the number of queued tasks in MySQL goes below this number. Look in MySQL for the number of rows in the tasks table, and configure greyhole.conf with a number much lower than that.

Do you know why you have so many file operations? If you do, and for example, you are adding a lot of files into your Greyhole pool through Samba in a specific share or folder, then maybe you can just delete your complete spool folder, re-create it (greyhole --create-mem-spool), and once you're done copying files, just use a greyhole --fsck to handle the new/changed files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants