Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize bibtex-completion-prepare-entry #419

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mpedramfar
Copy link
Contributor

If there are a lot of folders to search through, especially if some of them are on the cloud, bibtex-completion-find-pdf can be slow.
Currently, bibtex-completion-prepare-entry searches for the pdf files for all the entries.
If bibtex-completion-pdf-symbol is the empty string, then it will still search for every single pdf file and then show nothing for it.

This pull request checks if bibtex-completion-pdf-symbol is the empty string, and if it is, it will skip searching for it.
The same is done for notes and bibtex-completion-notes-symbol.

@tmalsburg
Copy link
Owner

tmalsburg commented Dec 18, 2022

Hi! Just to make sure there are no misunderstandings: I think you don't want to show PDF symbols in the results list but still be able to open PDFs? If correct, I see how this PR fixes your performance issue, but it seems like an extension that is very specific to your personal setup. And I don't understand why finding PDFs is so slow in your setup. Even with many folders it should be fast. My suspicion is that the best solution for this issue may be something else. Question: What do you mean when you say that some folders are in the cloud? Are these Dropbox folders or something like that? These should be cached locally and fast.

@mpedramfar
Copy link
Contributor Author

Yes. I don't want to lose any functionality and I don't want any changes in the behavior (even when bibtex-completion-pdf-symbol is the empty string) except for speed.

It probably is a bit specific to my setup. However, if you use rclone to mount a cloud storage (Dropbox, Google Drive, One Drive, etc) or use sshfs to mount a remote folder, then even running ls to list the contents of a remote directory could be slow.

@mpedramfar
Copy link
Contributor Author

This pull request also resolve #350 and provides a workaround for #226

@tmalsburg
Copy link
Owner

Not sure but I think a better solution might be to read the list of available PDFs once and then to check that list, not each file separately on disk. In general, we need a better approach to detecting PDFs and I have an idea for a plug-in approach that users could flexibly configure to their specific needs. Such a system would close half of all issues for this project or so. Perhaps I will find some time to work on this over the holidays but I can't promise it.

@tmalsburg
Copy link
Owner

How many folders to you have? Asking because it's useful for me to know what setups users have.

@mpedramfar
Copy link
Contributor Author

mpedramfar commented Dec 19, 2022

I have less than 500 folders, but some of them are in the cloud (mounted network files, not cached) and I've also made some tweaks to other places in bibtex-completion so that my pdf files are not necessarily of the format citekey.*.pdf
Because of this, the search becomes very slow.
So, in a way, this PR helps with 2 issues for me:

  1. slow access to network mounted drive
  2. support for more sophisticated pdf file names (which inevitably slows down the search, no matter how it's implemented)

On the other hand, I'm not entirely sure how a plug-in approach could be different enough to not have the first issue.
As I understand, if the new system is at least as capable as the current one, then this PR can help.

By capable, I mean this:
Are we caching a list of pdf files, or are we detecting them?
If we are caching the list, then we need to watch for any change in the filesystem. Except for local directories, this can be a bit of a hit and miss (e.g. see https://github.com/libfuse/libfuse/wiki/Fsnotify-and-FUSE and https://github.com/fsnotify/fsnotify#why-dont-notifications-work-with-nfs-smb-fuse-proc-or-sys). So, unless we wait until inotify support has been added for various FUSE and network filesystems, this is a loss in capability.
One way to "watch" for changes is to periodically detect them, which brings us back to the issue of detecting.
If we're going to detect them, then the cost of accessing the filessystem is always there, especially in network mounted folders. And this is exactly where this PR helps.

On the other hand, I can't think of how this change could be bad. Do you have a specific situation/plug-in system in mind where this change can be undesirable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants