Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust cache ModuleDataProvider to re-retrieve delegate values from store after putting in store #3449

Merged
merged 9 commits into from
Nov 8, 2024

Conversation

doriable
Copy link
Member

@doriable doriable commented Nov 5, 2024

This PR changes the caching behaviour so that once remote values
have been retrieved and stored in the cache, we re-retrieve these
values from the cache. This ensures that if the cache is an on-disk
store, then all values will have a LocalPath value.

This is needed to fix the LSP when there is no cache. When the LSP
checks for importable files, it builds the workspace.
Previous to this change, an empty cache results in the remotely fetched
imports to not have LocalPath, even though they are put in the cache
after being retrieved. The LSP relies on LocalPath, so it breaks.

Tested this locally with my own Neovim + LSP setup.

Fixes #3360

@doriable doriable requested a review from mcy November 5, 2024 20:32
Copy link
Contributor

github-actions bot commented Nov 5, 2024

The latest Buf updates on your PR. Results from workflow Buf CI / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedNov 7, 2024, 11:30 PM

@stefanvanburen
Copy link
Member

tried this out and works for me locally. Without this + an empty cache, the first open of a file with imports shows a ton of diagnostics. (I think some process in the lsp does eventually populate the cache, because after an edit of the file + a save, the diagnostics go away.)

With this + an empty cache, no diagnostics are shown. Seems like a clear improvement to me.

)

var (
errFileInfoNoLocalPath = errors.New("file info has no local path")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error printed for a user? Even if not, I don't actually know what it means - what does it mean for a file info to not have a path? This should be clearer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is printed in logs, but not directly exposed to the user in-line. I can clarify this error, however it is mostly for logging and debugging.

// LocalPath will not always be present if the file did not originate from disk. If an import's
// module data was fetched remotely, then it will not have a path on disk to the cache. By
// refreshing the module, and rebuilding the workspace, we populate the module data from the cache.
func getWorkspaceWithLocalPaths(ctx context.Context, fileName string, lsp *lsp) (bufworkspace.Workspace, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this isn't a method on *lsp?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to follow the pattern set out by the call-site findImportable, which is being called by *file.IndexImports, and *file.lsp is passed through.

return nil, err
}
var refreshWorkspace bool
for _, module := range workspace.Modules() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very expensive - this means that for every call of getWorkspaceWithLocalPaths, you are going to iterate over every file in the workspace. This feels like it would be prohibitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- covered some of this in the response to a comment below. If we do pursue changing the semantics ModuleDataProvider to return file info after writing to the cache, then we can avoid checking each file since we'll have guarantees to the file on-disk in the cache.

for _, module := range workspace.Modules() {
if err := module.WalkFileInfos(ctx, func(fileInfo bufmodule.FileInfo) error {
if fileInfo.FileType() == bufmodule.FileTypeProto && fileInfo.LocalPath() == "" {
return fmt.Errorf("file %q no local path: %w", fileInfo.Path(), errFileInfoNoLocalPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message that would be printed here is file "foo.proto" no local path: file info has no local path. I don't know what this means or what to do with it as a user, can you make the error clearer?

}
}
if refreshWorkspace {
workspace, err = getWorkspaceWithLocalPaths(ctx, fileName, lsp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is going on here (which means either it needs code clarification, or this function needs to be refactored). It appears you're making a recursive call with no clear base case that will result in recursion terminating. Moreover, each recursive call is O(files).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can clarify the comment at the top-level of the function and in-line more information, however this is an accurate description of what is going on. We are trying to ensure that the workspace that is built, for all imports, we can resolve a LocalPath. And the only way to check for that is to iterate through each file, there are no other guarantees otherwise.

The exit condition is where all files have a LocalPath, which should be the state we converge to once all imports are being resolved through the cache.

This is an expensive operation as it is implemented because of the conditions set by the LSP. In practice, we should have a fully populated cache and module data on-disk fairly quickly, since we cache all module data.

The other option we have is to change the semantics of ModuleDataProvider, such that if we resolve remote module data and put those values in the cache after, then we return module data with the cached information.

@doriable doriable changed the title Ensure all import file have a local path for the LSP Adjust cache ModuleDataProvider to re-retrieve delegate values from store after putting in store Nov 7, 2024
@doriable
Copy link
Member Author

doriable commented Nov 7, 2024

@bufdev pulled in the changes today for ModuleDataStore. Made some small adjustments:

  • checking to make sure there are no unfound keys from the store
  • adjusted the wording to specify on-disk cache, since LocalPath would still be absent for an in-memory cache, for example

Also updated the PR title and description and tested the LSP (and other buf operations) locally.

@doriable doriable requested a review from bufdev November 7, 2024 23:33
@doriable doriable merged commit 62082b2 into main Nov 8, 2024
10 checks passed
@doriable doriable deleted the 3360-fix-lsp-no-cache branch November 8, 2024 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

buf beta lsp doesn't work with an empty cache
3 participants