Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Exclude archived steps from VSCode search #3746

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Dec 20, 2024

Motivation

Searching in VSCode can be frustrating when we keep old, archived versions of code in the repository. Quite frequently, I find myself yelling at my computer, only to realize that I was editing the wrong dataset (found through VSCode quick search). This problem has already been addressed by the find-latest-step extension, although it lacks some capabilities of the native VSCode quick search. I don't personally use the find-latest-etl-step extension because it's slow on my laptop and I prefer the fuzzy matching in quick search (although other people use it).

Solution

Identify all inactive steps and exclude them from the VSCode search toolbar and file explorer by adding them to "files.exclude" and "search.exclude" in .vscode/settings.json. Running the script excluded over 600 datasets. One disadvantage is that all VSCode users must be aware that archived files can't be found there and must instead be opened, for example, via code etl/path/to/script from the terminal.

@owidbot
Copy link
Contributor

owidbot commented Dec 20, 2024

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-vscode-exclude

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found
Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2025-01-14 10:05:40 UTC
Execution time: 16.43 seconds

@Marigold Marigold marked this pull request as ready for review January 13, 2025 10:10
@lucasrodes
Copy link
Member

@Marigold Thanks for looking into this.

I don't personally use the find-latest-etl-step extension because it's slow on my laptop and I prefer the fuzzy matching in quick search (although other people use it).

That's been my experience as well. The extension is a bit slow, and I've also preferred to use the native search for now.

If I understand your proposal, I need to first run python scripts/exclude_archived_steps.py and then open the VS Code search?

I would find it nice if there was the option to "unarchive steps" so that I can search for all steps.

@Marigold
Copy link
Collaborator Author

There are two ways we can do it - either save it to the repository .vscode/settings.json (VSCode Project settings) so that all users have it enabled by default or let everyone whether to use it or not and write to User settings (~/Library/Application Support/Code/User/settings.json). The latter would mean that everyone will have to manually run it once in a while, while with the former we could run it automatically and commit to the repo.

If I understand your proposal, I need to first run python scripts/exclude_archived_steps.py and then open the VS Code search?

As per above, you wouldn't have to run it if we go with the project settings.

I would find it nice if there was the option to "unarchive steps" so that I can search for all steps.

If you want to search for a text inside all steps, you can click on the cog button

image

Searching archived files is trickier as they won't show up in quick search (Cmd + P). You'd have to either do it from terminal find . -type f -path "*garden/who/2024-09-09/flu_test*" (then run code etl/steps/data/grapher/who/2024-09-09/flu_test.py to open in) or from Finder.

I could also make it possible to keep files that are manually commented in .vscode/settings.json. For instance

"files.exclude": {
    "etl/steps/data/garden/agriculture/2017-03-08": true,
    // "etl/steps/data/garden/agriculture/2023-04-20": true,
   ...

wouldn't hide folder etl/steps/data/garden/agriculture/2023-04-20 and running exclude_archived_steps.py wouldn't overwrite it. That way, you could manually comment archived steps you frequently work on.

@pabloarosado
Copy link
Contributor

Hi @Marigold thanks for looking into this. I use find-latest-step and it takes 2 seconds to launch, and the displayed hits are more useful than the default cmd+p bar. So, for finding the latest ETL steps, I find it quite useful:

Screen.Recording.2025-01-21.at.16.38.02.mov

I don't know why it's so slow for you. But we could improve the extension. For example, it could search only in etl/steps/data, etl/steps/export and snapshots, and only for *.py files. Maybe that could speed up things (and exclude yaml and dvc files, if that's more convenient).

But if you think that's not a good approach, then your solution is also fine. Feel free to merge if you think this is convenient for you, thanks!

@Marigold
Copy link
Collaborator Author

I don't know why it's so slow for you

Well, I have a hunch, my old laptop thinks twice about every operation it does.

This way is more personally useful for things like large-scale refactorings (where you don’t want to be fixing archived steps) and reducing clutter when searching for non-step modules. It’s definitely more of a nice-to-have, though.

I’ll keep it open a bit longer before deciding whether to merge it or not. I’m still unsure if the potential confusion from not being able to find archived steps is worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants