Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIM filename does not match requirements #43

Closed
benoit74 opened this issue Nov 6, 2023 · 6 comments
Closed

ZIM filename does not match requirements #43

benoit74 opened this issue Nov 6, 2023 · 6 comments
Assignees
Labels
bug Something isn't working prio1 upstream

Comments

@benoit74
Copy link
Collaborator

benoit74 commented Nov 6, 2023

When creating a schedule, the provided filename does not match our convention and is refused by the Zimfarm since openzim/zimfarm#865 has been merged.

@benoit74
Copy link
Collaborator Author

benoit74 commented Nov 6, 2023

I see two paths forward on this issue:

  1. we modify the check for zim filename validity on farm.youzim.it
    1. either we allow to pass a custom regex for this instance via an optional environment variable
    2. or we completely disable the ZIM filename check via an optional environment variable
  2. we modify zimit to also support the {lang} placeholder in ZIM filename + zimfarm to allow this as a valid filename
    1. youzim.it ZIM filename would become {url_hostname}_{lang}_{unique_id_6_chars}_{period}.zim instead of {url_hostname}_{unique_id_8_chars}.zim
    2. {unique_id_8_chars} is not really a selection ...
    3. maybe {url_hostname}_{lang}_all_{unique_id_6_chars}_{period}.zim would make more sense

I prefer option 2 since:

  • it would allow to produce filenames almost matching our own convention which is probably not a bad thing
  • adding support for {lang} placeholder in zimfarm is a most probably a useful feature in other situations (scrapers producing multiple languages at once would benefit from this as well)

@rgaudin
Copy link
Member

rgaudin commented Nov 6, 2023

youzim.it filenames have already been debated a lot. @Popolechien have been very insistent on the filenames being user-centered given those are not in the catalog. Including the unique ID (so we can safely store them in a flat folder) required some convincing.

Additionally, the period makes zero sense in the context of youzim.it: it's creation is manual and not periodic and this month-period would only bring confusion.

I thus think option 1 is more practical. I don't recall how wp1 filenames are constructed. They might not fit the pattern neither

@benoit74
Copy link
Collaborator Author

benoit74 commented Nov 6, 2023

WP1 is not impacted from what I looked after because we validate only output, description, long description and other basic range / oneOf validations.

OK for option 1 if this is what fits you.

@rgaudin
Copy link
Member

rgaudin commented Nov 6, 2023

@Popolechien can confirm whether it still makes sense to him

@Popolechien
Copy link
Contributor

yes.

@benoit74
Copy link
Collaborator Author

benoit74 commented Nov 6, 2023

I've opened upstream issue since nothing has to be done in zimit-frontend in the end: openzim/zimfarm#867 (and I will work on it tomorrow)

@benoit74 benoit74 changed the title Zim filename does not match requirements ZIM filename does not match requirements Nov 6, 2023
@benoit74 benoit74 closed this as completed Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working prio1 upstream
Projects
None yet
Development

No branches or pull requests

3 participants