You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here Peter specified source as dagw-xxx but the validator would check if the source equals to the name of the dataset folder: dataset_name = document_file.parent.parent.name which is dagw:
Therefore we will have:
Checking dataset: dagw: 86%|████████████████████████████████████████████▉ | 19/22 [00:09<00:02, 1.46it/s]ERROR:__main__:--- Dataset dagw failed validation ------------
ERROR:__main__:Datasheet dagw does not exist.
Error reading datasheet dagw: [Errno 2] No such file or directory: '/work/github/danish-foundation-models/docs/datasheets/dagw'
Error in document file dagw-retsinformationdk.jsonl.gz: Source should be dagw, but is dagw-retsinformationdk
Error in document file dagw-ep.jsonl.gz: Source should be dagw, but is dagw-ep
Error in document file dagw-hest.jsonl.gz: Source should be dagw, but is dagw-hest
This also is the case for checking if dataset sheets exist, it would only check if dagw.md exists.
So I guess we have to seperate each of the sub-dagw into individual folders like:
Just found a tiny issue for all sub-dagws:
Here Peter specified
source
asdagw-xxx
but the validator would check if the source equals to the name of the dataset folder:dataset_name = document_file.parent.parent.name
which isdagw
:Therefore we will have:
This also is the case for checking if dataset sheets exist, it would only check if
dagw.md
exists.So I guess we have to seperate each of the sub-dagw into individual folders like:
Originally posted by @TTTTao725 in #266 (comment)
The text was updated successfully, but these errors were encountered: