Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearer error messages when Python corpus definition isn't found #1639

Open
lukavdplas opened this issue Jul 29, 2024 · 0 comments
Open

Clearer error messages when Python corpus definition isn't found #1639

lukavdplas opened this issue Jul 29, 2024 · 0 comments

Comments

@lukavdplas
Copy link
Contributor

lukavdplas commented Jul 29, 2024

If you add a Python corpus definition through your settings, the name you assign in the settings is not arbitrary. It's used to select the corpus class, so if you choose an unexpected name, the import module won't be able to find the Python corpus.

If a corpus was included in your settings but can't be loaded, the console will show an error message. This always starts with Could not load corpus {corpus-name}:, followed by whatever error was raised during the import process. This can happen for all sorts of reasons, including mistakes in the corpus definition.

I've seen these two messages pop up in that scenario, which were fixed when I corrected the name of the corpus:

  • getattr(): attribute name must be string
  • expected str, bytes or os.PathLike object, not NoneType

These errors are caused somewhere in this block of code:

corpus_spec.loader.exec_module(corpus_mod)
# assume the class name is the same as the corpus name,
# allowing for differences in camel case vs. lower case
regex = re.compile('[^a-zA-Z]')
corpus_name = regex.sub('', corpus_name).lower()
endpoint = next((attr for attr in dir(corpus_mod)
if attr.lower() == corpus_name), None)
corpus_class = getattr(corpus_mod, endpoint)
return corpus_class()

Tracing the error is left as an exercise to the reader. In any case, the issue here is that these messages aren't helpful in identifying the problem (that you picked an incorrect name).

Solution: wrap these lines in a try-except block. You should get a message like "cannot find an object matching name 'my-corpus' in python module blablabla/my-corpus.py", or something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant