-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Catch Guzzle Exception to avoid breaking harvest #4075
Comments
Beside this problem, there is another possible issue with that method. If the download url does timeout this might take forever or very long if the harvest is started via drush and the default timeout of 0 = forever of Guzzle client is used. Maybe one should use Drupal's http_client service instead to get Drupal's default settings and be able to override via settings? |
Switching to http_client service seems like a good move. As @janette mentions in the PR we are still thinking through the right way to deal with bad URLs. |
#4075: Catch Guzzle Exception to avoid breaking harvest
We found that this issue also affects the use-case of trying to create a dataset in the Drupal UI, using a bad URL for the source URL. This led to a WSOD with the error message that guzzle had gotten a 404. |
Fixed. Special thanks to @stefan-korn |
Describe the bug
During harvesting the method getRemoteMimeType is called to determine the mime type of distribution resource by calling the downloadURL. If for some reason the downloadURL is not available or broken, the harvesting will fail alltogether with a Guzzle Exception logged.
Steps To Reproduce
Do a harvest with one resource download url not being available.
Expected behavior
The method getRemoteMimeType should retun NULL on failure to call the downloadURL (as it is already said in the methods declaration) instead of quitting with an Exception.
The text was updated successfully, but these errors were encountered: