Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link checker times out #8

Closed
cyplas opened this issue Jun 23, 2017 · 2 comments
Closed

Link checker times out #8

cyplas opened this issue Jun 23, 2017 · 2 comments
Assignees

Comments

@cyplas
Copy link
Collaborator

cyplas commented Jun 23, 2017

For some items, the fastchecklinks curation task times out:

2017-06-21 21:14:55,654 INFO  org.dspace.curate.Curator @ Curation task: fastchecklinks performed on: 11356/1125 with status: 1. Result: 'Item: 11356/1125  [https://www.clarin.\
si/repository/xmlui/admin/item?itemID=1665] has 3 urls to check...
 - http://hdl.handle.net/11356/1125 = -2 - TIMEOUT

Curiously, when I increased the lr.link.checker.connect.timeout and lr.link.checker.read.timeout parameters in local.conf, the curation task yielded OK, but still generated an ERROR in the logs and an email notification.

But in fact it would probably be better to get at the root of the problem: why do some items take longer to load and what could be done about it?

(BTW, the fastchecklinks curation task also yields "403 - FAILED" for the licenses: this is ufal#678.)

@kosarko
Copy link

kosarko commented Sep 7, 2017

One reason why some items takes longer are the archive (zip in case of 11356/1125) previews. The files in the archive are essentially added to the item metadata; currently these are "baked" into the page, so for large (having many files) archives there's a lot of stuff to be transfered. It's something like 8.5MB in case of 11356/1125
filed ufal#785 for that

@cyplas
Copy link
Collaborator Author

cyplas commented Sep 27, 2017

Ok, thanks. In the meantime, I think we should just be aware of what the timeout means and sometimes try the timed out URL manually. So I'm closing this (@TomazErjavec: reopen if you disagree).

Curiously, when I increased the lr.link.checker.connect.timeout and lr.link.checker.read.timeout parameters in local.conf, the curation task yielded OK, but still generated an ERROR in the logs and an email notification.

Actually, this happens only for some items, and happens even if I keep the default parameter values (see ufal#792).

@cyplas cyplas closed this as completed Sep 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants