-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset with large number of files #8928
Comments
FWIW: I think we know performance gets worse as the number of files increases, but I don't think there are any hard limits known. My first guess in general for errors would be timeouts or memory/temp space issues, i.e. it takes Dataverse too long to generate and send the json for 75K files and the connection gets closed. Other than the usual checks of looking in the logs and server load, and looking in the browser dev console or going verbose with curl to see the specific status code and responder (i.e. for timeouts you can see if it is a load balancer, apache, etc. that timed out), I'm not sure what else to suggest. (It is surprising that a 500 error can occur without any log info though.) |
We're talking about this issue in tech hours. Here are some pain points for users:
Other discussion:
|
Other Point
|
Hi,
I have a dataset that contains around ~75k files (each less than 1MB).
Problem:
I can clearly see and access (open) the files of dataset by clicking "Files" from facet area.
But when I click the dataset i receive "500 internal server error " (after some minutes) and with no updates on server.log
However if I try from API call , eg Json rep of the dataset, I receive the following update from server.log.(text file attached)
Are there any other solutions /ideas to tackle them ? Apart from zipping(double zipping) them
[Ticket Number Reference: #324896]
link: https://help.hmdc.harvard.edu/Ticket/Display.html?id=324896
Best Regards
Lincoln
error_serverlog_apicall.txt
The text was updated successfully, but these errors were encountered: