-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"This repository is over its data quota" #162
Comments
@schuemie OK I'll work on removing the older shiny apps from this repo, starting with SystematicEvidence. Is there are already a list of the shiny apps we can delete? If not what's the best way to generate that list - post the question on the OHDSI forums perhaps? |
It might make sense to focus on the apps that take up the most space. I ran the script below, but I'm not sure my clone is correct since I got the error message above, so maybe you can rerun? folder <- "../ShinyDeploy"
subFolders <- list.files(folder, include.dirs = TRUE)
subFolders <- subFolders[dir.exists(file.path(folder, subFolders))]
computeSize <- function(subFolder) {
sum(file.info(list.files(file.path(folder, subFolder), recursive = TRUE, full.names = TRUE))$size)
}
sizes <- plyr::laply(subFolders, computeSize, .progress = "text")
data <- data.frame(
subFolder = subFolders,
mbs = sizes / 1024 ^ 2
)
data <- data[order(-data$mbs), ]
head(data, 10)
# subFolder mbs
# 62 EhdenRaDmardsEstimation 1211.1382
# 166 SystematicEvidence 625.8941
# 76 IbdCharacterization 521.5577
# 157 Sglt2iDka 482.2656
# 168 TicagrelorVsClopidogrel 400.0139
# 87 MskaiEstimationPrelim 324.2289
# 17 corazon 303.5255
# 1 AceBeta9Outcomes 254.4665
# 145 RanitidineCancerRisk 234.2958
# 119 OutcomeMisclassificationEval 193.2686 If this is correct, then |
@leeevans: could you remove the biggest apps from this repo (but keep them in the Shiny server)? I still can't clone this repo (which means I also can't push anything) |
The Shiny server deploy script just does a git pull so I don’t know how we could do that. As a workaround are you able to do a ‘partial clone’ or ‘shallow clone’ of the repo so you can make changes locally and push them back to the repo? https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ My suggestion would be that we inform the OHDSI community about the repo space issue and ask the shiny developers to remove any shiny apps/data they no longer need from the repo. Maybe share that on the OHDSI Forums/community call/HADES call? |
I'm probably missing something, but couldn't we just remove the big apps from version control? How I would do that (maybe not optimal): on the Shiny server, temporarily move the apps to some folder outside the clone. Commit these changes (git will think the files have been deleted) and push. Move the apps back to the folder. Now they are no longer version controlled. We could add them to .gitignore to ensure they are not overwritten or pushed back to the repo. |
Alternatively, I remember we have another Shiny server (e.g. running http://shiny.ohdsi.org:2020/AhasHfBkleAmputation/ ). Maybe we could move the large apps there? (And redirect from their current URLs like this) |
(Note that both a blobless and a treeless clone run into the same error. The document seems to suggest a shallow clone is a bad idea) |
I've moved the data from the EhdenRaDmardsEstimation Shiny app out of the ShinyDeploy GitHub repo but kept the code in GitHub by making a one line change to the global.R file (see below). I moved the application data on the server to a separate shiny server level 'data' directory, in a subdirectory named after the shiny app. Here is the R code change I made to reference the data under the shiny server data directory:
I will do the same for the other Shiny apps with large data files (>100MB) that reference their data files in the same way. In future, all OHDSI Shiny apps with large data files (>100MB), must access their data from the Shiny PostgreSQL database to avoid this repository data quota issue. |
Thanks @leeevans ! Unfortunately, I still am unable to clone the repo:
|
deleting files from the |
@msuchard the GitHub LFS data quota is reset monthly. As it is now June 1st, could you try this again? |
We need a better solution for data.ohdsi.org in the long term. (Apps break and don't get fixed quickly) https://forums.ohdsi.org/t/multiple-shiny-apps-fail-to-load/18852/3 Also study results should be in some kind of large data store like a database or flat file storage system. Git/github isn't good for storing data. Linking discussion here: https://forums.ohdsi.org/t/organization-of-shinydeploy/6223 |
Agreed. Continuing the discussion on the forums |
When I try to clone this repo, I get
@leeevans : Maybe we can remove some of the older shiny apps (when we didn't use databases to store the results) from this repo, while keeping them on the Shiny server? For example,
SystematicEvidence
takes up a lot of space, and doesn't need to modified anymore.The text was updated successfully, but these errors were encountered: