Refactoring of Downloaders to add incremental data file update #60
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a bit of a refactoring to the downloader to change it to a Iterator like model where the data is incrementally written to the output file. The reason for this is that on larger repos it can take hours to download all the commits and if something fails or you need to abort the downloader you would previously loose all the already downloaded data, this should now be better as the last downloaded chunk should already have been written to the output file.
I have not checked but this should probably also improve memory usage as all commits do not need to be in memory until the downloader is done downloading.
I also removed the extra layer inside Ssh downloader and made the Legacy downloader just another independent downloader class, this was just to simplify the structure, was a bit hard to keep track of the layers when coming in to the repo.
My main goal with this is to add a diff-only option to the downloader so that the downloader only downloads new commits from Gerrit so the existing output file can be incrementally updated periodically or that you can continue downloading after a failed download without the need to re-download all commits again.