What is the best way to increase download speed? #58
Unanswered
2320sharon
asked this question in
Q&A
Replies: 1 comment
-
Going Forward with JoblibDuring the CoastSeg Meeting today (8/22/22) we discussed the different advantages and disadvantages of using either asyncio or joblib to increase download speeds with CoastSat as well as other downloads. We decided to use JobLib to decrease download times because it would take much less work to implement and maintain. Though JobLib is not as fast as asyncio it requires much less refactoring and won't require creating a separate version of CoastSat with asynchronous functions. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My Research on Increasing Download Speed
I've been working on issue #51 to speed up downloads. Currently, downloads are performed synchronously which means the user has to wait on each download to complete before they can start the next one. I've been exploring options such as JobLib, asyncio , aiohttp, aiofiles , multiprocessing, and multithreading as ways to increase download speeds. There was quite a steep learning curve because a lot of the documentation is dense as well as the Stack Overflow posts, blogs, and examples are out of date for a lot of these libraries.
I'm at a point where I can declare that asyncio,aiohttp,aiofiles would be the best libraries to use to increase download speeds, but it will require refactoring functions in both
CoastSat
andCoastSeg
. The bad news is the async packages do not work well with ipywidgets (buttons, etc ) in Jupyter notebooks. The good news is that Panel is compatible with asyncio,aiohttp ,aiofiles which means we can make our Panel apps blazingly fast 🔥 without dealing with Jupyter's ipwidgets async madness 😵 . I think it would be a good idea to build a set of async download functions for CoastSat when we build the panel apps for bothCoastSeg
andSeg2Map
.Comparing
Async
andJobLib
I've performed a few different experiments to compare regular downloads with requests , downloads with JobLib , and downloads with asyncio ,aiohttp and aiofiles . Here is how they compare on the same set of 20 files:
asyncio ,aiohttp ,aiofiles
: 3.8 secondsJobLib
6.6 secondsrequests
: 13.3 secondsBasically, joblib takes about double the time that the async packages take, but joblib is much easier to configure with ipywidgets compared to any of the async packages.
My question is now:
Do we want to use asyncio ,aiohttp ,aiofiles for the Panel versions of
CoastSeg
andSeg2Map
to make them have super fast downloads?Do we want to use JobLib for the jupyter notebooks to cut download times?
Do we want to take the time to refactor CoastSat to use one or both of these downloading strategies? Regardless of which one we choose to create or modify a new version of CoastSat with async or joblib downloads is going to take some time.
If none of the following options sound appealing we can save this issue for later when
CoastSeg
is closer to being finished and figure out how to speed up downloads then. I think we should take some time to think about what tradeoffs we want to make for this project.Beta Was this translation helpful? Give feedback.
All reactions