You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to run hash or cmd commands with spark in cluster mode, we get the same problem we used to have with ml, because the workers do not have the apollo lib and it is not added to the spark session using addPyfile.
I think we should either modify the way the --depzip flag functions in order to add it, or change logic:
when ml-s flag is used to specify a master that is not local, we should add ml, engine and all other dependencies if the call is not made by a command of the ml library, e.g. apollo and it's dependencies.
the --dep-zip flag should be used to add ml dependencies. It will be of no use for us since our workers use ml-core image and already have them, but it will be useful for other users.
as was pointed out in this issue, I think we should add to the spark conf by default the flags that will clean up after us, because it ends up taking a lot of memory
The text was updated successfully, but these errors were encountered:
When trying to run
hash
orcmd
commands with spark in cluster mode, we get the same problem we used to have withml
, because the workers do not have the apollo lib and it is not added to the spark session usingaddPyfile
.I think we should either modify the way the
--depzip
flag functions in order to add it, or change logic:ml
-s
flag is used to specify a master that is not local, we should add ml, engine and all other dependencies if the call is not made by a command of theml
library, e.g.apollo
and it's dependencies.--dep-zip
flag should be used to addml
dependencies. It will be of no use for us since our workers useml-core
image and already have them, but it will be useful for other users.The text was updated successfully, but these errors were encountered: