Machine Learning and Data Science Env, with notebook jupyter
based on jupyter & luissalgadofreire/h2o-pysparkling docker images
- This image creates a machine learning environment using h2o's sparkling water in pysparkling mode
- image based on jupyter/all-spark-notebook - Python3, R, and Scala support for Apache Spark, optionally on Mesos
- This image creates a machine learning environment using h2o's sparkling water in pysparkling mode
- Libraries for data analysis from the Julia, Python3, and R communities
- Popular Python deep learning libraries: Tensorflow, Keras + Theano
- Scalable and Flexible Gradient Boosting
- Notebook environment for both interactive data analysis and batch data processing (one notebook, multiple languages)
docker pull luluisco/mlds-notebook:latest
docker run --rm -ti -p 8888:8888 luluisco/mlds-notebook:latest
En Debats: -v custom:/home/mlds/.custom
(create volume custom if not exist, in make cmd is the parameter 'custom')
-v workXXX:/home/mlds/work
(create volume workXXX if not exist, or the path to your directory with your work, in make cmd is the parameter 'work', don't forget '$(pwd)/workXXX' if path)
-p 8888:8888 -p 6006:6006
- 8888
- 6006
- 54321-54331 #H2O flow
- 7077 #spark port
- 8080 #spark master port
- 8081 #spark slave port
- 6066 #spark rest port
- 4040-4050 #spark context port
`(...) --NotebookApp.token="YOUR_TOKEN"`
( token or an password (Ex: mlds))
docker run -d -v $PWD/work/:/home/mlds/work -p 8888:8888 -p 6006:6006 luluisco/mlds-notebook --NotebookApp.token="mlds"
- Default User is jovyan
- have root permission
- password: mlds
- sudo works without password (share environment variable PATH with root automatically)
- sudo -i for root shell
- /home/mlds is a symbolic link to /home/jovyan
- PB(en debat, prefere docker commit) -> : /home/mlds/.custom (or /home/jovyan/.custom) contains all future packages added in a container
- python (pip install --user)
- R (install.packages)
- julia (Pkg.add)
- PB WITH R and Julia when installing new package, it's re-install the package if already exist globally and save it in .custom/ etc
- create package for R and julia for check if package already exist before to call the real fonction
- Add RStudio
- Automatically choose Ports
- choose ports between XXXMin and XXXMax ( set XXXMax or XXXNb) (XXX = portNb,portTensorBoard, portH2o, portSpark ,etc.....)
- multiple confs
- start-master
- start-slave
- exec
- logs
- open
Put the directory "bashCmd" in the root of your project
- download
wget -q && unzip && cd bashCmd
curl -s -o && unzip && cd bashCmd
- Warning !! don't move Makefile or getP, if not change variable 'work' and 'getP'
curl -sL | bash
make | make help # get help
make mlds
- latest=:latest
- image=luluisco/mlds-notebook
- cmd to execute after run
- execute by default(open notebook and set token to mlds)
- more=
- is empty
- supplementary parameters
- getP=getP
- command for find port
- quiet=no
- debug=-d
- docker run -d (detact)
- could be -ti
- run_rm=--rm
- remote container after die
- custom= en debat
- is empty
- if set create volume for /home/mlds/.custom
- could be an volume name or path (think to set with long path "$PWD/customXXX")
- work=../work
- if set create volume for /home/mlds/work
- home=/home/mlds/ #internal
- home_custom=.custom #internal en debat
- home_work=work #internal
- portNb=8888 #internal jupyter notebook port
portNbMin=8888 #jupyter begin port
portNbNb=10 #nb of available port
portNbMax=portNbMin + portNbNb #or set manually - #### Spark
- portSpark=7077 #internal spark port
- portSparkMaster=8080 #internal spark master port
- portSparkSlave=8081 #internal spark slave port
- portSparkRest=6066 #internal spark rest port
- portSparkContextB=4040 #internal spark context begin port
- portSparkContextE=4050 #internal spark context end port
- portSparkMin=7077 #spark begin port
- portSparkMasterMin=8080 #spark master begin port
- portSparkSlaveMin=8081 #spark slave begin port
- portSparkRestMin=6066 #spark rest begin port
- portSparkContextBMin=4040 #spark context begin port
- portSparkContextEMin=4050 #spark context end port
- NB
- portSparkNb=10 #nb of available ports for each run
- portSparkMasterNb=10 #spark master nb of available ports for each run
- portSparkSlaveNb=10 #spark slave nb of available ports for each run
- portSparkRestNb=10 #sparke rest nb of available ports for each run
- portSparkContextBNb=100 #spark context begin port 100 because 10 port available for each run (nb of available ports for each run)
- portSparkContextENb=100 #spark context end port 100 because 10 port available for each run (nb of available ports for each run)
- portSparkMasterMax=portSparkMasterMin + portSparkMasterNb #or set manually
- portSparkSlaveMax=portSparkSlaveMin + portSparkSlaveNb #or set manually
- portSparkRestMax=portSparkRestMin + portSparkRestNb #or set manually
- portSparkContextBMax=portSparkContextBMin + portSparkContextBNb #or set manually
- portSparkContextEMax=portSparkContextEMin + portSparkContextENb #or set manually
- portSparkMax=portSparkMin + portSparkNb #or set manually
- portH2o=54321-54331 #internal h2o ports
- portH2oMin=54321 #H2o begin port
- portH2oNb=100 #nb of availave ports 100 because for 10 ports for each instance 10^2 = 100
- portH2oMax=portH2oMin + portH2oNb #or set manually
- portTensorBoard=6006 internal tensorflow tensorboard port
- portTensorBoardMin=6006 #TensorBoard begin port
- portTensorBoardNb=10 #nb of available ports
- portTensorBoardMax=portTensorBoardMin + portTensorBoardNb #or set Manualy
- printCommand="yes"
- show command
docker run ${name} "${debug}" "${run_rm}" $VOLUMES $PORTS ${more} "${image}${latest}" ${cmd}
make mlds
make mlds work=work/
make mlds custom=customMyProject work=../work
- etc
When the container is running, go to the terminal and write
$ ./ [id_or_name_of_container_or_nothing_if_MLDS_C_CURR_IS_SET]
$ ./ [id_or_name_of_container_or_nothing_if_MLDS_C_CURR_IS_SET] IP_OF_MASTER (192.168.X.X or localhost if local)
$ ./ [-ti_or_nothing] CMD_TO_EXECUTE
if MLDS_C_CURR is set$ ./ id_or_name_of_container CMD_TO_EXECUTE [-ti_or_nothing]
$ ./open nb
$ ./open IP
$ ./logs
if MLDS_C_CURR is set$ ./logs id_or_name_of_container
- create
$ ./ NAME
- inspect
$ ./ NAME
- remove
$ ./ NAME
- volumes
$ ./
- stop
$ ./ [id_or_name_of_container_or_nothing_if_MLDS_C_CURR_IS_SET]
- stopAll
$ ./
- ps
$ ./ps ...
- ip local
$ ./ l
- ip internet
$ ./ i