diff --git a/README.md b/README.md index 72241be3..c615019d 100644 --- a/README.md +++ b/README.md @@ -70,7 +70,9 @@ Installed version of *^Python3.7*, *Docker* and *docker-compose v2* ([*Go here for instructions*](https://docs.docker.com/compose/install/)) or use the Binder batch in the next section. -***Note***: We recommend giving Docker at least 8 GB of RAM (On Docker Desktop you can go under settings -> resources) +***Note***: +1) We recommend giving Docker at least 8 GB of RAM (On Docker Desktop you can go under settings -> resources) +2) If you are a Windows user, we recommend you to install the *Windows Subsystem for Linux (WSL)* and integrate docker with WSL. For WSL installation's guidance can be found [here](https://docs.microsoft.com/en-us/windows/wsl/install). We use Ubuntu on WSL for debug and testing purposes. #### Demo correlating Uber traversals with Google popularities @@ -87,15 +89,15 @@ You could either use the deployed example on Binder using the badge above or run simply uses Pandas dataframes and is not connecting to a data warehouse. \ To run the demo locally, launch Docker in the background and from inside the root directory run: -Linux/Mac: ```zsh -cd kuwala/scripts && sh initialize_core_components.sh && sh run_cli.sh -``` -and for Windows (Please use PowerShell or any Docker integrated terminal): -```PS -cd kuwala/scripts && sh initialize_windows.sh && cd windows && sh initialize_core_components.sh && sh run_cli.sh +cd kuwala/scripts/shell && sh initialize_core_components.sh && sh run_cli.sh ``` +Or if you are a Windows user having issues to run a `shell` script, you can run a `python` script as an alternative: +```zsh +cd kuwala/scripts/python && python3 initialize_core_components.py && python3 run_cli.py +``` +*All `shell` scripts are also available in `python` script using the same file naming, inside the `/python` directory* #### Run the data pipelines yourself To run the pipelines yourself, please follow the instructions for the diff --git a/kuwala/README.md b/kuwala/README.md index 461b02b6..2b5c4d43 100644 --- a/kuwala/README.md +++ b/kuwala/README.md @@ -9,14 +9,14 @@ Installed version of *Docker* and *docker-compose v2* ### Pipelines -If you want to build all containers for all pipelines, change your working directory to `./kuwala/scripts` (or move to -`./kuwala/scripts/`, run `initialize_windows.sh`, and change directory to `windows/` if you are running a Windows -machine) and run: +If you want to build all containers for all pipelines, change your working directory to `./kuwala/scripts/shell` and run: + ```zsh sh initialize_all_components.sh ``` + You can also build the containers individually for single pipelines. All services are listed in the [`./docker-compose.yml`](https://github.com/kuwala-io/kuwala/tree/master/kuwala/docker-compose.yml). Please refer to each pipeline's `README.md` on how to run them. You can find the pipeline directories under @@ -35,7 +35,7 @@ Now you can proceed to any of the pipelines' `README.md` and follow the steps to ### Core -To initialize the CLI and Jupyter notebook run within the `./kuwala/scripts` directory (or `./kuwala/scripts/windows` ): +To initialize the CLI and Jupyter notebook run within the `./kuwala/scripts/shell` directory: ```zsh sh initialize_core_components.sh @@ -51,4 +51,7 @@ If you only want to start the Jupyter environment run: ```zsh sh run_jupyter_notebook.sh -``` \ No newline at end of file +``` +Or if you are a Windows user having issues to run a `shell` script, you can run a `python` script as an alternative. For example you can run `initialize_core_components.py` under `kuwala/scripts/python`.\ +\ +*All `shell` scripts are also available in `python` script using the same file naming, inside the `kuwala/scripts/python` directory* \ No newline at end of file diff --git a/kuwala/core/cli/README.md b/kuwala/core/cli/README.md index 14da49c9..f85d823e 100644 --- a/kuwala/core/cli/README.md +++ b/kuwala/core/cli/README.md @@ -20,20 +20,23 @@ The following pipelines can currently be selected through the CLI: To make sure you are running the latest version of all pipelines, run from inside the root directory: -Linux/Mac: ```zsh -cd kuwala/scripts && sh initialize_all_components.sh +cd kuwala/scripts/shell && sh initialize_all_components.sh ``` -Windows: +Or if you are a Windows user having issues to run a `shell` script, you can run a `python` script as an alternative: ```zsh -cd kuwala/scripts && sh initialize_windows.sh && cd windows && sh initialize_all_components.sh +cd kuwala/scripts/python && python3 initialize_all_components.py ``` -To start the CLI, run the following script from inside the `kuwala/scripts` directory and follow the instructions: +*All `shell` scripts are also available in `python` script using the same file naming, inside the `/python` directory* + + +To start the CLI, run the following script from inside the `kuwala/scripts/shell` directory and follow the instructions: ```zsh sh run_cli.sh -``` \ No newline at end of file +``` +or `python3 run_cli.py` from inside the `kuwala/scripts/python` directory \ No newline at end of file diff --git a/kuwala/core/cli/src/main.py b/kuwala/core/cli/src/main.py index cc2c66fd..01358529 100644 --- a/kuwala/core/cli/src/main.py +++ b/kuwala/core/cli/src/main.py @@ -18,6 +18,7 @@ def launch_jupyter_notebook(): ) run_command("docker-compose run --service-ports jupyter", exit_keyword="or http") + webbrowser.open( "http://localhost:8888/lab/tree/kuwala/notebooks/popularity_correlation.ipynb" ) @@ -26,6 +27,7 @@ def launch_jupyter_notebook(): ) + if __name__ == "__main__": logging.basicConfig( format="%(levelname)s %(asctime)s: %(message)s", diff --git a/kuwala/pipelines/osm-poi/README.md b/kuwala/pipelines/osm-poi/README.md index 9ce55165..80dfe567 100644 --- a/kuwala/pipelines/osm-poi/README.md +++ b/kuwala/pipelines/osm-poi/README.md @@ -23,18 +23,17 @@ To transform the standard `pbf` files, which is the file format of the OSM data, OSM-parquetizer is based on a Git submodule which needs to be initialized first. To initialize the submodule, run from inside the root directory: -Linux/Mac: - ```zsh -cd kuwala/scripts && sh initialize_git_submodules.sh +cd kuwala/scripts/shell && sh initialize_git_submodules.sh ``` - -Windows: +Or if you are a Windows user having issues to run a a `shell` script, you can run a `python` script as an alternative: ```zsh -cd kuwala/scripts && sh initialize_windows.sh && cd windows && sh initialize_git_submodules.sh +cd kuwala/scripts/python && python3 initialize_git_submodules.py ``` +*All `shell` scripts are also available in `python` script using the same file naming, inside the `/python` directory* + To make sure you are running the latest version of the pipeline, build the Docker images from inside the `kuwala` directory by running: diff --git a/kuwala/scripts/initialize_windows.sh b/kuwala/scripts/initialize_windows.sh deleted file mode 100644 index b6247719..00000000 --- a/kuwala/scripts/initialize_windows.sh +++ /dev/null @@ -1,12 +0,0 @@ -mkdir windows -sed 's/\r$//' build_all_containers.sh > ./windows/build_all_containers.sh -sed 's/\r$//' build_cli.sh > ./windows/build_cli.sh -sed 's/\r$//' build_jupyter_notebook.sh > ./windows/build_jupyter_notebook.sh -sed 's/\r$//' build_postgres.sh > ./windows/build_postgres.sh -sed 's/\r$//' create_zip_archive.sh > ./windows/create_zip_archive.sh -sed 's/\r$//' initialize_all_components.sh > ./windows/initialize_all_components.sh -sed 's/\r$//' initialize_core_components.sh > ./windows/initialize_core_components.sh -sed 's/\r$//' initialize_git_submodules.sh > ./windows/initialize_git_submodules.sh -sed 's/\r$//' run_cli.sh > ./windows/run_cli.sh -sed 's/\r$//' run_jupyter_notebook.sh > ./windows/run_jupyter_notebook.sh -sed 's/\r$//' stop_all_containers.sh > ./windows/stop_all_containers.sh diff --git a/kuwala/scripts/python/build_all_containers.py b/kuwala/scripts/python/build_all_containers.py new file mode 100644 index 00000000..c58bdf53 --- /dev/null +++ b/kuwala/scripts/python/build_all_containers.py @@ -0,0 +1,14 @@ +""" +build_all_containers.sh: +cd .. +docker-compose build postgres database-importer database-transformer jupyter admin-boundaries google-poi-api google-poi-pipeline google-trends osm-parquetizer osm-poi population-density' +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) +os.chdir(os.path.join(script_dir,'../../')) + +rc.run_command(['docker-compose build postgres database-importer database-transformer jupyter admin-boundaries google-poi-api google-poi-pipeline google-trends osm-parquetizer osm-poi population-density']) + \ No newline at end of file diff --git a/kuwala/scripts/python/build_cli.py b/kuwala/scripts/python/build_cli.py new file mode 100644 index 00000000..63a16b86 --- /dev/null +++ b/kuwala/scripts/python/build_cli.py @@ -0,0 +1,22 @@ +""" +build_cli.sh: + +cd ../.. +pip3 install virtualenv +virtualenv -p python3 venv +source ./venv/bin/activate +pip install -r kuwala/core/cli/requirements.txt +pip install -e . +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../../')) +rc.run_command(['pip3 install virtualenv']) +rc.run_command(['virtualenv -p python3 venv']) +rc.run_command(['source ./venv/bin/activate']) +rc.run_command(['pip install -r kuwala/core/cli/requirements.txt']) +rc.run_command(['pip install -e .']) \ No newline at end of file diff --git a/kuwala/scripts/python/build_jupyter_notebook.py b/kuwala/scripts/python/build_jupyter_notebook.py new file mode 100644 index 00000000..14f94bf1 --- /dev/null +++ b/kuwala/scripts/python/build_jupyter_notebook.py @@ -0,0 +1,14 @@ +""" +build_jupyter_notebook.sh: +cd .. +docker-compose build jupyter + +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../')) +rc.run_command(['docker-compose build jupyter']) \ No newline at end of file diff --git a/kuwala/scripts/python/build_postgres.py b/kuwala/scripts/python/build_postgres.py new file mode 100644 index 00000000..fdc98bd0 --- /dev/null +++ b/kuwala/scripts/python/build_postgres.py @@ -0,0 +1,12 @@ +""" +cd .. +docker-compose build postgres +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../')) +rc.run_command(['docker-compose build postgres']) \ No newline at end of file diff --git a/kuwala/scripts/python/create_zip_archive.py b/kuwala/scripts/python/create_zip_archive.py new file mode 100644 index 00000000..c3b5d26b --- /dev/null +++ b/kuwala/scripts/python/create_zip_archive.py @@ -0,0 +1,12 @@ +""" +cd ../.. +git archive --format=zip HEAD -o kuwala.zip +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../../')) +rc.run_command(['git archive --format=zip HEAD -o kuwala.zip']) \ No newline at end of file diff --git a/kuwala/scripts/python/initialize_all_components.py b/kuwala/scripts/python/initialize_all_components.py new file mode 100644 index 00000000..99a3aeed --- /dev/null +++ b/kuwala/scripts/python/initialize_all_components.py @@ -0,0 +1,14 @@ +""" +sh initialize_git_submodules.sh +sh build_cli.sh +sh build_all_containers.sh +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +rc.run_command(['python3 initialize_git_submodules.py']) +rc.run_command(['python3 build_cli.py']) +rc.run_command(['python3 build_all_containers.py']) diff --git a/kuwala/scripts/python/initialize_core_components.py b/kuwala/scripts/python/initialize_core_components.py new file mode 100644 index 00000000..875bf180 --- /dev/null +++ b/kuwala/scripts/python/initialize_core_components.py @@ -0,0 +1,14 @@ +""" +sh build_postgres.sh +sh build_cli.sh +sh build_jupyter_notebook.sh +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +rc.run_command(['python3 build_postgres.py']) +rc.run_command(['python3 build_cli.py']) +rc.run_command(['python3 build_jupyter_notebook.py']) diff --git a/kuwala/scripts/python/initialize_git_submodules.py b/kuwala/scripts/python/initialize_git_submodules.py new file mode 100644 index 00000000..37e94d6b --- /dev/null +++ b/kuwala/scripts/python/initialize_git_submodules.py @@ -0,0 +1,12 @@ +""" +cd ../.. +git submodule update --init --recursive +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../../')) +rc.run_command(['git submodule update --init --recursive']) \ No newline at end of file diff --git a/kuwala/scripts/python/run_cli.py b/kuwala/scripts/python/run_cli.py new file mode 100644 index 00000000..ed2c26f7 --- /dev/null +++ b/kuwala/scripts/python/run_cli.py @@ -0,0 +1,16 @@ +""" +cd ../../ +source ./venv/bin/activate +cd kuwala/core/cli +python3 src/main.py +""" + +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../../')) +rc.run_command(['source ./venv/bin/activate']) +os.chdir(os.path.join(script_dir,'../../../','kuwala/core/cli')) +rc.run_command(['python3 src/main.py']) diff --git a/kuwala/scripts/python/run_command.py b/kuwala/scripts/python/run_command.py new file mode 100644 index 00000000..eb9a568b --- /dev/null +++ b/kuwala/scripts/python/run_command.py @@ -0,0 +1,51 @@ +from threading import Thread +import subprocess +import os + +def run_command(command: [str], exit_keyword=None): + process = subprocess.Popen( + command, + bufsize=1, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + universal_newlines=True, + shell=True + ) + thread_result = dict(hit_exit_keyword=False) + + def print_std(std, result): + while True: + line = std.readline() + + if len(line.strip()) > 0: + print(line if 'Stage' not in line and '%' not in line else line.strip(), end='\r') + + if exit_keyword is not None and exit_keyword in line: + result['hit_exit_keyword'] = True + + break + + return_code = process.poll() + + if return_code is not None: + if return_code != 0: + return RuntimeError() + + break + + stdout_thread = Thread(target=print_std, args=(process.stdout, thread_result,), daemon=True) + stderr_thread = Thread(target=print_std, args=(process.stderr, thread_result,), daemon=True) + + stdout_thread.start() + stderr_thread.start() + + while stdout_thread.is_alive() and stderr_thread.is_alive(): + pass + + if thread_result['hit_exit_keyword']: + return process + + + +if __name__=='__main__': + run_command() \ No newline at end of file diff --git a/kuwala/scripts/python/run_jupyter_notebook.py b/kuwala/scripts/python/run_jupyter_notebook.py new file mode 100644 index 00000000..cd10d616 --- /dev/null +++ b/kuwala/scripts/python/run_jupyter_notebook.py @@ -0,0 +1,12 @@ +""" +cd ../ +docker-compose run --service-ports jupyter +""" +import os +import run_command as rc + +script_dir = os.path.dirname(os.path.abspath(__file__)) + +os.chdir(os.path.join(script_dir,'../../')) +rc.run_command(['docker-compose run --service-ports jupyter']) + diff --git a/kuwala/scripts/python/stop_all_containers.py b/kuwala/scripts/python/stop_all_containers.py new file mode 100644 index 00000000..4cfccf1d --- /dev/null +++ b/kuwala/scripts/python/stop_all_containers.py @@ -0,0 +1,14 @@ +""" +reset +docker stop $(docker ps -a -q) +docker-compose down +docker-compose rm -f +""" + +import os +import run_command as rc + +rc.run_command(['reset']) +rc.run_command(['docker stop $(docker ps -a -q)']) +rc.run_command(['docker-compose down']) +rc.run_command(['docker-compose rm -f']) \ No newline at end of file diff --git a/kuwala/scripts/build_all_containers.sh b/kuwala/scripts/shell/build_all_containers.sh similarity index 95% rename from kuwala/scripts/build_all_containers.sh rename to kuwala/scripts/shell/build_all_containers.sh index d96105d6..3997f87b 100644 --- a/kuwala/scripts/build_all_containers.sh +++ b/kuwala/scripts/shell/build_all_containers.sh @@ -1,2 +1,2 @@ -cd .. +cd ../.. docker-compose build postgres database-importer database-transformer jupyter admin-boundaries google-poi-api google-poi-pipeline google-trends osm-parquetizer osm-poi population-density \ No newline at end of file diff --git a/kuwala/scripts/build_cli.sh b/kuwala/scripts/shell/build_cli.sh similarity index 92% rename from kuwala/scripts/build_cli.sh rename to kuwala/scripts/shell/build_cli.sh index ce6f1b70..c5299afc 100644 --- a/kuwala/scripts/build_cli.sh +++ b/kuwala/scripts/shell/build_cli.sh @@ -1,4 +1,4 @@ -cd ../.. +cd ../../.. pip3 install virtualenv virtualenv -p python3 venv source ./venv/bin/activate diff --git a/kuwala/scripts/build_jupyter_notebook.sh b/kuwala/scripts/shell/build_jupyter_notebook.sh similarity index 75% rename from kuwala/scripts/build_jupyter_notebook.sh rename to kuwala/scripts/shell/build_jupyter_notebook.sh index b6320b2a..bf29848f 100644 --- a/kuwala/scripts/build_jupyter_notebook.sh +++ b/kuwala/scripts/shell/build_jupyter_notebook.sh @@ -1,2 +1,2 @@ -cd .. +cd ../.. docker-compose build jupyter \ No newline at end of file diff --git a/kuwala/scripts/build_postgres.sh b/kuwala/scripts/shell/build_postgres.sh similarity index 100% rename from kuwala/scripts/build_postgres.sh rename to kuwala/scripts/shell/build_postgres.sh diff --git a/kuwala/scripts/create_zip_archive.sh b/kuwala/scripts/shell/create_zip_archive.sh similarity index 78% rename from kuwala/scripts/create_zip_archive.sh rename to kuwala/scripts/shell/create_zip_archive.sh index a74b5584..23124d24 100644 --- a/kuwala/scripts/create_zip_archive.sh +++ b/kuwala/scripts/shell/create_zip_archive.sh @@ -1,2 +1,2 @@ -cd ../.. +cd ../../.. git archive --format=zip HEAD -o kuwala.zip \ No newline at end of file diff --git a/kuwala/scripts/initialize_all_components.sh b/kuwala/scripts/shell/initialize_all_components.sh similarity index 100% rename from kuwala/scripts/initialize_all_components.sh rename to kuwala/scripts/shell/initialize_all_components.sh diff --git a/kuwala/scripts/initialize_core_components.sh b/kuwala/scripts/shell/initialize_core_components.sh similarity index 100% rename from kuwala/scripts/initialize_core_components.sh rename to kuwala/scripts/shell/initialize_core_components.sh diff --git a/kuwala/scripts/initialize_git_submodules.sh b/kuwala/scripts/shell/initialize_git_submodules.sh similarity index 76% rename from kuwala/scripts/initialize_git_submodules.sh rename to kuwala/scripts/shell/initialize_git_submodules.sh index 3e482e1e..da0e44e2 100644 --- a/kuwala/scripts/initialize_git_submodules.sh +++ b/kuwala/scripts/shell/initialize_git_submodules.sh @@ -1,2 +1,2 @@ -cd ../.. +cd ../../.. git submodule update --init --recursive \ No newline at end of file diff --git a/kuwala/scripts/run_cli.sh b/kuwala/scripts/shell/run_cli.sh similarity index 83% rename from kuwala/scripts/run_cli.sh rename to kuwala/scripts/shell/run_cli.sh index cbedab1d..7e370716 100644 --- a/kuwala/scripts/run_cli.sh +++ b/kuwala/scripts/shell/run_cli.sh @@ -1,4 +1,4 @@ -cd ../../ +cd ../../../ source ./venv/bin/activate cd kuwala/core/cli python3 src/main.py \ No newline at end of file diff --git a/kuwala/scripts/run_jupyter_notebook.sh b/kuwala/scripts/shell/run_jupyter_notebook.sh similarity index 82% rename from kuwala/scripts/run_jupyter_notebook.sh rename to kuwala/scripts/shell/run_jupyter_notebook.sh index 432fb557..6a482838 100644 --- a/kuwala/scripts/run_jupyter_notebook.sh +++ b/kuwala/scripts/shell/run_jupyter_notebook.sh @@ -1,2 +1,2 @@ -cd ../ +cd ../.. docker-compose run --service-ports jupyter \ No newline at end of file diff --git a/kuwala/scripts/stop_all_containers.sh b/kuwala/scripts/shell/stop_all_containers.sh similarity index 100% rename from kuwala/scripts/stop_all_containers.sh rename to kuwala/scripts/shell/stop_all_containers.sh