This tutorial is designed to help you understand how to get started with setting up your computing environment, how to decide what to use your local laptop/desktop for, what to do on the server (and how), and how to go back and forth between different environments and tools on your laptop, the server, and your remote database (an other data resources).
We assume a GNU/linux (Ubuntu) server that's been set up for you, and access to a database (PostgreSQL).
You should have the following tools installed on your local machine (whether it's a MacOS, windows, or GNU/Linux) that you will use primarily locally:
ssh
(to connect to the server)psql
(to connect to the database through command line)dbeaver
(ordbvisualizer
) to connect to the database through a GUIgit
client (to work with github repositories)- Tableau
- GNU/Emacs, Vi, sublime or atom (text editor to edit code locally)
python
,jupyter
and other coding tools are helpful but you will be primarily using them on the server and not on your laptop
-
Decide which shell you're using. You have
bash
by default, but many of us likezsh
. -
Set up dotfiles. you can clone this repo with Adolfo's dotfiles
!!! danger
You should **never** blindly copy lines to your dotfiles that you don't understand. Check the files in dotfiles repository and adapt/adopt what suits your needs and tastes
-
Decide on your editor (vim or GNU/Emacs).
??? note "For vim users"
Get a good `.vimrc` file to make life easier for yourself if you choose vim. See for example [this](https://dougblack.io/words/a-good-vimrc.html)
??? note "If you prefer GNU/Emacs"
There are several options and depends in your taste, but [Emacs prelude](https://prelude.emacsredux.com/en/latest/) is a good start
-
Create a file with your database credentials (sample file) or (recommended) setup a .pg_service.conf
-
Learn about virtual environments and set one up (if it hasn't been set up for you).
-
Learn how to install new python packages through
pip install
-
screen
/tmux
: When you log in to your remote machine, run screen or tmux and work from a screen/tmux session -
(Optional) When using the database for any reason from your laptop (to connect with tableau or dbeaver or for any other application), open an ssh tunnel from your local machine to the remote server.
??? info "Windows"
See [here for instructions](https://www.skyverge.com/blog/how-to-set-up-an-ssh-tunnel-with-putty/)
??? info "MacOS, GNU/Linux"
As a reminder of [another section](../software-setup/#psql):
`ssh -N -L localhost:8888:localhost:8888 username@[projectname].dssg.io`
-
Writing and Running Code
-
If you're using your laptop (sublime, atom, or some other editor) to edit code, use git to commit nad push to the repo and then do a git pull on the server to get your code there.
-
If you're writing code on the server directly, you should use vim or GNU/Emacs.
-
git commit often. Every time you finish a chunk of work, do a git commit. git push when you've tested it and it is doing what you intended for it to do. Do not push code to master if it breaks. You will annoy your teammates :) Later in the summer, we'll talk more about how to create git branches.
-
Every time you resume working, do a git pull to get the latest version of the code.
-
If you need to copy files from your laptop to server, use
scp
.
!!! danger
Other way around, i.e. *from the server to your laptop*, **DON'T!** All the data needs to stay on the remote server.
-
If you're writing (or running) your code in jupyter notebooks, then you should:
-
create a no-browser jupyter session on the server
jupyter notebook --no-browser --port=8889
You may need to chage the port number to avoid conflicts with other teammates using the same port. -
On your local machine, create an SSH tunnel that forwards the port for Jupyter Notebook (
8889
in the above command) on the remote machine to a port on the local machine (also8888
above) so that we can access it using our local browser.ssh -N -L localhost:8888:localhost:8889 [email protected]
-
Access the remote jupyter server via your local browser. Open your browser and go to http://0.0.0.0:8888
!!! info "" you may need to copy and paste the longer URL with a token that is generated when you run the command in step 1) that looks like
http://localhost:8889/?token=343vdfvdfggdfgfdt345&token=fdsfdf345353vc
-
-
- When should you use Jupyter notebooks, versus when you should use .py files to write code
- When to use
psql
versus DBeaver - When to use SQL versus when to use Python and/or Pandas
- Tunneling to the DB for Tableau (or another app like QGIS):
ssh -L 5433:databaseservername:5432 username@projectservername