-
Notifications
You must be signed in to change notification settings - Fork 128
Installation Guide on Greenplum (v0.1.0alpha)
The following software must be installed before proceeding with the Installation Steps:
- Database: Greenplum™ 3.3 or higher
- Other software: LAPACK-DEVEL (http://www.netlib.org/lapack/)
These instructions assume your database is installed in $GPHOME
, and you have sudo access to the database installation user account (default: gpadmin
).
- Login to the master server using
gpadmin
account and set Greenplum environment:
$> $GPHOME/greenplum_path.sh
- Prepare your database and schema designated for MADlib installation:
- If the database does not exist, create it:
$> createdb <db_name>
1. Create target schema:
sql> CREATE SCHEMA <schema_name>;
1. Create database languages:
sql> CREATE LANGUAGE plpgsql;
sql> CREATE LANGUAGE plpythonu;
- Make sure you have Python
setuptools
installed:
$> cd /tmp
$> wget http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c11-py2.6.egg#md5=bfa92100bd772d5a213eedd356d64086
$> sh setuptools-0.6c11-py2.6.egg
Note!: If you are using a Mac you are probably missing wget
. You can either install it or just copy the above link to your browser and download setuptools egg file. Then go to the download directory and run sh setuptools-0.6c11-py2.6.egg
.
- Install the following python libraries used by the madlib installer.
$> $GPHOME/ext/python/bin/easy_install argparse sqlparse hashlib
Note!: If you notice an error during hashlib module installation just ignore it.
-
Download and uncompress madlib repository from
https://github.com/madlib/madlib/archives/master
or use git to clone it from[email protected]:madlib/madlib.git
. -
Change directory to the
madlib
root and edit themadpy/Config.yml
file to reflect your installation:
$> vi madpy/Config.yml
* Note #1: Make sure to verify `connect_args`.
* Note #2: Set `target_schema` according to step 2b).
* Note #3: Uncomment `prep_flags: -DGREENPLUM`.
* Note #4: If you have a multi-node Greenplum setup then uncomment `post_hook = greenplum.py`.
- From the
madlib
directory run the Python install command. This step deploys MADlib libraries into the target directories:
$> python setup.py install
- Now that the python libraries are installed and the database is ready, it's time to build the database extensions and install them.
Note! : If you uncommented
post_hook = greenplum.py
you will be asked for a full path to yourhost_file
to automatically deploy C shared objects on your Greenplum cluster. Find yourhost_file
location before running this:
$> export CFLAGS="-L$GPHOME/ext/python/lib/ -L$GPHOME/lib/"
$> export C_INCLUDE_PATH=$GPHOME/include/
$> madpack install
- To test your installation you can run the following command to list all the functions installed in your MADlib designated schema:
psql -d <db_name> -c "\df <schema_name>.*"
- Done. Visit MADlib User Documentation for more info.