Skip to content

Installation Guide on Greenplum (v0.1.0alpha)

Rahul Iyer edited this page Jul 1, 2015 · 2 revisions

Prerequisites:

The following software must be installed before proceeding with the Installation Steps:

Installation Steps:

These instructions assume your database is installed in $GPHOME, and you have sudo access to the database installation user account (default: gpadmin).

  1. Login to the master server using gpadmin account and set Greenplum environment:
     $> $GPHOME/greenplum_path.sh
  1. Prepare your database and schema designated for MADlib installation:
    1. If the database does not exist, create it:
     $> createdb <db_name>
1. Create target schema:
     sql> CREATE SCHEMA <schema_name>;
1. Create database languages:
     sql> CREATE LANGUAGE plpgsql;
     sql> CREATE LANGUAGE plpythonu;
  1. Make sure you have Python setuptools installed:
     $> cd /tmp
     $> wget http://pypi.python.org/packages/2.6/s/setuptools/setuptools-0.6c11-py2.6.egg#md5=bfa92100bd772d5a213eedd356d64086
     $> sh setuptools-0.6c11-py2.6.egg

Note!: If you are using a Mac you are probably missing wget. You can either install it or just copy the above link to your browser and download setuptools egg file. Then go to the download directory and run sh setuptools-0.6c11-py2.6.egg.

  1. Install the following python libraries used by the madlib installer.
     $> $GPHOME/ext/python/bin/easy_install argparse sqlparse hashlib

Note!: If you notice an error during hashlib module installation just ignore it.

  1. Download and uncompress madlib repository from https://github.com/madlib/madlib/archives/master or use git to clone it from [email protected]:madlib/madlib.git.

  2. Change directory to the madlib root and edit the madpy/Config.yml file to reflect your installation:

     $> vi madpy/Config.yml
* Note #1: Make sure to verify `connect_args`.
* Note #2: Set `target_schema` according to step 2b).
* Note #3: Uncomment `prep_flags: -DGREENPLUM`.
* Note #4: If you have a multi-node Greenplum setup then uncomment `post_hook = greenplum.py`.
  1. From the madlib directory run the Python install command. This step deploys MADlib libraries into the target directories:
     $> python setup.py install
  1. Now that the python libraries are installed and the database is ready, it's time to build the database extensions and install them. Note! : If you uncommented post_hook = greenplum.py you will be asked for a full path to your host_file to automatically deploy C shared objects on your Greenplum cluster. Find your host_file location before running this:
     $> export CFLAGS="-L$GPHOME/ext/python/lib/ -L$GPHOME/lib/"
     $> export C_INCLUDE_PATH=$GPHOME/include/
     $> madpack install
  1. To test your installation you can run the following command to list all the functions installed in your MADlib designated schema:
     psql -d <db_name> -c "\df <schema_name>.*"
  1. Done. Visit MADlib User Documentation for more info.