-
Notifications
You must be signed in to change notification settings - Fork 128
MADlib Module Anatomy
agorajek edited this page May 13, 2011
·
13 revisions
This page explains all the elements needed to sucesfully develop and plug-in a new MADlib module.
Say you want to write a new MADlib module called NewModule (code name: newmod).
./src/
modules/
newmod/ # (REQUIRED) new directory for the module code
newmod.sql_in # (REQUIRED) SQL file to create DB objects
newmod.py_in # (optional) Python code
newmod.c/cpp # (optional) C/C++ code for this module
test/ # (optional) directory for SQL test scripts
newmod.sql_in
...
-
newmod.sql_in - SQL file which creates database objects for this method. This is the only required code file, because there could me a module/method written completely in SQL. There would be no need for Python or C/C++ code in such case.
This file is preprocessed with m4 during installation phase and currently uses the following meta variables:
- MADLIB_SCHEMA - will be replaced with the target schema name
- PLPYTHON_LIBDIR - used inside PL/Python routines (UDFs) and will be replaced with a path to a directory with the Python module of each method
- MODULE_PATHNAME - used inside C routines (UDFs) and will be replaced with a path to a directory with the C/C++ module of each method
- newmod.py_in - Python code for newmod module (preprocessed during build phase for each DB platform)
- newmod.c/cpp - C/C++ code for newmod module
- test/newmod.sql_in - SQL test script written according to Unit-Testing-Guide
In order to include the new module in the generic (not database dependent) installation only the following config file must be edited: ./config/Modules.yml
:
- name: newmod
depends: ['othermod1', 'othermod2']