This repository contains some useful modules to preprocess data for machine learning. There is a test file for each module to verify its functions.
Machine learning often requires categorical data to be encoded into binary variables. Sometimes categorical data is represented by numeric values. Most encoders ignore numeric columns. This module converts numeric columns, storing categorical data, into dtype "string".
Machine learning often requires numeric data to be free of missing values. To avoid dropping all rows with at least one numeric value missing, those values get imputed by an outlier.