Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

Latest commit

 

History

History
102 lines (62 loc) · 5.52 KB

README.md

File metadata and controls

102 lines (62 loc) · 5.52 KB

Desk-LM

Desk-LM is a python environment for training machine learning models. It currently implements the following ML algorithms:

  • Linear SVM
  • Decision Tree
  • K-NN
  • ANN

We are extending the library to other algorithms, also unsupervised. Your voluntary contribution is welcome.

The user can specify a .csv dataset, an algorithm and a set of parameters, so to train and select the best model and export it for use on edge devices, by exploiting the twin tool Micro-LM.

For ANNs, Desk-LM outputs the model in hdf5 file format, to be imported by STM32 CubeAI, together with some .h files that could be useful for testing the whole dataset performance on the microcontroller (STM32 Nucleo boards only).

For all the other algorithms, Desk-ML produces .c and .h that will be used as source files in a Edge-LM project for optimzed memory footprint on edge devices. They contain the parameters of the selected ML model.

We are working so that Desk-ML will output .json files so to allow dynamic usage by microcontrollers.

Reference article for more infomation

F., Sakr, F., Bellotti, R., Berta, A., De Gloria, "Machine Learning on Mainstream Microcontrollers," Sensors 2020, 20, 2638. https://www.mdpi.com/1424-8220/20/9/2638

Usage

Input

The command line expects two input parameters (see launch.json):

-a < algorithm_name>

Currently, we support SVM, K-nn, ANN, DT.

-d <dataset_name>. The software expects a <dataset_name>.csv file in ../datasets/

Numeric only datasets are accepted, by now.

We have tried the software with the following datasets: Heart Disease UCI | Kaggle. Available online: http://www.kaggle.com/ronitf/heart-disease-uci

Boero, L.; Cello, M.; Marchese, M.; Mariconti, E.; Naqash, T.; Zappatore, S. Statistical fingerprint—Based intrusion detection system (SF-IDS). Int. J. Commun. Syst. 2017, 30, e3225.

Fausto, A.; Marchese, M. Implementation Details to Reduce the Latency of an SDN Statistical Fingerprint-Based IDS. In Proceedings of the IEEE International Symposium on Advanced Electrical and Communication Technologies (ISAECT), Rome, Italy, 27–29 November 2019.

http://www.fizyka.umk.pl/kis-old/projects/datasets.html#Sonar

Traffic, Driving Style and Road Surface Condition | Kaggle. Available online: http://www.kaggle.com/gloseto/traffic-driving-style-road-surface-condition

EnviroCar—Datasets—the Datahub. Available online: http://www.old.datahub.io/dataset/envirocar (accessed on 13 February 2020).

Massoud, R.; Poslad, S.; Bellotti, F.; Berta, R.; Mehran, K.; Gloria, A.D. A fuzzy logic module to estimate a driver’s fuel consumption for reality-enhanced serious games. Int. J. Serious Games 2018, 5, 45–62.

Massoud, R.; Bellotti, F.; Poslad, S.; Berta, R.; De Gloria, A. Towards a reality-enhanced serious game to promote eco-driving in the wild. In Games and Learning Alliance. GALA 2019. Lecture Notes in Computer Science; Liapis, A., Yannakakis, G., Gentile, M., Ninaus, M., Eds.; Springer: Berlin, Germany, 2019

Search for and download air quality data | NSW Dept of Planning, Industry and Environment. Available online: http://www.dpie.nsw.gov.au/air-quality/search-for-and-download-air-quality-data (accessed on 13 February 2020).

Configuration files

config.py exposes the characteristics of the dataset to be processed (e.g., what the target column is), and the parameters to be analyzed for the pre-processing (e.g., 'mle' algorithm for automatic PCA) and for the selected algorithm training and cross-validation.

For ANN only:

  • ./config/<ds_name>/activeFuncs.dat specifies the various activation functions that could be used in all the layers
  • ./config/<ds_name>/layerShape.dat specifies the possible shapes of the layers of the ANN.

Output

In config.py the user can specify the export_dir variable, where all the files usable by Micro-LM (please see https://github.com/Edge-Learning-Machine/Micro-LM for usage instructions) will be exported. Particularly, the files will be outuput under: "export_dir/ds/source" and "export_dir/ds/include". Output files are also duplicated in the following directories:

Linear SVM / DT / K-NN

In './out/source/' and in './out/include/', the .c and .h files are generated, that contain the selected model parameters, that need to be compiled in a Edge-LM project.

The same output is also provided under: './out/' + cfg.ds_name + '/include/' + cfg.algo.lower() + '/'

ANN

In "export_dir/ds/source" and './out/source/', the preprocess_params.c file is saved, together with .c file for dataset testing In "export_dir/ds/include" and './out/include/', the preprocess_params.h file is saved, together with .h file for dataset testing and the ANN model in hdf5 format

The same output is also provided under: './out/' + cfg.ds_name + '/include/' + cfg.algo.lower()

Use an ANN in a CubeIDE project, using STM X-Cube-AI package (for STM32 Nucleo boards only):

1- Load the generated .h5 model from DeskML into STM32CubeIDE and generate the code for your target board

2- Use the generated files in your project for pre-processing and/or dataset testing

Testing

if the nTests variable (config.py) is equal to 'full', testing_set (.c and .h files) is produced. Otherwise, a minimal_testing_set (.c and .h files) is produced. All these files are used by Micro-LM to test a whole dataset or part of it

Log

./<ds_name>.log, log file for each dataset

Data type

float 32 data are used

Run

python main.py -d <dataset_name> -a <algo_name>

Version

Currently tested with Python 3.6, Keras 2.2.4, Pandas 0.25.2, and Tensorflow 1.8.0, which is needed for importing the ANN model in Cube-AI