diff --git a/joss-paper/images/transformation_run_example.png b/joss-paper/images/transformation_run_example.png index b16d4cc..a740849 100644 Binary files a/joss-paper/images/transformation_run_example.png and b/joss-paper/images/transformation_run_example.png differ diff --git a/joss-paper/paper.md b/joss-paper/paper.md index 63651bc..890e850 100644 --- a/joss-paper/paper.md +++ b/joss-paper/paper.md @@ -62,7 +62,7 @@ One of the most widely applied strategy to enhance the performance of machine le The python-based package `HARDy` is a modularly structured package which classifies data using 2D convolutional neural networks. A schematic for the package can be found in figure 1. -![Workflow schematics for HARDy. After the data files _(.csv)_ are loaded, the transformations indicated in the _configuration file (.yaml)_ will be applied to the indicated variables. The configuration file will also contain information on which axis (x/y for Cartesian Representation or RGB channels for color images) to use for the loaded data. Once the images are generated, they will be used as input for a Convolutional Neural Network (CNN) for the data classification step. Another _configuration file (.yaml)_ is used to indicate the hyperparameters and network structure of the CNN or the hyperparameter space to use for the tuning session. Each combination of transformations will be used to train and test its own CNN/CNN tuning session. Finally, the data reporting will provide a final summary of the best classification accuracy for each transformation tested in the session, as well as the final trained model.](./images/HARDy_diagram.png) +![Workflow schematics for HARDy. The data files, left-most column, are subject to numerical and visual transformations, according to the rules outlined in the user defined configuration files. The images are then fed into either a CNN or a tuner, for which the hyperparameter space is controlled through another configuration file. Finally, each transformation produces a report comprising of the best trained model file, the log of training session and the model validation result.](./images/HARDy_diagram.png) The package was tested on a set of simulated Small Angle Scattering (SAS) data to be classified into four different particle models: spherical, ellipsoidal, cylindrical and core-shell spherical. A total of ten thousand files were generated for each model. The data was generated using \textit{sasmodels}. The geometrical and physical parameters used to obtain each spectrum were taken from a published work discussing a similar classification task [@ArchibaldRichardK2020Caas]. The name of each SAS model was used as label for the data, allowing for further validation of the test set results. These models were selected as they present similar parameters and data features, which at times make it challenging to distinguish between them.