Skip to content

CSV Format Support in Shifu

Zhang Pengshan (David) edited this page Apr 26, 2017 · 2 revisions

By using Shifu, header path and header delimiter should be specified in ModelConfig.json. While for csv format data,Sshifu Supports well.

How to run Shifu using CSV format file

   dataSet" : {
     "source" : "HDFS",
     "dataPath" : "train.csv",
     "dataDelimiter" : "|",
     "headerPath" : "",
     "headerDelimiter" : "",
     ...

After such configuration with empty 'headerPath' setting, header will be parsed from the first line of 'dataPath'. In training and data processing, the first line of 'train.csv' will also be ignored.

In 'eval' step, csv format data are also supported well.

  "evals" : [ {
      "name" : "Eval1",
      "dataSet" : {
         "source" : "HDFS",
         "dataPath" : "test.csv",
         "dataDelimiter" : "|",
         "headerPath" : "",
         "headerDelimiter" : "",
         ...

Please be noticed, such csv format data are supported since Shifu 0.10.0.

Clone this wiki locally