Skip to content

Meta data file explained

sf-wind edited this page Oct 16, 2018 · 6 revisions

The meta data file is in json format. The json file is kept very flexible. Different frameworks can have their own extensions of the json file, but some fields are required for all frameworks. Below the format of the json is described through an example for shufflenet on Caffe2 framwork:

{
  "model": {
    "category": "CNN",
    "description": "Trained ShuffleNet on Caffe2",
    "files": {
      "init": {
        "filename": "init_net.pb",
        "location": "https://s3.amazonaws.com/download.caffe2.ai/models/shufflenet/init_net.pb",
        "md5": "b4769da2f2090e2b5a87347bb35b274d"
      },
      "predict": {
        "filename": "predict_net.pb",
        "location": "https://s3.amazonaws.com/download.caffe2.ai/models/shufflenet/predict_net.pb",
        "md5": "711758bb6d38ca8f74adda2fe72340a9"
      }
    },
    "format": "caffe2",
    "kind": "deployment",
    "name": "shufflenet"
  },
  "tests": [
    {
      "commands": [
        "{program} --net {files.predict} --init_net {files.init} --warmup {warmup} --iter {iter} --input \"gpu_0/data\" --input_dims \"1,3,224,224\" --input_type float --run_individual true"
      ],
      "identifier": "shufflenet_1,3,244,244",
      "metric": "delay",
      "iter": 50,
      "warmup": 1
    }
  ]
}

The json file is composed of two main fields: model and tests

model is a dictionary containing all information related to the model. The required fields are different for different frameworks. Some commonly used fields are:

  • name: the name of the model.
  • files: a dictionary containing the description of the model files or other files used in the benchmark binary. The key is a unique name that may be referred in the test's arguments field. The value is another dictionary containing the following fields:
    • filename: the name of the file.
    • location: a full path indicating where the file is saved. It can be a local directory or a web URL.
    • md5: the md5 hash for the file. It is used for caching purpose. You may not need to calculate the md5 hash manually. The harness may calculate it and update the file. If the file is changed, the md5 is re-calculated. You can provide a PR to make it available for all users.
    • target (optional): if the test is only meant for one platform, this field specifies the target directory the file is copied to in that platform.

tests is a list containing one or multiple benchmark tests on the specified model. Each test may contain multiple fields. An example of the fields are:

  • metric: a field indicating the metric to collect in the benchmark. Please note this metric field is only used to parse the data outputted from the benchmark binary. The binary may output the actual collected metric.
  • commands: a list of strings for the commands to run. You can put arbitrary arguments in the string. If some contents referred in some fields of the test or the model fields, they can be referenced in the string using {<dot separated hierarchical field names>}. It is useful when referring to some files that need to be copied to the target platform and the absolute path change. It is also useful for referring to fields such as iter, warmup, which may be used by the benchmark harness. You can refer to specifications/models/caffe2/squeezenet/squeezenet.json as an example. {program} is a special token that may not exist in the json file. It is the main binary to be benchmarked. Detailed of the placeholder mapping can be found in Json placeholder string explained.
  • iter: the number of iterations to run the benchmark binary. Its value is used by the harness. If this value needs to be passed to the benchmark binary, you can put it as a variable to the arguments field.
  • warmup: the number of warmup iterations the benchmark binary runs without collecting the metrics.