Skip to content

Latest commit

 

History

History
66 lines (63 loc) · 28.7 KB

README.md

File metadata and controls

66 lines (63 loc) · 28.7 KB

List of ML File Formats

This repository lists file formats used in ML/AI systems. It can be used as a resource for tool development and vulnerability research. We aim to keep this list as up-to-date and accurate as possible. If you discover any missing file formats, inaccuracies, or if you have more details to contribute, please raise an issue or submit a pull request.

Name ML-specific Framework/Organization (if applicable) Identification Tooling Extensions Additional Notes
PyTorch v1.3 Yes PyTorch Fickling .pt, .pth, .bin Description: ZIP file containing data.pkl (1 pickle file)
PyTorch v0.1.1 Yes PyTorch Fickling .pt, .pth, .bin Description: Tar file with sys_info, pickle, storages, and tensors
PyTorch v0.1.10 Yes PyTorch Fickling .pt, .pth, .bin Description: Stacked pickle files
TorchScript v1.4 Yes PyTorch Fickling .pt, .pth, .bin Description: ZIP file with data.pkl, constants.pkl, and version (2 pickle files and a folder)
TorchScript v1.3 (deprecated) Yes PyTorch Fickling .pt, .pth, .bin Description: ZIP file with data.pkl and constants.pkl (2 pickle files)
TorchScript v1.1 (deprecated) Yes PyTorch Fickling .pt, .pth, .bin Description: ZIP file with model.json and attributes.pkl (a JSON file and a pickle file)
TorchScript v1.0 (deprecated) Yes PyTorch Fickling .pt, .pth, .bin Description: ZIP file with model.json
PyTorch model archive format [ZIP] Yes PyTorch Fickling .mar Description: ZIP file that includes Python code files and pickle files
PyTorch model archive format [TAR] Yes PyTorch - .mar Description: TAR file that includes Python code files and pickle files
PyTorch Package Yes PyTorch - .pt, .pth, .bin Description: ZIP file that includes a pickled model, user files represented as a Python package, and framework files including serialized tensor data
ExecuTorch Yes PyTorch - .pte Description: Modified binary flatbuffer file with optional data segments appended
Torch.export Yes PyTorch - .pt2 Description: ZIP file with JSON files and Python code file
PyTorch Mobile Yes PyTorch - .ptl Description: Modified binary flatbuffer file
Safetensors Yes - PolyFile .safetensors Refer to our audit
ONNX Yes - - .onnx Refer to LobotoMI
Keras native file format Yes Keras - .keras Description: ZIP archive with 2 JSON files and 1 h5 file
TensorFlow Saved Models Yes TensorFlow - .pb Description: Custom Protobuf format. Can result in arbitrary code execution.
TensorFlow Checkpoint Yes TensorFlow - .ckpt Description: Custom Protobuf format. Can result in arbitrary code execution.
TFLite Yes TensorFlow - .tflite Description: Modified binary flatbuffer file
TFJS Yes TensorFlow - - Description: JSON file and binary file with weights. Technically not a singular file format.
TF1 Hub format (deprecated) Yes TensorFlow - - Description: Custom Protobuf format.
Tensorizer Yes CoreWeave - - Not uncommon especially in private production systems
TFRecords Yes TensorFlow - .tfrecords Description: Wrapper around a Protocol Buffer
NPY Yes NumPy - .npy Used to integrate pickle by default as well.
NPZ Yes NumPy - .npz Description: ZIP file of NPY files
GGUF Yes llama.cpp/GGML - .gguf -
GGML Yes llama.cpp/GGML - .ggml -
GGMF (deprecated) Yes llama.cpp/GGML - .ggmf -
GGJT (deprecated) Yes llama.cpp/GGML - .ggjt -
NetCDF Yes - - .nc -
PMML Yes - - - -
MLeap Yes Spark - .mleap -
CoreML Yes Apple - .coreml -
MLFlow Format Yes MLFlow - - -
MLFlow TensorSpec input format Yes MLFlow - - -
SurrealML Yes SurrealDB - .surml -
Llamafile Yes - - .llamafile -
.prompt Yes HumanLoop - .prompt -
Pickle No Python PolyFile .pkl Refer to Fickling
Joblib No - PolyFile - -
Nemo Yes NVIDIA - - -
Riva Yes NVIDIA - - -
AVRO No - - - -
PARQUET No - - - -
ORC No - - - -
JSON No - PolyFile - -
CSV No - - - -
Protocol Buffers No - - - Usually an underlying file format
HDF5 No - - .h5 -
Caffe Yes Caffe - .caffemodel & .prototxt Description: Protobuf-based file format
ArmNN Flatbuffers Yes ArmNN - - -
Cambricon Yes - - - -
Circle Yes - - - -
ZIP No - PolyFile - Usually an underlying file format
CNTK v1 (deprecated) Yes Microsoft Cognitive Toolkit - - -
CNTK v2 Yes Microsoft Cognitive Toolkit - - Description: Protobuf-based file format
Darknet Yes Hank.ai Darknet - - -
DL4J Yes DL4J - - Description: ZIP-based file format
Deep Learning Container (DLC) Yes Qualcomm Neural Processing SDK - .dlc -