Skip to content

niall-turbitt/distributed-deep-learning-workshop

 
 

Repository files navigation

Distributed Deep Learning Workshop

In this workshop, we will train a deep learning model in a distributed manner using Databricks. We will discuss how we can leverage Delta Lake to prepare structured, semi-structured, or unstructured datasets and Petastorm for distributing datasets efficiently on a cluster. We will also cover how to use Horovod for distributed training on both CPU and GPU based hardware. This example aims to serve as a reusable template that is tailorable to meet your specific modeling needs.

Workshop structure

The workshop involves a series of Databricks notebooks split into two parts,

In part 1 we look at how we can optimally leverage the parallelism of Spark for training deep learning models in a distributed manner. The notebooks outline the following:

  • Data Prep
    • How to create a Delta table with the Binary file data source reader using JPEG image sources.
  • Single node training
  • Distributed training

In part 2 we look at how we can paralellize both hyperparameter tuning and model inference. We illustrate:

  • Model tuning with Hyperopt
    • Tuning a single node DL model with Hyperopt
    • Tuning a distributed Horovod process with Hyperopt
  • Distributed model inference
    • How to package up a custom Pyfunc with preprocessing/post-processing steps
    • Applying that logged custom Pyfunc in a single node inference setting
    • Applying that logged custom Pyfunc in a distributed inference setting

Requirements

A recommended Databricks ML Runtime >= 7.3LTS is suggested. Please use the repos feature to clone into your repo and access the notebook.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%