Skip to content

Latest commit

 

History

History
126 lines (82 loc) · 7.74 KB

Lecture note.md

File metadata and controls

126 lines (82 loc) · 7.74 KB

Introduction to the Azure AI/ML platform

1. What is a Workspace?

Top level resource of Azure Machine Learning Sevice. It serves as a hub for building and deploying models. You can create a workspace in the Azure portal, or you can create and access it by using Python on an IDE of your choice. The workspace stores the experiment objects that are required for each model you create. Aditionally, it saves your computer targets. You can track training runs , and you can retrieve logs, metrics, outputs and scripts with ease. This information is important for model evaluation and selection.

2. What is a Datastore?

It is an abstraction over an Azure Storage account. Each workspace has a registered default datastore, that you can use right away, but you can register another Azure blob or File storage contianers as a datastore.

3. What is a Pipeline?

It is a tool to create and manage workflows during a datascience process (data manipulation, model trainig & testing, development). Each step of the process can run unattended in different compute targets, which makes it easier to allocate resources.
pipeline

what-is-azure-ml-service

Introduction to Azure Machine Learning Service

ml-serice

Azure Machine Learning Services

  • Model Management
  • Model Training
  • Model Selection
  • Hyper-paramete Tuning
  • Feature Selection
  • Model Evaluation

Selecting Development Environment :

development-environment

Popular IDEs

  • Jupyter Notebooks

    • Open Source
    • Originally written for Python and called IPython
    • Realtime execution and rendering
    • Many languages supported
    • Supports Spark API (pySpark, SparkR)
    • Can share via GitHub, JupyterHub and Azure
    • Built-in viewer support by GitHub
    • Over 75 languages supported
    • The name is to emphasisze multi-language support
      • JUlia-PYthon-te-R

Demo : Setting up Conda and Jupyter Notebook

Anaconda is an intergrated Python Data Science Development evenviroment which can be downloaded from "Anaconda.com". It is available for all three OS MacOS, Linux as well as Windows. The beauty of the anaconda is, when it is downloaded and installed, it comes in-built with development evenvironments like Jupyter and Spyder as well as it also brings in Machine Learning libraries like NumPy, SciPy, pandas, Scikit-learn, Dask, Numba, TensorFlow, and Theano so you don't have to spend any additional effort in installing these libraries. When you download it and install the executable, it will install a software called Anaconda Navigator in your system.

jupyter-notebook

  1. azure-notebook azure-notebook1

  2. ml-studio-classic1 ml-studio-classic

workflow

import data explore create clean-missing-data

automl

automl automl-steps algorithms automl-demo

-->

1. Jupyter Notebook

-  On local machine
-  Consumes resources
-  Computation power depends on the machine

2. Azure Notebook

-  similiar to jupyter notebook (as per interface)
-  we can create n' run the notebook using free resources provided by azure (FREE COMPUTE)

3. Azure Machine Learning Studio Classic

-  Visual Drag and Drop ML Training and Development
-  Complete Machine Learning Environment
-  Ideal for learning and beginner data scientists

Workflow

import dataexplore and create summariespre-process and clean the dataalgorithm selectionmodel training and tuningdeploy and consume

4. Automated Machine Learning