Top level resource of Azure Machine Learning Sevice. It serves as a hub for building and deploying models. You can create a workspace in the Azure portal, or you can create and access it by using Python on an IDE of your choice. The workspace stores the experiment objects that are required for each model you create. Aditionally, it saves your computer targets. You can track training runs , and you can retrieve logs, metrics, outputs and scripts with ease. This information is important for model evaluation and selection.
It is an abstraction over an Azure Storage account. Each workspace has a registered default datastore, that you can use right away, but you can register another Azure blob or File storage contianers as a datastore.
It is a tool to create and manage workflows during a datascience process (data manipulation, model trainig & testing, development). Each step of the process can run unattended in different compute targets, which makes it easier to allocate resources.
- Model Management
- Model Training
- Model Selection
- Hyper-paramete Tuning
- Feature Selection
- Model Evaluation
Popular IDEs
-
Jupyter Notebooks
- Open Source
- Originally written for Python and called IPython
- Realtime execution and rendering
- Many languages supported
- Supports Spark API (pySpark, SparkR)
- Can share via GitHub, JupyterHub and Azure
- Built-in viewer support by GitHub
- Over 75 languages supported
- The name is to emphasisze multi-language support
- JUlia-PYthon-te-R
Anaconda is an intergrated Python Data Science Development evenviroment which can be downloaded from "Anaconda.com". It is available for all three OS MacOS, Linux as well as Windows. The beauty of the anaconda is, when it is downloaded and installed, it comes in-built with development evenvironments like Jupyter and Spyder as well as it also brings in Machine Learning libraries like NumPy, SciPy, pandas, Scikit-learn, Dask, Numba, TensorFlow, and Theano so you don't have to spend any additional effort in installing these libraries. When you download it and install the executable, it will install a software called Anaconda Navigator in your system.
workflow
automl
- On local machine
- Consumes resources
- Computation power depends on the machine
- similiar to jupyter notebook (as per interface)
- we can create n' run the notebook using free resources provided by azure (FREE COMPUTE)
- Visual Drag and Drop ML Training and Development
- Complete Machine Learning Environment
- Ideal for learning and beginner data scientists
import data →explore and create summaries →pre-process and clean the data →algorithm selection →model training and tuning →deploy and consume