Azure Machine Learning Workshop

Welcome to the Azure Machine Learning Workshop! In this session, you’ll embark on a hands-on journey to create and deploy machine learning models, with a special focus on geoscience applications. Using Azure Machine Learning's Designer, AutoML, and Notebooks, you’ll build models from the ground up, concentrating on practical geoscience scenarios.

This workshop is structured to provide clear, step-by-step guidance. Follow these instructions closely to maximize your learning experience.

Estimated Time to Complete: 1 to 2 hours

Rest assured, every step of the exercise is carefully laid out to support your progress.

Use Case: Predicting Geothermal Characteristics in Colombia

Objectives

The primary objective of this project is to predict geothermal characteristics in Colombia, with a particular focus on estimating the geothermal gradient. By leveraging machine learning techniques, you’ll aim to predict the Apparent Geothermal Gradient (°C/Km), which is crucial for geothermal exploration.

Methodology

This project utilizes a blend of geospatial data, geophysical information, and geothermal measurements. The data, found in normalized_data_minimax.csv, includes details such as well depths, temperatures, geological features, and proximity to volcanic structures.

Dataset Overview

The dataset used in this project is normalized, ensuring that all features have been scaled to a similar range, which is crucial for effective machine learning model training. Each column in the dataset is explained below:

Latitude: Specifies the north-south position of a point on the Earth's surface in degrees.
Longitude: Specifies the east-west position of a point on the Earth's surface in degrees.
Elevation (m): The height of a point above sea level, measured in meters.
Surface Temperature (°C): The temperature at the Earth's surface at a specific location, measured in degrees Celsius.
Apparent Geothermal Gradient (°C/Km): The rate of temperature increase with depth beneath the Earth's surface, expressed in degrees Celsius per kilometer.
Moho Depth (m): The depth to the Mohorovičić discontinuity, the boundary between the Earth's crust and the mantle, measured in meters.
Magnetic Anomaly (nT): The deviation of the Earth's magnetic field from the expected value, measured in nanoteslas (nT), indicating variations in the magnetic properties of underlying rocks.
Fault: Indicates the presence (1) or absence (0) of a fault at the location.
Strike-slip Fault: A fault type where the motion is predominantly horizontal along the fault line.
Reverse or Thrust Fault: A fault where one block moves upwards relative to another, typically associated with compressional forces.
Lineament: Linear features on the Earth's surface representing underlying geological structures such as faults or fractures.
Right-lateral Fault: A type of strike-slip fault where the opposite side of the fault moves to the right.
Normal Fault: A fault where one block moves downward relative to another, usually associated with extensional forces.
Active Fault: A fault that has recently been active and may be prone to future earthquakes.
Curie Depth (Km): The depth at which magnetic minerals lose their permanent magnetism due to high temperatures, measured in kilometers.
Vertical Gravity Gradient (E): The rate of change of the gravitational field with respect to height, measured in Eötvös units (E).
Free Air Anomaly (mGal): The difference between measured gravity at a location and theoretical gravity, corrected for elevation, measured in milligals (mGal).
Bouguer Anomaly (mGal): The difference between measured gravity and theoretical gravity after correcting for elevation and the mass of rocks above sea level, measured in milligals (mGal).
Nearest Basement: The depth to the basement rock beneath sedimentary deposits.
Nearest Volcano: The distance to the nearest volcano from the given location.
Volcanic Domain: Classification of the area based on its volcanic activity or history.
Volcanic Weight: A weighted score representing volcanic activity in the area, often used in risk assessment models.
Gradient Weight: A weighted value representing the influence of the geothermal gradient in predictive models.
Sample Weight: The weight assigned to each sample in a dataset, used in machine learning models to give varying importance to samples.

Step-by-Step Guide

1. Clone the Repository

Start by cloning the repository to your local machine:

git clone https://github.com/GitHub-Nawatech-Lab/azureml-exercise.git

2. Upload to Azure Machine Learning

Sign in to Azure Portal: Access the Azure Portal and log in with your credentials.
Create an Azure Machine Learning Workspace: If you don’t have one already, follow the Azure Machine Learning documentation to set up a new workspace.
Upload the Repository: Navigate to your workspace and upload the cloned repository.

3. Upload the Dataset to Data Assets

Navigate to the Datasets section within your Azure Machine Learning workspace.
Click on + Create Dataset and select From local files.
Upload the normalized_data_minimax.csv file located in the data folder of the cloned repository.
Complete the dataset registration process by providing a name, description, and ensuring the correct format is selected for the data.

4. Using Designer

Create a New Pipeline: Go to the Designer section in Azure Machine Learning Studio and initiate a new pipeline.
Drag and Drop Modules: Utilize the drag-and-drop interface to add data input, data transformation, and machine learning modules.
Configure Modules: Set up each module according to the specific requirements of the project.
Run the Pipeline: After configuring the pipeline, execute it to train and evaluate your model.

5. Using AutoML

Create a New AutoML Experiment: In Azure Machine Learning Studio, navigate to the Automated ML section and start a new experiment.
Select Dataset: Choose the dataset you uploaded to Data Assets.
Configure Experiment: Set the target column to the geothermal gradient and adjust other settings as necessary.
Run the Experiment: Launch the AutoML experiment to automatically train and evaluate multiple models.

6. Using Notebooks

Install Necessary Libraries: Open the terminal in Azure Machine Learning Studio and run the following command:

pip install -r requirements.txt

Open the Notebook: In Azure Machine Learning Studio, open the Model_V4.ipynb notebook.
Run Cells: Execute each cell to preprocess data, train the model, and evaluate the results.
Analyze Results: Review the outputs and visualizations to assess the model's performance.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
HAGI Hackathon 2024 - Day 1.pdf		HAGI Hackathon 2024 - Day 1.pdf
HolovizSeismic_read_data_segy.ipynb		HolovizSeismic_read_data_segy.ipynb
LICENSE		LICENSE
Model_V4.ipynb		Model_V4.ipynb
README.md		README.md
normalized_data_minmax.csv		normalized_data_minmax.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Azure Machine Learning Workshop

Use Case: Predicting Geothermal Characteristics in Colombia

Objectives

Methodology

Dataset Overview

Step-by-Step Guide

1. Clone the Repository

2. Upload to Azure Machine Learning

3. Upload the Dataset to Data Assets

4. Using Designer

5. Using AutoML

6. Using Notebooks

About

Releases

Packages

Contributors 2

Languages

License

GitHub-Nawatech-Lab/azureml-exercise

Folders and files

Latest commit

History

Repository files navigation

Azure Machine Learning Workshop

Use Case: Predicting Geothermal Characteristics in Colombia

Objectives

Methodology

Dataset Overview

Step-by-Step Guide

1. Clone the Repository

2. Upload to Azure Machine Learning

3. Upload the Dataset to Data Assets

4. Using Designer

5. Using AutoML

6. Using Notebooks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages