Semantic Segmentation Using U-NET for or Architectural Data

Problem Statement

Semantic segmentation involves classifying each pixel in an image into a specific category. In the context of satellite imagery, this task is crucial for understanding and managing large-scale environments, including urban planning, resource management, disaster response, and environmental monitoring. Our goal is to create a model that can accurately segment various land types, such as water, land, roads, buildings, and vegetation, from satellite images.

Real-Life Problem: Imagine a situation where a city is planning to expand its infrastructure. The urban planners need precise information on available land, existing roads, water bodies, and buildings to make informed decisions. Traditionally, this would require manual inspection of maps and satellite images, a time-consuming and error-prone process. Our solution aims to automate this process by providing pixel-level annotations of satellite images, thus offering a faster, more accurate, and scalable solution.

Exploratory Data Analysis (EDA)

Understanding the dataset is the first step toward building a successful model. The dataset used here consists of high-resolution satellite images, each annotated with six classes: water, land, road, building, vegetation, and unlabeled. Here's a brief overview of the dataset:

Class Distribution: The distribution of pixel counts across classes shows that most images are dominated by vegetation and land, while classes like roads and buildings are less frequent. This class imbalance is an important factor that we need to address during model training.

Sample Images and Masks: Below are some sample images and their corresponding segmentation masks. The images have been resized to a uniform size for consistency during model training.

Augmentation Visualization: Given the limited diversity in the dataset, we applied various data augmentation techniques such as rotation, width and height shifts, shear, and zoom. The images below show examples of augmented images and their corresponding masks.

Co-occurrence Analysis: We performed a co-occurrence analysis to understand which classes often appear together in the same image. This helps in understanding the spatial relationships between classes, which can be useful for the model to learn complex patterns.

Object Size and Aspect Ratio: Analyzing object sizes and aspect ratios helps us understand the scale of different objects in the images. This analysis is crucial for deciding the receptive field of the network layers and improving the model's performance in recognizing small or large objects.

Architecture Overview:

+---------------------------------------------+
|         Data Loading and Preprocessing      |
|  - Load images and corresponding masks      |
|  - Normalize and resize images and masks    |
|  - Optional data augmentation               |
+---------------------------------------------+
                      |
                      V
                Input Image
                      |
                      V
+---------------------------------------------+
|               U-Net Encoder                 |
|                                             |
|  [Block 1]                                  |
|   - Conv2D + ReLU                           |
|   - Conv2D + ReLU                           |
|   - MaxPooling2D                            |
|                                             |
|  [Block 2]                                  |
|   - Conv2D + ReLU                           |
|   - Conv2D + ReLU                           |
|   - MaxPooling2D                            |
|                                             |
|  ... (Repeated blocks)                      |
|                                             |
|  [Block N]                                  |
|   - Conv2D + ReLU                           |
|   - Conv2D + ReLU                           |
|   - MaxPooling2D                            |
+---------------------------------------------+
                      |
                      V
             Bottleneck Layer
             - Conv2D + ReLU
             - Conv2D + ReLU
                      |
                      V
+---------------------------------------------+
|               U-Net Decoder                 |
|                                             |
|  [Up Block N]                               |
|   - UpSampling2D                            |
|   - Concatenate with Encoder Block N output |
|   - Conv2D + ReLU                           |
|   - Conv2D + ReLU                           |
|                                             |
|  ... (Repeated blocks)                      |
|                                             |
|  [Up Block 1]                               |
|   - UpSampling2D                            |
|   - Concatenate with Encoder Block 1 output |
|   - Conv2D + ReLU                           |
|   - Conv2D + ReLU                           |
+---------------------------------------------+
                      |
                      V
              Output Layer
            - Conv2D (1x1 kernel)
            - Softmax Activation
                      |
                      V
              Segmentation Map

Techniques Explained

Data Augmentation: Techniques like rotations, zooms, and flips are applied to increase data diversity and improve model robustness.
Advanced U-Net: Includes dropout and batch normalization to prevent overfitting and accelerate convergence.
Early Stopping and Learning Rate Scheduling: Helps in finding the optimal model configuration without overfitting.
Co-occurrence and Object Analysis: Understanding object distributions informs better model design and parameter tuning.

Run and Streamlit Application

A user-friendly Streamlit app allows real-time inference on uploaded images. The model generates a segmented mask, providing a visual representation of various land types.

How to Run the Project

git clone https://github.com/yourusername/semantic-segmentation-satellite.git
cd semantic-segmentation-satellite
pip install -r requirements.txt

Preprocess Data:

python src/data_loader.py --config config.yaml

Train the Model:

python src/train.py --config config.yaml --epochs 30 --batch_size 8

Evaluate and Visualize:

python src/visualize.py --config config.yaml --model_path output/logs/best_model.h5

Run Streamlit App

streamlit run streamlit_app.py -- --config config.yaml --model_path output/logs/best_model.h5

Inference Result:

streamlit app Result:

Key Points for improvement:

Inference Results: Mentioned the current performance and challenges.
Training Improvements: Suggested training for more epochs.
Data Increase: Highlighted the importance of gathering more data.
Pretrained Models: Recommendation to use pretrained backbones.
Hyperparameter Tuning: Suggested fine-tuning various hyperparameters.
Augmentation Strategies: Advanced augmentation techniques for better generalization.
Post-Processing: Introduced the concept of CRFs for refinement.
Ensemble Learning: Suggested using multiple models for improved results.

Conclusion

This project demonstrates an end-to-end solution for semantic segmentation of satellite images. From data preprocessing and EDA to model training and deployment, we provide a comprehensive pipeline. The modular design allows easy modifications and scalability for future improvements. Contributions and suggestions are welcome to make this solution even more robust and versatile.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
data/Semantic segmentation dataset		data/Semantic segmentation dataset
notebooks		notebooks
old_code		old_code
output		output
src		src
Readme.md		Readme.md
config.yaml		config.yaml
requirements.txt		requirements.txt
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Segmentation Using U-NET for or Architectural Data

Problem Statement

Exploratory Data Analysis (EDA)

Architecture Overview:

Techniques Explained

Run and Streamlit Application

How to Run the Project

Preprocess Data:

Train the Model:

Evaluate and Visualize:

Run Streamlit App

Inference Result:

streamlit app Result:

Key Points for improvement:

Conclusion

About

Releases

Packages

Languages

deepak2233/Semantic-Segmentation-with-U-Net-for-Architectural-Data

Folders and files

Latest commit

History

Repository files navigation

Semantic Segmentation Using U-NET for or Architectural Data

Problem Statement

Exploratory Data Analysis (EDA)

Architecture Overview:

Techniques Explained

Run and Streamlit Application

How to Run the Project

Preprocess Data:

Train the Model:

Evaluate and Visualize:

Run Streamlit App

Inference Result:

streamlit app Result:

Key Points for improvement:

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages