Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Colab examples #24

Merged
merged 8 commits into from
Jan 20, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 71 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# Keras Image Models

<div align="center">
<img width="50%" src="docs/banner/kimm.png" alt="KIMM">
<img width="50%" src="https://github.com/james77777778/kimm/assets/20734616/b21db8f2-307b-4791-b93d-e913e45fb238" alt="KIMM">

[![PyPI](https://img.shields.io/pypi/v/kimm)](https://pypi.org/project/kimm/)
[![Contributions Welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/james77777778/kimm/issues)
Expand All @@ -24,10 +24,15 @@ pip install keras kimm

## Quickstart

### Use Pretrained Model
### Image Classification Using the Model Pretrained on ImageNet

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/14WxYgVjlwCIO9MwqPYW-dskbTL2UHsVN?usp=sharing)

```python
from keras import random
import cv2
import keras
from keras import ops
from keras.applications.imagenet_utils import decode_predictions

import kimm

Expand All @@ -38,58 +43,81 @@ print(kimm.list_models())
print(kimm.list_models("efficientnet", weights="imagenet")) # fuzzy search

# Initialize the model with pretrained weights
model = kimm.models.EfficientNetV2B0(weights="imagenet")

# Predict
x = random.uniform([1, 192, 192, 3]) * 255.0
y = model.predict(x)
print(y.shape)
model = kimm.models.EfficientNetV2B0()
image_size = model._default_size

# Initialize the model as a feature extractor with pretrained weights
model = kimm.models.EfficientNetV2B0(
feature_extractor=True, weights="imagenet"
# Load an image as the model input
image_path = keras.utils.get_file(
"african_elephant.jpg", "https://i.imgur.com/Bvro0YD.png"
)
image = cv2.imread(image_path)
image = cv2.resize(image, (image_size, image_size))
x = ops.convert_to_tensor(image)
x = ops.expand_dims(x, axis=0)

# Extract features for downstream tasks
y = model.predict(x)
print(y.keys())
print(y["BLOCK5_S32"].shape)
# Predict
preds = model.predict(x)
print("Predicted:", decode_predictions(preds, top=3)[0])
```

### Transfer Learning
```bash
['ConvMixer1024D20', 'ConvMixer1536D20', 'ConvMixer736D32', 'ConvNeXtAtto', ...]
['EfficientNetB0', 'EfficientNetB1', 'EfficientNetB2', 'EfficientNetB3', ...]
1/1 ━━━━━━━━━━━━━━━━━━━━ 11s 11s/step
Predicted: [('n02504458', 'African_elephant', 0.90578836), ('n01871265', 'tusker', 0.024864597), ('n02504013', 'Indian_elephant', 0.01161992)]
```

```python
from keras import layers
from keras import models
from keras import random
### An end-to-end example: fine-tuning an image classification model on a cats vs. dogs dataset

import kimm
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1IbqfqG2NKEOKvBOznIPT1kjOdVPfThmd?usp=sharing)

# Initialize the model as a backbone with pretrained weights
backbone = kimm.models.EfficientNetV2B0(
input_shape=[224, 224, 3],
include_top=False,
pooling="avg",
weights="imagenet",
)
Using `kimm.models.EfficientNetLiteB0`:

# Freeze the backbone for transfer learning
backbone.trainable = False
<div align="center">
<img width="75%" src="https://github.com/james77777778/kimm/assets/20734616/cbfc0773-a3fa-407d-be9a-fba4f19da6d3" alt="kimm_prediction_0">

# Construct the model with new head
inputs = layers.Input([224, 224, 3])
x = backbone(inputs, training=False)
x = layers.Dropout(0.2)(x)
outputs = layers.Dense(2)(x)
model = models.Model(inputs, outputs)
<img width="75%" src="https://github.com/james77777778/kimm/assets/20734616/2eac0831-75bb-4790-a3af-412c3e09cf8f" alt="kimm_prediction_1">
</div>

# Train the new model (put your own logic here)
Reference: [Transfer learning & fine-tuning (keras.io)](https://keras.io/guides/transfer_learning/#an-endtoend-example-finetuning-an-image-classification-model-on-a-cats-vs-dogs-dataset)

# Predict
x = random.uniform([1, 224, 224, 3]) * 255.0
y = model.predict(x)
print(y.shape)
```
### Grad-CAM

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1h25VmsYDOLL6BNbRPEVOh1arIgcEoHu6?usp=sharing)

Using `kimm.models.MobileViTS`:

<div align="center">
<img width="75%" src="https://github.com/james77777778/kimm/assets/20734616/cb5022a3-aaea-4324-a9cd-3d2e63a0a6b2" alt="grad_cam">
</div>

Reference: [Grad-CAM class activation visualization (keras.io)](https://keras.io/examples/vision/grad_cam/)

## Model Zoo

|Model|Paper|Weights are ported from|API|
|-|-|-|-|
|ConvMixer|[ICLR 2022 Submission](https://arxiv.org/abs/2201.09792)|`timm`|`kimm.models.ConvMixer*`|
|ConvNeXt|[CVPR 2022](https://arxiv.org/abs/2201.03545)|`timm`|`kimm.models.ConvNeXt*`|
|DenseNet|[CVPR 2017](https://arxiv.org/abs/1608.06993)|`timm`|`kimm.models.DenseNet*`|
|EfficientNet|[ICML 2019](https://arxiv.org/abs/1905.11946)|`timm`|`kimm.models.EfficientNet*`|
|EfficientNetLite|[ICML 2019](https://arxiv.org/abs/1905.11946)|`timm`|`kimm.models.EfficientNetLite*`|
|EfficientNetV2|[ICML 2021](https://arxiv.org/abs/2104.00298)|`timm`|`kimm.models.EfficientNetV2*`|
|GhostNet|[CVPR 2020](https://arxiv.org/abs/1911.11907)|`timm`|`kimm.models.GhostNet*`|
|GhostNetV2|[NeurIPS 2022](https://arxiv.org/abs/2211.12905)|`timm`|`kimm.models.GhostNetV2*`|
|InceptionV3|[CVPR 2016](https://arxiv.org/abs/1512.00567)|`timm`|`kimm.models.InceptionV3`|
|LCNet|[arXiv 2021](https://arxiv.org/abs/2109.15099)|`timm`|`kimm.models.LCNet*`|
|MobileNetV2|[CVPR 2018](https://arxiv.org/abs/1801.04381)|`timm`|`kimm.models.MobileNetV2*`|
|MobileNetV3|[ICCV 2019](https://arxiv.org/abs/1905.02244)|`timm`|`kimm.models.MobileNetV3*`|
|MobileViT|[ICLR 2022](https://arxiv.org/abs/2110.02178)|`timm`|`kimm.models.MobileViT*`|
|RegNet|[CVPR 2020](https://arxiv.org/abs/2003.13678)|`timm`|`kimm.models.RegNet*`|
|ResNet|[CVPR 2015](https://arxiv.org/abs/1512.03385)|`timm`|`kimm.models.ResNet*`|
|TinyNet|[NeurIPS 2020](https://arxiv.org/abs/2010.14819)|`timm`|`kimm.models.TinyNet*`|
|VGG|[ICLR 2015](https://arxiv.org/abs/1409.1556)|`timm`|`kimm.models.VGG*`|
|ViT|[ICLR 2021](https://arxiv.org/abs/2010.11929)|`timm`|`kimm.models.VisionTransformer*`|
|Xception|[CVPR 2017](https://arxiv.org/abs/1610.02357)|`keras`|`kimm.models.Xception`|

The export scripts can be found in `tools/convert_*.py`.

## License

Expand Down