Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chapter-5/1-develop-tool.ipynb #174

Open
zoldaten opened this issue Apr 5, 2023 · 0 comments
Open

chapter-5/1-develop-tool.ipynb #174

zoldaten opened this issue Apr 5, 2023 · 0 comments

Comments

@zoldaten
Copy link

zoldaten commented Apr 5, 2023

when train model (transfer learning)

# select the percentage of layers to be trained while using the transfer learning
# technique. The selected layers will be close to the output/final layers.
unfreeze_percentage = 0

learning_rate = 0.001

if training_format == "scratch":
    print("Training a model from scratch")
    model = scratch(train, val, learning_rate)
elif training_format == "transfer_learning":
    print("Fine Tuning the MobileNet model")
    model = transfer_learn(train, val, unfreeze_percentage, learning_rate)

i see messages like that:
Corrupt JPEG data: 65 extraneous bytes before marker 0xd9

i googled that and the problem is with corrupted images in dataset (cats and dogs). to fix that one needs to use code to clean dataset:

import os
num_skipped = 0
for folder_name in ("Cat", "Dog"):
    folder_path = os.path.join("PetImages", folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)
print("Deleted %d images" % num_skipped)

before training.
see also - https://discuss.tensorflow.org/t/first-steps-in-keras-error/8049/11

but i have no idea how to clean it if i have dataset presented as tfrecords:

cats_vs_dogs-train.tfrecord-00000-of-00008
cats_vs_dogs-train.tfrecord-00001-of-00008
cats_vs_dogs-train.tfrecord-00002-of-00008
cats_vs_dogs-train.tfrecord-00003-of-00008
cats_vs_dogs-train.tfrecord-00004-of-00008
cats_vs_dogs-train.tfrecord-00005-of-00008
cats_vs_dogs-train.tfrecord-00006-of-00008
cats_vs_dogs-train.tfrecord-00007-of-00008

how you fix that and if is not will the model will be correct ?

manavrmoorthy pushed a commit to manavrmoorthy/Practical-Deep-Learning-Book that referenced this issue Aug 28, 2023
- removing deprecated files
- the gcloud sessions are not using the same runtime (so we need to re-download data) so writing outputs to gdrive to make sure things are running and outputs are accessible for subsequent notebooks in colab
- renaming 4/1 notebook - since it had underscores instead of hyphens, the colab link in the notebook was not working
- xception added to imports for notebook 1, since it is part of the model_maker
- PracticalDL#163 - fixed
- PracticalDL#164 - fixed
- PracticalDL#169 - fixed
- metric='angular' added for annoy as default arg will be removed in subsequent releases
- removing a duplicate PCA + Annoy section
- PracticalDL#170 - fixed
- time is a negligible factor here, and we do not need it in the plots (since we are using optimised accuracy calculation using numpy from issue 170) - hence, modifying the plots
- removing matplotlib.style.use('seaborn') since it is deprecated
- the final fine-tuning notebook uses Caltech256 features (as per the book), which do not exist, since fine-tuning was done on Caltech101 - hence, renaming those files to caltech101. Can we retain caltech101 to test?
- PracticalDL#167 - fixed, if the above is okay
- formatted the code

chapter 5:
- write_grads and batch_size params have been removed from callback, or will be removed in subsequent releases
- PracticalDL#174 - not able to replicate this issue
- added a pointer to the notebook that suggests that for tensorboard to work without a 403 Forbidden error on Colab, cookies need to be allowed (I faced this issue)
- notebook 3 in chapter 5 is the exact same as notebook 2 in chapter 2 - replaced the file directly
- the autokeras notebook in Colab is named autokeras-error.ipynb - where can we change this to autokeras.ipynb?
- fixing accuracy score calculation in the autokeras notebook
- formatted the code

chapter 6:
- including the download_sample_image function
- formatted the code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant