Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error when loading the session-based model in example #908

Closed
FredHJC opened this issue Nov 30, 2022 · 5 comments
Closed

[BUG] Error when loading the session-based model in example #908

FredHJC opened this issue Nov 30, 2022 · 5 comments
Assignees
Labels
bug Something isn't working P0 status/needs-triage
Milestone

Comments

@FredHJC
Copy link

FredHJC commented Nov 30, 2022

❓ Questions & Help

Details

https://github.com/NVIDIA-Merlin/models/blob/main/examples/usecases/ecommerce-session-based-next-item-prediction-for-fashion.ipynb

We are trying to implement the session-based models shown in the above notebook. However, there is a consistent error when loading the saved model. TypeError: ('Keyword argument not understood:', 'layer was saved without config')

We can just run the example notebook and save the trained model. Then the error occurs when trying to load it. It looks like customized layers / prediction tasks should be handled with a manually specified config.

Screen Shot 2022-11-30 at 1 59 41 PM

@zhiruiwang
Copy link

Steps/Code to reproduce bug

  1. Run the Dressipi notebook Transformer-based model example after the model is trained:
model_transformer.fit(loader, 
                      validation_data=val_loader,
                      epochs=EPOCHS
                     )
  1. Save the model to disk, which is successful:
model_transformer.save(os.path.join('/ecom_modeling/11-30-session', 'transformer'))

It does have some warnings though:

WARNING:tensorflow:Skipping full serialization of Keras layer TFSharedEmbeddings(
  (_feature_shapes): Dict(
    (f_47_list_seq): TensorShape([1024, None])
    (f_68_list_seq): TensorShape([1024, None])
    (item_id_list_seq): TensorShape([1024, None])
    (item_id_last): TensorShape([1024, 1])
  )
  (_feature_dtypes): Dict(
    (f_47_list_seq): tf.int32
    (f_68_list_seq): tf.int32
    (item_id_list_seq): tf.int32
    (item_id_last): tf.int32
  )
), because it is not built.
.......
INFO:tensorflow:Unsupported signature for serialization: ((Prediction(outputs={'purchase_id_first/categorical_output': TensorSpec(shape=(None, 23272), dtype=tf.float32, name='outputs/outputs/purchase_id_first/categorical_output')}, targets={'purchase_id_first/categorical_output': TensorSpec(shape=(None, 23272), dtype=tf.float32, name='outputs/targets/purchase_id_first/categorical_output')}, sample_weight={'purchase_id_first/categorical_output': None}, features=None, negative_candidate_ids=None), <tensorflow.python.framework.func_graph.UnknownArgument object at 0x7fb613ff95e0>), {}).
  1. Load the model from disk, which causes the error:
model_loaded = tf.keras.models.load_model(
        os.path.join('/ecom_modeling/11-30-session', 'transformer'))
TypeError: ('Keyword argument not understood:', 'layer was saved without config')

We are wondering if we did something wrong with the saving and loading of the model, or if there's a bug in saving and loading Merlin session-based models.

Also not sure if it's related to #898 or #889

@rnyak rnyak added this to the Merlin 22.12 milestone Dec 14, 2022
@rnyak rnyak added bug Something isn't working P0 labels Dec 14, 2022
@rnyak rnyak changed the title [QST] Error when loading the session-based model in example [BUG] Error when loading the session-based model in example Dec 19, 2022
@rnyak rnyak modified the milestones: Merlin 22.12, Merlin 23.01 Dec 19, 2022
@sararb
Copy link
Contributor

sararb commented Dec 19, 2022

Thank you for reporting the bug. 

I was able to reproduce this error when loading the model without importing merlin.models.tf.  In fact, the custom layers will not be understood by Tensorflow if Merlin Models (MM) is not imported. 

The recommended way to load a trained MM model is to import merlin.models.tf before calling tf.keras.models.load_model, as follows:

import tensorflow as tf
import merlin.models.tf as mm
model = tf.keras.models.load_model('transformer-model')

Please let us know if this fixes the loading issue. Thanks!

@oliverholworthy
Copy link
Member

We also have a classmethod on our Model class. So can also do the following to load the model

import merlin.models.tf as mm

mm.Model.load('<path-to-saved-model-directory>')

@rnyak
Copy link
Contributor

rnyak commented Jan 5, 2023

@zhiruiwang @FredHJC closing this issue since it should be solved via #927.

@rnyak rnyak closed this as completed Jan 5, 2023
@zhiruiwang
Copy link

@rnyak We used the Merlin-tensorflow 22.12 image and refactored our pipeline to use the lastest API of merlin-models codebase, now the saving and loading two tower, LSTM, and transformer models are all working. Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 status/needs-triage
Projects
None yet
Development

No branches or pull requests

5 participants