- training as multi-class-classification problem (which video will be watched - class will br > 1M) - will be a problem
video-watch sequence - apply word2vec search - apply word2vec static profile (user)
auto-feature engineering by DNN
- Simple Dense Layer x 3 with relu - check this
apply ANN
- same video-watch embedding
- why impression and watched?
- language embedding