Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

AFM

1. 论文

Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks

创新:基于Attention的Pooling层,与一般的Attention机制不同,具体可以看原文笔记。

原文笔记: https://mp.weixin.qq.com/s/hPCS9Dw2vT2pwdWwPo0EJg

2. 模型结构

3. 实验数据集

采用Criteo数据集进行测试。数据集的处理见../data_process文件,主要分为:

  1. 考虑到Criteo文件过大,因此可以通过read_partsample_sum读取部分数据进行测试;
  2. 对缺失数据进行填充;
  3. 对密集数据I1-I13进行离散化分桶(bins=100),对稀疏数据C1-C26进行重新编码LabelEncoder
  4. 整理得到feature_columns
  5. 切分数据集,最后返回feature_columns, (train_X, train_y), (test_X, test_y)

4. 模型API

class AFM(Model):
    def __init__(self, feature_columns, mode, att_vector=8, activation='relu', dropout=0.5, embed_reg=1e-6):
        """
        AFM 
        :param feature_columns: A list. sparse column feature information.
        :param mode: A string. 'max'(MAX Pooling) or 'avg'(Average Pooling) or 'att'(Attention)
        :param att_vector: A scalar. attention vector.
        :param activation: A string. Activation function of attention.
        :param dropout: A scalar. Dropout.
        :param embed_reg: A scalar. the regularizer of embedding
        """

5. 实验超参数

  • file:Criteo文件;
  • read_part:是否读取部分数据,True
  • sample_num:读取部分时,样本数量,5000000
  • test_size:测试集比例,0.2
  • embed_dim:Embedding维度,8
  • att_vector:attention层隐藏单元,8
  • mode:Pooling的类型, att
  • dropout:0.5;
  • activation:relu
  • embed_reg:1e-5
  • learning_rate:学习率,0.001
  • batch_size:4096
  • epoch:10

6. 实验结果

  1. 采用Criteo数据集中前500w条数据,最终测试集的结果为:
    • max:AUC: 0.780834, loss: 0.4819
    • avg:AUC: 0.762366, loss: 0.4908
    • att:AUC: 0.770821, loss: 0.4806;【不易过拟合,运行速度较慢】
  2. 采用Criteo数据集全部内容:
    • 学习参数:235,112,786;
    • 单个Epoch运行时间【GPU:Tesla V100S-PCI】:411s;
    • 测试集结果(att): AUC: 0.787135, loss: 0.4692