xDeepFM模型学习

less than 1 minute read

论文基本信息

  1. 论文名:xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems

  2. 论文链接:https://arxiv.org/pdf/1803.05170.pdf

  3. 论文源码:


xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems

较早的模型

  1. DeepFM img

  2. Deep & Cross Network

    img

    Cross Network

    根据首层和次层的依赖可以解决多阶特征组合的问题

    img

    img 缺点:Cross Network实际上无法有效的发现高阶组合特征,它的每一个隐藏层都是 scalar multiple of x0,特征交互依然实在元素级(bit-wise)。

xDeepFM模型结构设计

  1. Embedding Layer

    embedding层将原始高维度高稀疏的特征进行降维,转化从dense vector

    img

    img

  2. Compressed Interaction Network (CIN)

    img

    • 输入:Embedding后的数据,m个特征的D维向量,记为X0 = [e1, e2, ···, em]

    • 递推公式:

      CIN结构有k层,Hk表示第k层的embedding vector的个数, 每层的输出结果为xk,下一层的输入依赖前一层的结果。 –「The structure of CIN is very similar to the Recurrent Neural Network (RNN), where the outputs of the next hidden layer are dependent on the last hidden layer and an additional input.」

      用CNN解释公式:将x0与xk进行Hadamard乘积,这一步完成之后会产生一个中间状态的张量Zk+1,Zk+1可以看做一个图片,wk,h是它的filter,The filter sildes across Zk+1 along the embedding dimension(D). Xk+1 is a collection of Hk+1 different feature maps.

      img

      对CIN中每层进行sum pooling操作, i ∈ [1, Hk], All pooling vectors from hidden layers are concatenated before connected to output units: p+ = [p1 , p2 , ··· , pT]

      img

  3. 最终模型结构

    img

    σ is the sigmoid function, a 是原始数据, xkdnn是DNN的输出, p+是CIN的输出,w*和b是训练的参数

    img

    When the depth and feature maps of the CIN part are both set to 1, xDeepFM is a generalization of DeepFM by learning the linear regression weights for the FM layer

实验结果

img

Comments