xDeepFM 网络介绍与源码浅析-APISpace

xDeepFM 网络介绍与源码浅析

前言 (与主题无关, 可以忽略)

哈哈哈, 十月第一篇博客, 希望这个季度能更奋进一些~~~ 不想当咸鱼了… ????????????

广而告之

xDeepFM

文章信息

论文标题: xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems论文地址:此外, DeepCTR 也进行了实现:2018论文作者: Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, Guangzhong Sun作者单位: University of Science and Technology of China

核心观点

核心观点介绍

首先看 xDeepFM 的网络结构, 如下图所示:

可以认为 xDeepFM 是对 DCN 的改进, 其介绍了 CIN 层来替换 DCN 中的 Cross 层, 用于显式学习交叉特征.

高阶特征交叉 (High-order Interactions)

下面在介绍 CIN 层的原理之前, 先说明一下 Cross 层存在的问题, 这个是 xDeepFM 这篇 paper 讨论的一个重点. 为了方便讨论, 先引入一些必要的符号. 设原始的高维稀疏特征经 Embedding Layer 处理后, 映射为低维的稠密向量:

在讨论 DCN 中的 Cross 层之前, 先介绍两个概念:

DCN 中 Cross 层通过下式对高阶特征交叉进行建模:

基于以上考虑, 本文提出 CIN 模块, 一种新的对特征进行显式交叉的方法, 并且特征交叉是在 vector-wise level 上进行的.

CIN (Compressed Interaction Network)

CIN 源码浅析

原作者在中实现了 CIN, 由于代码有点多, 不太想看, 这里分析 DeepCTR 对于 CIN 的实现, 了解个大概就行, 要用到的时候再说 ???? ???? ???? DeepCTR 在中实现了 CIN 模块.

详细注释写在了代码中, 其中不太直观的地方有两处, 我写了很简单的测试用例, 可以用于后续的参考:

dot_result_m = tf.matmul(split_tensor0, split_tensor, transpose_b=True)

import tensorflow as tfB = 2D = 3m = 2H = 2 ## 理解为 H_{k-1}a = tf.reshape(tf.range(B * D * m, dtype=tf.float32), (B, m, D))b = tf.split(a, D * [1], 2)c = tf.matmul(b, b, transpose_b=True)with tf.Session() as sess: print(sess.run(tf.shape(c))) ## shape 为 [D, B, m, H_{k-1}]

curr_out = tf.nn.conv1d(dot_result, filters=self.filters[idx], stride=1, padding='VALID')

import tensorflow as tfB = 2D = 3E = 4 ## 代表 m * H_{k-1}F = 5 ## 代表 H_{k}a = tf.reshape(tf.range(B * D * E, dtype=tf.float32), (B, D, E))b = tf.reshape(tf.range(1 * E * F, dtype=tf.float32), (1, E, F))curr_out = tf.nn.conv1d( a, filters=b, stride=1, padding='VALID')with tf.Session() as sess: print(sess.run(tf.shape(curr_out))) ## 结果为 [B, D, H_{k}]

CIN 模块的代码如下:

class CIN(Layer): """Compressed Interaction Network used in xDeepFM.This implemention is adapted from code that the author of the paper published on Input shape - 3D tensor with shape: ``(batch_size,field_size,embedding_size)``. Output shape - 2D tensor with shape: ``(batch_size, featuremap_num)`` ``featuremap_num = sum(self.layer_size[:-1]) // 2 + self.layer_size[-1]`` if ``split_half=True``,else ``sum(layer_size)`` . Arguments - **layer_size** : list of int.Feature maps in each layer. - **activation** : activation function used on feature maps. - **split_half** : bool.if set to False, half of the feature maps in each hidden will connect to output unit. - **seed** : A Python integer to use as random seed. References - [Lian J, Zhou X, Zhang F, et al. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems[J]. arXiv preprint arXiv:1803.05170, 2018.] ( """ def __init__(self, layer_size=(128, 128), activation='relu', split_half=True, l2_reg=1e-5, seed=1024, **kwargs): if len(layer_size) == 0: raise ValueError( "layer_size must be a list(tuple) of length greater than 1") self.layer_size = layer_size self.split_half = split_half self.activation = activation self.l2_reg = l2_reg self.seed = seed super(CIN, self).__init__(**kwargs) def build(self, input_shape): if len(input_shape) != 3: raise ValueError( "Unexpected inputs dimensions %d, expect to be 3 dimensions" % (len(input_shape))) self.field_nums = [int(input_shape[1])] self.filters = [] self.bias = [] for i, size in enumerate(self.layer_size): ## layer_size 对应着论文中的 H_{k}, 表示 CIN 每层中 feature map 的个数 ## self.filters[i] 的 shape 为 [1, m * H_{k-1}, H_{k}] self.filters.append( self.add_weight(name='filter' + str(i), shape=[1, self.field_nums[-1] * self.field_nums[0], size], dtype=tf.float32, initializer=glorot_uniform(seed=self.seed + i), regularizer=l2(self.l2_reg))) ## self.bias[i] 的 shape 为 [H_{k}] self.bias.append( self.add_weight(name='bias' + str(i), shape=[size], dtype=tf.float32, initializer=tf.keras.initializers.Zeros())) if self.split_half: if i != len(self.layer_size) - 1 and size % 2 > 0: raise ValueError( "layer_size must be even number except for the last layer when split_half=True") self.field_nums.append(size // 2) else: self.field_nums.append(size) self.activation_layers = [activation_layer( self.activation) for _ in self.layer_size] super(CIN, self).build(input_shape) # Be sure to call this somewhere! def call(self, inputs, **kwargs): ## inputs 的 shape 为 [B, m, D], 其中 m 为 Field 的数量, ## D 为 embedding size, 我注释的符号尽量和论文中的一样 if K.ndim(inputs) != 3: raise ValueError( "Unexpected inputs dimensions %d, expect to be 3 dimensions" % (K.ndim(inputs))) dim = int(inputs.get_shape()[-1]) # D hidden_nn_layers = [inputs] final_result = [] ## split_tensor0 表示 list: [x1, x2, ..., xD], 其中 xi 的 shape 为 [B, m, 1] split_tensor0 = tf.split(hidden_nn_layers[0], dim * [1], 2) for idx, layer_size in enumerate(self.layer_size): ## split_tensor 表示 list: [t1, t2, ..., tH_{k-1}], 即有 H_{k-1} 个向量; ## 其中 ti 的 shape 为 [B, H_{k-1}, 1] split_tensor = tf.split(hidden_nn_layers[-1], dim * [1], 2) ## dot_result_m 为一个 tensor, 其 shape 为 [D, B, m, H_{k-1}] dot_result_m = tf.matmul( split_tensor0, split_tensor, transpose_b=True) ## dot_result_o 的 shape 为 [D, B, m * H_{k-1}] dot_result_o = tf.reshape( dot_result_m, shape=[dim, -1, self.field_nums[0] * self.field_nums[idx]]) ## dot_result 的 shape 为 [B, D, m * H_{k-1}] dot_result = tf.transpose(dot_result_o, perm=[1, 0, 2]) ## 牛掰啊, 还可以这样写, 精彩! ## self.filters[idx] 的 shape 为 [1, m * H_{k-1}, H_{k}] ## 因此 curr_out 的 shape 为 [B, D, H_{k}] curr_out = tf.nn.conv1d( dot_result, filters=self.filters[idx], stride=1, padding='VALID') ## self.bias[idx] 的 shape 为 [H_{k}] ## 因此 curr_out 的 shape 为 [B, D, H_{k}] curr_out = tf.nn.bias_add(curr_out, self.bias[idx]) ## curr_out 的 shape 为 [B, D, H_{k}] curr_out = self.activation_layers[idx](curr_out) ## curr_out 的 shape 为 [B, H_{k}, D] curr_out = tf.transpose(curr_out, perm=[0, 2, 1]) if self.split_half: if idx != len(self.layer_size) - 1: next_hidden, direct_connect = tf.split( curr_out, 2 * [layer_size // 2], 1) else: direct_connect = curr_out next_hidden = 0 else: direct_connect = curr_out next_hidden = curr_out final_result.append(direct_connect) hidden_nn_layers.append(next_hidden) ## 先假设不走 self.split_half 的逻辑, 此时 result 的 ## shape 为 [B, sum(H_{k}), D] (k=1 -> T, T 为 CIN 的总层数) result = tf.concat(final_result, axis=1) ## result 最终的 shape 为 [B, sum(H_{k})] result = reduce_sum(result, -1, keep_dims=False) return

总结

这篇文章有点麻烦, 写博客花的时间有点久啊, 上午开始写, 左磨右磨终于写完了. 必须说一下, 我昨天把手机 B 站 APP 给卸载了, 感觉生活多出了很多时间… ???? ???? ????

mysql连接测试不成功的原因有哪些

381 2022-09-18

xDeepFM 网络介绍与源码浅析

mysql连接测试不成功的原因有哪些

linux怎么安装无线网卡驱动

java应用程序启动失败怎么解决

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）