PDF (6 MB)
Collect
Submit Manuscript
Open Access

CAN: Effective Cross Features by Global Attention Mechanism and Neural Network for Ad Click Prediction

Nanjing University of Posts and Telecommunications (NJUPT), Nanjing, 210003, China
Digital Media Department in the Faculty of Computer and Information Sciences, Hosei University, Tokyo 163-8001, Japan
Networked Information Systems Laboratory, Department of Human Informatics and Cognitive Sciences, Waseda University, Tokyo 163-8001, Japan
Show Author Information

Abstract

Online advertising click-through rate (CTR) prediction is aimed at predicting the probability of a user clicking an ad, and it has undergone considerable development in recent years. One of the hot topics in this area is the construction of feature interactions to facilitate accurate prediction. Factorization machine provides second-order feature interactions by linearly multiplying hidden feature factors. However, real-world data present a complex and nonlinear structure. Hence, second-order feature interactions are unable to represent cross information adequately. This drawback has been addressed using deep neural networks (DNNs), which enable high-order nonlinear feature interactions. However, DNN-based feature interactions cannot easily optimize deep structures because of the absence of cross information in the original features. In this study, we propose an effective CTR prediction algorithm called CAN, which explicitly exploits the benefits of attention mechanisms and DNN models. The attention mechanism is used to provide rich and expressive low-order feature interactions and facilitate the optimization of DNN-based predictors that implicitly incorporate high-order nonlinear feature interactions. The experiments using two real datasets demonstrate that our proposed CAN model performs better than other cross feature- and DNN-based predictors.

References

[1]
H. F. Guo, R. M. Tang, Y. M. Ye, Z. G. Li, and X. Q. He, DeepFM: A factorization-machine based neural network for CTR prediction, present at 26th Int. Joint Conf. Artificial Intelligence, Melbourne, Australia, 2017.
[2]
R. X. Wang, B. Fu, G. Fu, and M. L. Wang, Deep & cross network for ad click predictions, in Proc. ADKDD’17, Halifax, Canada, 2017.
[3]
J. Zhang, Y. F. Wang, Z. Y. Yuan, and Q. Jin, Personalized real-time movie recommendation system: Practical prototype and evaluation, Tsinghua Science and Technology, vol. 25, no. 2, pp. 180-191, 2020.
[4]
Y. Shan, T. R. Hoens, J. Jiao, H. J. Wang, D. Yu, and J. C. Mao, Deep crossing: Web-scale modeling without manually crafted combinatorial features, in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 255-262.
[5]
S. Rendle, Factorization machines, presented at 2010 IEEE Int. Conf. Data Mining, Sydney, Australia, 2010, pp. 995-1000.
[6]
Y. C. Juan, Y. Zhuang, W. S. Chin, and C. J. Lin, Field-aware factorization machines for CTR prediction, in Proc. 10th ACM Conf. Recommender Systems, Boston, MA, USA, 2016, pp. 43-50.
[7]
M. Blondel, A. Fujino, N. Ueda, and M. Ishihata, Higher-order factorization machines, in Proc. 30th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 3351-3359.
[8]
W. P. Song, C. C. Shi, Z. P. Xiao, Z. J. Duan, Y. W. Xu, M. Zhang, and J. Tang, AutoInt: Automatic feature interaction learning via self-attentive neural networks, in Proc. 28th ACM Int. Conf. Information and Knowledge Management, Beijing, China, 2019, pp. 1161-1170.
[9]
J. X. Lian, X. H. Zhou, F. Z. Zhang, Z. X. Chen, X. Xie, and G. Z. Sun, xDeepFM: Combining explicit and implicit feature interactions for recommender systems, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 1754-1763.
[10]
X. R. He, J. F. Pan, O. Jin, T. B. Xu, B. Liu, T. Xu, Y. X. Shi, A. Atallah, R. Herbrich, S. Bowers, et al., Practical lessons from predicting clicks on ads at facebook, in Proc. 8th Int. Workshop on Data Mining for Online Advertising, New York, NY, USA, 2014, pp. 1-9.
[11]
G. R. Zhou, X. Q. Zhu, C. R. Song, Y. Fan, H. Zhu, X. Ma, Y. H. Yan, J. Q. Jin, H. Li, and K. Gai, Deep interest network for click-through rate prediction, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 1059-1068.
[12]
S. Rendle, Z. Gantner, C. Freudenthaler, and L. Schmidt-Thieme, Fast context-aware recommendations with factorization machines, in Proc. 34th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Beijing, China, 2011, pp. 635-644.
[13]
R. J. Oentaryo, E. P. Lim, J. W. Low, D. Lo, and M. Finegold, Predicting response in mobile advertising with hierarchical importance-aware factorization machine, in Proc. 7th ACM Int. Conf. Web Search and Data Mining, New York, NY, USA, 2014, pp. 123-132.
[14]
X. N. He and T. S. Chua, Neural factorization machines for sparse predictive analytics, in Proc. 40th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Shinjuku, Japan, 2017, pp. 355-364.
[15]
P. S. Huang, X. D. He, J. F. Gao, L. Deng, A. Acero, and L. Heck, Learning deep structured semantic models for web search using clickthrough data, in Proc. 22nd ACM Int. Conf. Information & Knowledge Management, San Francisco, CA, USA, 2013, pp. 2333-2338.
[16]
W. N. Zhang, T. M. Du, and J. Wang, Deep learning over multi-field categorical data, presented at European Conf. Information Retrieval, Cham, Germany, 2016, pp. 45-57.
[17]
Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436-444, 2015.
[18]
H. T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, M. Ispir, et al., Wide & deep learning for recommender systems, in Proc. 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 2016, pp. 7-10.
[19]
X. N. He, L. Z. Liao, H. W. Zhang, L. Q. Nie, X. Hu, and T. S. Chua, Neural collaborative filtering, in Proc. 26th Int. Conf. World Wide Web, Perth, Australia, 2017, pp. 173-182.
[20]
M. T. Luong, H. Pham, and C. D. Manning, Effective approaches to attention-based neural machine translation, in Proc. 2015 Conf. Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1412-1421.
[21]
H. Chen, C. T. Yin, R. M. Li, W. G. Rong, Z. Xiong, and B. David, Enhanced learning resource recommendation based on online learning style model, Tsinghua Science and Technology, vol. 25, no. 3, pp. 348-356, 2019.
[22]
Y. R. Qu, H. Cai, K. Ren, W. N. Zhang, Y. Yu, Y. Wen, and J. Wang, Product-based neural networks for user response prediction, presented at 2016 IEEE 16th Int. Conf. Data Mining (ICDM), Barcelona, Spain, 2016, pp. 1149-1154.
[23]
J. Xiao, H. Ye, X. N. He, H. W. Zhang, F. Wu, and T. S. Chua, Attentional factorization machines: learning the weight of feature interactions via attention networks, in Proc. 26th Int. Joint Conf. Artificial Intelligence, Melbourne, Australia, 2017, pp. 3119-3125.
[24]
T. Fawcett, An introduction to roc analysis, Pattern Recogn. Lett., vol. 27, no. 8, pp. 861-874, 2006.
[25]
M. Abadi, P. Barham, J. M. Chen, Z. F. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: A system for large-scale machine learning, in Proc. 12th USENIX Conf. Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 265-283.
Tsinghua Science and Technology
Pages 186-195
Cite this article:
Cai W, Wang Y, Ma J, et al. CAN: Effective Cross Features by Global Attention Mechanism and Neural Network for Ad Click Prediction. Tsinghua Science and Technology, 2022, 27(1): 186-195. https://doi.org/10.26599/TST.2020.9010053
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return