信息资源管理学报 ›› 2022, Vol. 12 ›› Issue (4): 121-130.doi: 10.13365/j.jirm.2022.04.121

• 研究论文 • 上一篇    下一篇

付费知识直播用户流失预测实证研究

邢绍艳 朱学芳   

  1. 南京大学信息管理学院,南京,210023
  • 出版日期:2022-07-26 发布日期:2022-09-18
  • 作者简介:邢绍艳,硕士生,研究方向为数字信息资源管理及服务,Email:mg1914028@smail.nju.edu.cn;朱学芳(通讯作者),博士,教授,博士生导师,研究方向为数字信息资源管理及服务,多媒体信息处理等,Email:xfzhu@nju.edu.cn。

An Empirical Study on the User Churn Prediction of Paid Knowledge Live

Xing Shaoyan Zhu Xuefang   

  1. School of Information Management, Nanjing University,Najing, 210023
  • Online:2022-07-26 Published:2022-09-18

摘要: 发挥机器学习算法在分类预测方面的优势,通过实证研究探索付费知识直播用户流失预测模型,分析预测特征变量,为用户留存管理提供决策依据。以知乎Live为数据来源,从用户价值特征及评价特征两个维度出发,采集用户最近一次消费时间、月均消费次数、次均消费金额、首次消费时间及评分、评论文本共六项特征数据,基于六种机器学习算法构建预测模型,比较不同模型的预测效果。对比分析特征变量在用户流失预测中的贡献度,根据关键特征变量划分流失用户类型,提出相应留存策略。评分与评论文本情感对用户流失预测具有显著作用;基于集成学习的XGBoost用户流失预测模型综合表现最好,随机森林次之,集成学习优越的泛化性能得到验证;通过分析影响用户流失预测的重要变量,归纳总结出四类流失用户类型。

关键词: 机器学习, 知识直播, 知识付费, 用户流失, 预测效果, 用户价值, 用户评价

Abstract: Taking advantage of machine learning algorithm in classification prediction, this paper explores a user churn prediction model of paid knowledge live through empirical research, analyzes the prediction variables, and provides decision-making basis for user retention management. Taking Zhihu live as data source, starting from two dimensions of user value characteristics and user review characteristics, users’latest consumption time, monthly average consumption times, average consumption amount, first consumption time, rating and comment text are collected, and then prediction models are constructed based on six different machine learning algorithms, and their prediction effects are compared. Then, the contribution of variables in the prediction of user churn is compared and analyzed. According to the key variables, the types of churn users are divided, and the corresponding retention strategies are proposed. Rating and comment sentiment have significant effect on user churn prediction; XGBoost model based on ensemble learning has the best performance, followed by random forest, so the superior generalization performance of ensemble learning has been well verified. By analyzing the important factors that affect user churn prediction, four types of churn users are summarized.

Key words: Machine learning, Knowledge online live, Paid knowledge, User churn, Prediction effect;User value, User evaluation

中图分类号: