信息资源管理学报 ›› 2015, Vol. 5 ›› Issue (2): 63-69.doi: 10.13365/j.jirm.2015.02.063

• 研究论文 • 上一篇    下一篇

百度百科网页质量的自动化评价

仝召娟 许鑫   

  • 收稿日期:2014-07-10 出版日期:2015-04-26 发布日期:2015-04-26
  • 作者简介:仝召娟,女,助理馆员,硕士,研究方向为数字图书馆、信息资源建设;许鑫,男,副教授,博士,研究方向为管理信息系统、网络信息处理与分析,Email:xxu@infor.ecnu.edu.cn。

Automatic Evaluation of Baidu Encyclopedia Web Pages

Tong Zhaojuan Xu Xin   

  • Received:2014-07-10 Online:2015-04-26 Published:2015-04-26

摘要:

本文给出了一种高度自动化、可操作性强的百度百科网页质量评价方法。论文首先阐述了百度百科网页质量评价的必要性,介绍了国内外网页质量评价的现状;然后给出了百度百科网页质量的自动化评价思路,包括确定并自动化提取网页特征值、训练评价模型和自动化评价网页质量等步骤;以百度百科中“中华烹饪文化”相关的网页为实验对象,在对比分类结果的基础上,选取J48分类器实现了自动化评价,并探讨了各特征值对评价结果的影响;最后讨论了这种自动化评价方法的局限及后续研究。

关键词: 网页质量,  百度百科,  自动化评价

Abstract:

This paper presents an automatic and practical evaluation method of Baidu encyclopedia web pages. Firstly it expounds the necessity of the quality evaluation of Baidu encyclopedia web pages and the current situation of the web pages quality evaluation methods. Then the paper introduces the framework of automatic evaluation of the Baidu encyclopedia web pages quality, including confirming and automatically extracting the web pages’ features, training the evaluation model and automatically evaluating the web pages’ quality. Taking the webpages related to Chinese cuisine culture from Baidu encyclopedia as experimental subjects,based on the comparisons between the classification results,this paper selects the classifier to realize the automatic evaluation and discusses the influence of the eigenvalues on the evaluation results. Finally, it discusses the limitations and further direction of this automated evaluation method.

Key words: Web pages quality,  Baidu encyclopedia,  Automatic evaluation

中图分类号: