Journal of Information Resources Management ›› 2023, Vol. 13 ›› Issue (5): 137-148.doi: 10.13365/j.jirm.2023.05.137

Previous Articles    

The Identification of the Core Factors of Highly Cited Papers

Xu Linyu   

  1. School of Management,Xuzhou Medical University,Xuzhou,221004
  • Online:2023-09-26 Published:2023-10-15

Abstract: Highly cited papers have high academic discourse and reference values. The research on the identification of the core factors of highly cited papers is very important for academic papers to obtain citations and to establish and strengthen the competitive advantage. This paper extracts, screens, and forms a set of internal and external influencing factors of academic papers through the combination of literature extraction and questionnaire. It then explores the linear and nonlinear influence of these factors on highly cited papers by means of logistic regression. Finally, this paper uses various classical classification algorithms of machine learning to test the robustness of the above results. The results show that the quality and age of references both have a significant positive linear effect on the formation of highly cited papers, and with the increase of variable values, the quadratic coefficient has a strong superposition effect. In addition, journal reputation has an approximate linear effect on the formation of highly cited papers. However, the indicators such as the author’s reputation, usage and initial citation have a significant positive linear influence on the formation of highly cited papers. With the increase of variable values, the linear effect of the quadratic coefficient gradually weakens, showing a semi-"inverted U" trend of first increasing and then leveling off. Machine learning classical classification algorithms such as decision tree, naive bayes and random forest all show good prediction results for highly cited papers, which shows that the research results of this paper are robust.

Key words: Highly cited papers, Core factors, Preferential attachment, Logical regression, Machine learning classification algorithm

CLC Number: