Journal of Information Resources Management ›› 2025, Vol. 15 ›› Issue (4): 129-143.doi: 10.13365/j.jirm.2025.04.129

Previous Articles     Next Articles

Research on Collaborative Training of Small and Large Language Models for Scientific Entity Extraction with Few-Shot Data

Liang Zhu1,2 Liu Yinpeng1,2 Shi Xiang1,2 Huang Yong1,2 Cheng Qikai1,2   

  1. 1.School of Information Management, Wuhan University, Wuhan, 430072; 
    2.Institute of Intelligence and Innovation Governance, Wuhan University, Wuhan, 430072
  • Online:2025-07-26 Published:2025-08-31
  • About author:Liang Zhu, Ph.D. candidate, research interests include information retrieval and data mining; Liu Yinpeng, Ph.D. candidate, research interests include text mining and document intelligence; Shi Xiang, Ph.D. candidate, research interests include text mining and document intelligence; Huang Yong, associate professor, Ph.D. research interests include text mining and science metrics; Cheng Qikai(corresponding author), associate professor, Ph.D., research interests include text mining and information retrieval, Email:chengqikai@whu.edu.cn.
  • Supported by:
    This work is supported by the National Science and Technology Major Project "Key Technologies Research and Development for High-Reliability Sci-Tech Literature Intelligent Engine and Its Demonstration Application"(2023ZD0121502), the project "Argumentation Logic Recognition of Scientific Proposition Text based on Machine Reading Comprehension"(72174157) and the Key Project "Data and Intelligence Empowered Theoretic Change of Scientific Information Resource and Knowledge Management Theory"(72234005) supported by National Natural Science Foundation of China.

Abstract: Faced with issues such as high resource consumption, long processing times, and poor scalability in scientific entity extraction tasks, this paper proposes a collaborative training framework that balances the advantages of both small and large language models. The paper tests the effectiveness of this learning framework's model training under different data scales. Using four datasets from different fields—NCBI, BC4CHEMD, S800, and SCIERC—this method is shown to achieve results consistent with full-data fine-tuning in few-shot environments. The paper provides an in-depth analysis of the limitations of large language model prediction strategies in scientific entity extraction tasks and systematically tests the model performance exhibited by small and large language models under different data scales through multiple rounds of collaborative training. Additionally, from the dual perspectives of small model recognition strategies and training data similarity, this paper thoroughly examines the reasons for the improved performance of the proposed learning framework. The collaborative training framework built in this paper enables the simultaneous exploitation of large language model cognitive advantages and small model low-cost, high-efficiency operations, thus better supporting efficient extraction of bibliometric information in low-resource, few-shot environments.

Key words: Few-shot, Large language model, Model distillation, Scientific entities, Entity extraction

CLC Number: