面向科技实体抽取任务的大小模型少样本数据协同训练研究

doi:10.13365/j.jirm.2025.04.129

Journal of Information Resources Management ›› 2025, Vol. 15 ›› Issue (4): 129-143.doi: 10.13365/j.jirm.2025.04.129

Previous Articles Next Articles

Research on Collaborative Training of Small and Large Language Models for Scientific Entity Extraction with Few-Shot Data

Liang Zhu^1,2　Liu Yinpeng^1,2　Shi Xiang^1,2　Huang Yong^1,2　Cheng Qikai^1,2

1.School of Information Management, Wuhan University, Wuhan, 430072；　
2.Institute of Intelligence and Innovation Governance, Wuhan University, Wuhan, 430072

Online:2025-07-26 Published:2025-08-31
About author:Liang Zhu, Ph.D. candidate, research interests include information retrieval and data mining; Liu Yinpeng, Ph.D. candidate, research interests include text mining and document intelligence; Shi Xiang, Ph.D. candidate, research interests include text mining and document intelligence; Huang Yong, associate professor, Ph.D. research interests include text mining and science metrics; Cheng Qikai(corresponding author), associate professor, Ph.D., research interests include text mining and information retrieval, Email:chengqikai@whu.edu.cn.
Supported by:
This work is supported by the National Science and Technology Major Project "Key Technologies Research and Development for High-Reliability Sci-Tech Literature Intelligent Engine and Its Demonstration Application"(2023ZD0121502), the project "Argumentation Logic Recognition of Scientific Proposition Text based on Machine Reading Comprehension"(72174157) and the Key Project "Data and Intelligence Empowered Theoretic Change of Scientific Information Resource and Knowledge Management Theory"(72234005) supported by National Natural Science Foundation of China.

Abstract

Abstract: Faced with issues such as high resource consumption, long processing times, and poor scalability in scientific entity extraction tasks, this paper proposes a collaborative training framework that balances the advantages of both small and large language models. The paper tests the effectiveness of this learning framework's model training under different data scales. Using four datasets from different fields—NCBI, BC4CHEMD, S800, and SCIERC—this method is shown to achieve results consistent with full-data fine-tuning in few-shot environments. The paper provides an in-depth analysis of the limitations of large language model prediction strategies in scientific entity extraction tasks and systematically tests the model performance exhibited by small and large language models under different data scales through multiple rounds of collaborative training. Additionally, from the dual perspectives of small model recognition strategies and training data similarity, this paper thoroughly examines the reasons for the improved performance of the proposed learning framework. The collaborative training framework built in this paper enables the simultaneous exploitation of large language model cognitive advantages and small model low-cost, high-efficiency operations, thus better supporting efficient extraction of bibliometric information in low-resource, few-shot environments.

Key words: Few-shot, Large language model, Model distillation, Scientific entities, Entity extraction

CLC Number:

G350

Liang Zhu　Liu Yinpeng　Shi Xiang　Huang Yong　Cheng Qikai. Research on Collaborative Training of Small and Large Language Models for Scientific Entity Extraction with Few-Shot Data[J]. Journal of Information Resources Management, 2025, 15(4): 129-143.

[1]	Liu Xiaohui　Ran Congjing　Liu Xingshen　Li Wang. Evaluation of Prompt Fine-Tuning Data Efficacy in Large Language Models： A Focus on Data Quality [J]. Journal of Information Resources Management, 2025, 15(3): 108-121.
[2]	Yan Helong　Liu Jiangfeng　Wang Ziyi　Pei Lei. Data Factor Circulation Policy in the Digital Economy Era: Constitutive Elements, Theoretical Framework, and Practical Path [J]. Journal of Information Resources Management, 2025, 15(3): 76-92.
[3]	Zhu Danhao　Zhao Zhixiao　Zhang Yiping　Sun GuangYao　Liu Chang　Hu Die　Wang Dongbo. Research on Large Language Model Evaluation for the Generation Task of Natural Language Processing in Classical Chinese [J]. Journal of Information Resources Management, 2024, 14(5): 45-58.
[4]	Zhang Hai　Zhao Xue　Wang Dongbo. Research on Intelligent Information Processing of Ancient Books under the Large Language Model: Constituent Elements, Framework System, and Practical Path [J]. Journal of Information Resources Management, 2024, 14(5): 36-44.
[5]	Zuo Liang　Zhao Zhixiao　Wang Dongbo. A Study on Automatic Categorization of the Siku Quanshu Based on a Large Language Model [J]. Journal of Information Resources Management, 2024, 14(5): 23-35.
[6]	Zhou Haichen　Zhang Chengzhi　Hu Zhigang　Xu Shuo　Mao Jin　Chen Liang. Application and Reflection of Full-text Bibliometric in the Era of Large Models ——A Review of the 2023 Academic Salon on Full-text Bibliometric Analysis [J]. Journal of Information Resources Management, 2024, 14(2): 162-168,封2.
[7]	Speaker: Zhang Chengzhi Hu Zhigang Xu Shuo Wang Xuefeng Shi Qinghui Wang Wei. New Progress and Exploration of Full-text Bibliometric Analysis Theory and Technology——A Review of the 2019 Academic Salon on Full-text Bibliometric Analysis [J]. Journal of Information Resources Management, 2020, 10(1): 111-.

Research on Collaborative Training of Small and Large Language Models for Scientific Entity Extraction with Few-Shot Data

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 7

Recommended Articles

Metrics

Comments