信息资源管理学报 ›› 2026, Vol. 16 ›› Issue (2): 55-68.doi: 10.13365/j.jirm.2026.02.055

• 研究论文 • 上一篇    下一篇

企业高质量数据集建构:内涵特征、逻辑框架与未来展望

张博睿1 陈桃2 胡婕2 夏义堃1,3   

  1. 1.南京大学数据管理创新研究中心,苏州,215163; 
    2.南京大学-中国移动联合研究院,南京,210029; 
    3.南京大学数据智能与交叉创新实验室,南京,210046
  • 出版日期:2026-03-26 发布日期:2026-06-04
  • 基金资助:
    本文系南京大学-中国移动联合研究院课题“ 江苏移动数据要素布局和路径研究”(NJ20250045)研究成果之一。

Constructing High-Quality Enterprise Data Sets: Connotations, Characteristics, Logical Framework and Future Prospects

Zhang Borui1 Chen Tao2 Hu Jie2 Xia Yikun1,3   

  1. 1.Research Institute for Data Management & Innovation, Nanjing University, Suzhou, 215163; 
    2.Nanjing University-China Mobile Joint Research Institute, Nanjing, 210029; 
    3. Laboratory of Data Intelligence and Interdisciplinary Innovation, Nanjing University, Nanjing, 210046
  • Online:2026-03-26 Published:2026-06-04
  • Supported by:
    This paper is one of the research outcomes of the project of Nanjing University-China Mobile Joint Research Institute "Research on the Layout and Path of Data Elements of Jiangsu Mobile"(NJ20250045).

摘要: 深入剖析企业高质量数据集建构的内涵特征与逻辑框架,对于释放企业数据要素潜能、支撑企业数智化转型具有重要意义。首先阐释企业高质量数据集的概念内涵,从场景适配性、质量递进性、智能迭代性、价值扩散性、运营持续性与规制系统性六个维度,系统揭示其关键特征;其次基于社会技术系统理论,构建价值、场景与技术三维逻辑框架,为理解“人工智能+”背景下企业高质量数据集建构提供理论支撑;再次以数据价值链与数据生命周期理论为分析视角,构建涵盖场景需求感知、数据资源编织、知识资源萃取、数知融通提炼与数智服务生态的五维建构模型;最后结合当前企业高质量数据集建构面临的现实困境,提出针对性优化策略。未来需从数据场景塑造、数据汇聚治理、数据标注优化与数据安全保障四个方面协同发力,破解企业高质量数据集建构难题,提升数据要素价值赋能效能,为企业数智化转型与高质量发展提供支撑。

关键词: 人工智能+, 企业数据, 高质量数据集, 数据要素, 数据治理

Abstract: Analyzing the connotation, characteristics and logical framework of enterprise high-quality dataset construction is essential to unlocking data value and enabling firms' digital-intelligent transformation. This paper conceptualises enterprise high-quality datasets, identifies six core attributes, establishes a three-dimensional framework (value, scenario, technology) based on socio-technical systems theory to provide theoretical support for the construction of high-quality corporate datasets under the "AI Plus" background; subsequently, from the analytical perspectives of the data value chain and data life cycle theories, it constructs a five-dimensional model encompassing scenario demand perception, data resource weaving, knowledge resource extraction, data-knowledge integration and refinement, and a digital-intelligence service ecosystem; finally, in response to the practical dilemmas currently faced by enterprises, the study proposes targeted optimization strategies, suggesting that future efforts should focus synergistically on data scenario shaping, data aggregation and governance, data labeling optimization, and data security assurance to overcome the bottlenecks of high-quality dataset construction, enhance the value-enabling efficiency of data elements, and provide robust support for the digital-intelligence transformation and high-quality development of enterprises.

Key words: Artificial intelligence plus, Enterprise data, High-quality datasets, Data elements, Data governance

中图分类号: