信息资源管理学报 ›› 2025, Vol. 15 ›› Issue (5): 14-20.doi: 10.13365/j.jirm.2025.05.014

• 观点论文 • 上一篇    下一篇

超越文本中心主义:多模态技术驱动下的中文数字人文转型

刘炜1 单蓉蓉2 金家琴2,3   

  1. 1.上海社会科学院信息研究所,上海,200235; 
    2.上海大学文化遗产与信息管理学院,上海,200444; 
    3.上海图书馆,上海,200031
  • 出版日期:2025-09-26 发布日期:2025-10-31
  • 基金资助:
    本文系国家社会科学基金重大项目“智能时代提升全民数字素养的理论与实践研究”(24&ZD180)的研究成果之一。

Beyond Text-Centrism: The Transformation of Chinese Digital Humanities Driven by Multimodal Technologies

Liu Wei1 Shan Rongrong2 Jin Jiaqin2,3   

  1. 1.Shanghai Academy of Social Sciences Information Research Institute, Shanghai, 200235; 
    2.School of Cultural Heritage and Information Management, Shanghai University, Shanghai, 200444; 
    3.Shanghai Library, Shanghai, 200031
  • Online:2025-09-26 Published:2025-10-31
  • Supported by:
    This study is an outcome of the Major Project of the National Social Science Fund of China "Theories and Practices of Digital Literacy Improvement for All in the Intelligence Age"(24&ZD180).

摘要: 数字人文研究长期以文本分析为核心,但“文本中心主义”范式在中文语境下暴露出字符集覆盖不足、OCR识别率不高、非文本文化信息流失等问题,限制了对中国物质与非物质文化遗产的系统研究。多模态技术的兴起为中文数字人文开辟了转型路径。本研究首先剖析了文本中心主义的局限,继而探讨多模态融合技术的关键突破,以DeepSeek的Janus Pro模型为例,论证统一多模态大模型在古籍数据化、智能体构建与文化遗产保护等方面的应用潜力。研究表明,多模态技术通过跨模态协同重构文化记忆、强化文化认同,为中文数字人文的转型发展提供了技术支持与方法论重构的双重支撑。

关键词: 数字人文, 文本中心, 统一多模态大模型, DeepSeek, 多模态技术

Abstract: Digital humanities research has traditionally centered on textual analysis, yet this "text-centrism" paradigm reveals significant limitations within the Chinese context, including insufficient character set coverage, low OCR accuracy, and the loss of non-textual cultural information, all of which hinder a comprehensive study of China's rich material cultural heritage. The emergence of multimodal technologies offers a transformative pathway for Chinese digital humanities. This paper investigates the predicaments of text-centrism, analyzes solutions enabled by multimodal fusion technologies, and uses DeepSeek’s Janus Pro model as a case study to illustrate the potential of unified multimodal large-scale models in ancient text digitization, intelligent agent development, and cultural heritage preservation. The results show that multimodal technology can reconstruct cultural memory through cross-modal synergy, enhance the public's cultural identity, and provide technical and methodological support for the transformation of Chinese digital humanities.

Key words: Digital humanities, Text-centrism, Large multimodal models, DeepSeek, Multimodal technology

中图分类号: