信息资源管理学报 ›› 2012, Vol. 2 ›› Issue (1): 50-58.

• 研究论文 • 上一篇    下一篇

文献题录信息挖掘技术方法及其软件SATI的实现——以中外图书情报学为例

刘启元 叶鹰   

  • 收稿日期:2011-10-30 出版日期:2012-03-26 发布日期:2012-03-26
  • 作者简介:刘启元,男,硕士研究生;叶鹰,男,教授,博士生导师。

A Study on Mining Bibliographic Records by Designed Software SATI:Case Study on Library and Information Science

Liu Qiyuan Ye Ying   

  • Received:2011-10-30 Online:2012-03-26 Published:2012-03-26

摘要:

利用C#编程技术基于.NET平台设计开发出文献题录信息统计分析工具软件SATI,可导入处理EndNote格式、NoteExpress格式及NoteFirst格式的国内文献题录数据和HTML格式的WoS国际文献题录数据,进行数据格式的转换、字段信息的抽取、词条频次的统计和知识单元共现矩阵、词条频率逐年分布矩阵及文档词条矩阵的构建,以辅助生成聚类图、多维尺度图谱、网络知识图谱、策略坐标图等可视化结果。以2006~2010年中外图书情报学各十种具有代表性的核心期刊刊载的17440篇论文数据为实例,基于聚类与多维尺度分析结果,呈现出中外图书情报学三大主要研究领域,并结合共词分析与社会网络分析方法,通过绘制共现网络知识图谱与策略坐标图,进一步揭示研究领域结构的内部联系及其特征。

关键词: SATI, 共词分析, 聚类分析, 多维尺度分析, 知识图谱, 策略坐标图

Abstract:

A bibliographic information analysis software named SATI (Statistical Analysis Toolkit for Informetrics) is developed using C# based on Microsoft .NET platform. The national data fitting EndNote, NoteExpress and NoteFirst can be imported into SATI as well as international data as HTML (output from WoS). For the purpose of getting the clustering graph, multidimensional scaling map, network knowledge map, and strategic diagram, four basic functions including transforming raw data into XML, extracting selected elements, counting terms frequency and building knowledge units matrices have been implemented. Taking 17440 articles published in ten core Library and Information Science journals both at home and abroad during 2006 to 2010 as the sample, this paper revealed three potential research fields of LIS research area based on the consistency between clustering analysis and multidimensional scaling analysis results, and figured out relations and features of subject areas by interpreting the network knowledge map and strategic diagram. 

Key words: SATI, Co-word analysis, Cluster analysis, Multidimensional, scaling analysis, Knowledge mapping, Strategic diagram

中图分类号: