当前位置: 首页 > 文章 > 基于特征及规则模式的学位论文元数据信息自动抽取研究 农业图书情报学报 2015,27 (2) 57-59
Position: Home > Articles > Automatic Extraction of Metadata Information for Dissertation based on Feature and Rule Pattern Journal of Library and Information Science in Agriculture 2015,27 (2) 57-59

基于特征及规则模式的学位论文元数据信息自动抽取研究

作  者:
陈淑平
单  位:
燕山大学图书馆
关键词:
学位论文;元数据;信息抽取;正则表达式;模式匹配
摘  要:
目前,在中国高校数字图书馆,学位论文数据库是重要的数字资源,然而,其元数据录入一直依赖手工完成,效率低,耗费大量的人力。针对这一问题,采用基于文档特征与规则模式匹配的方法,利用正则表达式研究学位论文元数据的自动抽取,该算法包括信息定位和元数据抽取两个模块。实验数据表明,该算法具有较高的准确率和召回率以及综合性能指数F。
译  名:
Automatic Extraction of Metadata Information for Dissertation based on Feature and Rule Pattern
作  者:
CHEN Shu-ping;Library of Yanshan University, Yanshan University;
关键词:
Dissertation;;Metadata;;Information extraction;;Regular expression;;Pattern matching
摘  要:
Currently, in our digital library, dissertations database is one important of digital resources. However, metadata entry has relied on manual to complete, which is low efficiency, and cost a lot of manpower. For this problem, our applied the method of document features and pattern matching, and made use of regular expressions to research automatic extraction of dissertation metadata. The algorithm includes two modules of information field location and metadata extraction. The experimental data shows that the algorithm has higher precision and recall, and overall performance index F.

相似文章

计量
文章访问数: 7
HTML全文浏览量: 0
PDF下载量: 0

所属期刊

推荐期刊