当前位置: 首页 > 文章 > A method for improving the accuracy of automatic indexing of Chinese-English mixed documents 数据与情报科学学报(英文) 2012,5 (4)
Position: Home > Articles > A method for improving the accuracy of automatic indexing of Chinese-English mixed documents Journal of Data and Information Science 2012,5 (4)

A method for improving the accuracy of automatic indexing of Chinese-English mixed documents

作  者:
Zhao Yan;Hui She
关键词:
documents;automatic indexing;english;mixed;chinese;the accurac
摘  要:
Purpose: The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents. Design/methodology/approach: Based on the inherent characteristics of Chinese-English mixed texts and the cybernetics theory, we proposed an integrated control method for indexingn documents. It consists of qfeed-forward controlq, qin-progress controlq and qfeed-back controlq, aiming at improving the accuracy of automatic indexing of Chinese-English mixed documents. An experiment was conducted to investigate the effect of our proposed method. Findings: This method distinguishes Chinese and English documents in grammatical structures and word formation rules. Through the implementation of this method in the three phases of automatic indexing for the Chinese-English mixed documents, the results were encouraging. The precision increased from 88.54% to 97.10% and recall improved from 97.37% to 99.47%. Research limitations: The indexing method is relatively complicated and the whole indexing process requires substantial human intervention. Due to pattern matching based on a bruteforce (BF) approach, the indexing efficiency has been reduced to some extent. Practical implications: The research is of both theoretical signifi cance and practical value in improving the accuracy of automatic indexing of multilingual documents (not confined to Chinese-English mixed documents). The proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas. Originality/value: So far, few studies have been published about the method for increasing the accuracy of multilingual automatic indexing. This study will provide insights into the automatic indexing of multilingual documents, especially Chinese-English mixed documents.

相似文章

计量
文章访问数: 10
HTML全文浏览量: 0
PDF下载量: 0

所属期刊

推荐期刊