当前位置: 首页 > 文章 > LSI文本挖掘技术剖析 农业图书情报学报 2016,28 (7) 5-9
Position: Home > Articles > Analysis of the Latent Semantic Indexing text Mining Method Journal of Library and Information Science in Agriculture 2016,28 (7) 5-9

LSI文本挖掘技术剖析

作  者:
蔡豪源
单  位:
广州图书馆
关键词:
潜在语义索引;文本挖掘;向量空间模型;奇异值分解
摘  要:
介绍了LSI潜在语义索引在信息检索领域的运用。阐述了词项加权的3种方法,分析了矩阵的奇异值分解SVD在提取矩阵重要信息方面的作用,展示了对词项—文档矩阵的降秩近似是如何模拟人类理解语义的过程;比较了向量空间模型与LSI在搜索算法上的异同,通过对词项—文档矩阵进行文本挖掘的例子,指出了LSI在分析文档间内在联系所起到的作用。
译  名:
Analysis of the Latent Semantic Indexing text Mining Method
作  者:
CAI Hao-yuan;Guangzhou Library;
关键词:
Latent semantic indexing;;Text mining;;VSM;;SVD
摘  要:
This paper introduced the application of latent semantic indexing in the field of information retrieval,and presented three ways to calculate the lexical item weighting,and then analyzed the role of Singular Value Decomposition(SVD) in capturing the important information of matrix,and showed how the reduced-rank approximation of item-document matrix simulated the psychological process of human when understanding the meanings of sentences.Through the comparison of the searching algorithm of Vector Space Model(VSM) and LSI,and the case of text mining of a term-document matrix,it indicated how LSI worked in analyzing the connection between documents.

相似文章

计量
文章访问数: 12
HTML全文浏览量: 0
PDF下载量: 0

所属期刊

推荐期刊