Position: Home > Articles > Analysis of the Latent Semantic Indexing text Mining Method
Journal of Library and Information Science in Agriculture
2016,28
(7)
5-9
LSI文本挖掘技术剖析
作 者:
蔡豪源
单 位:
广州图书馆
关键词:
潜在语义索引;文本挖掘;向量空间模型;奇异值分解
摘 要:
介绍了LSI潜在语义索引在信息检索领域的运用。阐述了词项加权的3种方法,分析了矩阵的奇异值分解SVD在提取矩阵重要信息方面的作用,展示了对词项—文档矩阵的降秩近似是如何模拟人类理解语义的过程;比较了向量空间模型与LSI在搜索算法上的异同,通过对词项—文档矩阵进行文本挖掘的例子,指出了LSI在分析文档间内在联系所起到的作用。
译 名:
Analysis of the Latent Semantic Indexing text Mining Method
作 者:
CAI Hao-yuan;Guangzhou Library;
关键词:
Latent semantic indexing;;Text mining;;VSM;;SVD
摘 要:
This paper introduced the application of latent semantic indexing in the field of information retrieval,and presented three ways to calculate the lexical item weighting,and then analyzed the role of Singular Value Decomposition(SVD) in capturing the important information of matrix,and showed how the reduced-rank approximation of item-document matrix simulated the psychological process of human when understanding the meanings of sentences.Through the comparison of the searching algorithm of Vector Space Model(VSM) and LSI,and the case of text mining of a term-document matrix,it indicated how LSI worked in analyzing the connection between documents.
相似文章
-
潜在语义索引在FAQ构建中的应用研究 [李霞, 张太红, 李莉] 石河子大学学报(自然科学版) 2005,23 (6) 123-126