当前位置: 首页 > 文章 > 基于高斯混合模型的林业信息文本分类算法 中南林业科技大学学报 2014 (8) 114-119
Position: Home > Articles > Forestry information text classification algorithm based on GMM model Journal of Central South University of Forestry & Technology 2014 (8) 114-119

基于高斯混合模型的林业信息文本分类算法

作  者:
陈宇;许莉薇
单  位:
东北林业大学信息与计算机工程学院
关键词:
林业信息;文本分类;高斯混合模型;参数估计
摘  要:
为解决传统林业信息文本分类算法准确率低和正确率分布不均匀的问题,提出了一种基于高斯混合模型的林业信息文本分类算法。在阐述高斯混合模型和EM算法的基础上,使用TF-IDF方法计算林业信息文本特征值,对构造的林业信息文本特征矩阵降维,结合Kmeans算法,通过训练得到各类林业信息文本所对应的高斯混合模型的参数,构造分类器进行精准与快速分类。实验结果表明,该算法与神经网络分类方法、贝叶斯、决策树等常用分类算法相比,该算法有较高的准确率和实用性,为林业信息文本的分类研究开拓了新思路。
译  名:
Forestry information text classification algorithm based on GMM model
作  者:
CHEN Yu;XU Li-wei;School of Information and Computer Science,Northeast Forestry University;
关键词:
forestry information;;text classification;;Gaussian mixture model;;parametric estimation
摘  要:
In order to solve the problems of low categorization accuracy and uneven distribution of the traditional forestry information text classification algorithm,a forestry information text classification algorithm based on Gaussian mixture model(GMM) was puts forward. On the basis of Gaussian mixture model(GMM) and the principle of parametric estimation algorithm,the formula of TFIDF was used to compute text eigenvalue,the constructed feature matrix of forestry information text was reduced in the dimension of eigenmatrix. The Kmeans algorithm should be used,then get the parameters of Gaussian mixture model(GMM) through training of forestry information text,lastly a classifier of Gaussian mixture model(GMM) was established to achieve the goal of faster and accurate classification of forestry information text. The experimental results show that the algorithm has higher accuracy and practicality than the algorithm of neural network and Bayesian and decision tree,and the algorithm pioneer new ideas for studying the forestry information text classification algorithm.

相似文章

计量
文章访问数: 7
HTML全文浏览量: 0
PDF下载量: 0

所属期刊

推荐期刊