关键词:
烟草马铃薯Y病毒;三碱基组;概率;K-M聚类
摘 要:
提取4个不同来源的烟草马铃薯Y病毒完整基因组的统计特征,并对它们进行聚类分析。在烟草马铃薯Y病毒完整基因组的碱基序列上,用每个碱基及其随后两个碱基所构成的三碱基组,排列成一个新的序列S,计算所有64种不同三碱基组在S上出现的概率,得到一个64维向量L;比较各个基因组的L向量,得到4个三碱基组(CAA、GAT、GTA、GAC),它们的概率有明显的差异。这4个三碱基组的出现概率与烟草马铃薯Y病毒基因组的遗传变异有着重要关联;4个不同来源的烟草马铃薯Y病毒完整基因组,按其遗传变异结果,形成两个大类。
译 名:
The Statistical Characteristics of Potato Virus Y Complete Genome
作 者:
YANG Shuo,LI Jian-xue(Xiangcheng Tobacco Monopoly Bureau,Xiangyang 441000,Hubei,China)
关键词:
potato virus Y;three-base-groups;probability;K-M clustering
摘 要:
The statistical characteristics of the complete genome of 4 potato virus Y(PVY) with different resources were extracted and cluster analyzed.A new sequence S was arranged by the three-base groups composing every base and its following two bases in PVY complete genome.And then a 64-dimensional vector L was obtained by caculating the appearance probability of each of the 64 three-base-groups.4-three-base-groups(CAA,GAT,GTA,GAC) whose appearance probability was great different was identified by comparing L vector of every genome.The appearance probability of these four three-base-groups has great ralations with genetic variation of PVY.And the 4 complete genome of PVY was clustered into two groups according to the result of genetic variation.