ISSN在线(2278 - 8875)打印(2320 - 3765)
先进的特征选择算法的比较分析
特征选择是识别的预处理过程数据从大型维度数据的子集。识别所需的数据,使用一些特征选择算法。像ReliefF Parzen-ReliefF算法,它试图直接最大化分类准确性和自然反映了贝叶斯错误的目标。提出的算法框架选择特性的一个子集通过最小化非参数估计的贝叶斯估计错误率。一组现有的算法以及新的自然可以从这个框架。作为一个例子,我们表明,救援算法贪婪地试图最小化错误估计的贝叶斯k-Nearest-Neighbor(资讯)方法。这种新的解释深入揭示了家族的秘密margin-based特征选择算法和还提供了一种有原则的方式来建立新的替代品的性能提升。特别是,利用该框架,我们建立Parzen-Relief (PRelief)算法基于Parzen窗口估计量。救援重量估计算法是一种受欢迎的方法功能。救援的许多扩展算法开发。 Because of the randomicity and the uncertainty of the instances used for calculating the feature weight vector in the RELEIF algorithm, the results will fluctuate with the instances, which lead to poor evaluation accuracy. To solve this problem, a feature selection algorithm parzen+reliefF based algorithm is proposed. It takes both the mean and the variance of the discrimination among instances and weights into account as the criterion of feature weight estimation, which makes the result more stable and accurate. And the main idea is how to estimate the performance of the both algorithms, for this we are using two algorithms for calculating the quality of the generated out puts. They are Leader and sub-leader algorithm and Davies– Bouldin index (DBI) algorithm. Both are clustering algorithms. Which are used for knowing the cluster quality and cluster similarity.
库马拉斯Senapathi, Kanakeswari D,拉维Bhushan Yadlapalli