National Natural Science Foundation of China (No.61673251);National Key Research and Development Program of China (No.2016YFC0901900);Fundamental Research Funds for the Central Universities (No.GK2017010006);Postgraduate Cultivation Innovation Fund (No.2015CX028, No.2016CSY009)
会受噪音点影响,进而影响聚类结果,及其所使用的K-means算法的不稳定,对聚类结果的影响,提出两种完全自适应的谱聚类算法SC_SD(Spectral Clustering based on Standard Deviation)和SC_MD(Spectral Clustering based on Mean Distance),分别定义样本
To avoid the clustering results with the local scaling parameter
i
of self-tuning may be influenced by outliers
and the unstable clustering results from K-means in self-tuning
two true self-adaptive spectral clustering algorithms were proposed.The two spectral clustering algorithms are respectively named as SC_SD(Spectral Clustering based on Standard Deviation) and SC_MD(Spectral Clustering based on Mean Distance).They respectively define the standard deviation of point
i
and the mean distance from point
i
to others
as its radius of neighborhood
then count the number of points in the neighborhood
and use the standard deviation of point
i
in the neighborhood as its local scaling parameter
so as to avoid the influence from outliers to the local scaling parameter
i
of point
i
and the distortion in clustering results of self-tuning.SD_K-medoids are adopted to instead of K-means in self-tuning to avoid the unstable clustering results of K-means
so as to get the true clustering of a dataset.The experimental results on UCI datasets and on synthetic datasets demonstrate that SC_SD and SC_MD can obtain better clustering results than that of traditional spectral clustering algorithm NJW and spectral clustering algorithm self-tuning
and are robust to noises
and has got good scalability.The proposed SC_SD and SC_MD can detect the clustering of a dataset without any given information
and the SC_MD can be used to detect the clustering of a comparable big data.