With the advent of the era of big data, huge amounts of data have appeared in various fields with complex structure, such as different dimensions and scales. As we know, the classical Pearson correlation measures the linear relationship between two random variables in equal dimension. In 2007, Szekely et.al proposed distance correlation (DC) that characterizes multivariate independence for random variables in arbitrary dimension. In order to explore the internal relationship between variables, in this paper, we study two agglomerative hierarchical clustering methods. We firstly propose complete distance correlation clustering (complete DC clustering) for variable clustering, which has ultrametricity and space contractibility. Secondly, we propose union DC clustering via improving the complete DC clustering. Numerical results for real data are reported to demonstrate the efficiency of our proposed union distance correlation clustering.
Pearson K. Contributions to the mathematical theory of evolution[J]. Philosophical Transactions of the Royal Society of London, 1894, A 185(1):71-110.
Li S Z, Rizzo M L. K-groups:A Generalization of k-means clustering[J]. ArXiv e-prints, 2017.
Szekely G J, Rizzo M L, Bakirov N K. Measuring and testing independence by correlation of distances[J]. Annals of Statistics, 2007, 35(6):2769-2794.
Kong J, Klein B E K, Klein R, Lee K, Wahba G. Using distance correlation and SS-AVOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality[J]. Proceeding of the National Academy of Sciences, 2012, 109(50):20352-20357.
Li R, Zhong W, Zhu L. Feature screening via distance correlation learning[J]. Journal of the American Statistical Association, 2012, 107, 1129-1139.
Sheng W, Yin X. Direction estimation in single-index models via distance covariance[J]. Journal of Multivariate Analysis, 2013, 122:148-161.
Sheng W, Yin X. Sufficient dimension reduction via distance covariance[J].Journal of Computational Graphical Statistics, 2016, 25:91-104.
Van Roovij A C M. Non Archimedean Functional Analysis[M]. New York:M. Dekker, 1978.
Chen Z M, Van Ness J W. Space-conserving and agglomerative algorithms[J]. Journal of Classification, 1996, 13:157-163.