Abstract
Community discovery is an important task in social network analysis. However, most existing methods for community discovery rely on the topological structure alone. These methods ignore the rich information available in the content data. In order to solve this issue, in this paper, we present a community discovery method based on heterogeneous information network decomposition and embedding. Unlike traditional methods, our method takes into account topology, node content and edge content, which can supply abundant evidence for community discovery. First, an embedding-based similarity evaluation method is proposed, which decomposes the heterogeneous information network into several subnetworks, and extracts their potential deep representation to evaluate the similarities between nodes. Second, a bottom-up community discovery algorithm is proposed. Via leader nodes selection, initial community generation, and community expansion, communities can be found more efficiently. Third, some incremental maintenance strategies for the changes of networks are proposed. We conduct experimental studies based on three real-world social networks. Experiments demonstrate the effectiveness and the efficiency of our proposed method. Compared with the traditional methods, our method improves normalized mutual information (NMI) and the modularity by an average of 12% and 37% respectively.