Community detection is a vital task in many fields, such as social networks and financial analysis, to name a few. The Louvain method, the main workhorse of community detection, is a popular heuristic method. To apply it to large-scale graph networks, researchers have proposed several parallel Louvain methods (PLMs), which suffer from two challenges: the latency in the information synchronization, and the community swap. To tackle these two challenges, we propose an isolate sets based parallel Louvain method (IPLM) and a fusion IPLM with the hashtables based Louvain method (FIPLM), which are based on a novel graph partition algorithm. Our graph partition algorithm divides the graph network into subgraphs called isolate sets, in which the vertices are relatively decoupled from others. We first describe the concepts and properties of the isolate set. Second we propose an algorithm to divide the graph network into isolate sets, which enjoys the same computation complexity as the breadth-first search. Third, we propose IPLM, which can efficiently calculate and update vertices information in parallel without latency or community swap. Finally, we achieve further acceleration by FIPLM, which maintains a high quality of community detection with a faster speedup than IPLM. Our two methods are for shared-memory architecture, and we implement our methods on an 8-core PC; the experiments show that IPLM achieves a maximum speedup of 4.62x and outputs higher modularity (maximum 4.76%) than the serial Louvain method on 14 of 18 datasets. Moreover, FIPLM achieves a maximum speedup of 7.26x.
- Article type
- Year
- Co-author
Instance-specific algorithm selection technologies have been successfully used in many research fields, such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instance-specific algorithm selection method based on multi-output learning, which can manage these relations more directly. Three kinds of multi-output learning methods are used to predict the performances of the candidate algorithms: (1) multi-output regressor stacking; (2) multi-output extremely randomized trees; and (3) hybrid single-output and multi-output trees. The experimental results obtained using 11 SAT datasets and 5 MaxSAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.