Overlapping community detection model based on a modularity-aware graph autoencoder

Jie CHEN; Binbin LIU; Shu ZHAO; Yanping ZHANG

doi:10.16511/j.cnki.qhdxxb.2024.26.018

| Sign up

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

References

Show full outline

Hide outline

Publishing Language: Chinese

Overlapping community detection model based on a modularity-aware graph autoencoder

Jie CHEN^{¹^,²}, Binbin LIU^{¹^,²}, Shu ZHAO^{¹^,²}(), Yanping ZHANG^{¹^,²}

School of Computer Science and Technology, Anhui University, Hefei 230601, China

Ministry of Education Key Laboratory of Computational Intelligence and Signal Processing, Anhui University, Hefei 230601, China

Show Author Information

Abstract

Objective

In the ever-expanding field of network science, the abstraction of complex entity relationships into network structures provides a foundation for understanding real-world interactions.The discovery of communities within these networks plays a pivotal role in identifying clusters of closely interconnected nodes.This process reveals latent patterns and functionalities inherent in the intricate fabric of reality, proving invaluable for tracking dynamic network behaviors and assessing community influences.These influences span a range of phenomena, from rumor propagation to virus outbreaks and tumor evolution.A notable characteristic of these communities is their overlapping nature, with participants often straddling multiple community boundaries.This characteristic adds an additional layer of complexity to the exploration of network structures, making the discovery of overlapping communities imperative for a comprehensive understanding of network structures and functional dynamics.

Methods

Within the realm of network science, network representation learning algorithms have significantly enriched the pursuit of community discovery.These algorithms adeptly transform complex network information into lower-dimensional vectors, effectively maintaining the underlying network structure and attribute information.Such representations prove invaluable for subsequent graph processing tasks, including but not limited to link prediction, node classification, and community discovery.Among these algorithms, the graph autoencoder model is a prominent representative, demonstrating efficiency in learning network embeddings and finding applications in diverse community discovery tasks.However, a limitation inherent in traditional graph autoencoder models is their predominant focus on local node-edge reconstruction.This focus often overlooks the crucial influence of community structure, particularly in scenarios featuring overlapping nodes across multiple communities.This inherent challenge makes it difficult to precisely determine node affiliations and community distributions.To address this issue, we introduce an innovative unsupervised modularity-aware graph autoencoder model (GAME) designed for overlapping community discovery.The model incorporates an efficient modularity maximization loss function into the graph autoencoder framework.This ensures the preservation of community structure throughout the network embedding process.The modularity-aware loss is meticulously reconstructed to facilitate the update of encoder parameters, thereby improving the model performance in overlapping community discovery tasks.We harness the resulting community membership matrix to probabilistically assign communities to nodes.

Results

The efficacy of the proposed GAME model was rigorously evaluated across six diverse social network datasets (Facebook 348, Facebook 414, Facebook 686, Facebook 698, Facebook 1684, and Facebook 1912), with node counts ranging from 60-800.Additionally, assessments were conducted on four collaborator network datasets (Computer Science, Engineering, Chemistry, and Medicine) featuring node counts ranging from 1.4×10⁴ to 6.4×10⁴.Comparative analyses with seven prevalent overlapping community discovery methods, encompassing both traditional and graph autoencoder-based algorithms, demonstrated a noteworthy 2.1% improvement under the normalized mutual information (NMI) evaluation index.This performance enhancement substantiated the tangible advantages and effectiveness of the proposed GAME model.

Conclusions

The integration of an efficient modularity maximization loss function into the graph autoencoder model, as demonstrated by the GAME model, successfully addresses the conventional limitations of graph autoencoders.These models often prioritize the reconstruction of local node connections during community discovery tasks, often overlooking the overarching structure of the community, particularly when confronted with overlapping nodes.The experimentally validated performance boost underscores the GAME model's efficacy in navigating the complexities of overlapping community discovery compared to mainstream methods.However, it is worth noting that the model's reliance on substantial memory resources can become a challenge when handling datasets that combine network structure and node attributes.This is especially apparent in scenarios with small attribute networks (N≤800), where the model exhibits insensitivity to the threshold ρ variation.Future work will focus on refining the model to mitigate these challenges and ensure optimal performance across a broader spectrum of real-world scenarios.

Keywords

community detection overlapping communities graph autoencoder modularity maximization community membership matrix

CLC number: TP391 Document code: A Article ID: 1000-0054(2024)08-1319-11

References

[1]

RADICCHI F, CASTELLANO C, CECCONI F, et al. Defining and identifying communities in networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(9): 2658-2663.