GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder

Li Wang; Hao Zhang; Hao-Wu Chang; Qing-Ming Qin; Bo-Rui Zhang; Xue-Qing Li; Tian-Heng Zhao; Tian-Yue Zhang

doi:10.1007/s11390-021-0804-3

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Regular Paper

GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder

Li Wang^¹, Hao Zhang^{¹^,²}(

), Hao-Wu Chang^², Qing-Ming Qin^³, Bo-Rui Zhang^⁴, Xue-Qing Li^², Tian-Heng Zhao^², Tian-Yue Zhang^²

College of Software, Jilin University, Changchun 130012, China

College of Computer Science and Technology, Jilin University, Changchun 130012, China

College of Plant Science, Jilin University, Changchun 130062, China

Department of Biochemistry, University of Illinois at Urbana-Champaign, Champaign 61820, U.S.A.

Show Author Information

Abstract

Unlike traditional clustering analysis, the biclustering algorithm works simultaneously on two dimensions of samples (row) and variables (column). In recent years, biclustering methods have been developed rapidly and widely applied in biological data analysis, text clustering, recommendation system and other fields. The traditional clustering algorithms cannot be well adapted to process high-dimensional data and/or large-scale data. At present, most of the biclustering algorithms are designed for the differentially expressed big biological data. However, there is little discussion on binary data clustering mining such as miRNA-targeted gene data. Here, we propose a novel biclustering method for miRNA-targeted gene data based on graph autoencoder named as GAEBic. GAEBic applies graph autoencoder to capture the similarity of sample sets or variable sets, and takes a new irregular clustering strategy to mine biclusters with excellent generalization. Based on the miRNA-targeted gene data of soybean, we benchmark several different types of the biclustering algorithm, and find that GAEBic performs better than Bimax, Bibit and the Spectral Biclustering algorithm in terms of target gene enrichment. This biclustering method achieves comparable performance on the high throughput miRNA data of soybean and it can also be used for other species.

Keywords

biclustering graph autoencoder miRNA-targeted gene binary data

Electronic Supplementary Material

Download File(s)

jcst-36-2-299-Highlights.pdf (111.5 KB)

References

[1]

Kuwabara P E. DNA microarrays and gene expression: From experiments to data analysis and modeling. Briefings in Functional Genomics and Proteomics, 2003, 2(1): 80-81. https://doi.org/10.1093/bfgp/2.1.80.

Crossref Google Scholar

[2]

Jain A K, Murty M N, Flynn P J et al. Data clustering: A review. ACM Computing Surveys, 1999, 31(3): 264-323. https://doi.org/10.1145/331499.331504.

Crossref Google Scholar

[3]

Wang H, Wang W, Yang J et al. Clustering by pattern similarity in large data sets. In Proc. the 2002 ACM SIGMOD International Conference on Management of Data, June 2002, pp.394-405. https://doi.org/10.1145/564691.564737.

Crossref

[4]

Gasch A P, Eisen M B. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology, 2002, 3(11): Article No. research0059. https://doi.org/10.1186/gb-2002-3-11-research0059.

Crossref Google Scholar

[5]

Cheng Y, Church G M. Biclustering of expression data. In Proc. the 8th International Conference on Intelligent Systems for Molecular Biology, August 2000, pp.93-103.

[6]

Madeira S C, Oliveira A L. Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1(1): 24-45. https://doi.org/10.1109/TCBB.2004.2.

Crossref Google Scholar

[7]

Busygin S, Prokopyev O A, Pardalos P M et al. Biclustering in data mining. Computers & Operations Research, 2008, 35(9): 2964-2987. https://doi.org/10.1016/j.cor.2007.01.005.

Crossref Google Scholar

[8]

Eren K, Deveci M, Küçüktunç O et al. A comparative analysis of biclustering algorithms for gene expression data. Briefings in Bioinformatics, 2013, 14(3): 279-292. https://doi.org/10.1093/bib/bbs032.

Crossref Google Scholar

[9]

Oghabian A, Kilpinen S, Hautaniemi S et al. Biclustering methods: Biological relevance and application in gene expression analysis. PLoS ONE, 2014, 9(3): Ariticle No. e90801. https://doi.org/10.1371/journal.pone.0090801.

Crossref Google Scholar

[10]

Pontes B, R. Giráldez, Aguilar-Ruiz J S. Biclustering on expression data: A review. Journal of Biomedical Informatics, 2015, 57: 163-180. https://doi.org/10.1016/j.jbi.2015.06.028.

Crossref Google Scholar

[11]

Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. Proceedings of the National Academy of Sciences of the United States of America, 2000, 97(22): 12079-12084. https://doi.org/10.1073/pnas.210134797.

Crossref Google Scholar

[12]

Bhattacharya A, De Rajat K. Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics, 2009, 25(21): 2795-2801. https://doi.org/10.1093/bioinformatics/btp526.

Crossref Google Scholar

[13]

Prelić A, Bleuler S, Zimmermann P et al. A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 2006, 22(9): 1122-1129. https://doi.org/10.1093/bioinformatics/btl060.

Crossref Google Scholar

[14]

Hartigan J A. Direct clustering of a data matrix. Journal of the American Statistical Association, 1972, 67(337): 123-129. https://doi.org/10.1080/01621459.1972.10481214.

Crossref Google Scholar

[15]

Yang J, Wang H, Wang W et al. Enhanced biclustering on expression data. In Proc. the 3rd IEEE Symposium on BioInformatics and BioEngineering, March 2003, pp.321-327. https://doi.org/10.1109/BIBE.2003.1188969.

Crossref

[16]

Liu J, Wang W. OP-cluster: Clustering by tendency in high dimensional space. In Proc. the 3rd IEEE International Conference on Data Mining, November 2003, pp.187-194. https://doi.org/10.1109/ICDM.2003.1250919.

Crossref

[17]

Tanay A, Sharan R, Shamir R. Discovering statistically significant biclusters in gene expression data. In Proc. the 10th International Conference on Intelligent Systems for Molecular Biology, August 2002, pp.136-144.

Crossref

[18]

Rodriguez-Baena D S, Perez-Pulido A J, Aguilarruiz J S. A biclustering algorithm for extracting bit-patterns from binary datasets. Bioinformatics, 2011, 27(19): 2738-2745. https://doi.org/10.1093/bioinformatics/btr464.

Crossref Google Scholar

[19]

Alzahrani M, Kuwahara H, Wang W et al. Gracob: A novel graph-based constant-column biclustering method for mining growth phenotype data. Bioinformatics, 2017, 33(16): 2523-2531. https://doi.org/10.1093/bioinformatics/btx199.

Crossref Google Scholar

[20]

Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling. Bioinformatics, 2003, 19(suppl_2): ii196-ii205. https://doi.org/10.1093/bioinformatics/btg1078.

Crossref Google Scholar

[21]

Kluger Y, Basri R, Chang J T et al. Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Research, 2003, 13(4): 703-716. https://doi.org/10.1101/gr.648603.

Crossref Google Scholar

[22]

Kipf T, Welling M. Semi-supervised classification with graph convolutional networks. In Proc. the 5th International Conference on Learning Representations, April 2017.

[23]

Niepert M, Ahmed M H, Kutzkov K. Learning convolutional neural networks for graphs. In Proc. the 33rd International Conference on Machine Learning, June 2016, pp.2014-2023.

[24]

Kipf T N, Welling M. Variational graph auto-encoders. arXiv: 1611.07308, 2016. https://arxiv.org/abs/1611.07308, November 2020.

[25]

Zhou J, Cui G, Zhang Z et al. Graph neural networks: A review of methods and applications. arXiv: 1812.08434, 2018. https://arxiv.org/abs/1812.08434, July 2020.

[26]

Wu Z, Pan S, Chen F et al. A comprehensive survey on graph neural networks. arXiv: 1901.00596, 2019. https://arxiv.org/abs/1901.00596v4, December 2019.

[27]

Cao S S, Lu W, Xu Q K. Deep neural networks for learning graph representations. In Proc. the 13th AAAI Conference on Artificial Intelligence, February 2016, pp.1145-1152.

Crossref

[28]

Hammer B, Micheli A, Sperduti A. Universal approximation capability of cascade correlation for structures. Neural Computation, 2005, 17(5): 1109-1159. https://doi.org/10.1162/0899766053491878.

Crossref Google Scholar

[29]

Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM Conference on Knowledge Discovery and Data Mining, August 2016, pp.1225-1234. https://doi.org/10.1145/2939672.2939753.

Crossref

[30]

Hamilton WL, Ying Z, Leskovec J. Inductive representation learning on large graphs. In Proc. the 31st Annual Conference on Neural Information Processing Systems, December 2017, pp.1024-1034.

Journal of Computer Science and Technology

Volume 36 Issue 2,
March 2021

Pages 299-309

DOI: 10.1007/s11390-021-0804-3

Cite this article:

Wang L, Zhang H, Chang H-W, et al. GAEBic: A Novel Biclustering Analysis Method for miRNA-Targeted Gene Data Based on Graph Autoencoder. Journal of Computer Science and Technology, 2021, 36(2): 299-309. https://doi.org/10.1007/s11390-021-0804-3

352

Views

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 14 July 2020

Accepted: 05 March 2021

Published: 05 March 2021