Many human diseases involve multiple genes in complex interactions. Large Genome-Wide Association Studies (GWASs) have been considered to hold promise for unraveling such interactions. However, statistic tests for high-order epistatic interactions (
- Article type
- Year
- Co-author
In this article, we aim to provide a thorough review of the Bayesian-inference-based methods applied to Hepatitis B Virus (HBV), Hepatitis C Virus (HCV), and Human Immunodeficiency Virus (HIV) studies with a focus on the detection of the viral mutations and various problems which are correlated to these mutations. It is particularly difficult to detect and interpret these interacting mutation patterns, but by using Bayesian statistical modeling, it provides a groundbreaking opportunity to solve these problems. Here we summarize Bayesian-based statistical approaches, including the Bayesian Variable Partition (BVP) model, Bayesian Network (BN), and the Recursive Model Selection (RMS) procedure, which are designed to detect the mutations and to make further inferences to the comprehensive dependence structure among the interactions. BVP, BN, and RMS in which Markov Chain Monte Carlo (MCMC) methods are used have been widely applied in HBV, HCV, and HIV studies in the recent years. We also provide a summary of the Bayesian methods’ applications toward these viruses’ studies, where several important and useful results have been discovered. We envisage the applications of more modified Bayesian methods to other infectious diseases and cancer cells that will be following with critical medical results before long.
Every person differs from every other person regarding their physical appearance, susceptibility to disease, response to medications, and so on. However, 99.9 percent of human DNA is the same. As such, differences in human genomes are very worthy of study. Single-Nucleotide Polymorphisms (SNPs) are the simplest form and most common source of genetic polymorphism. SNPs have been used to successfully identify defective genes that cause Mendelian diseases. However, most common human diseases are complex and are caused by multiple SNPs. Each SNP explains only a small fraction of genetic causes. Experiments on individual SNPs may reveal their non-detectable effects on complex diseases. Pathogenesis is a complicated topic, and it is difficult to correctly predict multiple SNPs. As such, the analysis of SNP data is a critical task in the study of genetic diseases. In this paper, we divide the methods for genome-wide SNP data analysis into two categories: single-trait Genome-Wide Association Studies (GWAS) in which pathology is mined from data of a single phenotype, and multiple-trait GWAS which identifies cross-phenotype associations. For single-trait GWAS, we review methods ranging from the simple to the complex, including TEAM, BOOST, AntEpiSeeker, SNPRuler, EDCF, HiSeeker, ORF, MLR-tagging, MSCD, and MIC. For multiple-trait GWAS, we describe methods in terms of their employed regression models, dimension-reduction methods, and meta-analysis methods. We also list the advantages and disadvantages of these methods. Finally, we discuss the future directions of SNP data analysis for genome-wide association.