Abstract
Adverse Drug Reaction (ADR) is one of the major challenges to the evaluation of drug safety in the medical field. The Bayesian Confidence Propagation Neural Network (BCPNN) algorithm is the main algorithm used by the World Health Organization to monitor ADRs. Currently, ADR reports are collected through the spontaneous reporting system. However, with the continuous increase in ADR reports and possible use scenarios, the efficiency of the stand-alone ADR detection algorithm will encounter considerable challenges. Meanwhile, the BCPNN algorithm requires a certain number of disk I/O, which leads to considerable time consumption. In this study, we propose a Spark-based parallel BCPNN algorithm, which speeds up data processing and reduces the number of disk I/O in BCPNN, and two optimization strategies. Then, the ADR data collected from the FDA Adverse Event Reporting System are used to verify the performance of the proposed algorithm and its optimization strategies. Experiments show that the parallel BCPNN can significantly accelerate data processing and the optimized algorithm has a high acceleration rate and can effectively prevent memory overflow. Finally, we apply the proposed algorithm to a dataset provided by a real medical consortium. Experiments further prove the performance and practical value of the proposed algorithm.