Graphical Abstract

Data science is an interdisciplinary discipline that employs big data, machine learning algorithms, data mining techniques, and scientific methodologies to extract insights and information from massive amounts of structured and unstructured data. The healthcare industry constantly creates large, important databases on patient demographics, treatment plans, results of medical exams, insurance coverage, and more. The data that IoT (Internet of Things) devices collect is of interest to data scientists. Data science can help with the healthcare industry's massive amounts of disparate, structured, and unstructured data by processing, managing, analyzing, and integrating it. To get reliable findings from this data, proper management and analysis are essential. This article provides a comprehensive study and discussion of process data analysis as it pertains to healthcare applications. The article discusses the advantages and disadvantages of using big data analytics (BDA) in the medical industry. The insights offered by BDA, which can also aid in making strategic decisions, can assist the healthcare system.
Hemingway H, Asselbergs FW, Danesh J, Dobson R, Maniadakis N, Maggioni A, et al. Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur Heart J. 2018;39(16):1481–95. https://doi.org/10.1093/eurheartj/ehx487
Wang L, Alexander CA. Big data analytics in medical engineering and healthcare: methods, advances and challenges. J Med Eng Technol. 2020;44(6):267–83. https://doi.org/10.1080/03091902.2020.1769758
Koppad S, Annappa B, Gkoutos GV, Acharjee A. Cloud computing enabled big multi‐omics data analytics. Bioinform Biol Insights. 2021;15:117793222110359. https://doi.org/10.1177/11779322211035921
Morris MA, Saboury B, Burkett B, Gao J, Siegel EL. Reinventing radiology: big data and the future of medical imaging. J Thorac Imaging. 2018;33(1):4–16. https://doi.org/10.1097/RTI.0000000000000311
Dhir A, Talwar S, Kaur P, Malibari A. Food waste in hospitality and food services: a systematic literature review and framework development approach. J Clean Prod. 2020;270:122861. https://doi.org/10.1016/j.jclepro.2020.122861
Amirian P, van Loggerenberg F, Lang T, Thomas A, Peeling R, Basiri A, et al. Using big data analytics to extract disease surveillance information from point of care diagnostic machines. Pervasive Mobile Computing. 2017;42:470–86. https://doi.org/10.1016/j.pmcj.2017.06.013
De Silva D, Burstein F, Jelinek HF, Stranieri A. Addressing the complexities of big data analytics in healthcare: the diabetes screening case. Australasian J Inform Sys. 2015;19:S99–S115. https://doi.org/10.3127/ajis.v19i0.1183
Austin C, Kusumoto F. The application of big data in medicine: current implications and future directions. J Interv Card Electrophysiol. 2016;47:51–9. https://doi.org/10.1007/s10840-016-0104-y
Ozminkowski RJ, Wells TS, Hawkins K, Bhattarai GR, Martel CW, Yeh CS. Big data, little data, and care coordination for medicare beneficiaries with medigap coverage. Big Data. 2015;3(2):114–25. https://doi.org/10.1089/big.2014.0034
Zaragoza MG, Kim HK, Chung Y. U‐healthcare big data analytics process control. Int J Control Automation. 2017;10(11):165–74. https://doi.org/10.14257/ijca.2017.10.11.15
Gopal G, Suter‐Crazzolara C, Toldo L, Eberhardt W. Digital transformation in healthcare–architectures of present and future information technologies. Clin Chem Lab Med (CCLM). 2019;57(3):328–35. https://doi.org/10.1515/cclm-2018-0658
Bravo J, Hervás R, Fontecha J, González I. m‐Health: lessons learned by M‐experiences. Sensors. 2018;18(5):1569. https://doi.org/10.3390/s18051569
Navaz AN, Serhani MA, Al‐Qirim N, Gergely M. Towards an efficient and energy‐aware mobile big health data architecture. Comput Methods Prog Biomed. 2018;166:137–54. https://doi.org/10.1016/j.cmpb.2018.10.008
Chen M, Hao Y, Hwang K, Wang L, Wang L. Disease prediction by machine learning over big data from healthcare communities. IEEE Access. 2017;5:8869–79. https://doi.org/10.1109/ACCESS.2017.2694446
Hadi MS, Lawey AQ, El‐Gorashi TEH, Elmirghani JMH. Patient‐centric cellular networks optimization using big data analytics. IEEE Access. 2019;7:49279–96. https://doi.org/10.1109/ACCESS.2019.2910224
Cheng CH, Kuo YH, Zhou Z. Tracking nosocomial diseases at individual level with a real‐time indoor positioning system. J Med Syst. 2018;42(11):222. https://doi.org/10.1007/s10916-018-1085-4
Manogaran G, Varatharajan R, Lopez D, Kumar PM, Sundarasekar R, Thota C. A new architecture of Internet of things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gen Comp Systems. 2018;82:375–87. https://doi.org/10.1016/j.future.2017.10.045
Chehade A, Liu K. Structural degradation modeling framework for sparse data sets with an application on Alzheimer's disease. IEEE Trans Automation Sci Eng. 2019;16(1):192–205. https://doi.org/10.1109/TASE.2018.2829770
Wu J, Li H, Cheng S, Lin Z. The promising future of healthcare services: when big data analytics meets wearable technology. Inform Management. 2016;53(8):1020–33. https://doi.org/10.1016/j.im.2016.07.003
Jindal A, Dua A, Kumar N, Das AK, Vasilakos AV, Rodrigues JJPC. Providing healthcare‐as‐a‐service using fuzzy rule based big data analytics in cloud computing. IEEE J Biomed Health Informatics. 2018;22(5):1605–18. https://doi.org/10.1109/JBHI.2018.2799198
Narayanan A, Greco M. Patient experience of Australian general practices. Big Data. 2016;4(1):31–46. https://doi.org/10.1089/big.2016.0010
Lin YK, Chen H, Brown RA, Li SH, Yang HJ. Healthcare predictive analytics for risk profiling in chronic care: a Bayesian multitask learning approach. MIS Quarterly. 2017;41:473–95. https://doi.org/10.25300/MISQ/2017/41.2.07
Moreira MWL, Rodrigues JJPC, Kumar N, Al‐Muhtadi J, Korotaev V. Evolutionary radial basis function network for gestational diabetes data analytics. J Computational Sci. 2018;27:410–7. https://doi.org/10.1016/j.jocs.2017.07.015
Wang Y, Kung L, Wang WYC, Cegielski CG. An integrated big data analytics‐enabled transformation model: application to health care. Inform Management. 2018;55(1):64–79. https://doi.org/10.1016/j.im.2017.04.001
Wang Y, Kung L, Byrd TA. Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecase Soc. Change. 2018;126:3–13. https://doi.org/10.1016/j.techfore.2015.12.019
Shao Y, Wang K, Shu L, Deng S, Deng DJ. Heuristic optimization for reliable data congestion analytics in crowdsourced eHealth networks. IEEE Access. 2016;4:9174–83. https://doi.org/10.1109/ACCESS.2016.2646058
Ventola CL. Big data and pharmacovigilance: data mining for adverse drug events and interactions. P & T: J Formulary Management. 2018;43(6):340–51.
Bihan K, Lebrun‐Vignes B, Funck‐Brentano C, Salem JE. Uses of pharmacovigilance databases: an overview. Therapies. 2020;75(6):591–8. https://doi.org/10.1016/j.therap.2020.02.022
Zhou X, Chen S, Liu B, Zhang R, Wang Y, Li P, et al. Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support. Artif Intell Med. 2010;48(2–3):139–52. https://doi.org/10.1016/j.artmed.2009.07.012
Yang JJ, Li J, Mulder J, Wang Y, Chen S, Wu H, et al. Emerging information technologies for enhanced healthcare. Comp Industry. 2015;69:3–11. https://doi.org/10.1016/j.compind.2015.01.012
Cai T, Giannopoulos AA, Yu S, Kelil T, Ripley B, Kumamaru KK, et al. Natural language processing technologies in radiology research and clinical applications. Radiographics. 2016;36(1):176–91. https://doi.org/10.1148/rg.2016150080
Mohammed N, Fung BCM, Hung PCK, Lee CK. Centralized and distributed anonymization for high‐dimensional healthcare data. ACM Transac Knowledge Dis Data. 2010;4(4):1–33. https://doi.org/10.1145/1857947.1857950
Chong SA, Abdin E, Vaingankar JA, Heng D, Sherbourne C, Yap M, et al. A population‐based survey of mental disorders in Singapore. Ann Acad Med Singapore. 2012;41(2):49–66. https://doi.org/10.47102/annals-acadmedsg.V41N2p49
Panagiotakopoulos TC, Lyras DP, Livaditis M, Sgarbas KN, Anastassopoulos GC, Lymberopoulos DK. A contextual data mining approach toward assisting the treatment of anxiety disorders. IEEE Trans Inf Technol Biomed. 2010;14(3):567–81. https://doi.org/10.1109/TITB.2009.2038905
Kostkova P, Fowler D, Wiseman S, Weinberg JR. Major infection events over 5 years: how is media coverage influencing online information needs of health care professionals and the public? J Med Internet Res. 2013;15(7):e107. https://doi.org/10.2196/jmir.2146
Harpaz R, Vilar S, DuMouchel W, Salmasian H, Haerian K, Shah NH, et al. Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions. J Am Med Inform Assoc. 2013;20(3):413–9. https://doi.org/10.1136/amiajnl-2012-000930
Harpaz R, Chase HS, Friedman C. Mining multi‐item drug adverse effect associations in spontaneous reporting systems. BMC Bioinformatics. 2010;11:S7. https://doi.org/10.1186/1471-2105-11-S9-S7
Helm‐Murtagh SC. Use of big data by blue cross and blue shield of North Carolina. N C Med J. 2014;75(3):195–7. https://doi.org/10.18043/ncm.75.3.195
Wu X, Zhu X, Wu GQ, Ding W. Data mining with big data. IEEE Transac Knowledge Data Eng. 2013;26(1):97–107. http://doi.org/10.1109/TKDE.2013.109
Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP. Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet. 2011;12(3):224. https://doi.org/10.1038/nrg2857-c2
Marx V. The big challenges of big data. Nature. 2013;498(7453):255–60. https://doi.org/10.1038/498255a
Swarup V, Geschwind DH. From big data to mechanism. Nature. 2013;500(7460):34–5. https://doi.org/10.1038/nature12457
Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2010;38:D5–D16. https://doi.org/10.1093/nar/gkp967
Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH. PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res. 2009;37:W623–33. https://doi.org/10.1093/nar/gkp456
Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2009;37:D5–D15. https://doi.org/10.1093/nar/gkn741
Sayers EW, Agarwala R, Bolton EE, Brister JR, Canese K, Clark K, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2019;47:D23–8. https://doi.org/10.1093/nar/gky1069
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, et al. ChEMBL: a large‐scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40:D1100–7. https://doi.org/10.1093/nar/gkr777
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46:D1074–82. https://doi.org/10.1093/nar/gkx1037
Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, Chong J. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2016;44:D1045–53. https://doi.org/10.1093/nar/gkv1072
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, et al. A view of cloud computing. Commun ACM. 2010;53(4):50–8. https://doi.org/10.1145/1721654.1721672
Nickolls J, Dally WJ. The GPU computing era. IEEE micro. 2010;30(2):56–69. https://doi.org/10.1109/MM.2010.41
Moktadir MA, Ali SM, Paul SK, Shukla N. Barriers to big data analytics in manufacturing supply chains: a case study from Bangladesh. Comput Ind Eng. 2019;128:1063–75. https://doi.org/10.1016/j.cie.2018.04.013
Belle A, Thiagarajan R, Soroushmehr SMR, Navidi F, Beard DA, Najarian K. Big Data Analytics Inhealthcare, Hindawi Publishing Corporation; 2015. p. 1–16. https://doi.org/10.1155/2015/370194
Alaboudi A, Atkins A, Sharp B, Balkhair A, Alzahrani M, Sunbul T. Barriers and challenges in adopting Saudi telemedicine network: the perceptions of decision makers of healthcare facilities in Saudi Arabia. J Infection Public Health. 2016;9(6):725–33. https://doi.org/10.1016/j.jiph.2016.09.001
Busagala LS, Kawono GC. Perceptions and adoption of information and communication technology for healthcare services in Tanzania. Int J Comp ICT Res. 2013;7(1):12–21.
Kavitha R, Kannan E, Kotteswaran S. Implementation of cloud based electronic health record (EHR) for Indian healthcare needs. Indian J Sci Technol. 2016;9(3):1–5. https://doi.org/10.17485/ijst/2016/v9i3/86391
Luna DR, Mayan JC, García MJ, Almerares AA, Househ M. Challenges and potential solutions for big data implementations in developing countries. Yearb Med Inform. 2014;23(1):36–41. https://doi.org/10.15265/IY-2014-0012
Purkayastha S, Braa J. Big data analytics for developing countries—using the cloud for operational BI in health. Electron J Inf Syst Dev Ctries. 2013;59(1):1–17. https://doi.org/10.1002/j.1681-4835.2013.tb00420.x