Feature selection is a cornerstone in advancing the accuracy and efficiency of predictive models, particularly in nuanced domains like socio-economic analysis. This study explores nine distinct feature selection methods, utilizing a heart disease dataset as a representative model for complex socio-economic systems. Our findings identified four universally recognized features as critical across all selection methods. However, the divergence in significance attributed to other features by different methods underscores the inherent variability in selection techniques. When the top four features were incorporated into twelve classification models, a noticeable surge in predictive accuracy was observed, emphasizing their foundational role in enhancing model outcomes. The variations among methods stress the need for a methodical and discerning approach to feature selection, especially in data-rich socio-economic landscapes. As we venture further into an era defined by data-driven decision-making, rigour and precision in feature selection become indispensable. Future research should extend this approach to broader datasets, ensuring the robustness and adaptability of our findings.
- Article type
- Year
In the contemporary business landscape, software has evolved into a strategic asset crucial for organizations seeking sustainable competitive advantage. The imperative of ensuring software quality becomes evident as low-quality software systems pose formidable challenges to organizational performance. This study delves into the profound impact of three key dimensions of information system quality on organizational performance—information quality (IQ), quality of service (QoS), and software quality (SQ). Anchored in the DeLone and McLean information system (IS) success model, a quantitative questionnaire was administered to 360 industry experts and academics. Rigorous data analysis, employing exploratory factor analysis (EFA), confirmatory factor analysis (CFA), and structural equation modeling (SEM), revealed significant positive effects of all three quality dimensions on organizational performance. Among these dimensions, software quality emerged as the most influential, showcasing substantial total effects, closely followed by information and service qualities. The study underscores the tangible value derived from strategic investments in enhancing software, information, and service quality. Elevating these facets manifests as a catalyst for improved organizational performance, empowering decision-makers with accurate and timely information while enhancing user satisfaction with the system. This research contributes significantly to the IS success literature by empirically validating the synergistic relationship between information quality, service quality, software quality, and organizational outcomes. The systematic analysis offered in this study goes beyond theoretical validation, providing actionable insights for managers. The findings guide the prioritization of quality initiatives and resource allocation, enabling organizations to maximize competitive advantage. As a future research direction, investigating moderator influences and exploring alternate quality constructs relevant to contemporary technologies, including cyber-physical systems, cloud services, and crowdsensing, holds promise for further enriching our understanding of the evolving digital landscape.
Stroke prediction and prevention is an important focus in healthcare due to the significant morbidity and mortality associated with strokes. In this study, we investigate using Generative Adversarial Networks (GANs) to augment a stroke dataset and evaluate the effects on prediction performance. The original dataset contained patient medical records and demographics used to predict stroke occurrence. We trained a GAN on these data and generated synthetic samples to augment the training set. Five machine learning models were developed on the original and augmented datasets, including decision tree, k-nearest neighbors, random forest, Support Vector Machine (SVM), and logistic regression classifiers. Experiments indicate statistically significant improvements in prediction accuracy, F1-score, specificity, and sensitivity with GAN augmentation across all models. The random forest classifier achieved the highest average accuracy of 0.981 on augmented data, versus 0.967 on original data. GANs prove effective for tackling class imbalance and enabling more robust stroke prediction from limited real-world data. This demonstrates the potential of data augmentation and generative models to enhance healthcare Artificial Intelligence (AI) applications.