Abstract
A common phenomenon that increasingly stimulates the interest of investors, companies, and entrepreneurs involved in crowd funding activities particularly on the Kickstarter website is identifying metrics that make such campaigns markedly successful. This study seeks to gauge the importance of key predictive variables or features based on statistical analysis, identify model-based machine learning methods based on performance assessment that predict success of a campaigns, and compare the selected different machine learning algorithms. To achieve our research objectives and maximize insight into the dataset used, feature engineering was performed. Then, machine learning models, inclusive of Logistic Regression (LR), Support Vector Machines (SVMs) in the form of Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), and random forest analysis (bagging and boosting), were performed and compared via cross validation approaches in terms of their resulting test error rates, F1 score, Accuracy, Precision, and Recall rates. Of the machine learning models employed for predictive analysis, the test error rates and the other classification metric scores obtained across the three cross-validation approaches identified bagging and gradient boosting (the SVMs) as more robust methods for predicting success of Kickstarter projects. The major research objectives in this paper have been achieved by accessing the performance of key statistical learning methods that guides the choice of learning methods or models and giving us a measure of the quality of the ultimately chosen model. However, Bayesian semi-parametric approaches are of future research consideration. These methods facilitate the usage of an infinite number of parameters to capture information regarding the underlying distributions of even more complex data.