• Employed R (tidyverse, tidymodels, janitor, dplyr) for data wrangling, exploratory data analysis, and statistical analysis, including handling missing values, outliers, and correlations, as well as visualizing data (ggplot2) with box plots and stacked column charts
• Developed a Random Forest classification model with 99.47% accuracy and 99.90% ROC_AUC to predict target ‘programs’; Applied K-means clustering (factoextra) to divide 10M+ donors into 5 groups and identified key features of each group
• Recommended predicted target programs to enhance customer service, and encourage donation