-
Notifications
You must be signed in to change notification settings - Fork 1
Description
This project is a data analysis project that aims at being able to predict the results of swing states in the 2020 election. They've collected county specific data ranging from racial composition to age and gender within those counties to make their predictions. Based on this they plan to train train ML models to make their predictions.
3 things I like :
-
I really like how this midterm project is formatted. It seems like a lot of thought was put into making this midterm report appear readable.
-
I thought it was very clever how you guys took the log of "realGDP2016" to continue creating a linear relationship between the variables so the models you would subsequently fit would work well.
-
Additionally I liked how you made clear your iterative model construction process. You slowly added features and talked about how/why you proceeded in this manner.
3 things for improvement :
-
I feel like you could have spent delved into explaining your conclusions from your correlation matrices that you represented.
-
I also would have loved to see a visualization or two showing how well your iterative models overfitted/underfitted depending on the number of features.
-
Lastly I would have wanted a bit more justification for why you only chose features with specifically the cutoff point of correlation greater than equal to .25. Is this industry practice? Or an arbitrary number.