-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
3 things I like about the project:
- I like how the group collected a large amount of data from various sources and combined them into one large dataset. I can see how much work you have put into that part.
- I like the group's approach in transforming realGDP2016 to log base 10 of those data to interpret the data more easily. That's smart!
- I like how the group tried adding in the features step-by-step (starting off without using previous year's data on vote share then adding them) in order to gauge how much your model improved.
3 improvements I think the team can make: - I found the report to be a bit confusing and convoluted. I think the readers are only expected to read the report instead of the code and still be able to understand what kind of analyses you have done. However, for this report, I had to navigate through the group's code to know that you were doing linear regression because you did not explicitly say that in your report and I think this is a very crucial piece of information to include in this context. In terms of the correlation matrix, I think the one with the final feature selection should be sufficient (for the other ones you could briefly summarize your findings so you would not have to save a whole page just for those graphs and focus more on your analyses).
- I am curious as to why you chose 0.25 as the benchmark for the correlation coefficient to decide whether or not to drop a certain feature. If there is any scientific research to back this up, please include it in the report. For feature selection, I think you can try using sparse modelling (in your report I think you referred to this as "scarce model", so I do not know if we are on the same page for this one) to get rid of the features with coefficients of 0s. That may be a more efficient method.
- Future suggestion: in lecture, professor Udell mentioned how the Obama campaign used their collected data on voters to optimize the use of resources. It would be great if your group could develop some methods in achieving that for president-elect Biden! Just to make the project even more interesting! :)
Metadata
Metadata
Assignees
Labels
No labels