Random Forests etc.
Decision trees and Random forests are very powerful classication techniques and today we will use them on the Iris dataset. The tasks are as follows:
- After conducting data exploration, write a formal hypothesis on which is the most significant variable.
- Classifiy using the decisiontreeclassifier, randomforestclassifier and logisticregression.
- Tune the hyperparameters in the decisiontreeclassifier to get the best possible results.
- For the decision tree classifier, print out 2 graphs using graphviz (Mac: brew install graphviz, Windows: conda install python-graphviz) for the classifier without tuning and one with tuning.
- For each classifer, plot the decision boundry (example: http://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_iris.html#sphx-glr-auto-examples-ensemble-plot-forest-iris-py)
- Write down the interpretation of the results