Peer Review - Midterm Report #77

posadaj · 2016-11-05T04:46:12Z

The project has a lot of potential, as it seems to one of the more rich datasets and a very interesting topic. The preliminary analysis of the dataset also shows that the dataset is relatively easy to work with, sample points are treated as all or nothing with complete or incomplete features and no corruption of the data. It also looks like decent progress has been made on the project with the forest classifier having a 12% error.

Where I think the report is lacking is in the explanation of decisions made for the project. For example it's unclear why the two models of One-vs-Rest and Forest Classifier were used. I don't believe they were covered in the course and even so, there should be some explanation as to why the model would work for this problem. Another example is the following: "we used a total of 158 features after preprocessing, which required dropping feature columns which represented one level of a particular categorical variable". Why did you decide to drop some columns of a specific category? And where do the 158 features come from? In each survey there are 35 questions (with multiple parts) so do they amount to 158? If not, mention which features are dropped and why.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer Review - Midterm Report #77

Peer Review - Midterm Report #77

posadaj commented Nov 5, 2016

Peer Review - Midterm Report #77

Peer Review - Midterm Report #77

Comments

posadaj commented Nov 5, 2016