Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer Review - Midterm Report #77

Open
posadaj opened this issue Nov 5, 2016 · 0 comments
Open

Peer Review - Midterm Report #77

posadaj opened this issue Nov 5, 2016 · 0 comments

Comments

@posadaj
Copy link
Contributor

posadaj commented Nov 5, 2016

The project has a lot of potential, as it seems to one of the more rich datasets and a very interesting topic. The preliminary analysis of the dataset also shows that the dataset is relatively easy to work with, sample points are treated as all or nothing with complete or incomplete features and no corruption of the data. It also looks like decent progress has been made on the project with the forest classifier having a 12% error.

Where I think the report is lacking is in the explanation of decisions made for the project. For example it's unclear why the two models of One-vs-Rest and Forest Classifier were used. I don't believe they were covered in the course and even so, there should be some explanation as to why the model would work for this problem. Another example is the following: "we used a total of 158 features after preprocessing, which required dropping feature columns which represented one level of a particular categorical variable". Why did you decide to drop some columns of a specific category? And where do the 158 features come from? In each survey there are 35 questions (with multiple parts) so do they amount to 158? If not, mention which features are dropped and why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant