Add some form of plagiarism detection and relevance scoring #9

retypepassword · 2016-03-05T17:17:58Z

The NB assignments have the issue that a lot of the answers are simply copied and pasted from somewhere (often the first Google search result), defeating the purpose of the assignments.

Plagiarism detection could help mitigate this issue somewhat. One way to do plagiarism detection is to compare comments against one another and set a maximum similarity threshold (given a minimum length) beyond which a comment is flagged for plagiarism. A randomly selected sample of comments (or samples of comments from groups of comments that bear high similarity to one another) could also be checked against Google search results using the custom search engine API (search.cse.list) for plagiarism.

An additional feature that could be used in conjunction with plagiarism detection could be a relevance score. Relevance could be calculated by checking words in a comment against the corpus of all words (excluding the 100 most commonly used words) used in responding to the assignment. It would also be important to ensure that a sufficient number of different words is used so that a comment with just one or two words repeated over and over is not given a high relevance score.

These features would likely require significant time and effort to implement, so I'm not terribly inclined to implement them unless there's a consensus that they would be beneficial, useful, and necessary.

The text was updated successfully, but these errors were encountered:

retypepassword added enhancement low priority labels Mar 5, 2016

retypepassword changed the title ~~Add some form of plagiarism detection and relevance detection~~ Add some form of plagiarism detection and relevance scoring Mar 5, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some form of plagiarism detection and relevance scoring #9

Add some form of plagiarism detection and relevance scoring #9

retypepassword commented Mar 5, 2016

Add some form of plagiarism detection and relevance scoring #9

Add some form of plagiarism detection and relevance scoring #9

Comments

retypepassword commented Mar 5, 2016