Name scrubbing issues #19

srinivasgumdelli · 2016-03-09T22:16:18Z

I was testing out this impressive library and my text files had some names which start with a lower case letter (example: sarah), these kinds were not being filtered.

One more issue that I found with 1.0.3 version was

Hello. Please testing

will be replaced by

{{NAME}}. {{NAME}} testing

Thanks,
Sri

The text was updated successfully, but these errors were encountered:

deanmalmgren · 2016-04-15T14:32:33Z

Thanks for bringing this to our attention, @srinivasgumdelli! After digging around a bit, it appears that the problem with words like Hello and Please started with textblob version 0.10.1. I've pinned the textblob version to 0.10.0 which should address the Hello and Please issue you were having.

Lower case names remains an issue though, which I suspect will be better addressed by using machine learning techniques (#16) vs strict natural language processing. I added a unit test for this so we can be sure to address that in a more robust way in the future. For now, we're skipping the unit test.

If you have any other suggestions for the package, please let me know!

deanmalmgren · 2016-04-15T14:56:24Z

Oh, and I just released version 1.1.0 of scrubadub that should address this issue. You should be able to pip install -U scrubadub and hopefully that will address most of the problems you were having.

deanmalmgren closed this as completed in 7491e94 Apr 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Name scrubbing issues #19

Name scrubbing issues #19

srinivasgumdelli commented Mar 9, 2016

deanmalmgren commented Apr 15, 2016

deanmalmgren commented Apr 15, 2016

Name scrubbing issues #19

Name scrubbing issues #19

Comments

srinivasgumdelli commented Mar 9, 2016

deanmalmgren commented Apr 15, 2016

deanmalmgren commented Apr 15, 2016