Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drug names with minor typographical differences is repeated #1

Open
sunbiz opened this issue Jul 12, 2022 · 1 comment
Open

Drug names with minor typographical differences is repeated #1

sunbiz opened this issue Jul 12, 2022 · 1 comment

Comments

@sunbiz
Copy link
Member

sunbiz commented Jul 12, 2022

We need a mechanism to check for minor typographical differences in drug names so that they are not repeated in the concept or drug tables. Currently, spaces, multiple spaces, or lack of spaces is creating duplicate concept/drug names in the database.
Below is an example:

image

@arygup
Copy link

arygup commented Mar 15, 2023

Dear sir, after going through the codebase. I have tried my best to write a function which deletes all rows with similar names. This function can be modified to prevent inserting new rows of already existing data.

kindly please consider going through this small piece of code
https://github.com/arygup/Projects/blob/main/HandlingTypos.ipynb

Best Regards,
Aryan Gupta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants