-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing RAM footprint + Adding preferred term output #67
base: master
Are you sure you want to change the base?
Conversation
Hmm, after finding a bug in my code which removed excluded a good portion of terms from being included in the simstring DB, the RAM reduction isn't as much as I had hoped. A couple of things are fixed, but large UMLS sets are still not processable |
Hi! Active Subset: excludes "legacy" sources that have not been updated for several years in the UMLS Metathesaurus. |
hello, thank you for your commit , but how we can get synonyms ans source (Snomed,MSH ...etc)
|
The preferred term for every match is also returned (useful for normalizing terms in a text).
The RAM footprint is reduced by removing the sets in which the terms are accumulated. Alternatively, only a set of already saved terms is kept per concept. As a consequence, duplicate terms can be insereted into the simstring database, when 2 equal terms are included different UMLS concepts. As a fix, the duplicates from the simstring database are removed when matching