Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

therapy names aren't unique per Evidence, and Disease property field sometimes is not there #153

Open
abidalikashi opened this issue Nov 1, 2024 · 2 comments

Comments

@abidalikashi
Copy link

I was going to open a PR with this fix but seems like its private access only
so my fix was adding those two properties to the evidence class
this maybe wrong approach I'd like to know if it is, I am not an expert on the data itself

@property
def get_unique_therapies(self):
    unique_therapy = set()
    for therapy in self.therapies:
        if therapy not in unique_therapy:
            unique_therapy.add(therapy.name)
    return list(unique_therapy)

@property
def get_disease_name(self):
    return getattr(self.disease, 'name', 'N/A')
@susannasiebert
Copy link
Collaborator

That is certainly unexpected. I contacted the curation team to see if there is a reason why these particular evidence items would be linked to the same therapy twice. Hopefully we can get this data cleaned up.

Regarding the disease name: that behavior is expected because functional evidence is not linked to a specific disease. I'm not sure whether or not to add the proposed property. On the one hand it's certainly a helpful shortcut but on the other hand it might cause confusion since the disease name is not literally N/A for functional evidence. I would be supportive if that property returned None instead of `N/A' if no disease was associated with the evidence item but at that point I'm not sure how much more helpful this method would be.

@abidalikashi
Copy link
Author

thank you for the clarification, I am not well versed in the data yet however I see your point about None, its a better approach, the reason I needed it to be that way is that it simplifies iterative operations where I have to account for disease name property not to exist at all, it would be much cleaner code if I just get None instead of the entire field missing and causing iteration to break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants