Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Handling ogbl-wikikg2 as Heterogeneous Dataset #482

Open
HumeraSabir7303 opened this issue May 30, 2024 · 0 comments
Open

Comments

@HumeraSabir7303
Copy link

Hi OGB Team,

I am currently working with the ogbl-wikikg2 dataset for a knowledge graph completion task. According to the dataset description on the OGB website, ogbl-wikikg2 is a knowledge graph (KG) that contains triplet edges (head, relation, tail) with 2,500,604 entities and 535 relation types. This suggests that the dataset inherently contains heterogeneous data due to the multiple relation types.

However, I noticed that in the dataset's metadata, is_hetero is set to False. This raises a few questions and potential issues for users who wish to treat this dataset as a heterogeneous graph for KG completion tasks.

Questions:
Why is is_hetero set to False for ogbl-wikikg2?

Given the nature of the data, should this be considered a heterogeneous graph dataset?

Impact on Model Implementation:

For tasks such as KG completion, how should users handle relation types if the dataset is treated as homogeneous?
Can you provide guidance or best practices for users who want to leverage the heterogeneous nature of the dataset (i.e., using relation types effectively in models)?
Evaluation Metric:

The current evaluation setup (e.g., Mean Reciprocal Rank, MRR) seems to align with KG completion tasks. Are there specific reasons for not treating this dataset as heterogeneous?
Documentation and Examples:

Could you provide more detailed documentation or examples on how to implement models that can handle the multiple relation types in ogbl-wikikg2?
Suggested Improvements:

Provide additional examples or guidelines on handling ogbl-wikikg2 as a heterogeneous graph, specifically for KG completion tasks.
Clarify any potential implications for using the dataset in its current form vs. treating it as heterogeneous.
I believe addressing these points would help many researchers and practitioners better utilize the ogbl-wikikg2 dataset for their projects.

Thank you for your attention to this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant