Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build BOW model for Semantic Domain Identification #30

Open
7 tasks
Tracked by #11
janetzki opened this issue Feb 24, 2023 · 0 comments
Open
7 tasks
Tracked by #11

Build BOW model for Semantic Domain Identification #30

janetzki opened this issue Feb 24, 2023 · 0 comments
Labels
GNN model nice-to-have RQ 2 Research Question #2 (GNN for semantic domain identification)

Comments

@janetzki
Copy link
Owner

janetzki commented Feb 24, 2023

Goal

We want to build a GNN-based edge prediction BOW model for SDI. We hypothesize that it has a higher performance than the simple baseline model.
Motivation: SDI with F1 > 0.30 for 1 tpi/meu

Tasks

  • Acquire refined mappings from verses to semantic domains Acquire MARBLE Data #1
  • use refined mappings from words in verses to SDs to assign SDs to words in verses from LRL
    • simply assign SDs in eng to each aligned word in LRL
    • if many false positive mappings (i.e., low precision): refine assignments with generated SD dicts for LRL (set intersection)
  • collect BOW for every word with assigned SD (2 words before and after word in the middle)
  • aggregate BOWs by SD
  • perform SDI by extracting BOW for every candidate word in input sentence and compute cosine dist to aggregated BOW
  • try out baseline: look up each word in a dictionary
  • consider usefulness of WSD (word sense disambiguation) with pywsd or different tool: Eng verse → WordNet → SD (see Jonathan’s 2nd mail)
@janetzki janetzki added the RQ 2 Research Question #2 (GNN for semantic domain identification) label Feb 24, 2023
@janetzki janetzki changed the title Build GNN model for Semantic Domain Identification Build BOW model for Semantic Domain Identification Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GNN model nice-to-have RQ 2 Research Question #2 (GNN for semantic domain identification)
Projects
None yet
Development

No branches or pull requests

1 participant