Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dictScrap #17

Open
kernc opened this issue Dec 11, 2020 · 1 comment
Open

dictScrap #17

kernc opened this issue Dec 11, 2020 · 1 comment

Comments

@kernc
Copy link
Contributor

kernc commented Dec 11, 2020

Say I have a dictionary like:

<TEI xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" xmlns:a="http://elex.is/wp1/teiLex0Mapper/legacyAttributes" xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>hebrew_syns-wordnet</title>
      </titleStmt>
      <extent></extent>
      <publicationStmt>
        <publisher>Wordnet</publisher>
        <availability>
          <licence></licence>
        </availability>
        <date when=""></date>
        <idno></idno>
      </publicationStmt>
      <sourceDesc>
        <p></p>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <body>
      <entry m:e="synonym" a:num="1214" a:ev="u" a:idmorpho="0" a:stat="assigned" a:tgr="tg1" a:modify="2005-06-22" xml:lang="" type="null" xml:id="entry_1">
        <form type="lemma">
          <orth m:e="lemma">!מַשֶׁהוּ</orth>
        </form>
        <dictScrap>
          <seg m:e="dictinfo" a:bidict="melingo" a:bisense="1" a:monodict="rav-milim"/>
          <cit type="translationEquivalent"><quote m:e="teqs" xml:lang="eng">something</quote></cit>
          <seg m:e="comment">#ASSIGN:50=[gnd=39.0,dfl=11.0] #MZ@#IE too general</seg>
          <seg m:e="history">eyal-22 Jun 2005, assign-6 Nov 2003</seg>
        </dictScrap>
        <gramGrp>
          <gram type="pos">n</gram>
        </gramGrp>
      </entry>
      <entry m:e="synonym" a:num="8653" a:ev="yy" a:idmorpho="95136" a:stat="checked" a:modify="2005-06-22" xml:lang="" type="null" xml:id="entry_2">
        <form type="lemma">
          <orth m:e="lemma">יֵשׁוּת</orth>
        </form>
        <dictScrap>
          <seg m:e="dictinfo" a:bidict="melingo" a:bisense="1.1" a:monodict="rav-milim" a:monosense="1.1"/>
          <cit type="translationEquivalent"><quote m:e="teqs" xml:lang="eng">entity</quote></cit><seg m:e="comment">#IE</seg>
          <seg m:e="history">eyal-22 Jun 2005</seg>
        </dictScrap>
        <gramGrp>
          <gram type="pos">n</gram>
        </gramGrp>
      </entry>
    </body>
  </text>
</TEI>

After the transformation, the resulting XML (note, PR #16 applied):

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://lari-datasets.ilc.cnr.it/nenu_sample#" xmlns:void="http://rdfs.org/ns/void#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:ns="http://creativecommons.org/ns#" xmlns:lime="http://www.w3.org/ns/lemon/lime#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:lexinfo="http://www.lexinfo.net/ontology/3.0/lexinfo#" xmlns:lexicog="http://www.w3.org/ns/lemon/lexicog#" xmlns:dct="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:ontolex="http://www.w3.org/ns/lemon/ontolex#" xmlns:vann="http://purl.org/vocab/vann/" xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:skos="http://www.w3.org/2004/02/skos/core#">
  <lime:Lexicon>
    <dc:title>hebrew_syns-wordnet</dc:title>
    <dc:publisher>Wordnet</dc:publisher>
    <lime:entry>
      <ontolex:LexicalEntry rdf:ID="entry_1">
        <ontolex:canonicalForm>
          <rdf:Description>
            <ontolex:writtenRep xml:lang="">!מַשֶׁהוּ</ontolex:writtenRep>
          </rdf:Description>
        </ontolex:canonicalForm>
        <dictScrap xmlns="http://www.tei-c.org/ns/1.0">
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" xmlns:a="http://elex.is/wp1/teiLex0Mapper/legacyAttributes" m:e="dictinfo" a:bidict="melingo" a:bisense="1" a:monodict="rav-milim"/>
          <lexinfo:senseTranslation xmlns="http://lari-datasets.ilc.cnr.it/nenu_sample#" xml:lang="eng">something</lexinfo:senseTranslation>
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" m:e="comment">#ASSIGN:50=[gnd=39.0,dfl=11.0] #MZ@#IE too general</seg>
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" m:e="history">eyal-22 Jun 2005, assign-6 Nov 2003</seg>
        </dictScrap>
      </ontolex:LexicalEntry>
    </lime:entry>
    <lime:entry>
      <ontolex:LexicalEntry rdf:ID="entry_2">
        <ontolex:canonicalForm>
          <rdf:Description>
            <ontolex:writtenRep xml:lang="">יֵשׁוּת</ontolex:writtenRep>
          </rdf:Description>
        </ontolex:canonicalForm>
        <dictScrap xmlns="http://www.tei-c.org/ns/1.0">
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" xmlns:a="http://elex.is/wp1/teiLex0Mapper/legacyAttributes" m:e="dictinfo" a:bidict="melingo" a:bisense="1.1" a:monodict="rav-milim" a:monosense="1.1"/>
          <lexinfo:senseTranslation xmlns="http://lari-datasets.ilc.cnr.it/nenu_sample#" xml:lang="eng">entity</lexinfo:senseTranslation>
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" m:e="comment">#IE</seg>
          <seg xmlns:m="http://elex.is/wp1/teiLex0Mapper/meta" m:e="history">eyal-22 Jun 2005</seg>
        </dictScrap>
      </ontolex:LexicalEntry>
    </lime:entry>
  </lime:Lexicon>
</rdf:RDF>

is not valid RDF due to "Multiple children of property element" inside the dictScrap elements.

What would be a good way to deal with that?

@laurentromary
Copy link
Collaborator

laurentromary commented Dec 11, 2020

Hi kernc. It's Laurent here. Keeping to the spirit within which we set up TEI Lex 0, I would suggest to first make the content TEI Lex 0 compliant and then do the transform. The source data is a little weird (under encoded from a TEI point of view + quite a couple of additional hacks). @ttasovac what do you think?

@kernc kernc mentioned this issue Dec 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants