Skip to content

Commit

Permalink
Add Relation class and .relation_map() methods (#227)
Browse files Browse the repository at this point in the history
* Change _DatabaseEntity._ENTITY_TYPE to enum
* Add Relation class
* Add Sense.relation_map() and Synset.relation_map() methods, which is
  like .relations(), but the keys are Relation objects and map 1-to-1
  to targets.
* Add unique_list() utility function for code clarity
  • Loading branch information
goodmami authored Dec 11, 2024
1 parent 33d5bdb commit c994588
Show file tree
Hide file tree
Showing 10 changed files with 414 additions and 115 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,15 @@
## [Unreleased][unreleased]

## Index

* Added `oewn:2024` ([#221])

## Added

* `Relation` class ([#216])
* `Sense.relation_map()` method ([#216])
* `Synset.relation_map()` method ([#167], [#216])


## [v0.10.1]

Expand Down Expand Up @@ -681,6 +688,7 @@ abandoned, but this is an entirely new codebase.
[#155]: https://github.com/goodmami/wn/issues/155
[#156]: https://github.com/goodmami/wn/issues/156
[#157]: https://github.com/goodmami/wn/issues/157
[#167]: https://github.com/goodmami/wn/issues/167
[#168]: https://github.com/goodmami/wn/issues/168
[#169]: https://github.com/goodmami/wn/issues/169
[#177]: https://github.com/goodmami/wn/issues/177
Expand All @@ -696,4 +704,5 @@ abandoned, but this is an entirely new codebase.
[#213]: https://github.com/goodmami/wn/issues/213
[#214]: https://github.com/goodmami/wn/issues/214
[#215]: https://github.com/goodmami/wn/issues/215
[#216]: https://github.com/goodmami/wn/issues/216
[#221]: https://github.com/goodmami/wn/issues/221
61 changes: 61 additions & 0 deletions docs/api/wn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ The Sense Class
.. automethod:: counts
.. automethod:: metadata
.. automethod:: relations
.. automethod:: relation_map
.. automethod:: get_related
.. automethod:: get_related_synsets
.. automethod:: closure
Expand Down Expand Up @@ -221,6 +222,7 @@ The Synset Class
.. automethod:: holonyms
.. automethod:: meronyms
.. automethod:: relations
.. automethod:: relation_map
.. automethod:: get_related
.. automethod:: closure
.. automethod:: relation_paths
Expand Down Expand Up @@ -253,6 +255,65 @@ The Synset Class
Shortcut for :func:`wn.taxonomy.lowest_common_hypernyms`.


The Relation Class
------------------

The :meth:`Sense.relation_map` and :meth:`Synset.relation_map` methods
return a dictionary mapping :class:`Relation` objects to resolved
target senses or synsets. They differ from :meth:`Sense.relations`
and :meth:`Synset.relations` in two main ways:

1. Relation objects map 1-to-1 to their targets instead of to a list
of targets sharing the same relation name.
2. Relation objects encode not just relation names, but also the
identifiers of sources and targets, the lexicons they came from, and
any metadata they have.

One reason why :class:`Relation` objects are useful is for inspecting
relation metadata, particularly in order to distinguish ``other``
relations that differ only by the value of their ``dc:type`` metadata:

>>> oewn = wn.Wordnet('oewn:2024')
>>> alloy = oewn.senses("alloy", pos="v")[0]
>>> alloy.relations() # appears to only have one 'other' relation
{'derivation': [Sense('oewn-alloy__1.27.00..')], 'other': [Sense('oewn-alloy__1.27.00..')]}
>>> for rel in alloy.relation_map(): # but in fact there are two
... print(rel, rel.subtype)
...
Relation('derivation', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') None
Relation('other', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') material
Relation('other', 'oewn-alloy__2.30.00..', 'oewn-alloy__1.27.00..') result

Another reason why they are useful is to determine the source of a
relation used in :doc:`interlingual queries <../guides/interlingual>`.

>>> es = wn.Wordnet("omw-es", expand="omw-en")
>>> mapa = es.synsets("mapa", pos="n")[0]
>>> rel, tgt = next(iter(mapa.relation_map().items()))
>>> rel, rel.lexicon() # relation comes from omw-en
(Relation('hypernym', 'omw-en-03720163-n', 'omw-en-04076846-n'), <Lexicon omw-en:1.4 [en]>)
>>> tgt, tgt.words(), tgt.lexicon() # target is in omw-es
(Synset('omw-es-04076846-n'), [Word('omw-es-representación-n')], <Lexicon omw-es:1.4 [es]>)

.. autoclass:: Relation

.. attribute:: name

The name of the relation. Also called the relation "type".

.. attribute:: source_id

The identifier of the source entity of the relation.

.. attribute:: target_id

The identifier of the target entity of the relation.

.. autoattribute:: subtype
.. automethod:: lexicon
.. automethod:: metadata


The ILI Class
-------------

Expand Down
24 changes: 24 additions & 0 deletions tests/_util_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@

from wn._util import flatten, unique_list


def test_flatten():
assert flatten([]) == []
assert flatten([[]]) == []
assert flatten([[], []]) == []
assert flatten([[[], []], [[], []]]) == [[], [], [], []]
assert flatten([[1]]) == [1]
assert flatten([[1, 2], [3, 4]]) == [1, 2, 3, 4]
assert flatten(["AB", "CD"]) == ["A", "B", "C", "D"]


def test_unique_list():
assert unique_list([]) == []
assert unique_list([1]) == [1]
assert unique_list([1, 1, 1, 1, 1]) == [1]
assert unique_list([1, 1, 2, 2, 1]) == [1, 2]
assert unique_list([2, 1, 2, 2, 1]) == [2, 1]
assert unique_list("A") == ["A"]
assert unique_list("AAA") == ["A"]
assert unique_list("ABABA") == ["A", "B"]
assert unique_list([(1, 2), (1, 2), (2, 3)]) == [(1, 2), (2, 3)]
2 changes: 2 additions & 0 deletions tests/data/mini-lmf-1.0.xml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ Spanish:
<Lemma partOfSpeech="v" writtenForm="illustrate" />
<Sense id="test-en-illustrate-v-0003-01" synset="test-en-0003-v" >
<SenseRelation relType="derivation" target="test-en-illustration-n-0002-01" />
<SenseRelation relType="other" target="test-en-illustration-n-0002-01" dc:type="result" />
<SenseRelation relType="other" target="test-en-illustration-n-0002-01" dc:type="event" />
</Sense>
<SyntacticBehaviour senses="test-en-illustrate-v-0003-01" subcategorizationFrame="Somebody ----s something" />
<SyntacticBehaviour senses="test-en-illustrate-v-0003-01" subcategorizationFrame="Something ----s something" />
Expand Down
36 changes: 36 additions & 0 deletions tests/relations_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,39 @@ def test_synset_relations_issue_169():
def test_synset_relations_issue_177():
# https://github.com/goodmami/wn/issues/177
assert 'hyponym' in wn.synset('test-es-0001-n').relations()


@pytest.mark.usefixtures('mini_db')
def test_sense_relation_map():
en = wn.Wordnet('test-en')
assert en.sense('test-en-information-n-0001-01').relation_map() == {}
relmap = en.sense('test-en-illustrate-v-0003-01').relation_map()
# only sense-sense relations by default
assert len(relmap) == 3
assert all(isinstance(tgt, wn.Sense) for tgt in relmap.values())
assert {rel.name for rel in relmap} == {'derivation', 'other'}
assert {rel.target_id for rel in relmap} == {'test-en-illustration-n-0002-01'}
# sense relations targets should always have same ids as resolved targets
assert all(rel.target_id == tgt.id for rel, tgt in relmap.items())


@pytest.mark.usefixtures('mini_db')
def test_synset_relation_map():
en = wn.Wordnet('test-en')
assert en.synset('test-en-0003-v').relation_map() == {}
relmap = en.synset('test-en-0002-n').relation_map()
assert len(relmap) == 2
assert {rel.name for rel in relmap} == {'hypernym', 'hyponym'}
assert {rel.target_id for rel in relmap} == {'test-en-0001-n', 'test-en-0004-n'}
# synset relation targets have same ids as resolved targets in same lexicon
assert all(rel.target_id == tgt.id for rel, tgt in relmap.items())
assert all(rel.lexicon().id == 'test-en' for rel in relmap)

# interlingual synset relation targets show original target ids
es = wn.Wordnet('test-es', expand='test-en')
relmap = es.synset('test-es-0002-n').relation_map()
assert len(relmap) == 2
assert {rel.name for rel in relmap} == {'hypernym', 'hyponym'}
assert {rel.target_id for rel in relmap} == {'test-en-0001-n', 'test-en-0004-n'}
assert all(rel.target_id != tgt.id for rel, tgt in relmap.items())
assert all(rel.lexicon().id == 'test-en' for rel in relmap)
2 changes: 2 additions & 0 deletions wn/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
'synset',
'synsets',
'Synset',
'Relation',
'ili',
'ilis',
'ILI',
Expand Down Expand Up @@ -53,6 +54,7 @@
word, words, Word, Form, Pronunciation, Tag,
sense, senses, Sense, Count,
synset, synsets, Synset,
Relation,
ili, ilis, ILI,
Wordnet
)
Expand Down
Loading

0 comments on commit c994588

Please sign in to comment.