-
Notifications
You must be signed in to change notification settings - Fork 0
CRUDOperations
These are speculations on possible CRUD and provenance related scenarios.
In Diversicon each LMF XML file should contain exactly one LexicalResource
. This simplifies file management and provenance.
A Diversicon database should always contain a faithful representation of the imported XMLs. To allow this, eventual changes to an imported LexicalResource
should be done in a controlled manner by Diversicon (i.e. ID renaming or edges added for computing transitive closure). This way at any time you should be able to export a LexicalResource to obtain something nearly identical to the original XML it came from. Some difference with the original could be admitted for provenance purposes, like i.e. additional metadata documentating the passage into Diversicon.
Creation of LexicalResource
should be done only through import functions of Diversicon, as this preserves database integrity and allows keeping properly track of metadata.
There are different scenarios in which a user might want to update an existing LexicalResource
. He might want to modify internal links, links to other resources, or just change fields such as synset descriptions. We now discuss in detail some relevant case for synsets.
If you just want to append synsets under existing ones in an upper ontology, without actually modifying that ontology, It is sufficient to create a new LexicalResource
XML and link synsets of another XML by using SynsetRelations
:
Suppose you have an upper ontology with two Synsets
linked by a hypernym
relation, and you want to extend the relation by inserting a third node between the two:
Currently, to achieve this goal without manually modifying the original resource, you can create a new LexicalResource
holding the middle synset linked with a hypernym
(a canonical relation) to the top node, and a hyponym
(a non-canonical relation) to the bottom node. When importing the new LexicalResource Diversicon will automatically run a normalization procedure, which will create edges of canonical relations such as hypernym
:
In some cases you might be forced to directly change the original LexicalResource
. For example, you found a nice WordNet in your favourite language but quickly discovered some synsets have wrong relations and others don't have any description at all. So you want to fix relations and add missing descriptions: currently the best way to do this would be to create your own version of the LexicalResource
and assign a different namespace to the resource . To do the changes, you could either:
- import the original XML into Diversicon, specifying to use a different namespace
- do the changes with some [Database browser](../master/docs/Database browsers.md)
- manually run the function
processGraph
to validate, normalize, and compute the transitive closure of the graph.
To keep track of changes, you could use some DB diff tool
- modify the original XML, assigning a new namespace
- import the modified XML into Diversicon
To keep track of changes, you could use some diff tool, even versioning with git (not ideal, but could still work).
If the changes we want to do are additive, we might envisage allowing a special "patch import" mode that will roughly merge stuff this way:
If an element in the input XML has the same ID of an element in the database:
- attributes in the DB element will be replaced with attributes from the XML
- for subelements of cardinality 1, the subelement in the db will be merged with the subelement from the XML
- for subelements of cardinality more than one, if in the XML they have an id which is matching an ID the DB, they will be merged with the corrisponding subelement in the db, otherwise they will be added to the list of existing subelements.
If an element has an ID not matching an ID in the database:
- element is added to db
If an element does not have an ID:
- a new ID with default prefix will be generated and element will be added to elements in the DB
Currently, there is no special facility for deleting stuff. If you try to do it manually the DB might also complain that you are violating some constraint (for SynsetRelation
you don't have constraints). Probably in many cases if you need to get rid of a LexicalResource
you could just create an empty database and reimport all the XMLs.