Skip to content

CRUDOperations

DavidLeoni edited this page Feb 24, 2017 · 1 revision

CRUD operations

Design requirements

Create

Update

Delete


These are speculations on possible CRUD and provenance related scenarios.

Design requirements

Req 1: One LexicalResource per XML

In Diversicon each LMF XML file should contain exactly one LexicalResource. This simplifies file management and provenance.

Req 2: Imported LexicalResources shouldn't change

A Diversicon database should always contain a faithful representation of the imported XMLs. To allow this, eventual changes to an imported LexicalResource should be done in a controlled manner by Diversicon (i.e. ID renaming or edges added for computing transitive closure). This way at any time you should be able to export a LexicalResource to obtain something nearly identical to the original XML it came from. Some difference with the original could be admitted for provenance purposes, like i.e. additional metadata documentating the passage into Diversicon.

Create

Creation of LexicalResource should be done only through import functions of Diversicon, as this preserves database integrity and allows keeping properly track of metadata.

Update

There are different scenarios in which a user might want to update an existing LexicalResource. He might want to modify internal links, links to other resources, or just change fields such as synset descriptions. We now discuss in detail some relevant case for synsets.

Appending synsets to an upper ontology

If you just want to append synsets under existing ones in an upper ontology, without actually modifying that ontology, It is sufficient to create a new LexicalResource XML and link synsets of another XML by using SynsetRelations:

Inserting middle synsets

Suppose you have an upper ontology with two Synsets linked by a hypernym relation, and you want to extend the relation by inserting a third node between the two:

Currently, to achieve this goal without manually modifying the original resource, you can create a new LexicalResource holding the middle synset linked with a hypernym (a canonical relation) to the top node, and a hyponym (a non-canonical relation) to the bottom node. When importing the new LexicalResource Diversicon will automatically run a normalization procedure, which will create edges of canonical relations such as hypernym:

Updating existing synsets

In some cases you might be forced to directly change the original LexicalResource. For example, you found a nice WordNet in your favourite language but quickly discovered some synsets have wrong relations and others don't have any description at all. So you want to fix relations and add missing descriptions: currently the best way to do this would be to create your own version of the LexicalResource and assign a different namespace to the resource . To do the changes, you could either:

Updating existing synsets: Manual DB edit

  1. import the original XML into Diversicon, specifying to use a different namespace
  2. do the changes with some [Database browser](../master/docs/Database browsers.md)
  3. manually run the function processGraph to validate, normalize, and compute the transitive closure of the graph.

To keep track of changes, you could use some DB diff tool

Updating existing synsets: Edit original XML

  1. modify the original XML, assigning a new namespace
  2. import the modified XML into Diversicon

To keep track of changes, you could use some diff tool, even versioning with git (not ideal, but could still work).

Updating existing synsets: Patch with incremental XML

If the changes we want to do are additive, we might envisage allowing a special "patch import" mode that will roughly merge stuff this way:

If an element in the input XML has the same ID of an element in the database:

  • attributes in the DB element will be replaced with attributes from the XML
  • for subelements of cardinality 1, the subelement in the db will be merged with the subelement from the XML
  • for subelements of cardinality more than one, if in the XML they have an id which is matching an ID the DB, they will be merged with the corrisponding subelement in the db, otherwise they will be added to the list of existing subelements.

If an element has an ID not matching an ID in the database:

  • element is added to db

If an element does not have an ID:

  • a new ID with default prefix will be generated and element will be added to elements in the DB

Delete

Currently, there is no special facility for deleting stuff. If you try to do it manually the DB might also complain that you are violating some constraint (for SynsetRelation you don't have constraints). Probably in many cases if you need to get rid of a LexicalResource you could just create an empty database and reimport all the XMLs.