From 322862da685f14e7ae9a59835be7b1f344b6966b Mon Sep 17 00:00:00 2001 From: Filip Ginter Date: Sat, 7 Dec 2024 18:06:14 +0200 Subject: [PATCH] autogenerated by https://github.com/spyysalo/jekyll-hook --- ab/template-index.html | 208 ----------------------------------------- ka/index.html | 4 +- 2 files changed, 2 insertions(+), 210 deletions(-) delete mode 100644 ab/template-index.html diff --git a/ab/template-index.html b/ab/template-index.html deleted file mode 100644 index 8b7969965d5..00000000000 --- a/ab/template-index.html +++ /dev/null @@ -1,208 +0,0 @@ - - - - - Abkhaz UD - - - - - - - - - - - - - - - - -
- -
-
- home - - edit page - issue tracker - - -
-
- -
- - -
- This page pertains to UD version 2. -
- - -
- - - - -

UD for Abkhaz

- -

Tokenization and Word Segmentation

- -

*

- -
-

Instruction: Describe the general rules for delimiting words (for example, based on whitespace and punctuation) and exceptions to these rules. Specify whether words with spaces and/or multiword tokens occur. Include links to further language-specific documentation if available.

- -
- -

Morphology

- -

Tags

- -

*

- -
-

Instruction: Specify any unused tags. Explain what words are tagged as PART. Describe how the AUX-VERB and DET-PRON distinctions are drawn, and specify whether there are (de)verbal forms tagged as ADJ, ADV or NOUN. Include links to language-specific tag definitions if any.

- -
- -

Features

- -

*

- -
-

Instruction: Describe inherent and inflectional features for major word classes (at least NOUN and VERB). Describe other noteworthy features. Include links to language-specific feature definitions if any.

- -
- -

Syntax

- -

*

- -
-

Instruction: Give criteria for identifying core arguments (subjects and objects), and describe the range of copula constructions in nonverbal clauses. List all subtype relations used. Include links to language-specific relations definitions if any.

- -
- -

Treebanks

- -

There are N Abkhaz UD treebanks:

- - - -
-

Instruction: Treebank-specific pages are generated automatically from the README file in the treebank repository and -from the data in the latest release. Link to the respective *-index.html page in the treebanks folder, using the language code -and the treebank code in the file name.

- -
- -
- - - - - - - - - - -
- - diff --git a/ka/index.html b/ka/index.html index b7518cee679..f4c11ab3a39 100644 --- a/ka/index.html +++ b/ka/index.html @@ -76,7 +76,7 @@ root + 'lib/ext/jquery.address.min.js' ); -

UD for LANGUAGE

+

UD for Georgian

This is a work-in-progress overview of the UD annotation for Georgian.

@@ -85,7 +85,7 @@

Tokenization and Word Segmentation
  • In Modern Georgian, words are delimited regularly by white spaces and punctuation marks. However, in Old Georgian, tokenization was an irregular process, as words were sometimes separated by white spaces and sometimes not. Additionally, depending on the century, words in Old Georgian could also be separated by paragraph separators (჻).
  • Punctuation symbols are not separated from the words; that holds even for hyphenated compounds such as siblings “და-ძმა” ‘sister and brother’ (one token) etc. However, the dash is separated from the surrounding characters. They can consist of a sequence of symbols, such as a question mark followed by an exclamation mark (?!), an exclamation mark followed by two full stops (!..) and ellipsis (…) and appear: a) in abbreviations (ა.შ. ‘etc.’, ე.ი. ‘i.e.’, etc.) and b) in numeric expressions (1.2, 0,5, etc.).
  • -
  • Due to rich agglutinating type of morphology, clitics can be treated as multi-word tokens and segmented to individual syntactic words in the following cases: +
  • Due to rich agglutinating type of morphology, clitics can be treated as multi-word tokens and segmented to individual syntactic words in the following cases: a) auxiliary verbs (AUX) attached to the nominal paradigm, which add functional and grammatical meaning to the sentence, expressing tense, aspect, mood, etc.: სახლია = სახლი+ა ‘is a house’; b) postpositions represented by a suffix attached to an inflected nominal (noun, adjective, numeral and pronoun): სახლში = სახლ+ში ‘in the house’; c) the indirect speech particle represented by a suffix attached to an inflected nominal or verb: სახლიო = სახლი+ო ‘a house as smb. said’, წერსო = წერს+ო ‘he writes as smb. said’.