Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to differentiate DET for quantifiers and DET for demonstrative determiners for isolating languages like Thai #1048

Closed
leky40 opened this issue Aug 5, 2024 · 4 comments
Labels
dependencies question Thai UPOS Universal part-of-speech tags: definitions and examples
Milestone

Comments

@leky40
Copy link

leky40 commented Aug 5, 2024

I was wondering if there would be a syntactic relation to differentiate a quantifier tagged DET and a demonstrative determiner tagged DET for isolating languages without any agreements and/or grammatical markers, like Thai. I was trying to annotate a construction / structure: head noun + quantifier + noun (used as a classifier) + demonstrative determiner.

Please look at a treebank I attach here.

According to the UD framework, in the treebank, I tagged the quantifier (meaning "several) DET and the demonstrative determiner (meaning "this") DET. When I annotated them with the syntactic relations for both, they must be "det".

But is there any way to distinguish these two syntactic relations for these two words?

I checked the UD relations. There are two: det:numgov and det:nummod. After reading their definitions, I am not sure if they would be fit for Thai. Only Czech samples are presented.

Or should this Thai demonstrative determiner not be tagged DET?

Thai is an isolating and tone language without grammatical markers.

Would there be any suggestions for my questions? And does my annotation seem possible or to make sense?

samples for classifier doc txt(2)

@ftyers
Copy link
Contributor

ftyers commented Aug 5, 2024

There generally wouldn't be a different syntactic relation, but you could use a language specific one det:quant vs. det. The other thing you could do is use the PronType=Dem for the demonstrative sense, or propose a PronType=Qnt lexico-morphological feature.

@leky40
Copy link
Author

leky40 commented Aug 5, 2024

There generally wouldn't be a different syntactic relation, but you could use a language specific one det:quant vs. det. The other thing you could do is use the PronType=Dem for the demonstrative sense, or propose a PronType=Qnt lexico-morphological feature.

Ok thank you

@Stormur
Copy link
Contributor

Stormur commented Aug 6, 2024

The syntactic relation is and has to stay the same. Some possible distinguishing traits like the position in the phrase are already represented in the tree structure.

Since these are lexical/semantic differences, the way to mark them is through PronType, as already suggested by ftyers, and I would also point to NumType when dealing with quantities. The point is that if a determiner can answer to a quantity question as a numeral can (How many? Several/Three), this feature makes sense. The difference with numerals will be having a PronType and/or not having a specific value (NumValue).

I am skeptical about putting semantic subtypes referring only to single elements in the dependency relations, as opposed to other subtypes referring to whole constructions and clauses (e.g. pass, cmp, even numgov...), so personally I would avoid that. Features like PronType make these elements already retrievable.

@leky40
Copy link
Author

leky40 commented Aug 6, 2024

@Stormur thank you

@dan-zeman dan-zeman added question UPOS Universal part-of-speech tags: definitions and examples Thai labels Aug 29, 2024
@dan-zeman dan-zeman added this to the v2.15 milestone Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies question Thai UPOS Universal part-of-speech tags: definitions and examples
Projects
None yet
Development

No branches or pull requests

4 participants