Neural dependency parsing of uzbek texts based on the universal dependency treebank
Keywords:
Universal Dependency treebank, Uzbek languageAbstract
The Uzbek language is morphologically rich with numerous affixes, and meaning in sentences is expressed through various suffixes of nouns, verbs, and other parts of speech. In this article, we present a neural semantic analysis of Uzbek language texts based on a set of trees created using the Universal Dependencies (UD) corpus. This corpus contains 686 sentences and 7,950 tokens, with an average of 11.6 tokens per sentence. We construct a graph-based parser that employs a two-stage architecture - a BiLSTM-based contextual encoder combined with a biaffine function that integrates head and dependent projections.
References
A. Akhundjanova and L. Talamo, “Universal Dependencies Treebank for Uzbek,” in Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (Resourceful 2025), pp. 129–134.
Elmurod Kuriyozov, David Vilares, and Carlos Gómez-Rodríguez. 2024. BERTbek: A Pretrained Language Model for Uzbek. In Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024, pages 33–44, Torino, Italia. ELRA and ICCL.
Mansurov, B., Mansurov, A. (2021). UzBERT: pretraining a BERT model for Uzbek. arXiv preprint arXiv:2108.09814.
Arofat Akhundjanova, Furkan Akkurt, Bermet Chontaeva, Soudabeh Eslami, and Cagri Coltekin. 2025. Parallel Universal Dependencies Treebanks for Turkic Languages. In Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025), pages 129–136, Ljubljana, Slovenia. Association for Computational Linguistics.
Matlatipov, G., Vetulani, Z. (2009). Representation of Uzbek Morphology in Prolog. In: Marciniak, M., Mykowiecka, A. (eds) Aspects of Natural Language Processing. Lecture Notes in Computer Science, vol 5070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04735-0_4
M. S. Sharipov, H. S. Adinaev and E. R. Kuriyozov, "Rule-Based Punctuation Algorithm for the Uzbek Language," 2024 IEEE 25th International Conference of Young Professionals in Electron Devices and Materials (EDM), Altai, Russian Federation, 2024, pp. 2410-2414, doi: 10.1109/EDM61683.2024.10615061.
Downloads
Additional Files
Published
How to Cite
License
Copyright (c) 2025 Sanatbek Matlatipov, Xurshid Fayzullayev

This work is licensed under a Creative Commons Attribution 4.0 International License.