SHARTLI TASODIFIY MAYDONLAR MODELI ASOSIDA O‘ZBEK TILI MATNLARINI PUNKTUATSION TAHLIL QILISH.

Authors

  • xushnudbek adinayev TATU Urganch filiali
  • Maqsud Sharipov

Keywords:

Tinish belgilari, NLP, CRF model, F1 mezonlar.

Abstract

This study proposes an approach based on the Conditional Random Fields (CRF) model for punctuation analysis of Uzbek language texts. The primary goal of the research is to ensure the structural integrity of texts by accurately identifying and placing punctuation marks in Uzbek texts. The CRF model, leveraging its sequence-based analytical capabilities, thoroughly examines words and their contextual features to predict the placement of punctuation marks. As part of the project, a specialized corpus for the Uzbek language is constructed, in which the relationships between each word and punctuation mark are annotated. 

 

References

Pham Q.H., Nguyen B.T., Cuong N.V. Punctuation Prediction for Vietnamese Texts Using Conditional Random Fields // ACML Workshop: Machine Learning and Its Applications in Vietnam (MLAVN 2014). – 2014. – B. 1–9.

Yavuz S., Nalci A. Punctuation using Conditional Random Fields // University of California, San Diego, Course Project Report. – 2013. – 7 bet.

D. Hardt, “Comma checking in Danish,” 2001.

U. Salaev, E. Kuriyozov, and C. Gómez-Rodríguez, “SimRelUz: Similarity and Relatedness scores as a Semantic Evaluation Dataset for Uzbek Language,” in 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, SIGUL 2022 - held in conjunction with the International Conference on Language Resources and Evaluation, LREC 2022 - Proceedings, 2022, pp. 199 – 206. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138700420&partnerID=40&md5=bf476cd74317f06577dd0548c5c600d6

A. M. Abdurashetona and I. O. Ismailovich, “Methods of Tagging Part of Speech of Uzbek Language,” in Proceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021, 2021, pp. 82 – 85. doi: 10.1109/UBMK52708.2021.9558900.

A. M. Abdurashetona and U. Mokhiyakon, “Software Features and Linguistic Features of Uzbek Synonymizer,” in Proceedings - 7th International Conference on Computer Science and Engineering, UBMK 2022, 2022, pp. 171 – 175. doi: 10.1109/UBMK55850.2022.9919447.

B. Mengliyev, S. Shahabitdinova, S. Khamroeva, S. Gulyamova, and A. Botirova, “The morphological analysis and synthesis of word forms in the linguistic analyzer,” Journal of Language and Linguistic Studies, vol. 17, no. 1, pp. 558 – 564, 2021, [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85103797274&partnerID=40&md5=8a4052419f721c3c734f8fe1c48984ec

K. Madatov, S. Bekchanov, and J. Vičič, “Dataset of Karakalpak language stop words,” Data Brief, vol. 48, 2023, doi: 10.1016/j.dib.2023.109111.

M. Sharipov and O. Sobirov, “Development of a Rule-Based Lemmatization Algorithm Through Finite State Machine for Uzbek Language,” in CEUR Workshop Proceedings, 2022, pp. 154 – 159. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85146112590&partnerID=40&md5=e1080c39d101c0e351cfed1a8228d391

M. Sharipov and O. Yuldashov, “UzbekStemmer: Development of a Rule-Based Stemming Algorithm for Uzbek Language,” in CEUR Workshop Proceedings, 2022, pp. 137 – 144. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85146120714&partnerID=40&md5=b86a3c26c4ac5cdc4a0365e91374f052

D. Mengliev, E. Akhmedov, V. Barakhnin, Z. Hakimov, and O. Alloyorov, “Utilizing Lexicographic Resources for Sentiment Classification in Uzbek Language,” Jan. 2023, pp. 1720–1724. doi: 10.1109/APEIE59731.2023.10347765.

M. Sharipov, J. Mattiev, J. Sobirov, and R. Baltayev, ‘Creating a Morphological and Syntactic Tagged Corpus for the Uzbek Language’, CEUR Workshop Proceedings, vol. 3315, pp. 93–98, 2022.

Цуканова О. А. Методология и инструментарий моделирования бизнес-процессов: учебное пособие – СПб.: Университет ИТМО, 2015. – 100 с.

K. Madatov, S. Bekchanov, and J. Vicic, “Dataset of stopwords extracted from Uzbek texts”, Data in Brief, vol. 43, 2022

Published

2025-06-02

How to Cite

adinayev, xushnudbek, & Sharipov, M. (2025). SHARTLI TASODIFIY MAYDONLAR MODELI ASOSIDA O‘ZBEK TILI MATNLARINI PUNKTUATSION TAHLIL QILISH. The Descendants of Al-Fargani, (2), 66–70. Retrieved from http://al-fargoniy.uz/index.php/journal/article/view/815

Issue

Section

Статьи

Categories