EIJMRMS ISSN: 2750-8587
VOLUME04 ISSUE06
91
SIMILARITIES OF LEXICAL-SEMANTIC RELATIONS IN UZBEK AND ENGLISH LANGUAGES
Khurramova Gulchekhra
Teacher of Termiz University of Economics and Service, Uzbekistan
AB O U T ART I CL E
Key words:
Corpus, corpus linguistics, parallel
corpus, translation corpus, comparable corpus,
segmentation, machine translation, tokenization,
lemmatization, stemization.
Received:
18.06.2024
Accepted
: 23.06.2024
Published
: 28.06.2024
Abstract:
This article deals with corpus
linguistics, ideas about the corpus and its parallel
corpus link, its structure, corpus types, tokens,
lemmas, stemming. Today, the theoretical and
practical significance of the corps is in the study of
the existing possibilities of language in Uzbek
linguistics, the identification of problematic
aspects of linguistics, the creation of electronic
dictionaries, increasing the effectiveness of
modern information technology in language
learning, automatic translation, search and
computer analysis. In solving problems, there is a
need to build a corpus of language in specific
areas.
INTRODUCTION
One of the global problems of the 21st century is to preserve the national character of natural languages.
Consistently conducting research on NLP and language technologies has become an urgent task in the
creation and development of electronic corpora of world languages. Scientific and practical research
conducted abroad in the field of corpus linguistics proved that the corpus is a necessary and necessary
point not only for representatives of the field dealing with words, but also for the development of the
nation. The creation of the national corpus of the Uzbek language has become one of the most important
issues facing our applied linguistics in our country today. In particular, in order to increase the prestige
of the Uzbek language in society and at the international level, to create an electronic national corpus
of the Uzbek language, which includes all scientific, theoretical and practical information about the
Uzbek language, popularizing the Uzbek language in the Internet world information network, ensuring
that it occupies a worthy place in it, creating Uzbek applications of software products, implementing
Uzbek language teaching computer programs on a large scale, texts in the Uzbek language the creation
VOLUME04 ISSUE06
https://doi.org/10.55640/eijmrms-04-06-14
Pages: 91-94
EUROPEAN INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY RESEARCH
AND MANAGEMENT STUDIES
ISSN: 2750-8587
VOLUME04 ISSUE06
92
of computer programs intended for editing were identified as important tasks facing linguistics. In
practice, many researches are being conducted in this regard. Russian and English corpus linguistics in
various fields V. Zakharov, A. Sedov, A. Baranov, R. Potapova, V. Rykov, U. Francis, N. Leontyeva, V.
Martin, S. Kubler, A. Laurence, E. Etwell, S. Hunston, L. Boizou, McKennery, J. Grafmiller, J. Grieva, N.
Groom, S. Hansson, K. MMcAulif, M. Malberg, P. Milin, A. Murakami, R. Peych, A. Schembri, P. Thompson,
B. Winter, G. Lynch and other foreign scientists3 conducted scientific research in the field of corpus
studies (corpus linguistics) in Turkology. Aksan, Deniz, Zeyrek, Kemal Oflazer, Umut Özge Bular on
Turkish language; Yusup Aibaidulla, Kim-Teng Lua on the Uyghur language; I.A. Buskunbaeva, Z.
Sirazitdinov on the Bashkir language; Sheymovich on the Khakaz language, J. Suleymanov, A. Gatiatullin,
O. Nevzorova, R. Gilmullin, B. Hakimov on the Tatar language; The works of scientists such as L.
Kubedinova on the Crimean Tatar language and Salchak on the Tuva language are noteworthy.
Uzbek scientists B. Mengliyev, Sh. Shahobiddinova, Z. Kholmanova, S. Karimov, N. Abdurakhmonova, L.
Raupova, Sh. Hamroyeva, M. Abjalova, G. Toirova, G. Ikromova, J. Djumbayeva, G. Ergasheva, A.
Eshmo`minov did scientific work. The conceptology of the national corpus of the Uzbek language is
being developed by a team of scientists under the leadership of B. Mengliyev. There are different
standards for creating a corpus of texts. The corpora are based on the types of data base (oral, written),
the language of the texts (Russian, German, Turkish...), the parallelism of the text translations (bilingual,
trilingual), the style (colloquial, artistic, official, scientific, journalistic), from the base It is structured
depending on the possibility of use (open, closed), geographical location (belonging to only one country
or etc.). A corpus is a set of spoken and written texts stored in a computer database. The time when the
materials collected in the corpus were written, which style they belong to, and which source they belong
to, will also be explained in detail. Depending on his interests, the user can refer to artistic, scientific,
official or journalistic texts. This is especially useful in language learning. In school education,
pedagogues are very helpful in quickly giving tasks to students to strengthen their knowledge during
the lesson. The corpus is a systematized library with a very wide scope and a high level of importance.
Easy to use, saves a lot of time. It differs from other programs in terms of electronic search system.
Corpus search allows the user to find all forms of the specified word in different contexts. It clearly
shows where it is in the dictionary and its options. It can determine the range of words that can be
combined, denotative and connotative meanings of the searched word. Describes the frequency or
statistics of word usage in a writer's work. It is a sign of modern development that can reflect the state
of use of this word in which period.
EUROPEAN INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY RESEARCH
AND MANAGEMENT STUDIES
ISSN: 2750-8587
VOLUME04 ISSUE06
93
Parallel corpus as a new type of linguistic resources, the parallel corpus autonomous part of the
electronic corpus is important for its ability to collect a lot of necessary information. In the direction of
machine translation, there are specially formatted multilingual corpora for side-by-side comparison,
which are called structured parallel corpora. An early example of a corpus of parallel texts was found
in 1799 in the Nile Delta near the city of Rosetta, dating back to 196 BC. It is a stone that is spoken about
honors. We observe information about the structure, composition and possibility of the parallel corpus
in the works of D.O.Dobrovolsky, Yu.Tao, V.Zakharov, A.A.Kokoreva, E.P.Sosnina. A parallel corpus, a
collection of originals and their translations, can be used in many ways for the benefit of translation
studies, machine translation, linguistics, computational linguistics, or simply the human translator. In
computational linguistics, translation corpora have been used since the early 1980s for machine
translation, as well as for term extraction, word semantics, etc. As the first parallel texts, avalanche
reports collected in German, French, and Italian languages in Switzerland, weather information
provided by Canadian mass media in English and French appeared in the late 1980s and early 1990s.
A parallel corpus is a pair of translated texts. In translation studies, the main focus is on identifying the
features that distinguish translations from original texts. These changes may be individual to a given
translation task or translation pair, but they may specify common features that distinguish translations
from untranslated texts according to the broad linguistic features of the translated text. This research
is a clear way to empirically identify specific features of corpus translations, and since the 1990s by
Baker (1993; 1996), Johansson & Ebeling (1996) and more recently by Hansen (2003); Teich (2003);
Used by Mauranen & Kujamäki (2004) and Hansen-Schirra, Neumann & Steiner (2012). In addition,
parallel corpora are used as a reference in translation teaching and professional translation settings, as
they provide quick and interactive access to translation solutions (such as translation memories). The
University of Liverpool will host the 2009 Corpus Linguistics Conference, which will discuss the
requirements of linguists and translation studies working with parallel corpora, tools for using corpora
for their own purposes, and issues related to corpus interfacing.
REFERENCES
1.
Abdurakhmonova N, Tuliyev U. Morphological analysis by finite state transducer for Uzbek-English
machine translation/Foreign Philology: Language. Literature, Education. 2018(3):68.
2.
Abdurakhmonova N, Urdishev K. Corpus based teaching Uzbek as a foreign language. Journal of
Foreign Language Teaching and Applied Linguistics (JFLTAL). 2019;6(1-2019):131-7.
3.
Abduraxmonova, N. Z. "Linguistic support of the program for translating English texts into Uzbek
(on the example of simple sentences): Doctor of Philosophy (PhD) il dis. aftoref." (2018).
EUROPEAN INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY RESEARCH
AND MANAGEMENT STUDIES
ISSN: 2750-8587
VOLUME04 ISSUE06
94
4.
Abdurakhmonova N. The bases of automatic morphological analysis for machine translation.
Izvestiya Kyrgyzskogo gosudarstvennogo tekhnicheskogo universiteta. 2016;2 (38):12-7.
5.
ALAUDINOVA D. Written translation of texts related to different spheres.
6.
ALAUDINOVA D. FRAZEOLOGIK BIRIKMALAR VA ULARNI TARJIMA QILISH USULLARI //XALQ
TA’LIMI. –
С. 57.
7.
Алаудинова Д. FRAZEOLOGIK (TURGʻUN) BIRIKMALAR VA ULARNI TARJIMA QILISH USULLARI
//Ижтимоий
-
гуманитар фанларнинг долзарб муаммолари/Актуальные проблемы
социально
-
гуманитарных наук/Actu
al Problems of Humanities and Social Sciences.
–
2023.
–
Т.
3.
–
№. S/9.
