Mualliflar

  • Zokirova Elinura Jasurovna

DOI:

https://doi.org/10.71337/inlibrary.uz.trtteztro.120060

Kalit so‘zlar:

Keywords: corpus linguistics scientific terminology collocation concordance ESP COCA terminology extraction

Annotasiya

Abstract: This article explores how corpus linguistic methods contribute to the analysis and understanding of English scientific terminology. It discusses the application of frequency analysis, concordance, collocation, and semantic prosody in identifying patterns and usage of scientific terms in authentic contexts. Drawing on corpora such as the British National Corpus (BNC), the Corpus of Contemporary American English (COCA), and specialized scientific subcorpora, the study demonstrates how data-driven approaches can reveal morphological, syntactic, and semantic features of terminological units. The paper concludes that corpus linguistics plays a crucial role in terminology management, translation, lexicography, and teaching English for Specific Purposes (ESP).

background image

Ta'limda raqamli texnologiyalarni tadbiq etishning zamonaviy tendensiyalari va rivojlanish omillari

www.pedagoglar.org

45-to’plam_1-qism_Iyul-2025

53

CORPUS LINGUISTIC APPROACHES TO ENGLISH

SCIENTIFIC TERMINOLOGY

Zokirova Elinura Jasurovna

Qarshi State University

Abstract:

This article explores how corpus linguistic methods contribute to the

analysis and understanding of English scientific terminology. It discusses the
application of frequency analysis, concordance, collocation, and semantic prosody in
identifying patterns and usage of scientific terms in authentic contexts. Drawing on
corpora such as the British National Corpus (BNC), the Corpus of Contemporary
American English (COCA), and specialized scientific subcorpora, the study
demonstrates how data-driven approaches can reveal morphological, syntactic, and
semantic features of terminological units. The paper concludes that corpus linguistics
plays a crucial role in terminology management, translation, lexicography, and
teaching English for Specific Purposes (ESP).

Keywords

: corpus linguistics, scientific terminology, collocation, concordance,

ESP, COCA, terminology extraction


The growth of science and technology has led to a vast expansion of scientific

vocabulary in English. With thousands of new terms coined annually, there is a
growing need for systematic methods to analyze, classify, and teach this terminology
effectively. One of the most powerful tools in this endeavor is

corpus linguistics

—the

study of language through large collections of real-world texts, known as corpora.

Corpus-based research provides insights into how scientific terms are used in

context, how frequently they occur, what grammatical patterns they follow, and which
words they co-occur with. These insights are valuable for various fields, including
applied linguistics, translation studies, language teaching, and lexicography.

This article aims to show how corpus linguistic approaches can be used to study

English scientific terminology and support the development of accurate and
pedagogically useful resources.

To investigate scientific terminology from a corpus-linguistic perspective, the

following steps and tools were used:

Corpus Selection

: General corpora (BNC, COCA) and domain-specific corpora

(e.g., Medline for medical texts, arXiv abstracts for physics, and academic journals
from the Scientific American corpus).

Frequency Analysis

: To identify high-frequency scientific terms and affixes.

Concordance Analysis

: Using AntConc software to observe usage patterns in

context.


background image

Ta'limda raqamli texnologiyalarni tadbiq etishning zamonaviy tendensiyalari va rivojlanish omillari

www.pedagoglar.org

45-to’plam_1-qism_Iyul-2025

54

Collocation Analysis

: Measuring co-occurrence relationships (e.g.,

data +

analysis

,

quantum + mechanics

) using statistical measures like MI (Mutual

Information) and log-likelihood.

Semantic Prosody

: Examining the attitudinal or connotative meaning around

specific terms.

Term Extraction

: Using tools like Sketch Engine and WordSmith Tools to

extract candidate terminology based on keyness and domain relevance.

Frequency and Productivity of Scientific Terms

Corpus data showed that English scientific texts are heavily loaded with

nominal

phrases

(e.g.,

climate change

,

genetic mutation

,

information retrieval

), many of which

follow the

N + N

or

Adj + N

patterns.

Top frequent terms (from COCA-academic register):

data

,

analysis

,

research

,

study

,

result

,

model

.

Common affixes in terms:

-logy

,

-scope

,

bio-

,

eco-

,

techno-

.

Emerging neologisms:

infodemic

,

nanocarrier

,

biointerface

.

Collocational Patterns

Data

collocates with

collection

,

analysis

,

set

,

processing

.

Gene

collocates with

expression

,

sequence

,

therapy

,

editing

.

These patterns reveal

semantic frames

and help disambiguate polysemous

terms.

Concordance and Contextual Analysis

The term

algorithm

appeared in contexts such as:

search algorithm

,

learning algorithm

,

encryption algorithm

Often preceded by adjectives like

efficient

,

proprietary

, or

complex

These concordance lines indicate disciplinary variation and usage specificity in

computing vs. data science contexts.

Semantic Prosody

The term

mutation

showed a predominantly negative semantic prosody in medical

contexts (e.g.,

harmful mutation

,

genetic disorder

), while in evolutionary biology, the

tone was more neutral or positive (

beneficial mutation

,

adaptive mutation

). This

variation has implications for ESP teaching and scientific communication.

Genre-Specific Terminological Variation Across Disciplines

Corpus data revealed that identical scientific terms may display distinct semantic

and functional profiles depending on the disciplinary context. For instance:

The term

model

in physics is often used to describe theoretical constructs

(

quantum model

,

standard model

), whereas in biology it refers to experimental systems

or organisms (

animal model

,

in vitro model

).


background image

Ta'limda raqamli texnologiyalarni tadbiq etishning zamonaviy tendensiyalari va rivojlanish omillari

www.pedagoglar.org

45-to’plam_1-qism_Iyul-2025

55

The word

interface

refers to a user or system boundary in computing (

user

interface

,

software interface

), but in biology or chemistry, it denotes the contact surface

between substances or systems (

cellular interface

,

protein–surface interface

).

Such

genre-specific variation

highlights the need for context-aware approaches

in translation, lexicography, and ESP (English for Specific Purposes) instruction.

Monitoring Emerging Terminology via Corpus Tools

Using keyword and term-extraction features in

Sketch Engine

and

AntConc

, the

study identified a range of

emerging or trending scientific terms

in corpora from

2015 to 2024:

CRISPR

,

blockchain

,

metamaterial

,

telehealth

,

climate resilience

These terms frequently exhibit

metaphorical origins

or blended word-forms,

underscoring the role of conceptual innovation in modern scientific language. For
example,

blockchain

metaphorically frames information systems using imagery from

physical chains, and

telehealth

blends

tele-

(distance) with

health

to describe remote

healthcare delivery.

Corpus-based analysis provides a more

empirical and objective

approach to

terminology study than traditional introspective methods. It reveals not only the forms
of scientific terms but also their

collocational behavior

,

genre sensitivity

, and

disciplinary variation

.

One significant finding is that many scientific terms exhibit

stable collocational

patterns

that could serve as pedagogical chunks in ESP contexts. For example,

teaching the phrase

conduct a study

as a collocation rather than isolated words

improves fluency and comprehension.

The extended findings reaffirm that

corpus linguistics provides a powerful

empirical foundation

for analyzing scientific terminology. Unlike introspective or

prescriptive approaches, corpus-based methods allow for:

The observation of

authentic, in-use patterns

across disciplines.

The detection of

genre and register variation

, crucial for precision in scientific

communication.

The identification of

collocational routines and phraseological units

, which

are essential for fluency and accuracy in ESP and technical translation.

In educational settings, data from corpora can be used to develop

discipline-

specific language materials

. For example, rather than teaching isolated terms such as

conduct

, instructors can present collocational chunks like

conduct a study

,

conduct an

experiment

, or

conduct a trial

, based on real frequency data.

Furthermore, the analysis showed how some scientific terms undergo

semantic

drift

over time or across contexts. The word

cloud

, for example, has shifted from its

original meteorological meaning to become a core term in information technology


background image

Ta'limda raqamli texnologiyalarni tadbiq etishning zamonaviy tendensiyalari va rivojlanish omillari

www.pedagoglar.org

45-to’plam_1-qism_Iyul-2025

56

(

cloud storage

,

cloud computing

). Such shifts reflect the dynamic and metaphor-rich

nature of scientific discourse.

Another point of interest is the

polyfunctionality of scientific terms

. Many

words, such as

platform

,

network

, or

interface

, operate across multiple disciplines,

often leading to potential ambiguity. Corpus analysis helps clarify these uses by
providing context-rich examples, enabling translators and learners to distinguish
between domain-specific meanings.

Finally, the

integration of corpus tools

into terminology research facilitates the

creation of more effective glossaries, translation memories, and pedagogical resources,
as it aligns linguistic analysis with real-world data and usage trends.

Moreover, corpus linguistics can help identify

terminological drift

—where

words shift meaning across time or domains. For instance,

interface

once referred

mainly to hardware connections, but now commonly denotes software environments
or biological contact zones.

CONCLUSION.

The findings suggest a strong potential for integrating corpus

tools in translator training, technical writing, and ESP curriculum design. Teachers can
use concordance lines to show real usage, while lexicographers can identify typical
structures for dictionary entries.

Corpus linguistic approaches provide rich, authentic, and statistically grounded

insights into the use of scientific terminology in English. These methods enable
researchers to:

Identify frequent and emerging scientific terms

Analyze usage and collocation patterns

Detect semantic nuances and contextual shifts

Enhance translation accuracy and teaching effectiveness

As science becomes more interdisciplinary and terminology more dynamic,

corpus linguistics will play a critical role in ensuring that language professionals keep
pace with terminological change.

References:

1.

McEnery, T., & Hardie, A. (2012).

Corpus Linguistics: Method, Theory and

Practice

. Cambridge University Press.

2.

Flowerdew, L. (2011).

Corpora and Language Education

. Palgrave Macmillan.

3.

Hunston, S. (2002).

Corpora in Applied Linguistics

. Cambridge University Press.

4.

COCA: Corpus of Contemporary American English. https://www.english-
corpora.org/coca/

5.

Anthony, L. (2023).

AntConc Software

. http://www.laurenceanthony.net/

6.

Kilgarriff, A., & Tugwell, D. (2001). “Word Sketches: Automatic Extraction of
Collocations.”

International Journal of Lexicography

, 14(3).

7.

Ahmad, K. et al. (1992).

Terminology: Theory and Applications

.

Bibliografik manbalar

McEnery, T., & Hardie, A. (2012). Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.

Flowerdew, L. (2011). Corpora and Language Education. Palgrave Macmillan.

Hunston, S. (2002). Corpora in Applied Linguistics. Cambridge University Press.

COCA: Corpus of Contemporary American English. https://www.english-corpora.org/coca/

Anthony, L. (2023). AntConc Software. http://www.laurenceanthony.net/

Kilgarriff, A., & Tugwell, D. (2001). “Word Sketches: Automatic Extraction of Collocations.” International Journal of Lexicography, 14(3).

Ahmad, K. et al. (1992). Terminology: Theory and Applications