IMPLEMENTING AUTHENTIC AI ASSESSMENT IN TESOL:  CHALLENGES AND RESEARCH DIRECTIONS

Dilnoza Usmanova

doi:10.71337/inlibrary.uz.tadqiqotlar.119093

Mualliflar

Dilnoza Usmanova

DOI:

https://doi.org/10.71337/inlibrary.uz.tadqiqotlar.119093

Kalit so‘zlar:

Keywords: AI implementation language assessment TESOL research agenda educational technology

Annotasiya

This article addresses the practical challenges of implementing authentic AI-
enhanced language assessment in TESOL contexts. Drawing on a four-dimensional
framework of authenticity, we identify key implementation barriers at technological,
pedagogical, and institutional levels. We propose a research agenda to address these
challenges and offer practical guidelines for TESOL practitioners navigating the
integration of AI assessment tools. The article concludes with recommendations for
interdisciplinary collaboration between language educators, AI developers, and
assessment researchers.

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

266

ISSN:3030-3613

IMPLEMENTING AUTHENTIC AI ASSESSMENT IN TESOL:

CHALLENGES AND RESEARCH DIRECTIONS

Dilnoza Usmanova

Abstract

This article addresses the practical challenges of implementing authentic AI-

enhanced language assessment in TESOL contexts. Drawing on a four-dimensional
framework of authenticity, we identify key implementation barriers at technological,
pedagogical, and institutional levels. We propose a research agenda to address these
challenges and offer practical guidelines for TESOL practitioners navigating the
integration of AI assessment tools. The article concludes with recommendations for
interdisciplinary collaboration between language educators, AI developers, and
assessment researchers.

Keywords

: AI implementation, language assessment, TESOL, research agenda,

educational technology

1. Introduction

Artificial intelligence technologies offer promising possibilities for language

assessment, but their successful implementation requires addressing significant
challenges related to authenticity. This article examines implementation challenges
through the lens of a four-dimensional authenticity framework (contextual,
interactional, consequential, and representational) and proposes research directions to
address these challenges.

2. Current Implementation Challenges

2.1 Technological Challenges

Computational Resources

: Truly authentic AI assessment may require

substantial computing power not available in all educational contexts. Resource
disparities may create inequitable access to high-quality assessment technologies.

Technical Integration

: Implementing AI systems within existing educational

technology infrastructure presents challenges. Many language programs use learning
management systems with limited AI integration capabilities.

Data Requirements

: High-quality AI assessment requires extensive training

data. Smaller language programs may lack sufficient data for customization or
validation.

Algorithm Transparency

: The “black box” nature of some AI systems

complicates validation against authenticity criteria. Educators may be unable to
determine how assessment decisions are made.

2.2 Pedagogical Challenges

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

267

ISSN:3030-3613

Assessment Literacy

: Many language educators lack sufficient understanding

of AI capabilities and limitations to implement these tools effectively.

Balancing Assessment Types

: Determining appropriate roles for AI versus

human assessment remains challenging, particularly for complex language skills.

Feedback Integration

: Incorporating AI feedback into broader pedagogical

approaches requires careful design to avoid overemphasis on machine-detectable
features.

Learner Resistance

: Some learners may resist AI assessment due to concerns

about validity, fairness, or preference for human evaluation.

2.3 Institutional Challenges

Policy Development

: Many institutions lack policies governing AI assessment

use, raising questions about validity, accessibility, and academic integrity.

Staff Development

: Professional development related to AI assessment

implementation is often inadequate.

Cost-Benefit Analysis

: Institutions struggle to evaluate return on investment for

AI assessment technologies, particularly regarding authentic assessment outcomes.

Ethical Considerations

: Privacy concerns, data ownership, and potential bias

in AI systems raise significant ethical questions that institutions must address.

3. Integration Matrix: Current Status

The following matrix evaluates current implementation status across educational

contexts:

Educ

ational
Context

Contextu

al Authenticity

Interac

tional
Authenticity

Consequ

ential
Authenticity

Represent

ational
Authenticity

Highe

r Education

Moderate

-

some

contextualized
tasks

Low -

limited
dialogue
capabilities

Variable

- depends on
implementation

Low

-

limited
accommodation
of diversity

Privat

e Language
Schools

Low

-

standardized
assessments

Low -

primarily
one-way
feedback

Variable

-

commercial

pressures

Low

-

standard language
focus

K-12

Settings

Low

-

often
decontextualize
d

Low -

limited
interaction

Concerni

ng - potential
negative
washback

Low

-

normative
approaches

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

268

ISSN:3030-3613

Self-

Directed
Learning

Moderate

-

some

personalization

Low -

scripted
interaction

Variable

- depends on
learner attitudes

Low

-

mainstream
language models

This matrix highlights significant gaps in current implementation, particularly

regarding interactional and representational authenticity.

4. Practical Implementation Strategies

4.1 Short-Term Strategies

Hybrid Assessment Approaches

: Combine AI assessment with human

evaluation, leveraging each for appropriate aspects of language performance.

Contextual Scaffolding

: Provide rich contextual information around AI

assessment tasks to enhance contextual authenticity.

Feedback Mediation

: Train educators to help learners interpret and apply AI

feedback within broader communicative contexts.

Transparency Practices

: Clearly communicate to learners what AI can and

cannot effectively evaluate, preventing misaligned expectations.

4.2 Medium-Term Strategies

Customized Implementation

: Develop institution-specific frameworks for AI

assessment integration based on learner needs and program goals.

Professional Development

: Create comprehensive training programs

addressing both technical and pedagogical aspects of AI assessment.

Assessment Ecosystems

: Design complementary assessment approaches that

collectively address all dimensions of authenticity.

Continuous Evaluation

: Implement ongoing evaluation of AI assessment

impact on teaching practices and learning outcomes.

5. Research Agenda

To address implementation challenges, we propose a research agenda organized

around the four authenticity dimensions:

5.1 Contextual Authenticity Research

●

Developing and validating context-rich assessment tasks compatible with AI

evaluation

●

Examining the relationship between contextual features and AI assessment

accuracy

●

Creating frameworks for adapting AI assessment to specific target language

use domains

●

Investigating multimodal integration in AI assessment

5.2 Interactional Authenticity Research

●

Advancing dialogue-based assessment technologies that support authentic

interaction

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

269

ISSN:3030-3613

●

Evaluating turn-taking and repair strategies in AI-human assessment

interactions

●

Developing metrics for evaluating interactional competence through AI

●

Exploring the potential of LLMs for more contingent assessment interaction

5.3 Consequential Authenticity Research

●

Studying washback effects of AI assessment on teaching and learning

practices

●

Investigating stakeholder perceptions and acceptance of AI assessment

●

Examining transfer of learning between AI assessment contexts and real-

world language use

●

Developing approaches to enhance learner agency in AI assessment

5.4 Representational Authenticity Research

●

Creating and validating AI systems that accommodate linguistic variation

●

Developing assessment approaches for multilingual competence

●

Investigating cultural bias in AI assessment and strategies for mitigation

●

Expanding training data to represent diverse communication styles

5.5 Interdisciplinary Research Priorities

●

Collaborative research involving TESOL practitioners, AI developers, and

assessment specialists

●

Mixed-methods approaches combining quantitative evaluation with

qualitative insights

●

Longitudinal studies tracking the impact of AI assessment implementation

over time

●

Action research by practitioners implementing AI assessment in diverse

contexts

6. Case Study: Implementing Authentic AI Writing Assessment

To illustrate practical implementation, we present a case study of an English for

Academic Purposes program implementing an AI writing assessment system:

Initial Challenges

:

●

System provided detailed feedback on grammar and vocabulary but limited

feedback on rhetorical effectiveness

●

Students focused primarily on sentence-level corrections rather than global

improvements

●

Faculty questioned alignment with program’s genre-based writing approach

●

System showed bias against non-standard expressions common in

multilingual writing

Implementation Strategies

:

●

Created supplementary rubrics addressing rhetorical dimensions AI couldn’t

evaluate

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

270

ISSN:3030-3613

●

Developed faculty-led workshops helping students interpret AI feedback

within genre expectations

●

Implemented peer review focusing on content and organization to

complement AI’s linguistic focus

●

Provided faculty training on guiding students to critically evaluate AI

feedback

Outcomes

:

●

More balanced attention to both linguistic accuracy and rhetorical

effectiveness

●

Increased student agency in determining which AI suggestions to implement

●

Development of metacognitive skills through critical engagement with AI

feedback

●

Improved faculty attitudes toward AI as a complementary rather than

replacement tool

This case illustrates how thoughtful implementation addressing authenticity

gaps can leverage AI benefits while mitigating limitations.

7. Conclusion

Implementing authentic AI assessment in TESOL contexts requires addressing

significant technological, pedagogical, and institutional challenges. The proposed
research agenda and implementation strategies provide a pathway toward more
authentic AI assessment integration.

While current AI capabilities show varying degrees of alignment with

authenticity dimensions, understanding these gaps enables more effective
implementation. By approaching AI assessment as a complement to rather than
replacement for human assessment, TESOL practitioners can leverage technological
affordances while preserving the authenticity essential to communicative language
teaching.

Future progress will require interdisciplinary collaboration between language

educators, AI developers, and assessment researchers to create systems that better align
with all dimensions of authentic assessment. This collaboration should be guided by
clear pedagogical principles rather than technological possibilities alone.

References

1.

Chapelle, C. A., & Sauro, S. (Eds.). (2022).

The handbook of technology and second

language teaching and learning

. Wiley Blackwell.

2.

Levis, J., & Suvorov, R. (2022). Automated assessment of second language
pronunciation. In H. Mohebbi & C. Coombe (Eds.),

Research questions in language

education and applied linguistics

(pp. 803-808). Springer.

3.

Messick, S. (1996). Validity and washback in language testing.

Language Testing,

13

(3), 241-256.

T A D Q I Q O T L A R

jahon ilmiy – metodik jurnali

https://scientific-jl.com

65-son_1-to’plam_Iyul-2025

271

ISSN:3030-3613

4.

Ockey, G. J. (2021). An overview of AI and language assessment: Definitions,
applications, and challenges.

Language Assessment Quarterly, 18

(2), 119-135.

5.

Winke, P., & Isbell, D. R. (2022). The development, implementation, and ethical
management of AI-based language assessments.

Language Assessment Quarterly,

19

(3), 231-240.

Bibliografik manbalar

References

Chapelle, C. A., & Sauro, S. (Eds.). (2022). The handbook of technology and second

language teaching and learning. Wiley Blackwell.

Levis, J., & Suvorov, R. (2022). Automated assessment of second language

pronunciation. In H. Mohebbi & C. Coombe (Eds.), Research questions in language

education and applied linguistics (pp. 803-808). Springer.

Messick, S. (1996). Validity and washback in language testing. Language Testing,

(3), 241-256.

Ockey, G. J. (2021). An overview of AI and language assessment: Definitions,

applications, and challenges. Language Assessment Quarterly, 18(2), 119-135.

Winke, P., & Isbell, D. R. (2022). The development, implementation, and ethical

management of AI-based language assessments. Language Assessment Quarterly,

(3), 231-240.

IMPLEMENTING AUTHENTIC AI ASSESSMENT IN TESOL: CHALLENGES AND RESEARCH DIRECTIONS

Mualliflar

DOI:

Kalit so‘zlar:

Annotasiya

Bibliografik manbalar

Муаллифнинг (муаллифоарнинг) энг кўп ўқилган мақолалари

Категории

Axborot

Nashr

Bo'lim

Yuklab olishlar

Qanday qilib iqtibos keltirish kerak