Session Information

This page shows the session details and the presentations assigned to this session.

Enhancing Automated Essay Scoring by Integrating Rule-Based Language Checking with Generative Models

Abstract

Recent advances in generative artificial intelligence (AI) have enabled automated feedback systems that offer scalable support for writing instruction in classroom settings. While large language models (LLMs) can generate formative feedback efficiently, prior research indicates that such feedback often contains hallucinations or lacks linguistic precision, thereby limiting its pedagogical usefulness (Jia et al., 2024; Cheng & Amiri, 2025). This study investigates whether integrating rule-based language-checking methods into a generative AI feedback system improves the accuracy and instructional value of automated feedback for student essays in primary and lower secondary education.To this end, we developed an AI-based feedback system that generates (1) ratings of spelling and grammar on separate four-point scales and (2) written feedback summarizing linguistic quality and listing detected errors with suggested corrections. Using this system, feedback was generated for 100 student essays under two conditions: generative AI augmented with rule-based methods and generative AI only.To evaluate the quality of both the ratings and the written feedback, linguistic experts independently scored the essays and reviewed the AI-generated feedback regarding hallucinations and inaccurate corrections. Preliminary results show that the correlation between human and AI spelling ratings increases from r = 0.608 to r = 0.713 when rule-based methods are integrated, while the correlation for grammar remained comparable (r = 0.607 vs. r = 0.576). To contextualize these findings, we present qualitative examples illustrating how the integration of rule-based checks corrected specific linguistic inaccuracies in the generative output. These findings suggest that hybrid systems can improve the accuracy of automated writing feedback, particularly for spelling.References Cheng, J., & Amiri, H. (2025). Linguistic blind spots of large language models. In NAACL 2025 Cognitive Modeling and Computational Linguistics Workshop. arXiv. https://doi.org/10.48550/arXiv.2503.19260 Jia, Q., Cui, J., Du, H., Rashid, P., Xi, R., Li, R., & Gehringer, E. (2024). LLM-generated feedback in real classes and beyond: Perspectives from students and instructors. In D. A. Joyner, B. Paaßen, & C. Demmans Epp (Eds.), Proceedings of the 17th International Conference on Educational Data Mining (pp. 862–867). International Educational Data Mining Society. https://doi.org/10.5281/zenodo.12729974

Social Regulation in AI-Supported Feedback Ecologies: Disciplinary vs Non-Disciplinary Peers

Abstract

Research on feedback literacy and social regulation of learning increasingly acknowledges the improtance of multiple feedback sources; however, we still know relatively little about how regulation unfolds across different feedback ecologies, particularly in varied human–AI configurations. Drawing on models of self-, co-, and socially shared regulation of learning, this study examines how doctoral students regulate their writing when revising with (a) AI plus disciplinary peers and (b) AI plus non-disciplinary peers. Fifty-five PhD students were allocated to two conditions: one in which they received AI feedback and discussed their texts with disciplinary peers in groups of four, and another in which they received AI feedback and discussed their texts with non-disciplinary peers in groups of four. Data comprised (1) AI interaction histories, (2) 14 audio-recorded “listening room” discussions, and (3) ~300-word individual reflections comparing AI and peer feedback. Transcripts were segmented into episodes and coded for forms of regulation (self-, co-, and socially shared regulation) and functions of regulation (planning, monitoring, evaluating, adapting). Across ecologies, AI never participated in genuinely socially shared regulation; episodes of shared regulation emerged only in human–human negotiation. In AI + disciplinary peer groups, AI most often functioned as a co-regulator: students tended to follow AI suggestions when a disciplinary peer could “watch over”, with regulation distributed between AI guidance and expert peer oversight. In AI + non-disciplinary peer groups, AI was more often recruited as a resource for self-regulation: students critically evaluated and selectively adapted AI feedback in the absence of disciplinary authority. The study offers a nuanced account of how different actors in feedback ecologies shape regulatory processes, and the presentation will discuss pedagogical implications for designing feedback from multiple resources in doctoral writing courses.

The Effects of ChatGPT Feedback on Student Engagement: A Longitudinal Study

Abstract

ChatGPT can provide timely, personalized and informative feedback to improve text quality and learning success. It can thus mitigate teachers’ workload, particularly in writing-intensive courses. Despite these advantages, it remains unclear to what extent L2 learners engage with and incorporate feedback in the revision process for the improvement of text quality, as feedback uptake depends on several external and internal factors (Liu & Storch 2010). Furthermore, recent studies emphasize that students’ engagement with written corrective feedback changes over time, and that these dynamics of students’ engagement with feedback have not been explored yet (Mao & Icy 2024: 815). Therefore, the present study analyzes the impact of GenAI-assisted feedback (exemplified by ChatGPT-4) in combination with teacher feedback in extensive university German as a foreign language courses (CEFR, B2/ C1). The study focuses on the following research questions: RQ1: To what extent can the combination of GenAI-assisted feedback and teacher feedback support the revision phase in the writing process?RQ2: Which dynamics can be identified in the learner profiles based on the engagement with ChatGPT-based feedback? This longitudinal study with international students of German as a foreign language adopts a sequential explanatory mixed-methods research design (QUAN ® qual) to answer the research questions. For the quantitative analysis (QUAN) learners’ engagement (including all subtypes: behavioral, emotional, cognitive and social) is measured by using a standardized questionnaire with closed items in 13-week courses. This data (n=74) is used to carry out a hierarchical cluster analysis with Ward-Linkage to identify latent learner profiles and to assess the dynamics of engagement over time. The qualitative component (qual) of the study comprises the analysis of open-ended questions in reflection sheets as well as interviews in focus groups to get a holistic view of the feedback uptake and students’ engagement. Preliminary findings indicate that ChatGPT feedback on syntactic complexity is effective in improving linguistic accuracy and syntactic range, while teacher feedback is beneficial for fostering self-reflection, strategic revision, and writing motivation. The results are transferable to other L2 contexts, in particular for general language courses and academic writing and thus offers a replicable framework for integrating GenAI feedback into writing pedagogy.