Applying LLMs like ChatGPT, Deepseek, Grok for student work evaluation
https://doi.org/10.54596/10.54596/2958-0048-2025-2-220-230
Abstract
The article discusses the possibilities of using large language models (LM), such as ChatGPT, Deep Seek and Grok, in the tasks of evaluating student papers. The author conducts a qualitative analysis of the results obtained using ChatGPT, in comparison with teaching assessments, with an emphasis on identifying the strengths and weaknesses of the automated approach. The potential advantages of using LLM are discussed – processing speed, compliance with criteria, scalability – as well as limitations associated with evaluating creativity and depth of analysis. Special attention is paid to the applicability of various models depending on the type of assignment (text, code) and the specifics of the discipline. The work is of a review and analytical nature and can serve as a starting point for further research in the field of digitalization of educational assessment and integration of LLM into the educational process.
Keywords
About the Author
K. D. MuntinovKazakhstan
Kairat D. Muntinov, corresponding author, Lecturer, department of Information and Communication Technologies, master
Petropavlovsk
References
1. Perera, R., Lankathilaka, M. Evaluating the efficacy of ChatGPT in automated essay scoring. Journal of Educational Technology & Society, 2024, 27(1), 1-15.
2. Holmes, W., Bialik, M., Fadel, C. Artificial Intelligence in Education: Promises and Implications for Teaching and Learning. Center for Curriculum Redesign, 2023.
3. Kasneci, E., Sehler, K., Kuchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... Kasneci, G. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 2023, 103, 102274.
4. Smirnov, V.A. (2020). Using Large Language Models in Education: Reviews and Perspectives. // Bulletin of Education. - 2020, 5 (12), 45-62. (In Russian)
5. Petrova, O.S. (2021). Assessment of Student Essays: A Comparative Analysis of Traditional Methods and Automated Approaches. // Problems of Pedagogy. - 2021, 28(3), 112-130. (In Russian)
6. Kozlov, D.Yu. (2022). Artificial Intelligence in Education: Ethical Aspects and Challenges. // Philosophy of Education. - 2022, 15(2), 78-95. (In Russian)
7. Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877¬ 1901.
8. Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and implications. Minds and Machines, 30(3), 683-694.
9. Holmes, W., Bialik, M., & Fadel, C. (2023). Artificial Intelligence in Education: Promises and Implications for Teaching and Learning. Center for Curriculum Redesign .
10. Barratt, D. The last invention of mankind: artificial intelligence and the end of the era of Homo sapiens / James Barratt; [trans. from English. Natalia Lisova]. - 2nd edition. - Moscow: ANF, 2019. - 396 p.
11. Doherty, P. Man + machine. New principles of work in the era of artificial intelligence / P. Doherty, J. Wilson; translated from English by O. Sivchenko, N. Yatsyuk. - M.: Mann, Ivanov and Ferber, 2019. - 298 p.
Review
For citations:
Muntinov K.D. Applying LLMs like ChatGPT, Deepseek, Grok for student work evaluation. Vestnik of M. Kozybayev North Kazakhstan University. 2025;(2 (66)):220-230. (In Kazakh) https://doi.org/10.54596/10.54596/2958-0048-2025-2-220-230