An Integrated Intelligent Teaching Evaluation System Empowered by Multimodal Large Models

Wanli Wen; Jiayu Li

doi:10.6918/IJOSSER.202512_8(12).0021

Authors

Wanli Wen
Jiayu Li

DOI:

https://doi.org/10.6918/IJOSSER.202512_8(12).0021

Keywords:

Multimodal Large Models, Integrated Evaluation, Educational Artificial Intelligence, Evaluation-Teaching Closed Loop, Cognitive Diagnosis

Abstract

To address the structural contradiction between large-scale education and personalized assessment confronting the current "Four Evaluations" reform, and to overcome the drawbacks of traditional evaluation methods such as the separation of process and outcome, high subjectivity, and a lack of timeliness, this study proposes and designs an integrated intelligent teaching evaluation system empowered by Multimodal Large Models. The core of this system is designed around three main steps. First, it integrates multimodal data streams, including audio, vision, and text, to establish a unified data foundation for the in-depth perception and quantitative representation of all classroom elements. Second, grounded in core educational theories such as formative assessment, cognitive diagnosis, and multiple intelligences, a multidimensional evaluation indicator framework is proposed. Finally, by leveraging multi-task learning and joint inference algorithms, an integrated intelligent evaluation engine is conceptualized. This engine is capable of simultaneously producing outcome, process, value-added, and comprehensive evaluations within a unified computational framework. Theoretical analysis and architectural design suggest that this system will shift the evaluation focus from being solely score-oriented to being competency-oriented, and from summative assessment to formative empowerment. It aims to establish a real-time, closed-loop feedback mechanism that effectively integrates evaluation with teaching, thereby providing a practical technological pathway for implementing large-scale individualized instruction and fostering educational equity.

Downloads

Download data is not yet available.

References

[1] The CPC Central Committee and the State Council. General Plan for Deepening the Reform of Education Evaluation in the New Era. Information on: https://www.gov.cn/zhengce/2020-10/13/content_5551032.htm.

[2] Yubo Hou, Qiangqiang Li, Hao Li. The Construction of Critical Thinking Structure and Scale Development in China. Journal of Peking University (Natural Science Edition). 2022, Vol. 58 (No. 02), p. 383-390.

[3] Ismail S M, Rahul D R, Patra I, et al. Formative vs. summative assessment: impacts on academic motivation, attitude toward learning, test anxiety, and self-regulation skill. Language Testing in Asia. 2022, Vol. 12 (No. 01), p. 40.

[4] Xingnan Lu, Xuewei Gao. Artificial Intelligence Empowering Educational Evaluation Reform: Development Trends, Risk Assessment and Mitigation Strategies. China Journal of Education. 2023, Vol. (No. 02), p. 48-54.

[5] Zhinan Huang, Gen Li, Yafeng Zheng. Empowering the High-Quality Development of Science Education with Multimodal Large Models: Potential, Challenges, and Application Exploration. China Electro-education. 2025, Vol. (No. 06), p. 60-69.

[6] Jeon H, Jun Y, Laine T H, et al. Immersive virtual reality game for cognitive-empathy education: Implementation and formative evaluation. Education and Information Technologies. 2024, Vol. 29 (No. 02), p. 1559-1590.

[7] Han Y, Ji F, Jiang Z. Two-stage polytomous attribute estimation for cognitive diagnostic models: overcoming computational challenges in large-scale assessments with many polytomous attributes. Humanities and Social Sciences Communications. 2025, Vol. 12 (No. 01), p. 1-14.

[8] Yue Wang, Shujuan Chang, Xiaoling Han, et al. Item bank construction and validity testing based on item response theory: A case study of the public course "Modern Educational Technology". Modern Educational Technology. 2019, Vol. 29 (No. 10), p. 41-47.

[9] Gardner H. Frames of Mind: the theory of multiple intelligences. Basic Books, 1983.

[10] Vandenhende S, Georgoulis S, Van Gansbeke W, et al. Multi-task learning for dense prediction tasks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2021, Vol. 44 (No. 07), p. 3614-3633.

[11] Zhang Y, Yang Q. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering. 2021, Vol. 34 (No. 12), p. 5586-5609.

[12] Rogers P M, Marine J M, Ives S T, et al. Validity evidence for a formative writing engagement assessment in elementary grades. Assessment in Education: Principles, Policy & Practice. 2022, Vol. 29 (No. 02), p. 262-284.

[13] Xie Q, Jiang S, Jiang L, et al. Efficiency optimization techniques in privacy-preserving federated learning with homomorphic encryption: A brief survey. IEEE Internet of Things Journal. 2024, Vol. 11 (No. 14), p. 24569-24580.

[14] Tingting Feng, Dejian Liu, Lulu Huang, et al. Digital Education: Application, Sharing, Innovation-Summary of the 2024 World Digital Education Conference. China Electro-education. 2024, Vol. (No. 03), p. 20-36.

[15] Li T, Sahu A K, Talwalkar A, et al. Federated learning: Challenges, methods, and future directions. IEEE Signal ProcesYunong Yang, Ao Xu, Chunjiong Zhang, et al. Privacy Protection in Smart Classrooms Based on Federated Multi-task Learning. Modern Educational Technology. 2024, Vol. 34 (No. 09), p. 123-132.sing Magazine. 2020, Vol. 37 (No. 03), p. 50-60.

[16] Yunong Yang, Ao Xu, Chunjiong Zhang, et al. Privacy Protection in Smart Classrooms Based on Federated Multi-task Learning. Modern Educational Technology. 2024, Vol. 34 (No. 09), p. 123-132.

[17] Leilei Zhao, Li Zhang, Jing Wang. Ethical risks of educational data in the intelligent era: typical representations and governance paths. China Distance Education. 2022, Vol. (No. 03), p. 17-25+77.

[18] Huibin Zhang, Lei Xu. Ethical Risks and Governance Approaches of Generative AI in Education: A Case Study of Russell Group. Modern Educational Technology. 2024, Vol. 34 (No. 06), p. 25-34.