DEVELOPMENT AND OPTIMIZATION OF SIMPLE-O: AN AUTOMATED ESSAY SCORING SYSTEM FOR JAPANESE LANGUAGE BASED ON BERT, BILSTM, AND BIGRU
DOI:
https://doi.org/10.62146/ijecbe.v3i3.100Keywords:
BERT, BiLSTM, BiGRU, Manhattan Distance, Cosine SimiliarityAbstract
This study aims to develop an Automatic Essay Scoring System (SIMPLE-O) for Japanese essays, consisting of five short essay questions. SIMPLE-O is designed to enhance scoring accuracy by leveraging deep learning models such as BERT, BiLSTM, and BiGRU. The research evaluates deep-level score predictions for each question, rather than only considering the total score across the five questions, to provide more reliable and accurate assessments. SIMPLE-O compares student responses with three predefined answer keys using two similarity measurement methods: Cosine Similarity and Manhattan Distance. The study employs datasets developed through data augmentation techniques applied to lecturer and student responses. The system is implemented using Python, and its performance is evaluated through analyses of various architectures based on specified hyperparameters. The best results were achieved using a BERT-BiLSTM architecture with the Cosine Similarity method, configured with a batch size of 8, 256 hidden state units, a learning rate of 0.00001, and 100 epochs. The evaluation demonstrated that this approach achieved a Mean Absolute Percentage Error (MAPE) of 7.230% and an average score difference of 5.689. This research highlights the potential of SIMPLE-O for automated scoring of Japanese essays, offering improved accuracy, reliability, and deeper analytical insights.
References
R. H. Chassab, L. Q. Zakaria, and S. Tiun, "Automatic Essay Scoring: A Review on the Feature Analysis Techniques," International Journal of Advanced Computer Science and Applications, vol. 12, no. 10, pp. –, 2021, doi: 10.14569/IJACSA.2021.0121028.
M. Uto, "A review of deep-neural automated essay scoring models," Behaviormetrika, vol. 48, no. 2, pp. 459–484, Jul. 2021, doi: 10.1007/s41237-021-00142-y.
Y. Liu, "GEEF: A neural network model for automatic essay feedback generation by integrating writing skills assessment," Expert Systems with Applications, vol. 245, p. 123043, Jul. 2024, doi: 10.1016/j.eswa.2023.123043.
A. Sharma, "Modeling essay grading with pre-trained BERT features," Applied Intelligence (Dordrecht, Netherlands), vol. 54, no. 6, pp. 4979–4993, Mar. 2024, doi: 10.1007/s10489-024-05410-4.
"Language:The Japanese Language", Afe.easia.columbia.edu, 2021. [Online]. Available: http://afe.easia.columbia.edu/japan/japanworkbook/Language/lsp.htm. [Accessed: 7 - Dec- 2024].
E. B. Page, "Statistical and linguistic strategies in the computer grading of essays," in Proceedings of the 1967 Conference on Computational Linguistics, 1967, doi: 10.3115/991566.991598.
A. A. Putri Ratna, P. D. Purnamasari, N. K. Anandra, and D. L. Luhurkinanti, "Hybrid Deep Learning CNN-Bidirectional LSTM and Manhattan Distance for Japanese Automated Short Answer Grading: Use case in Japanese Language Studies," Proceedings of the 8th International Conference on Communication and Information Processing (ICCIP '22), Association for Computing Machinery, New York, NY, USA, pp. 22–27, 2023. DOI: 10.1145/3571662.3571666.
B. Okgetheng and K. Takeuchi, "Estimating Japanese Essay Grading Scores with Large Language Models," in Proceedings of the 30th Annual Meeting of the Society of Language Processing (NLP2024), Kobe, Japan, 2024.
W. Li and H. Liu, "Applying large Language models for automated essay scoring for non-native Japanese," Humanities and Social Sciences Communications, vol. 11, p. 723, 2024. doi: 10.1057/s41599-024-03209-9.
R. H. Chassab, L. Q. Zakaria, and S. Tiun, "An optimized BERT fine-tuned model using an artificial bee colony algorithm for automatic essay score prediction," PeerJ Comput. Sci., vol. 10, p. e2191, 2024. doi: 10.7717/peerj-cs.2191.
S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
A. Graves and J. Schmidhuber, "Framewise phoneme classification with bidirectional LSTM Networks," in Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, 2005, vol. 4, pp. 2047–2052, doi: 10.1109/IJCNN.2005.1556215.
C. Qin, "Long short-term memory with activation on gradient," Neural Networks, vol. 164, pp. 135–145, Jul. 2023, doi: 10.1016/j.neunet.2023.04.026.
K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation," IEEE Transactions on Neural Networks, 2014, doi: 10.48550/arXiv.1406.1078.
W. Shang, H. Zhen, and W. Zhang, "Sentiment Analysis of Hybrid Network Model," in Proceedings of the 2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter), Taiyuan, China, 2023, pp. 109–113, doi: 10.1109/SNPD-Winter57765.2023.10224037.
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical evaluation of gated recurrent neural Networks on sequence modeling," Ithaca, 2014, retrieved from ProQuest.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, and I. Polosukhin, "Attention is all you need," Ithaca, NY, 2023. [Online]. Available: https://www.proquest.com/working-papers/attention-is-all-you-need/docview/2076493815/se-2.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for Language understanding," Ithaca, 2019, retrieved from ProQuest.
CL-Tohoku, "BERT Japanese," GitHub Repository, 2020. [Online]. Available: https://github.com/cl-tohoku/BERT-japanese. [Accessed: 16-Dec-2024].
[1] C. Sun et al, "How to Fine-Tune BERT for Text Classification?" ArXiv.Org, 2020. Available:https://www.proquest.com/working-papers/how-fine-tune-bert-text-classification/docview/2225512747/se-2.
M. S. Ergon Cugler de, "Asymptotic behavior of the Manhattan Distance in n-dimensions: Estimating multidimensional scenarios in empirical experiments," Ithaca, 2024, retrieved from ProQuest.
A. W. Qurashi, V. Holmes, and A. P. Johnson, "Document Processing: Methods for Semantic Text Similarity Analysis," in 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Novi Sad, Serbia, 2020, pp. 1–6.
S. F. Shalihah, P. D. Purnamasari, L. Santiar, and A. A. P. Ratna, "Development of the Oral Examination Assessment System (SIPENILAI) in Japanese Using Latent Semantic Analysis (LSA) Algorithm," in Proceedings of the 6th International Conference on Communication and Information Processing (ICCIP '20), 2021, pp. 12–19. doi: 10.1145/3442555.3442558.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Electrical, Computer, and Biomedical Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.
 
						 
							



