Power Transformer Fault Identification Based on Random Forest and Data Resampling Considering Data Uncertainty

Authors

DOI:

https://doi.org/10.62146/ijecbe.v3i4.145

Keywords:

power transformers, random forest, dissolved gases, resampling, data imbalance

Abstract

Dissolved gas analysis (DGA) is a very important and reliable technique for fault identification in transformers to prevent grid outage. However, DGA datasets are sometimes imbalanced and also affected by uncertainty due to the presence of varying levels of noise, and this reduces the prediction accuracy of classification models. In this paper, we proposed a hybrid approach that combines the Random Forest classifier with multiple resampling techniques such as Random Over-Sampling, SMOTE, ADASYN, Borderline-SMOTE (versions 1 and 2), SMOTE-ENN, and SMOTE-Tomek. These methods were evaluated to identify the best combinations under different levels of uncertainty. Experiments were done on a publicly accessible DGA dataset with the addition of Gaussian noise (0%-20%) that simulates practical uncertainty caused by data measurement errors. The results indicate that SMOTE obtained the highest average accuracy of 82.46% and 81.29% with training-testing splits of 70:30 and 80:20, respectively. In addition to SMOTE, random oversampling achieved the highest average accuracy of 83.19% using a 90:10 split. These accuracy values are averages across all noise levels tested. The results suggest that appropriate selection of a resampling method improves fault identification of a random forest classifier under Gaussian noise.

References

“CIGRE 761 - Condition Assessment of Power Transformers,” 2019.

C57.104-2019 - IEEE Guide for the Interpretation of Gases Generated in Mineral Oil-Immersed Transformers. IEEE, 2019.

“IEC 60599 - Mineral oil-filled electrical equipment in service - Guidance on the interpretation of dissolved and free gases analysis,” 2022.

R.R. Rogers, “IEEE and IEC Codes to Interpret Incipient Faults in Transformers,” IEEE Transactions on Dielectrics and Electrical Insulation, vol. 13, no. 5, pp. 349–354, Oct. 1978, doi: 10.1109/TEI.1978.298141.

M. Duval, “A review of faults detectable by gas-in-oil analysis in transformers,” IEEE Electrical Insulation Magazine, vol. 18, no. 3, pp. 8–17, Aug. 2002, doi: 10.1109/MEI.2002.1014963.

M. L. L. Duval, “The duval pentagon-a new complementary tool for the interpretation of dissolved gas analysis in transformers,” IEEE Electrical Insulation Magazine, vol. 30, p. 9, Nov. 2014, doi: 10.1109/MEI.2014.6943428.

I. B. M. Taha, A. Hoballah, and S. S. M. Ghoneim, “Optimal ratio limits of rogers’ four-ratios and IEC 60599 code methods using particle swarm optimization fuzzy-logic approach,” IEEE Transactions on Dielectrics and Electrical Insulation, vol. 27, no. 1, pp. 222–230, Feb. 2020, doi: 10.1109/TDEI.2019.008395.

S. S. M. Ghoneim, I. B. M. Taha, and N. I. Elkalashy, “Integrated ANN-based proactive fault diagnostic scheme for power transformers using dissolved gas analysis,” IEEE Transactions on Dielectrics and Electrical Insulation, vol. 23, no. 3, pp. 1838–1845, Jun. 2016, doi: 10.1109/TDEI.2016.005301.

F. Guerbas et al., “Neural networks and particle swarm for transformer oil diagnosis by dissolved gas analysis,” Sci Rep, vol. 14, no. 1, Dec. 2024, doi: 10.1038/s41598-024-60071-0.

L. Jin, D. Kim, K. Y. Chan, and A. Abu-Siada, “Deep Machine Learning-Based Asset Management Approach for Oil- Immersed Power Transformers Using Dissolved Gas Analysis,” IEEE Access, vol. 12, pp. 27794–27809, 2024, doi: 10.1109/ACCESS.2024.3366905.

S. R. Al-Sakini, G. A. Bilal, A. T. Sadiq, and W. A. K. Al-Maliki, “Dissolved Gas Analysis for Fault Prediction in Power Transformers Using Machine Learning Techniques,” Applied Sciences (Switzerland), vol. 15, no. 1, Jan. 2025, doi: 10.3390/app15010118.

Suwarno, H. Sutikno, R. A. Prasojo, and A. Abu-Siada, “Machine learning based multi-method interpretation to enhance dissolved gas analysis for power transformer fault diagnosis,” Heliyon, vol. 10, no. 4, Feb. 2024, doi: 10.1016/j.heliyon.2024.e25975.

R. A. Prasojo et al., “Precise transformer fault diagnosis via random forest model enhanced by synthetic minority over-sampling technique,” Electric Power Systems Research, vol. 220, Jul. 2023, doi: 10.1016/j.epsr.2023.109361.

O. Kherif, Y. Benmahamed, M. Teguar, A. Boubakeur, and S. S. M. Ghoneim, “Accuracy Improvement of Power Transformer Faults Diagnostic Using KNN Classifier With Decision Tree Principle,” IEEE Access, vol. 9, pp. 81693–81701, 2021, doi: 10.1109/ACCESS.2021.3086135.

R. F. R. B. Souza, “Dissolved Gas Analysis to Identify Faults and Improve Reliability in Transformers Using Support Vector Machines,” in 2016 Clemson University Power Systems Conference (PSC), Clemson: [IEEE], Mar. 2016. doi: 10.1109/PSC.2016.7462827.

A. Hoballah, D. E. A. Mansour, and I. B. M. Taha, “Hybrid Grey Wolf Optimizer for Transformer Fault Diagnosis Using Dissolved Gases Considering Uncertainty in Measurements,” IEEE Access, vol. 8, pp. 139176–139187, 2020, doi: 10.1109/ACCESS.2020.3012633.

I. B. M. Taha, S. Ibrahim, and D. E. A. Mansour, “Power transformer fault diagnosis based on DGA using a convolutional neural network with noise in measurements,” IEEE Access, vol. 9, pp. 111162–111170, 2021, doi: 10.1109/ACCESS.2021.3102415.

H. C. Chen, Y. Zhang, and M. Chen, “Transformer Dissolved Gas Analysis for Highly-Imbalanced Dataset Using Multiclass Sequential Ensembled ELM,” IEEE Transactions on Dielectrics and Electrical Insulation, vol. 30, no. 5, pp. 2353–2361, Oct. 2023, doi: 10.1109/TDEI.2023.3280436.

K. N. V. P. S. Rajesh, U. Mohan Rao, I. Fofana, P. Rozga, and A. Paramane, “Influence of Data Balancing on Transformer DGA Fault Classification With Machine Learning Algorithms,” IEEE Transactions on Dielectrics and Electrical Insulation, vol. 30, no. 1, pp. 385–392, Feb. 2023, doi: 10.1109/TDEI.2022.3230377.

P. A. R. Azmi, M. Yusoff, and M. T. M. Sallehud-din, “Improving transformer failure classification on imbalanced DGA data using data-level techniques and machine learning,” Energy Reports, vol. 13, pp. 264–277, Jun. 2025, doi: 10.1016/j.egyr.2024.12.006.

J. Chen et al., “A novel method for power transformer fault diagnosis considering imbalanced data samples,” Front Energy Res, vol. 12, 2024, doi: 10.3389/fenrg.2024.1500548.

A. Dhini, A. Faqih, B. Kusumoputro, I. Surjandari, and A. Kusiak, “Data-driven fault diagnosis of power transformers using dissolved gas analysis (DGA),” International Journal of Technology, vol. 11, no. 2, pp. 388–399, 2020, doi: 10.14716/ijtech.v11i2.3625.

L. Wang, T. Littler, and X. Liu, “Hybrid AI model for power transformer assessment using imbalanced DGA datasets,” IET Renewable Power Generation, vol. 17, no. 8, pp. 1912–1922, Jun. 2023, doi: 10.1049/rpg2.12733.

B. Vahidi and A. Teymouri, Quality Confirmation Tests for Power Transformer Insulation Systems. 2019. doi: 10.1007/978-3-030-19693-6.

S. van der Leeuw, “Uncertainties,” Interdisciplinary Contributions to Archaeology, pp. 157–169, 2016, doi: 10.1007/978-3-319-27833-9_9.

P. F. Pelz et al., “Types of Uncertainty BT - Mastering Uncertainty in Mechanical Engineering,” P. F. Pelz, P. Groche, M. E. Pfetsch, and M. Schaeffner, Eds., Cham: Springer International Publishing, 2021, pp. 25–42. doi: 10.1007/978-3-030-78354-9_2.

R. Ghorbani and R. Ghousi, “Comparing Different Resampling Methods in Predicting Students’ Performance Using Machine Learning Techniques,” IEEE Access, vol. 8, pp. 67899–67911, 2020, doi: 10.1109/ACCESS.2020.2986809.

“DGA/datasets/dataset_(589).xlsx at master · Saleh860/DGA · GitHub.” Accessed: Jun. 06, 2025. [Online]. Available: https://github.com/Saleh860/DGA/blob/master/datasets/dataset_(589).xlsx

N. V Chawla, K. W. Bowyer, and L. O. Hall, “SMOTE : Synthetic Minority Over-sampling Technique,” vol. 16, pp. 321–357, 2002.

H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” Proceedings of the International Joint Conference on Neural Networks, no. 3, pp. 1322–1328, 2008, doi: 10.1109/IJCNN.2008.4633969.

H. Han, W.-Y. Wang, and B.-H. Mao, “Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning BT - Advances in Intelligent Computing,” D.-S. Huang, X.-P. Zhang, and G.-B. Huang, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 878–887.

I. Tomek, “Tomek Link: Two Modifications of CNN,” IEEE Trans. Systems, Man and Cybernetics, pp. 769–772, 1976, [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4309452

D. L. Wilson, “Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans Syst Man Cybern, vol. 2, no. 3, pp. 408–421, 1972, doi: 10.1109/TSMC.1972.4309137.

L. Breiman, “Random Forests,” Mach Learn, vol. 45, no. 1, pp. 5–32, 2001, doi: 10.1023/A:1010933404324.

Published

2025-12-30

How to Cite

Majok, E. C. Y., & Sudiarto, B. (2025). Power Transformer Fault Identification Based on Random Forest and Data Resampling Considering Data Uncertainty. International Journal of Electrical, Computer, and Biomedical Engineering, 3(4), 693–711. https://doi.org/10.62146/ijecbe.v3i4.145

Issue

Section

Electrical and Electronics Engineering

Most read articles by the same author(s)

1 2 > >>