Data Exploratory Analysis and Feature Selection of Low-Speed Wind Tunnel Data for Predicting Force and Moment of Aircraft

Main Article Content

Fitra Hidiyanto
Shabrina Leksono
Sigit Tri Atmaja
Rizqon Fajar


This paper discusses exploratory data analysis (EDA) and feature selection of aircraft test results in Indonesia's low-speed wind tunnels (ILST). First, we briefly explain input and output parameters and data processing to make readable and higher accurate data. Then, we used feature selection using embedded and random forest methods to find parameters that most affect the force coefficient of aircraft. The research activities carried out in this study are to review literature from either scientific journals, the internet, or books and interview with an engineer who tests aircraft models at ILST. Then create a program for processing data from test results, such as data extraction, data cleaning, exploratory data analysis, and feature selection with python. After applying the feature selection method, we found that all the methods show similar results and have succeeded in separating the powerful features from the weak ones with a significant score difference. We decide to use the Random Forest method. The three most strongest features in the coefficient of an aircraft model in the ILST test (CL, CD, CM25, CYAW, CROLL and CY) are the following: for CL are ALFA (0.984), T0 (0.008), P0 (0.004), on for CD is are ALFA (0.965), T0 (0.009), RE (0.007), in CM25 are ALFA (0.416), P0 (0.285), T0 (0.168), in CYAW are BETA (0.44), T0 (0.141), ALFA (0.141), in CROLL is BETA (0.79), ALFA (0.091), P0 (0.036), and in CY are BETA (0.842), ALFA (0.114) and T0 (0.014). The results of this paper can be used to help build a model for the coefficient of aircraft design using machine learning based on the data from the ILST test more effectively and efficiently.

Keyword: Machine Learning, Feature Selection, Exploratory Data Analysis, Aircraft Modeling.

Article Details




UPT – LAGG (2014). Program Manual Pengujian Model N219-B12 Power Off & Power On di Indonesia Low Speed Tunnel (ILST) UPT LAGG BPPT.

Daryanto Y, Purwono J, Subagyo (2018). Wing configuration on Wind Tunnel Testing of an Unmanned Aircraft Vehicle. IOP Conf. Series: Journal of Physics: Conf. Series 1005 (2018) 012032 DOI: 10.1088/1742-6596/1005/1/012032. Page 5.

Neeraj M, Maurya V, Kumar N (2020). A Review on machine learning (feature selection, classification and clustering) approaches of big data in different area of research. Journal of critical reviews. VOL 7, ISSUE 19, 2020. Page 2611.

"Equation for the data processing of the ILST", national Lucht - EN Ruimtevaart laboratorium, National Aerospace Laboratory NLR The Netherlands. August 1987.

Sahoo K, Samal AK, Pramanik J, Kumar Pani S (2019). Exploratory Data Analysis using python. International Journal of Innovative Technology and Exploring Engineering (IJITEE). ISSN: 2278-3075, Volume-8, Issue-12, October 2019. Page 4730.

Henderi, Wahyuningsih T, Rahwanto E (2021). Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. International Journal of Informatics and Information System. Vol. 4, No. 1, March 2021, pp. 13-20. Page 16.

Nongthombam K, SharmaD (2021). Data Analysis using python. International Journal of Engineering Research & Technology (IJERT). ISSN: 2278-0181. Vol. 10 Issue 07, July-2021. Page 463-468.

Venkatesh B, Anuradha J (2019). A Review of Feature Selection and Its Methods. Cybernetics And Information Technologies. Volume 19, no 1. March 2019. Online ISSN: 1314-4081. DOI: 10.2478/cait-2019-0001. Page 3.

Abhigyan. Feature Selection For Dimensionality Reduction(Embedded Method) [Internet]. 2020 [cited 2022 Apr 8]. p. Analytics Vidhya. Available from:

Hayati M, Muslim A (2021). Generalized Linear Mixed Model and Lasso Regularization for Statistical Downscaling. Enthuastic International Journal Of Statistics And Data Science. Volume 1, Issue 1, April 2021, pp. 38.

Belli E (2020). Smoothly Adaptively Centered Ridge Estimator. arXiv:2011.00289v1 [stat.ME] 31 Oct 2020. Page 9.

Divya P, Pavithra M, Jayalakhsmi S, Praveen Kumar P. (2021). Application of Random Forest Algorithm in Bio Informatics. International Journal of Information Technology Insights & Transformations. Vol. 5, Issue 1 – 2021. Page 19.

Chaibi M, Benghoulam EM, Tarik L, Berrada M, El Hmaidi A (2022). Machine Learning Models Based on Random Forest Feature Selection and Bayesian Optimization for Predicting Daily Global Solar Radiation. International Journal of Renewable Energy Development (IJRED). 2022: 309-323. 309-323.