ORIGINAL ARTICLE
Comparative analysis of the performance of selected machine learning algorithms depending on the size of the training sample
,
 
,
 
 
 
 
More details
Hide details
1
Faculty of Geodesy and Cartography, Warsaw University of Technology, pl. Politechniki 1, 00-661, Warsaw, Poland
 
2
Orbitile Ltd., Potułkały 6B/4, 02-791, Warsaw, Poland
 
 
A - Research concept and design; B - Collection and/or assembly of data; C - Data analysis and interpretation; D - Writing the article; E - Critical revision of the article; F - Final approval of article
 
 
Submission date: 2024-03-12
 
 
Final revision date: 2024-07-22
 
 
Acceptance date: 2024-07-29
 
 
Publication date: 2024-09-23
 
 
Corresponding author
Przemysław Kupidura   

Faculty of Geodesy and Cartography, Warsaw University of Technology, pl. Politechniki 1, 00-661, Warsaw, Poland
 
 
Reports on Geodesy and Geoinformatics 2024;118:53-69
 
KEYWORDS
TOPICS
ABSTRACT
The article presents an analysis of the effectiveness of selected machine learning methods: Random Forest (RF), Extreme Gradient Boosting (XGB), and Support Vector Machine (SVM) in the classification of land use and cover in satellite images. Several variants of each algorithm were tested, adopting different parameters typical for each of them. Each variant was classified multiple (20) times, using training samples of different sizes: from 100 pixels to 200,000 pixels. The tests were conducted independently on 3 Sentinel-2 satellite images, identifying 5 basic land cover classes: built-up areas, soil, forest, water, and low vegetation. Typical metrics were used for the accuracy assessment: Cohen's kappa coefficient, overall accuracy (for whole images), as well as F-1 score, precision, and recall (for individual classes). The results obtained for different images were consistent and clearly indicated an increase in classification accuracy with the increase in the size of the training sample. They also showed that among the tested algorithms, the XGB algorithm is the most sensitive to the size of the training sample, while the least sensitive is SVM, which achieved relatively good results even when using training samples of the smallest sizes. At the same time, it was pointed out that while in the case of RF and XGB algorithms the differences between the tested variants were slight, the effectiveness of SVM was very much dependent on the gamma parameter -- with too high values of this parameter, the model showed a tendency to overfit, which did not allow for satisfactory results.
 
REFERENCES (48)
1.
Allwright, S. (2023). XGBoost vs Random Forest, which is better? Technical report.
 
2.
Belgiu, M. and Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114:24–31, doi:10.1016/j.isprsjprs.2016.01.011.
 
3.
Bigdeli, A., Maghsoudi, A., and Ghezelbash, R. (2023). A comparative study of the XGBoost ensemble learning and multilayer perceptron in mineral prospectivity modeling: A case study of the Torud-Chahshirin belt, NE Iran. Earth Science Informatics, 17(1):483–499, doi:10.1007/s12145-023-01184-4.
 
4.
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory, COLT92. ACM, doi:10.1145/130385.130401.
 
5.
Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32, doi:10.1023/a:1010933404324.
 
6.
Budach, L., Feuerpfeil, M., Ihde, N., Nathansen, A., Noack, N., Patzlaff, H., Naumann, F., and Harmouch, H. (2022). The effects of data quality on machine learning performance. doi:10.48550/ARXIV.2207.14529.
 
7.
Burkholder, A., Warner, T. A., Culp, M., and Landenberger, R. (2011). Seasonal trends in separability of leaf reflectance spectra for Ailanthus altissima and four other tree species. Photogrammetric Engineering & Remote Sensing, 77(8):793–804, doi:10.14358/PERS.77.8.793.
 
8.
Chen, T. and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. ACM, doi:10.1145/2939672.2939785.
 
9.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46, doi:10.1177/001316446002000104.
 
10.
Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3):273–297, doi:10.1007/bf00994018.
 
11.
Cracknell, M. J. and Reading, A. M. (2014). Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information. Computers and Geosciences, 63:22–33, doi:10.1016/j.cageo.2013.10.008.
 
12.
Ding, H. (2024). Establishing a soil carbon flux monitoring system based on support vector machine and XGBoost. Soft Computing, 28(5):4551–4574, doi:10.1007/s00500-024-09641-y.
 
13.
Duro, D. C., Franklin, S. E., and Dubé, M. G. (2012). A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery. Remote Sensing of Environment, 118:259–272, doi:10.1016/j.rse.2011.11.020.
 
14.
Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., and Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), doi:10.1186/1472-6947-12-8.
 
15.
Fu, Y., Shen, R., Song, C., Dong, J., Han, W., Ye, T., and Yuan, W. (2023). Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm. Science of Remote Sensing, 7:100081, doi:10.1016/j.srs.2023.100081.
 
16.
Ghayour, L., Neshat, A., Paryani, S., Shahabi, H., Shirzadi, A., Chen, W., Al-Ansari, N., Geertsema, M., Pourmehdi Amiri, M., Gholamnia, M., Dou, J., and Ahmad, A. (2021). Performance evaluation of Sentinel-2 and Landsat 8 OLI data for land cover/use classification using a comparison between machine learning algorithms. Remote Sensing, 13(7):1349, doi:10.3390/rs13071349.
 
17.
Halevy, A., Norvig, P., and Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2):8–12, doi:10.1109/mis.2009.36.
 
18.
Hand, D. J., Christen, P., and Kirielle, N. (2021). F*: an interpretable transformation of the F-measure. Machine Learning, 110(3):451–456, doi:10.1007/s10994-021-05964-1.
 
19.
Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition, 14-16 August 1995, Montreal, QC, Canada, volume 1, pages 278–282. IEEE, doi:10.1109/ICDAR.1995.598994.
 
20.
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, doi:10.1109/34.709601.
 
21.
Kapoor, S. and Perrone, V. (2021). A simple and fast baseline for tuning large XGBoost models. arXiv preprint arXiv:2111.06924, doi:10.48550/arXiv.2111.06924.
 
22.
Koppaka, R. and Moh, T.-S. (2020). Machine learning in Indian crop classification of temporal multi-spectral satellite image. In 2020 14th International Conference on Ubiquitous Information Management and Communication (IMCOM). IEEE, doi:10.1109/imcom48794.2020.9001718.
 
23.
Kumar, A. (2023). Random Forest vs XGBoost: Which one to use? Examples. Technical report.
 
24.
Kupidura, P. and Niemyski, S. (2024). Analysis of the effectiveness of selected machine learning algorithms in the classification of satellite image content depending on the size of the training sample. Teledetekcja Środowiska, 64:24–38.
 
25.
Labatut, V. and Cherifi, H. (2012). Accuracy measures for the comparison of classifiers. arXiv preprint arXiv:1207.3790, doi:10.48550/arXiv.1207.3790.
 
26.
Li, X., Chen, W., Cheng, X., and Wang, L. (2016). A comparison of machine learning algorithms for mapping of complex surface-mined and agricultural landscapes using ZiYuan-3 Stereo Satellite imagery. Remote Sensing, 8(6):514, doi:10.3390/rs8060514.
 
27.
Liaw, A., Wiener, M., et al. (2002). Classification and regression by Random Forest. R news, 2(3):18–22.
 
28.
Liu, J., Zuo, Y., Wang, N., Yuan, F., Zhu, X., Zhang, L., Zhang, J., Sun, Y., Guo, Z., Guo, Y., Song, X., Song, C., and Xu, X. (2021). Comparative analysis of two machine learning algorithms in predicting site-level net ecosystem exchange in major biomes. Remote Sensing, 13(12):2242, doi:10.3390/rs13122242.
 
29.
Maxwell, A., Strager, M., Warner, T., Zégre, N., and Yuill, C. (2014a). Comparison of NAIP orthophotography and rapideye satellite imagery for mapping of mining and mine reclamation. GIScience and Remote Sensing, 51(3):301–320, doi:10.1080/15481603.2014.912874.
 
30.
Maxwell, A., Warner, T., Strager, M., Conley, J., and Sharp, A. (2015). Assessing machine-learning algorithms and image- and lidar-derived variables for GEOBIA classification of mining and mine reclamation. International Journal of Remote Sensing, 36(4):954–978, doi:10.1080/01431161.2014.1001086.
 
31.
Maxwell, A. E. and Warner, T. A. (2015). Differentiating mine-reclaimed grasslands from spectrally similar land cover using terrain variables and object-based machine learning classification. International Journal of Remote Sensing, 36(17):4384–4410, doi:10.1080/01431161.2015.1083632.
 
32.
Maxwell, A. E., Warner, T. A., and Fang, F. (2018). Implementation of machine-learning classification in remote sensing: An applied review. International Journal of Remote Sensing, 39(9):2784–2817, doi:10.1080/01431161.2018.1433343.
 
33.
Maxwell, A. E., Warner, T. A., Strager, M. P., and Pal, M. (2014b). Combining RapidEye satellite imagery and lidar for mapping of mining and mine reclamation. Photogrammetric Engineering and Remote Sensing, 80(2):179–189, doi:10.14358/pers.80.2.179-189.
 
34.
Mousavinezhad, M., Feizi, A., and Aalipour, M. (2023). Performance evaluation of machine learning algorithms in change detection and change prediction of a watershed’s land use and land cover. International Journal of Environmental Research, 17(2), doi:10.1007/s41742-023-00518-w.
 
35.
Nalepa, J. and Kawulok, M. (2018). Selecting training sets for support vector machines: A review. Artificial Intelligence Review, 52(2):857–900, doi:10.1007/s10462-017-9611-1.
 
36.
Powers, D. M. W. (2007). Evaluation: From precision, recall and F-factor to ROC, informedness, markedness & correlation. Technical report SIE-07-001. Technical report, Flinders University, Adelaide, Australia.
 
37.
Ramezan, C. A., Warner, T. A., Maxwell, A. E., and Price, B. S. (2021). Effects of training set size on supervised machine-learning land-cover classification of large-area high-resolution remotely sensed data. Remote Sensing, 13(3):368, doi:10.3390/rs13030368.
 
38.
Raudys, S. and Jain, A. (1991). Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(3):252–264, doi:10.1109/34.75512.
 
39.
Schölkopf, B. (2002). Learning with kernels. Adaptive computation and machine learning. MIT Press, Cambridge, Mass.
 
40.
Seydi, S. T., Kanani-Sadat, Y., Hasanlou, M., Sahraei, R., Chanussot, J., and Amani, M. (2022). Comparison of machine learning algorithms for flood susceptibility mapping. Remote Sensing, 15(1):192, doi:10.3390/rs15010192.
 
41.
Shang, M., Wang, S.-X., Zhou, Y., and Du, C. (2018). Effects of training samples and classifiers on classification of Landsat-8 imagery. Journal of the Indian Society of Remote Sensing, 46(9):1333–1340, doi:10.1007/s12524-018-0777-z.
 
42.
Shih, H.-c., Stow, D. A., and Tsai, Y. H. (2018). Guidance on and comparison of machine learning classifiers for Landsat-based land cover and land use mapping. International Journal of Remote Sensing, 40(4):1248–1274, doi:10.1080/01431161.2018.1524179.
 
43.
Sim, J. and Wright, C. C. (2005). The Kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical Therapy, 85(3):257–268, doi:10.1093/ptj/85.3.257.
 
44.
Sobieraj, J., Fernández, M., and Metelski, D. (2022). A comparison of different machine learning algorithms in the classification of impervious surfaces: Case study of the Housing Estate Fort Bema in Warsaw (Poland). Buildings, 12(12):2115, doi:10.3390/buildings12122115.
 
45.
Volke, M. I. and Abarca-Del-Rio, R. (2020). Comparison of machine learning classification algorithms for land cover change in a coastal area affected by the 2010 earthquake and tsunami in Chile. Natural Hazards and Earth System Sciences [preprint], doi:10.5194/nhess-2020-41.
 
46.
Wainer, J. and Fonseca, P. (2021). How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artificial Intelligence Review, 54(6):4771–4797, doi:10.1007/s10462-021-10011-5.
 
47.
Zhao, Z., Islam, F., Waseem, L. A., Tariq, A., Nawaz, M., Islam, I. U., Bibi, T., Rehman, N. U., Ahmad, W., Aslam, R. W., Raza, D., and Hatamleh, W. A. (2024). Comparison of three machine learning algorithms using Google Earth engine for land use land cover classification. Rangeland Ecology and Management, 92:129–137, doi:10.1016/j.rama.2023.10.007.
 
48.
Zheng, W. and Jin, M. (2020). The effects of class imbalance and training data size on classifier learning: An empirical study. SN Computer Science, 1(2), doi:10.1007/s42979-020-0074-0.
 
eISSN:2391-8152
ISSN:2391-8365
Journals System - logo
Scroll to top