Message Board

Respected readers, authors and reviewers, you can add comments to this page on any questions about the contribution, review, editing and publication of this journal. We will give you an answer as soon as possible. Thank you for your support!

Name
E-mail
Phone
Title
Content
Verification Code
Volume 16 Issue 1
Mar.  2025
Turn off MathJax
Article Contents

HUANG Sihao, WANG Lixia, LIU Yongxia, JIANG Chengjun, SONG Kexin, ZHAO Yan, HE Yingdui. Banana yield prediction based on machine learning algorithm[J]. Journal of Tropical Biology, 2025, 16(1): 21-30. doi: 10.15886/j.cnki.rdswxb.20240031
Citation: HUANG Sihao, WANG Lixia, LIU Yongxia, JIANG Chengjun, SONG Kexin, ZHAO Yan, HE Yingdui. Banana yield prediction based on machine learning algorithm[J]. Journal of Tropical Biology, 2025, 16(1): 21-30. doi: 10.15886/j.cnki.rdswxb.20240031

Banana yield prediction based on machine learning algorithm

doi: 10.15886/j.cnki.rdswxb.20240031
  • Received Date: 2024-02-22
  • Rev Recd Date: 2024-07-01
  • Publish Date: 2025-03-15
  • The yield of banana(Musa AA) was predicted based on machine learning algorithm, and an optimal prediction model and yield influencing factors were clarified so as to provide technical support for the integrated nutrient management and yield prediction of banana(Musa AA) in Chengmai County, Hainan Province. The input variables were screened by correlation analysis and stepwise regression analysis, and then yield prediction models of banana(Musa AA) were established by random forest(RF), support vector machine(SVM), K-nearest neighbor(KNN) and artificial neural network(ANN). The models were interpreted by Shapley Additive exPlanations(SHAP) method to reveal the dominant factors affecting the yield of banana(Musa AA), and the impact of the dominant factors on yield was quantitatively analyzed. The results showed that the ANN model was the best for yield prediction with R2 being 0.98, and root-mean-square error(RMSE) and mean absolute error(MAE) being 0.16 kg·plant and 0.10 kg·plant, respectively, and that the prediction value of this model was basically of no deviation. The error of the ANN model had converged at about 100 samples, and even smaller cost could also achieve good prediction effect. The SVM model had only slightly lower prediction performance than the ANN model but had the risk of underfitting, while the KNN and RF models had lower prediction performance with overfitting, and the errors had not converged under the current sample size. The ANN model prediction results were explained by the SHAP method as follows: The leading factors affecting the yield of banana(Musa AA) were available potassium, alkali-hydrolyzed nitrogen, exchangeable calcium and exchangeable magnesium.When the content of available potassium was greater than 100 mg·kg-1, the content of alkali-hydrolyzed nitrogen was greater than 100 mg·kg-1, the content of exchangeable calcium was greater than 600 mg·kg-1, and the content of exchangeable magnesium was greater than 60 mg·kg-1, the yield of banana(Musa AA) was promoted. When the content of exchangeable calcium and magnesium in soil was deficient, the content of available manganese and available zinc in soil should be increased to alleviate the stress of nutrient deficiency in the banana plantations.
  • [1] 曾鸿运,吴元立,黄秉智.中国香蕉育种研究进展[J]. 果树学报, 2023, 40(11):2446-2465.

    DOI: 10.13925/j.cnki.gsxb.20230151.
    [2] 梁张慧,吴宇军,刘绍钦,等.皇帝蕉优质高产高效栽培技术[J]. 广东农业科学, 2010, 37(9):79-80.
    [3] 王芳,谢江辉.我国香蕉产业“十三五”回顾与“十四五”展望[J]. 中国热带农业, 2022(3):15-22.
    [4] 刘雪红,吴坤林,陈国华,等.“金手指”香蕉的组织培养和快速繁殖[J]. 中国南方果树, 2006(1):34-35.
    [5] 冷张玲.“中国皇帝蕉之乡”-海南澄迈县[J]. 中国果菜, 2017, 37(8):83-84.
    [6] 唐文,李凯,李羽佳,等.优质绿色皇帝蕉栽培管理技术[J]. 分子植物育种, 2018, 16(8):2730-2735.
    [7] EVSTATIEV B I, GABROVSKA-EVSTATIEVA K G. A review on the methods for big data analysis in agriculture[C] //proceedings of the IOP Conference Series Materials Science and Engineering, 2021.
    [8] BENOS L, TAGARAKIS A C, DOLIAS G, et al. Machine Learning in Agriculture:a Comprehensive Updated Review[J]. Sensors(Basel, Switzerland), 2021, 21(11):3758-3812.
    [9] OLIVARES B O, VEGA A, CALDERÓN M A R, et al.Identification of Soil Properties Associated with the Incidence of Banana Wilt Using Supervised Methods[J]. Plants, 2022, 11(15):2070-2088.
    [10] ALABI T R, ADEWOPO J, DUKE O P, et al. Banana Mapping in Heterogenous Smallholder Farming Systems Using High-Resolution Remote Sensing Imagery and Machine Learning Models with Implications for Banana Bunchy Top Disease Surveillance[J]. Remote Sensing,2022, 14(20):5206-5227.
    [11] CHAUDHARI V, PATIL M P. Detection and Classification of Banana Leaf Disease Using Novel Segmentation and Ensemble Machine Learning Approach[J]. Applied Computer Systems, 2023, 28(1):92-99.
    [12] OLIVARES B O, CALERO J, REY J C, et al. Correlation of banana productivity levels and soil morphological properties using regularized optimal scaling regression[J]. Catena,2022, 208:105718-105728.
    [13] ANGELA V D S, ALFREDO B N, JHONATAN C P, et al. Artificial neural network modelling in the prediction of bananas'harvest[J]. Scientia Horticulturae, 2019,257:108724-108730.
    [14] CYNTHIA R. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead[J]. Nature Machine Intelligence,2019, 1(5):206-215.
    [15] STROBL C, BOULESTEIX A-L, KNEIB T, et al.Conditional variable importance for random forests[J]. BMC bioinformatics, 2008, 9:1-11.
    [16] ALTMANN A, TOLOŞI L, SANDER O, et al. Permutation importance:a corrected feature importance measure[J]. Bioinformatics, 2010, 26(10):1340-1347.
    [17] PATRICK F, BRETT M W, R. W V, et al. Mid-season empirical cotton yield forecasts at fine resolutions using large yield mapping datasets and diverse spatial covariates[J]. Agricultural Systems, 2020, 184:102894-1028104.
    [18] SCOTT M L, LEE S-I. A unified approach to interpreting model predictions[C]; proceedings of the Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, California, USA, 2017.Curran Associates Inc.
    [19] SHAPLEY L S. A value for n-person games[J]. Contributions to the Theory of Games, 1953:1-15.
    [20] JONES E J, BISHOP T F A, MALONE B P, et al. Identifying causes of crop yield variability with interpretive machine learning[J]. Computers and Electronics in Agriculture, 2022, 192:106632-106641.
    [21] ATTIA A, GOVIND A, QURESHI A S, et al. Coupling Process-Based Models and Machine Learning Algorithms for Predicting Yield and Evapotranspiration of Maize in Arid Environments[J]. Water, 2022, 14(22):3647-3662.
    [22] 季鹏,袁星.基于多种机器学习模型的西北地区蒸散发模拟与趋势分析[J]. 大气科学学报, 2023, 46(1):69-81.
    [23] 袁雨珍,杜衍红,周燕敏.涡旋提取-电感耦合等离子体发射光谱(ICP-OES) 法测定酸性和中性土壤中交换性盐基总量[J]. 中国无机分析化学, 2023:13(2):1408-1413.
    [24] 唐碧玉,阳兆鸿,陈祝炳,等.超声浸提-电感耦合等离子体原子发射光谱内标法测定离子型稀土矿区土壤中有效硫[J]. 冶金分析, 2020, 40(3):57-61.
    [25] 陈波,马玲,王金云.电感耦合等离子体原子发射光谱法同时测定复垦土壤中有效铜、锌、铁、锰、硫的含量[J]. 理化检验-化学分册, 2022, 58(2):166-172.
    [26] BRAY M, HAN D W. Identification of support vector machines for runoff modelling[J]. Journal of Hydroinformatics, 2004, 6(4):265-280.
    [27] HOOGENBOOM G, PORTER C H, BOOTE K J, et al.The DSSAT crop modeling ecosystem[M]. America:Advances in crop modelling for a sustainable agriculture,2019.
    [28] SABAS P, SILAS M, ISAMBI M, et al. Time series and ensemble models to forecast banana crop yield in Tanzania, considering the effects of climate change[J]. Resources, Environment and Sustainability, 2023, 14:100138-100148.
    [29] OLIVARES B O, ANDRÉS V, RUEDA C M A, et al. Prediction of Banana Production Using Epidemiological Parameters of Black Sigatoka:An Application with Random Forest[J]. Sustainability, 2022, 14(21):14123-14123.
    [30] SOARES J D R, PASQUAL M, LACERDA W S, et al.Comparison of techniques used in the prediction of yield in banana plants[J]. Scientia Horticulturae, 2014, 167:84-90.
    [31] BARLIN O O, MIGUEL A A, CESAR A O, et al. Relationship Between Soil Properties and Banana Productivity in the Two Main Cultivation Areas in Venezuela[J]. Journal of Soil Science and Plant Nutrition, 2020, 20:2512-2524.
    [32] RAMEZANPOUR M R, FARAJPOUR M. Application of artificial neural networks and genetic algorithm to predict and optimize greenhouse banana fruit yield through nitrogen, potassium and magnesium[J]. PloS one, 2022,17(2):1-12.
    [33] KENNETH N. Diagnosis and management of nutrient constraints in bananas(Musa spp.) [J]. Fruit Crops, 2020,651-659.
    [34] 赵学强,潘贤章,马海艺,等.中国酸性土壤利用的科学问题与策略[J]. 土壤学报, 2023, 60(5):1248-1264.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Article views(14) PDF downloads(0) Cited by()

Proportional views
Related

Banana yield prediction based on machine learning algorithm

doi: 10.15886/j.cnki.rdswxb.20240031

Abstract: The yield of banana(Musa AA) was predicted based on machine learning algorithm, and an optimal prediction model and yield influencing factors were clarified so as to provide technical support for the integrated nutrient management and yield prediction of banana(Musa AA) in Chengmai County, Hainan Province. The input variables were screened by correlation analysis and stepwise regression analysis, and then yield prediction models of banana(Musa AA) were established by random forest(RF), support vector machine(SVM), K-nearest neighbor(KNN) and artificial neural network(ANN). The models were interpreted by Shapley Additive exPlanations(SHAP) method to reveal the dominant factors affecting the yield of banana(Musa AA), and the impact of the dominant factors on yield was quantitatively analyzed. The results showed that the ANN model was the best for yield prediction with R2 being 0.98, and root-mean-square error(RMSE) and mean absolute error(MAE) being 0.16 kg·plant and 0.10 kg·plant, respectively, and that the prediction value of this model was basically of no deviation. The error of the ANN model had converged at about 100 samples, and even smaller cost could also achieve good prediction effect. The SVM model had only slightly lower prediction performance than the ANN model but had the risk of underfitting, while the KNN and RF models had lower prediction performance with overfitting, and the errors had not converged under the current sample size. The ANN model prediction results were explained by the SHAP method as follows: The leading factors affecting the yield of banana(Musa AA) were available potassium, alkali-hydrolyzed nitrogen, exchangeable calcium and exchangeable magnesium.When the content of available potassium was greater than 100 mg·kg-1, the content of alkali-hydrolyzed nitrogen was greater than 100 mg·kg-1, the content of exchangeable calcium was greater than 600 mg·kg-1, and the content of exchangeable magnesium was greater than 60 mg·kg-1, the yield of banana(Musa AA) was promoted. When the content of exchangeable calcium and magnesium in soil was deficient, the content of available manganese and available zinc in soil should be increased to alleviate the stress of nutrient deficiency in the banana plantations.

HUANG Sihao, WANG Lixia, LIU Yongxia, JIANG Chengjun, SONG Kexin, ZHAO Yan, HE Yingdui. Banana yield prediction based on machine learning algorithm[J]. Journal of Tropical Biology, 2025, 16(1): 21-30. doi: 10.15886/j.cnki.rdswxb.20240031
Citation: HUANG Sihao, WANG Lixia, LIU Yongxia, JIANG Chengjun, SONG Kexin, ZHAO Yan, HE Yingdui. Banana yield prediction based on machine learning algorithm[J]. Journal of Tropical Biology, 2025, 16(1): 21-30. doi: 10.15886/j.cnki.rdswxb.20240031
Reference (34)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return