Performance Evaluation of Some Machine Learning Models for Music Genre Classification

Article Sidebar

Main Article Content

Olojede, Ojo Abraham
Stephen Olatunde, Olabiyisi
Oluwaseun. O Alo

Abstract: Music genre classification is a challenging task in the field of music information retrieval due to the overlapping characteristics of certain genres and the variability in audio quality. Several techniques have been developed to accurately classify music genre. However, these techniques have not been adequately analysed and compared. Hence, this study investigates the comparative performance of Convolutional Neural Network (CNN), Support Vector Machine (SVM), and Random Forest (RF) in music genre classification.


Mel-Frequency Cepstral Coefficients (MFCCs) were extracted from the audio samples using the Librosa library. Next, the three machine learning models - Convolutional Neural Network (CNN), Support Vector Machine (SVM) and Random Forest (RF) - were trained. The CNN model was designed with multiple convolutional and pooling layers, along with dropout for regularization. The SVM model was used to create an optimal hyperplane for classification, while the RF model utilized an ensemble of decision trees. Finally, the models were evaluated and compared using accuracy, precision, recall and F1 score.


The results of the evaluation and comparism indicate that CNN achieved 95% accuracy, 93% precision, 92% recall and 91% F-1 score. SVM achieved 93% accuracy, 90% precision, 80% recall and 70% F-1 score while RF achieved 77% accuracy, 77% precision, 72% recall and 60% F-1 score.


The result demonstrated that CNN outperformed SVM and RF interms of accuracy, precision, recall, and F-1 score: CNN is thereby recommended for Music Genre Classification, this finding underscore the efficiency of CNN addressing the challenges task in the field of music information retrieval and leading to the advancement of automated music classification system and improve the accessibility and enjoyment of digital music libraries.

Performance Evaluation of Some Machine Learning Models for Music Genre Classification. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(2), 18-24. https://doi.org/10.51583/IJLTEMAS.2025.1402003

Downloads

Downloads

Download data is not yet available.

References

Bergstra, J., Casagrande, N., Erhan, D., Eck, D., & Kégl, B. (2006). Aggregate features and AdaBoost for music classification. Machine Learning, 65(2), 473-484. DOI: https://doi.org/10.1007/s10994-006-9019-7

Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., & Slaney, M. (2008). Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE, 96(4), 668-696. DOI: https://doi.org/10.1109/JPROC.2008.916370

Choi, H., Sohn, K., & Kim, J. (2017). Transfer Learning for Music Genre Classification Using Deep Neural Networks. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 632–637.

Dieleman, S., & Schrauwen, B. (2014). End-to-end learning for music audio. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). DOI: https://doi.org/10.1109/ICASSP.2014.6854950

Essid, S., Richard, G., & David, B. (2006). Musical instrument recognition by pairwise classification strategies. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1401 DOI: https://doi.org/10.1109/TSA.2005.860842

Ghildiyal, A., Singh, K., & Sharma, S. (2020). Music genre classification using machine learning. In 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 1368–1372). doi: https://doi.org/10.1109/ICECA48744.2020.9271215 DOI: https://doi.org/10.1109/ICECA49313.2020.9297444

Humphrey, E. J., Bello, J. P., & LeCun, Y. (2012). Moving beyond feature design: Deep architectures and automatic feature learning in music informatics. Proceedings of the 13th International Society for Music Information Retrieval Conference.

Li, X., Ogihara, M., & Kitahara, I. (2003). A support vector machine classifier for music genre classification. In IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '03, 2003 (Vol. 3, pp. III–53–56).

Lidy, T., & Rauber, A. (2005). Evaluation of feature extractors and psycho-acoustic transformations for music genre classification. Proceedings of the 6th International Conference on Music Information Retrieval.

Logan, B., & Salomon, A. (2001). Music classification by tempo and beat-occurrence features. IEEE Transactions on Multimedia, 3(3), 341–348.

Ndou, N., Ajoodha, R., & Jadhav, A. (2021). Music genre classification: A review of deep learning and traditional machine-learning approaches. In 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (pp. 351–356). doi:https://doi.org/10.1109/IEMTRONICS52076.2021.9455366 DOI: https://doi.org/10.1109/IEMTRONICS52119.2021.9422487

Ren, D., Liu, Y., & Li, J. (2019). Musical genre classification based on Gaussian mixture models of spectral and temporal features. In 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6939–6943). IEEE.

Schlüter, J., & Böck, S. (2014). Improved musical onset detection with convolutional neural networks. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). DOI: https://doi.org/10.1109/ICASSP.2014.6854953

Sturm, B. L. (2014). A simple method to determine if a music information retrieval system is a “horse”. IEEE Transactions on Multimedia, 16(6), 1636-1644. DOI: https://doi.org/10.1109/TMM.2014.2330697

Turnbull, D., Renals, S., & Gillett, M. (2003). Towards content-based classification of popular music using linguistic and audio features. IEEE Transactions on Audio, Speech, and Language Processing, 11(6), 709–717.

Tzanetakis, G., & Cook, P. (2002). Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293-302. DOI: https://doi.org/10.1109/TSA.2002.800560

West, R., Schnitzer, D., & Brown, G. (2002). Probabilistic classification and segmentation of audio data using hidden Markov models. Journal of New Music Research, 31(2), 203 214.

Zhang, Y., & Yang, Q. (2018). An overview of multi-task learning. National Science Review, 5(1,30-43. DOI: https://doi.org/10.1093/nsr/nwx105

Article Details

How to Cite

Performance Evaluation of Some Machine Learning Models for Music Genre Classification. (2025). International Journal of Latest Technology in Engineering Management & Applied Science, 14(2), 18-24. https://doi.org/10.51583/IJLTEMAS.2025.1402003

Similar Articles

You may also start an advanced similarity search for this article.