Voice Disorder Detection: Hybrid GTCC-MFCC Fusion with Source-Based Features Optimized for the Male Sub-Cohort
Abstract
Voice disorders present a significant clinical challenge, adversely impacting communication and quality of life, thereby necessitating the development of reliable, non-invasive diagnostic systems. This research proposes an advanced diagnostic framework designed to overcome the limitations of traditional methodologies that rely exclusively on single-source spectral information. To achieve this, a systematic optimization methodology was applied to extract acoustic features from sustained vowel /a/ signals obtained from the male sub-cohort of the Saarbrücken Voice Database (SVD). The feature engineering phase integrated a comprehensive set of acoustic descriptors, combining advanced spectral coefficients (GTCC and MFCC) with traditional source-based features. To identify the most potent and non-redundant feature subset, a Recursive Feature Elimination (RFE) algorithm was rigorously employed across 100 iterative experiments, guaranteeing high statistical stability. This work substantiates two critical findings: First, that a hybrid strategy which intelligently combines auditory-inspired spectral features with traditional source-based biomarkers is necessary for maximizing diagnostic stability. Second, the RFE process validated the indispensability of key source-based metrics (CPP and GNR), which achieved high ranks in the final feature vector. The proposed framework achieved a peak Accuracy of 84.99%±4.50% and demonstrated good clinical stability in the early detection of voice disorders, confirming the necessity of integrating source-based biomarkers into advanced spectral analysis frameworks
Downloads
Copyright (c) 2026 ITEGAM-JETIA

This work is licensed under a Creative Commons Attribution 4.0 International License.








