Resident University of Calgary Calgary, Alberta, Canada
Background: Dysphonia is a common symptom that primary care physicians encounter. Diagnostic tools for voice disorders are lacking for primary care physicians. Artificial intelligence (AI) tools may add to the armamentarium for physicians, decreasing the time to diagnosis and limiting the burden of dysphonia for patients.
Methods: Voice recordings patients were prospectively collected from 2019-2021 using smartphones. The Saarbrucken dataset was included to increase heterogeneity and sample size. Audio files were converted to mel-spectrograms using TensorFlow and the short-time Fourier transform method. Diagnostic categories were created to group pathology and included neurological and muscular disorders, inflammatory, mass lesions, and normal.
Results: There were 1304 samples collected, including 144 prospective and 712 normal samples. The binary AI was trained on 1025 samples and tested on 129, and the diagnostic category AI was trained on 800 samples and tested on 90. The AI detected pathology with 89% accuracy. The AI was able to differentiate vocal cord pathology based on the diagnostic category with 85% accuracy. When disorders were not grouped into categories, accurate recognition of individual pathologies decreased to 55%.