Introduction: Gastroesophageal reflux disease (GERD) can lead to voice alterations, including hoarseness. The study aim was to identify specific voice biomarkers associated with pathologic GERD using advanced machine learning tools. Detection of pathologic GERD, including Barrett’s esophagus (BE), with voice biomarkers can serve as a simple non-invasive screening tool.
Methods: Voice recordings were obtained from patients undergoing clinically indicated esophagogastroduodenoscopy (EGD) and/or ambulatory pH monitoring studies. Patients were excluded if they had another condition (pulmonary, cardiac, neurologic, etc) associated with voice disturbance. Voice recording consisted of a 6-sentence standard script read over 25-45 seconds. GERD(+) patients were defined as those with erosive esophagitis (LA grade B-D) or peptic stricture or acid exposure time >6%. BE was defined as columnar mucosa >1 cm with confirmed specialized intestinal metaplasia. Patients without these findings were considered GERD(-). A vocally normal group consisting of individuals with normal voice as judged by speech pathology evaluation was used as an independent control group.
Random forest models were trained using a balanced number of subjects per condition using random participant selection from the majority class. Using a 5-fold nested cross validation strategy, features were selected and ranked within fold, and a series of models were trained within each fold using recursive feature elimination. The average F1 score, a harmonic mean of precision and recall (range 0-100), across all folds was reported to assess performance.
Results: The study sample consisted of 245 patients (vocally normal, n=98; GERD(-), n=78; GERD(+), n=34; BE, n=35) (Table 1). Feature rankings suggested voice quality differences between groups relating to voice signal periodicity. The model demonstrated excellent ability to discern BE and GERD(+) from the vocally normal group with F1 scores 82 (males) and 89 (females) and 80 (males) and 80 (females) for BE and GERD(+) respectively. There was also a good voice signal distinguishing BE and GERD(+) groups from GERD(-) with F1 scores ranging from 60-70. Figure 1 shows model Receiver Operating Characteristics.
Discussion: These results suggest that voice biomarkers may be useful as a non-invasive tool in the detection of pathologic GERD/BE. A deep learning diagnostic model will be developed using the identified voice biomarkers.