Introduction: The clinical research network PEDSnet, a collection of pediatric health systems in the US, provides an opportunity to pool data amongst institutions to allow for observational studies and clinical trials in rare diseases. We previously reported using diagnostic codes to identify patients in PEDSnet with primary hyperoxaluria (PH), a rare genetic disorder of liver oxalate overproduction. Here, we evaluate the performance of our algorithm to accurately identify patients with PH.
Methods: We reviewed the medical records of 341 previously identified patients <18 years old who had diagnostic codes for or related to PH between January 2009 - January 2021 at 7 PEDSnet institutions. We developed an algorithm using diagnostic codes that generated three categories of the hypothesized probability of truly having PH. Tier 1 had specific diagnostic codes for PH; tier 2 had codes for hyperoxaluria, oxalate nephropathy, or oxalosis; tier 3 had a combination of =2 codes for a disorder of carbohydrate metabolism and =1 code for kidney stones. The definition of true PH diagnosis included genetic testing, confirmatory urine collections of oxalate metabolites, or diagnosis through other sources, such as clinical documentation or liver biopsy/hepatic enzyme analysis. The proportion of confirmed PH diagnoses was compared across tiers and PEDSnet sites.
Results: Out of 330 patients with PH diagnosis codes for whom chart review was completed, 36 had confirmed PH (10.9%). Tier 1 had the highest proportion of true PH; however, the accuracy was only 20% (Figure). Of those with confirmed PH, 25 were PH type 1 and the remaining were type 2, type 3, or unspecified. The diagnosis was confirmed by genetic testing in 28 patients, while the others were diagnosed based on clinical documentation or liver biopsy/hepatic enzyme analysis. The range of percent confirmed PH across sites was 0-18% (median 14%), but improved to 0-47% (median 22%) when using tier 1 criteria.
Conclusions: Diagnostic codes for hyperoxaluria, even those specific for PH, have a poor ability to accurately identify patients with PH. This poor performance raises concerns when only using code-based means of identifying patient populations of interest. One must be particularly careful in sample populations for rare disorders.