Karl Gisslander1, Arthur White2, Mårten Segelmark3, Mark Little2 and Aladdin Mohammad4, 1Lund University, Lund, Sweden, 2Trinity College, Dublin, Ireland, 3Nephrology, Department of Clinical Sciences, Lund University, Lund, Sweden, 4Rheumatology, Department of Clinical Sciences, Lund University, Lund, Sweden
Background/Purpose: The sub-classification of anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitis (AAV) has been a long-standing debate. Unsupervised learning has previously been used for partitioning of phenotypic groups, but as AAV is a rare disease, small sample sizes have been a limiting factor. Here we attempt clustering of a small dataset harmonised to the FAIRVASC ontology, allowing potential future inclusion of an additional 5000 AAV patients from the FAIRVASC collaboration registries to the cluster model. FAIRVASC is a research project seeking to federate AAV registries across Europe using semantic web technologies (https://fairvasc.eu).
Methods: This study used a dataset of 292 patients from southern Sweden, classified as granulomatosis with polyangiitis (GPA) or microscopic polyangiitis (MPA), according to the European Medicines Agency algorithm. The dataset was pre-processed from a relational database format to a resource descriptive framework (RDF) graph-based data model, harmonising the dataset to a FAIRVASC standard. Factor analysis of mixed data (FAMD) and agglomerative hierarchical cluster analysis on principal components (HCPC) was used to develop a cluster model, including organ pattern, ANCA status, serum creatinine, C-reactive protein, gender, and age at diagnosis. The generated clusters were evaluated by baseline characteristics, mortality, and renal outcome.
Results: The analyses involved data for 163 subjects with GPA and 129 with MPA. The clustering model resulted in two larger clusters and three smaller ones. The larger clusters were a predominantly anti-PR3 positive cluster of young (mean 57.5 years at diagnosis) patients with ear-nose-throat involvement and a favourable outcome (Cluster 1), and a predominantly anti-MPO positive cluster with severe kidney involvement and high rates of mortality and end-stage kidney disease (Cluster 5). The three smaller clusters differed in terms of organ involvement and ANCA status at diagnosis, one with severe lung and renal involvement and a poor outcome (Cluster 3) and two with similar outcome, one ANCA negative (Cluster 4), and one with peripheral nerve involvement (Cluster 2). The descriptive characteristics of the clusters are presented in table 1.
Conclusion: Our analysis suggests five clusters of AAV patients based on baseline features, associated with different mortality and renal outcome. The investigation acts as a proof of concept of the FAIRVASC ontology and infrastructure for the harmonisation of heterogeneous AAV datasets. The cluster model may in the future readily include an unprecedented number of European AAV patients. Table 1. Baseline characteristics and outcome of 292 patients with AAV by diagnosis and cluster affiliation Disclosures: K. Gisslander, None; A. White, None; M. Segelmark, None; M. Little, None; A. Mohammad, None.