Publications
Recent publications
Machine learning model deployment in clinical practice demands real-time risk assessment to identify situations in which the model is uncertain. Once deployed, models should be accurate for classes seen during training while providing informative estimates of uncertainty to flag abnormalities and unseen classes for further analysis. Although recent developments in uncertainty estimation have resulted in an increasing number of methods, a rigorous empirical evaluation of their performance on large-scale digital pathology datasets is lacking. This work provides a benchmark for evaluating prevalent methods on multiple datasets by comparing the uncertainty estimates on both in-distribution and realistic near and far out-of-distribution (OOD) data on a whole-slide level. To this end, we aggregate uncertainty values from patch-based classifiers to whole-slide level uncertainty scores. We show that results found in classical computer vision benchmarks do not always translate to the medical imaging setting. Specifically, we demonstrate that deep ensembles perform best at detecting far-OOD data but can be outperformed on a more challenging near-OOD detection task by multi-head ensembles trained for optimal ensemble diversity. Furthermore, we demonstrate the harmful impact OOD data can have on the performance of deployed machine learning models. Overall, we show that uncertainty estimates can be used to discriminate in-distribution from OOD data with high AUC scores. Still, model deployment might require careful tuning based on prior knowledge of prospective OOD data.
Publication in DiVABackground:Optimal fluid status is an important issue in hemodialysis. Clinical evaluation of volume status and different diagnostic tools are used to determine hydration status in these patients. However, there is still no accurate method for this assessment. Purpose:To propose and evaluate relative lean water signal (LWSrel) as a water-fat MRI-based tissue hydration measurement. Study Type:Prospective. Population:A total of 16 healthy subjects (56 & PLUSMN; 6 years, 0 male) and 11 dialysis patients (60.3 +/- 12.3 years, 9 male; dialysis time per week 15 +/- 3.5 hours, dialysis duration 31.4 +/- 27.9 months). Field Strength/Sequence:A 3 T; 3D spoiled gradient echo. Assessment:LWSrel, a measurement of the water concentration of tissue, was estimated from fat-referenced MR images. Segmentations of total adipose tissue as well as thigh and calf muscles were used to measure LWSrel and tissue volumes. LWSrel was compared between healthy subjects and dialysis patients, the latter before and after dialysis. Bioimpedance-based body composition monitor over hydration (BCM OH) was also measured. Statistical Tests:T-tests were used to compare differences between the healthy subjects and dialysis patients, as well as changes between before and after dialysis. Pearson correlation was calculated between MRI and non-MRI biomarkers. A P value < 0.05 was considered statistically significant. Results:The LWSrel in adipose tissue was significantly higher in the dialysis cohort compared with the healthy cohort (246.8% +/- 60.0% vs. 100.0% +/- 10.8%) and decreased significantly after dialysis (246.8 +/- 60.0% vs. 233.8 +/- 63.4%). Thigh and calf muscle volumes also significantly decreased by 3.78% +/- 1.73% and 2.02% +/- 2.50% after dialysis. There was a significant correlation between changes in adipose tissue LWSrel and ultrafiltration volume (r = 87), as well as with BCM OH (r = 0.66). Data Conclusion:MRI-based LWSrel and tissue volume measurements are sensitive to tissue hydration changes occurring during dialysis.
Publication in DiVABackground Segmenting the whole heart over the cardiac cycle in 4D flow MRI is a challenging and time-consuming process, as there is considerable motion and limited contrast between blood and tissue.
Purpose To develop and evaluate a deep learning-based segmentation method to automatically segment the cardiac chambers and great thoracic vessels from 4D flow MRI.
Study Type Retrospective.
Subjects A total of 205 subjects, including 40 healthy volunteers and 165 patients with a variety of cardiac disorders were included. Data were randomly divided into training (n = 144), validation (n = 20), and testing (n = 41) sets.
Field Strength/Sequence A 3 T/time-resolved velocity encoded 3D gradient echo sequence (4D flow MRI).
Assessment A 3D neural network based on the U-net architecture was trained to segment the four cardiac chambers, aorta, and pulmonary artery. The segmentations generated were compared to manually corrected atlas-based segmentations. End-diastolic (ED) and end-systolic (ES) volumes of the four cardiac chambers were calculated for both segmentations.
Statistical tests Dice score, Hausdorff distance, average surface distance, sensitivity, precision, and miss rate were used to measure segmentation accuracy. Bland-Altman analysis was used to evaluate agreement between volumetric parameters.
Results The following evaluation metrics were computed: mean Dice score (0.908 +/- 0.023) (mean +/- SD), Hausdorff distance (1.253 +/- 0.293 mm), average surface distance (0.466 +/- 0.136 mm), sensitivity (0.907 +/- 0.032), precision (0.913 +/- 0.028), and miss rate (0.093 +/- 0.032). Bland-Altman analyses showed good agreement between volumetric parameters for all chambers. Limits of agreement as percentage of mean chamber volume (LoA%), left ventricular: 9.3%, 13.5%, left atrial: 12.4%, 16.9%, right ventricular: 9.9%, 15.6%, and right atrial: 18.7%, 14.4%; for ED and ES, respectively.
Data conclusion The addition of this technique to the 4D flow MRI assessment pipeline could expedite and improve the utility of this type of acquisition in the clinical setting.
Evidence Level 4
Technical Efficacy Stage 1
Publication in DiVA