Hybrid CNN-Capsule Network Architecture for Automated Diabetic Retinopathy Classification: A Rigorous Statistical Validation on Clinical Data
DOI:
https://doi.org/10.56294/saludcyt20252320Keywords:
Diabetic Retinopathy, Computer-Aided Diagnosis, Capsule Networks, Deep Learning, Medical Image Analysis, Clinical ValidationAbstract
Introduction: Diabetic retinopathy (DR) remains the leading cause of preventable blindness worldwide, necessitating the development of automated screening solutions to address the global shortage of ophthalmological expertise. Many existing deep learning approaches lack rigorous statistical validation and fail to address the severe class imbalance characteristic of clinical DR populations.
Objective: A hybrid convolutional neural network-capsule network architecture for automated DR severity grading, emphasizing statistical rigor and clinical applicability over benchmark optimization.
Method: A novel ensemble architecture combining ResNet-50 and MobileNet-V2 feature extractors with capsule network classifiers was developed and rigorously validated using 5-fold stratified cross-validation on the Messidor dataset (n=1,200) with strict patient-level data splitting. The framework addresses severe class imbalance through systematic multi-level mitigation strategies including SMOTE-ENN augmentation, focal loss optimization (α=0,25, γ=2,0), and dynamic class weighting. Statistical robustness was ensured through bootstrap confidence intervals (n=1,000 iterations), McNemar's paired comparison tests, and comprehensive ablation studies.
Results: The hybrid architecture achieved 88,67 % ± 2,43 % overall accuracy (95 % CI: 86,24 %-91,10 %), with an area under the curve of 90,85 % ± 2,15 % (95 % CI: 88,70 %-93,00 %). Critically, sensitivity for sight-threatening cases reached 84,0 % for severe NPDR and 80,6 % for proliferative DR, while maintaining 91,8 % specificity for non-referrable cases. The systematic class imbalance mitigation strategy improved the F1-scores of the minority class by 19,3 % relative to the standard CNN baseline (p < 0,001, McNemar's χ² = 15,67). Cross-validation consistency (coefficient of variation: 2,74 %) demonstrated model stability essential for clinical deployment.
Conclusion: The hybrid CNN-capsule architecture provides clinically relevant DR classification with transparent statistical validation. The demonstrated sensitivity for sight-threatening cases and conservative classification patterns support potential screening application, though prospective clinical validation remains necessary.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Mini T. V (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.
