Threshold-Optimized and Calibrated Logistic Regression for Breast Cancer Classification

Ronal Watrianthos; Rayendra Rayendra; Ervan Asri; Yuhefizar Yuhefizar; Humaira Humaira

doi:10.56294/saludcyt20252241

Authors

Ronal Watrianthos Politeknik Negeri Padang, Department of Information Technology. Padang, Indonesia Author https://orcid.org/0000-0003-3475-7266
Rayendra Politeknik Negeri Padang, Department of Information Technology. Padang, Indonesia Author https://orcid.org/0009-0005-5736-2082
Ervan Asri Politeknik Negeri Padang, Department of Information Technology. Padang, Indonesia Author https://orcid.org/0000-0002-6566-7856
Yuhefizar Politeknik Negeri Padang, Department of Information Technology. Padang, Indonesia Author https://orcid.org/0000-0001-9861-7962
Humaira Politeknik Negeri Padang, Department of Information Technology. Padang, Indonesia Author https://orcid.org/0000-0002-9554-5358

DOI:

https://doi.org/10.56294/saludcyt20252241

Keywords:

Breast cancer classification, Logistic regression, Clinical decision support, Probability calibration, Fine needle aspirate cytology

Abstract

Breast cancer affects over 2.3 million individuals annually worldwide. Traditional diagnostic methods face limitations in consistency and objectivity, particularly in resource-constrained settings. This study developed a logistic regression-based clinical decision support system for breast cancer classification. We analyzed the Wisconsin Diagnostic Breast Cancer dataset containing 569 samples with 30 quantitative morphological features from fine needle aspirate cytology. The dataset comprised 357 benign and 212 malignant cases. Data underwent standardization via StandardScaler, followed by 75-25 train-test partitioning (426 training, 143 testing samples). We evaluated the logistic regression model through confusion matrix analysis, ROC curve assessment, threshold optimization via Youden's Index, and probability calibration using Expected Calibration Error (ECE). The model achieved 95.8% accuracy, 96.2% sensitivity, and 95.6% specificity on independent testing data, with AUC-ROC of 0.993. Threshold optimization identified 0.560 as the optimal decision boundary, yielding 3.77% false negative rate and 4.44% false positive rate. Probability calibration demonstrated reliable predictions with ECE of 0.0390, improved to 0.0328 through isotonic regression. The model correctly classified 137 of 143 test samples (86 true negatives, 51 true positives, 4 false positives, 2 false negatives). The logistic regression model demonstrated strong discriminative performance for breast cancer classification. However, single train-test validation and dataset-specific characteristics require cautious interpretation. Cross-validation and external validation remain necessary for clinical translation.

References

1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6). DOI: https://doi.org/10.3322/caac.21492

2. DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, et al. Breast cancer statistics, 2019. CA Cancer J Clin. 2019;69(6). DOI: https://doi.org/10.3322/caac.21583

3. Al Muhaisen S, Safi O, Ulayan A, Aljawamis S, Fakhoury M, Baydoun H, et al. Artificial Intelligence-Powered Mammography: Navigating the Landscape of Deep Learning for Breast Cancer Detection. Cureus. 2024 Mar 26; DOI: https://doi.org/10.7759/cureus.56945

4. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021;71(3). DOI: https://doi.org/10.3322/caac.21660

5. Madabhushi A, Lee G. Image analysis and machine learning in digital pathology: Challenges and opportunities. Vol. 33, Medical Image Analysis. 2016. DOI: https://doi.org/10.1016/j.media.2016.06.037

6. Evans AJ, Brown RW, Bui MM, Chlipala EA, Lacchetti C, Milner DA, et al. Validating Whole Slide Imaging Systems for Diagnostic Purposes in Pathology. Arch Pathol Lab Med. 2022;146(4). DOI: https://doi.org/10.5858/arpa.2020-0723-CP

7. Krane JF. Koss’ Diagnostic Cytology and Its Histopathologic Bases, Fifth Edition. International Journal of Gynecological Pathology. 2007;26(3). DOI: https://doi.org/10.1097/pgp.0b013e3180544919

8. Ramos-Vara JA, Miller MA. When Tissue Antigens and Antibodies Get Along: Revisiting the Technical Aspects of Immunohistochemistry-The Red, Brown, and Blue Technique. Vol. 51, Veterinary Pathology. 2014. DOI: https://doi.org/10.1177/0300985813505879

9. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Vol. 28, Nature Medicine. 2022. DOI: https://doi.org/10.1038/s41591-021-01614-0

10. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Vol. 25, Nature Medicine. 2019. DOI: https://doi.org/10.1038/s41591-018-0316-z

11. McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788). DOI: https://doi.org/10.1038/s41586-019-1799-6

12. Liu Y, Jain A, Eng C, Way DH, Lee K, Bui P, et al. A deep learning system for differential diagnosis of skin diseases. Nat Med. 2020;26(6). DOI: https://doi.org/10.1038/s41591-020-0842-3

13. Samsir S, Sitorus JHP, Zulkifli, Ritonga Z, Nasution FA, Watrianthos R. Comparison of machine learning algorithms for chest X-ray image COVID-19 classification. J Phys Conf Ser. 2021;1933(1):012040. DOI: https://doi.org/10.1088/1742-6596/1933/1/012040

14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Vol. 25, Nature Medicine. 2019. DOI: https://doi.org/10.1038/s41591-018-0300-7

15. Jimma BL. Artificial intelligence in healthcare: A bibliometric analysis. Vol. 9, Telematics and Informatics Reports. 2023. DOI: https://doi.org/10.1016/j.teler.2023.100041

16. Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: Past, present and future. Vol. 2, Stroke and Vascular Neurology. 2017. DOI: https://doi.org/10.1136/svn-2017-000101

17. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Vol. 1, Nature Machine Intelligence. 2019. DOI: https://doi.org/10.1038/s42256-019-0048-x

18. Ghassemi M, Oakden-Rayner L, Beam AL. The false hope of current approaches to explainable artificial intelligence in health care. Vol. 3, The Lancet Digital Health. 2021. DOI: https://doi.org/10.1016/S2589-7500(21)00208-9

19. Beam AL, Kohane IS. Big data and machine learning in health care. Vol. 319, JAMA - Journal of the American Medical Association. 2018. DOI: https://doi.org/10.1001/jama.2017.18391

20. Char DS, Shah NH, Magnus D. Implementing Machine Learning in Health Care — Addressing Ethical Challenges. New England Journal of Medicine. 2018;378(11). DOI: https://doi.org/10.1056/NEJMp1714229

21. Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. Vol. 17, BMC Medicine. 2019. DOI: https://doi.org/10.1186/s12916-019-1426-2

22. Naylor CD. On the prospects for a (Deep) learning health care system. Vol. 320, JAMA - Journal of the American Medical Association. 2018. DOI: https://doi.org/10.1001/jama.2018.11103

23. Scott AJ, Hosmer DW, Lemeshow S. Applied Logistic Regression. Biometrics. 1991;47(4). DOI: https://doi.org/10.2307/2532419

24. Anderson KM, Odell PM, Wilson PWF, Kannel WB. Cardiovascular disease risk profiles. Am Heart J. 1991;121(1 PART 2). DOI: https://doi.org/10.1016/0002-8703(91)90861-B

25. Guo C, Pleiss G, Sun Y, Weinberger KQ. On calibration of modern neural networks. In: 34th International Conference on Machine Learning, ICML 2017. 2017.

26. Minderer M, Djolonga J, Romijnders R, Hubis F, Zhai X, Houlsby N, et al. Revisiting the Calibration of Modern Neural Networks. In: Advances in Neural Information Processing Systems. 2021.

27. Vickers AJ, Elkin EB. Decision curve analysis: A novel method for evaluating prediction models. Medical Decision Making. 2006;26(6). DOI: https://doi.org/10.1177/0272989X06295361

28. Al Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A, Subbarayalu AV, et al. A Review of the Role of Artificial Intelligence in Healthcare. Vol. 13, Journal of Personalized Medicine. 2023. DOI: https://doi.org/10.3390/jpm13060951

29. Ueda D, Kakinuma T, Fujita S, Kamagata K, Fushimi Y, Ito R, et al. Fairness of artificial intelligence in healthcare: review and recommendations. Vol. 42, Japanese Journal of Radiology. 2024. DOI: https://doi.org/10.1007/s11604-023-01474-3

30. Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ (Online). 2016;352. DOI: https://doi.org/10.1136/bmj.i6

31. Sendak MP, Gao M, Brajer N, Balu S. Presenting machine learning model information to clinical end users with model facts labels. Vol. 3, npj Digital Medicine. 2020. DOI: https://doi.org/10.1038/s41746-020-0253-3

32. López-Ratón M, Rodríguez-Álvarez MX, Cadarso-Suárez C, Gude-Sampedro F. Optimalcutpoints: An R package for selecting optimal cutpoints in diagnostic tests. J Stat Softw. 2014;61(8). DOI: https://doi.org/10.18637/jss.v061.i08

33. Perkins NJ, Schisterman EF. The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve. Am J Epidemiol. 2006;163(7). DOI: https://doi.org/10.1093/aje/kwj063

34. Wu N, Phang J, Park J, Shen Y, Kim SG, Heacock L, et al. Breast Cancer Wisconsin (Diagnostic) Data Set | Kaggle. Kaggle. 2019;4(November).

35. W.N. S, W.H. W, O.L. M. Nuclear feature extraction for breast tumor diagnosis. IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology. 1993;1905(870).

36. Alshayeji MH, Ellethy H, Abed S, Gupta R. Computer-aided detection of breast cancer on the Wisconsin dataset: An artificial neural networks approach. Biomed Signal Process Control. 2022;71. DOI: https://doi.org/10.1016/j.bspc.2021.103141

37. Panda NR, Pati JK, Mohanty JN, Bhuyan R. A Review on Logistic Regression in Medical Research. Vol. 13, National Journal of Community Medicine. 2022. DOI: https://doi.org/10.55489/njcm.134202222

38. Prasad R, Anjali P, Adil S, Deepa N. Heart disease prediction using logistic regression algorithm using machine learning. Int J Eng Adv Technol. 2019;8(3 Special Issue).

39. Lasko TA, Bhagwat JG, Zou KH, Ohno-Machado L. The use of receiver operating characteristic curves in biomedical informatics. Vol. 38, Journal of Biomedical Informatics. 2005. DOI: https://doi.org/10.1016/j.jbi.2005.02.008

40. Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine. Clin Chem. 1993;39(4). DOI: https://doi.org/10.1093/clinchem/39.4.561

41. Posocco N, Bonnefoy A. Estimating Expected Calibration Errors. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2021. DOI: https://doi.org/10.1007/978-3-030-86380-7_12

42. Böken B. On the appropriateness of Platt scaling in classifier calibration. Inf Syst. 2021;95. DOI: https://doi.org/10.1016/j.is.2020.101641

43. Huang L, Zhao J, Zhu B, Chen H, Broucke S Vanden. An Experimental Investigation of Calibration Techniques for Imbalanced Data. IEEE Access. 2020;8. DOI: https://doi.org/10.1109/ACCESS.2020.3008150

44. Hosmer DW, Hosmer T, Le Cessie S, Lemeshow S. A comparison of goodness-of-fit tests for the logistic regression model. Stat Med. 1997;16(9). DOI: https://doi.org/10.1002/(SICI)1097-0258(19970515)16:9<965::AID-SIM509>3.0.CO;2-O

45. Cox DR. Two Further Applications of a Model for Binary Regression. Biometrika. 1958;45(3/4). DOI: https://doi.org/10.2307/2333203

46. Dhanya R, Paul IR, Sindhu Akula S, Sivakumar M, Nair JJ. A comparative study for breast cancer prediction using machine learning and feature selection. In: 2019 International Conference on Intelligent Computing and Control Systems, ICCS 2019. 2019. DOI: https://doi.org/10.1109/ICCS45141.2019.9065563

47. LG A, AT E. Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence. J Health Med Inform. 2013;04(02). DOI: https://doi.org/10.4172/2157-7420.1000124

48. Aličković E, Subasi A. Breast cancer diagnosis using GA feature selection and Rotation Forest. Neural Comput Appl. 2017;28(4). DOI: https://doi.org/10.1007/s00521-015-2103-9

49. Salama Gouda I, .Abdelhalim M.B., Zeid Magdy Abd-elghany. Breast Cancer Diagnosis on Three Different Datasets Using Multi-Classifiers. International Journal of Computer and Information Technology. 2012;01(01).

50. Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ. Quantification via probability estimators. In: Proceedings - IEEE International Conference on Data Mining, ICDM. 2010. DOI: https://doi.org/10.1109/ICDM.2010.75

51. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Vol. 110, Journal of Clinical Epidemiology. 2019. DOI: https://doi.org/10.1016/j.jclinepi.2019.02.004

52. Asri H, Mousannif H, Al Moatassime H, Noel T. Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis. In: Procedia Computer Science. 2016. DOI: https://doi.org/10.1016/j.procs.2016.04.224

Threshold-Optimized and Calibrated Logistic Regression for Breast Cancer Classification

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite