Public Data, Reproducibility, and Benchmark-Building as Foundations for Generalizable Anastomotic Leak Prediction in Surgical Machine Learning

Nabin Thapa; Prakash Bhandari; Milan Karki

PDF

Published: 2025-07-04

Nabin Thapa

Far Western University, Department of Computer Science and Information Technology, Bhimdatta-18, Mahendranagar 10400, Mahakali Highway, Kanchanpur, Nepal

Prakash Bhandari

Mid-Western University, School of Engineering, Department of Computer Engineering and IT, Birendranagar 21700, Surkhet–Jumla Road, Surkhet, Nepal

Milan Karki

Nepal Open University, Faculty of Science, Health and Technology, Department of Computer Science, Manbhawan 44700, Kumaripati Road, Lalitpur, Nepal

Abstract

Benchmark culture has shaped progress in several computational disciplines because it converts isolated model claims into cumulative evidence. Clinical machine learning has adopted this logic unevenly, and anastomotic leak research illustrates why. Anastomotic leak is clinically consequential, relatively infrequent, operationally heterogeneous, and often documented through imperfect combinations of diagnosis codes, procedures, laboratory trajectories, and clinician judgment. These properties make the problem suitable for machine learning while simultaneously making evaluation unusually fragile. A model can appear promising under one cohort definition, one feature extraction protocol, or one institutional coding practice, then degrade when any of those conditions change. This paper develops a technical framework for benchmark-building in anastomotic leak research centered on three propositions: public data are necessary for cumulative method comparison, reproducibility must be treated as an evaluated property rather than a rhetorical aspiration, and benchmark design should foreground transportability rather than leaderboard maximization. The discussion formalizes benchmark tasks across preoperative, early postoperative, and longitudinal surveillance horizons; analyzes label uncertainty and missingness as structural properties of the data-generating process; and proposes an evaluation architecture that combines discrimination, calibration, shift robustness, fairness, and implementation-sensitive reporting. A central argument is that benchmark quality depends less on the novelty of any one algorithm than on precise cohort construction, deterministic pipelines, patient-level temporal splitting, and sustained governance of evolving public corpora. The result is a blueprint for anastomotic leak benchmarking that can support transparent model development, rigorous cross-site comparison, and more credible claims about clinical readiness

Issue

Vol. 15 No. 7 (2025): TDSHBS-JULY-2025

Section

Articles

How to Cite

Thapa, N., Bhandari, P., & Karki, M. (2025). Public Data, Reproducibility, and Benchmark-Building as Foundations for Generalizable Anastomotic Leak Prediction in Surgical Machine Learning. Transactions on Digital Society, Human Behavior, and Socioeconomic Studies, 15(7), 1-24. https://sciencequill.com/index.php/TDSHBS/article/view/2025-07-04

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section

How to Cite