header

Optimizing Low-Cost Numerical Biomarker Panels via Feature Selection Machine Learning for Automated Cancer Staging in Resource-Constrained Clinics

Authors
  • Sunday Sunday

    Ladoke Akintola University of Technology
    Author
Keywords:
Cancer Staging, Feature Selection, Machine Learning, Numerical Biomarkers, Resource-Constrained Settings, Random Forest, XGBoost
Abstract

Cancer staging remains a critical determinant of treatment pathways and patient prognosis, yet resource-constrained clinics in low- and middle-income countries (LMICs) face persistent barriers to accurate staging due to limited access to advanced imaging, pathology infrastructure, and specialist expertise. While machine learning has demonstrated promise in cancer classification using high-dimensional biomarker data, existing approaches typically require expensive imaging modalities or genomic sequencing, rendering them impractical for routine deployment in under-resourced settings. This study addresses this gap by developing and validating a framework for automated cancer staging using low-cost numerical biomarker panels optimized through feature selection machine learning. We employed a retrospective analysis of numerical biomarker data—including C-reactive protein (CRP), lactate dehydrogenase (LDH), and tumor mutation burden (TMB)—from 398 non-small-cell lung cancer patients, applying Recursive Feature Elimination (RFE) with Random Forest, Support Vector Machine (SVM), Gradient Boosting, and XGBoost classifiers. The optimized panel achieved a staging accuracy of 89.4% (95% CI: 87.1–91.7%) with only 12 key biomarkers, representing a 7% improvement over baseline models using full feature sets and comparable to prior radiomics-based approaches achieving 90.3% accuracy but at substantially lower cost . Feature importance analysis identified CRP, LDH, and albumin as the top three predictors. The proposed framework offers a scalable, non-invasive alternative to conventional staging methods, with significant implications for improving cancer care equity in LMICs.

Cover Image
Downloads
Published
06/26/2026
Section
Articles
License

Copyright (c) 2026 Sunday Sunday (Author)

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Optimizing Low-Cost Numerical Biomarker Panels via Feature Selection Machine Learning for Automated Cancer Staging in Resource-Constrained Clinics. (2026). The Science Post, 2(2). https://www.thesciencepostjournal.com/index.php/tsp/article/view/138