Radiomics and artificial intelligence (AI)-based imaging models offer a noninvasive approach to preoperative risk stratification in localized renal cell carcinoma (RCC), where existing prognostic tools remain limited. We conducted a systematic review and meta-analysis to evaluate their predictive performance and methodological quality for recurrence and survival outcomes.
A systematic review was conducted in PubMed and Scopus from inception through April 2025. Radiomics and AI models were assessed for prognostic accuracy regarding 5-yr fixed-time recurrence-free survival (RFS) and overall survival after surgery for localized RCC. The extracted data included model type, radiomic features, validation methods, and area under the curve (AUC). Methodological quality was assessed using the APPRAISE-AI framework. Pooled 5-yr AUCs were synthesized using a prespecified random-effect model; heterogeneity was quantified (Q and τ2) and explored using a prespecified analysis restricted to external validation-only cohorts and sensitivity analyses.
Thirty studies (n = 17 639) were included, predominantly retrospective and computed tomography (CT) based. The most predictive and frequently retained radiomic features were from the gray-level co-occurrence matrix and shape families. A meta-analysis of 20 radiomic model cohorts showed a pooled AUC of 0.87 (95% confidence interval [CI]: 0.84-0.90) for 5-yr RFS (Q = 271.08; p < 0.001; τ2 = 0.0037). External validation cohorts showed a pooled AUC of 0.86 (95% CI: 0.83-0.88; Q = 12.81; p = 0.172; τ2 = 0.0004). APPRAISE-AI revealed overall moderate methodological quality (median score: 54/100), with limited adherence to TRIPOD-AI and underuse of explainability tools.
Radiomic models for localized RCC built on standardized CT protocols and robust segmentation, and incorporating shape and texture features combined with clinical variables demonstrated high prognostic accuracy. Our meta-analysis confirms that such models predict recurrence and survival outcomes accurately.
Radiomics and image-based artificial intelligence (AI) have recently gained significant interest as potential tools to refine risk stratification in localised renal cell carcinoma (RCC), a setting in which clinicians still rely largely on traditional clinicopathological variables and postoperative nomograms. In this context, the recent systematic review and meta-analysis by Mjaess et al. provides a timely and comprehensive overview of the current body of evidence regarding these approaches. The authors present a methodologically well-conducted systematic review, using the APPRAISE-AI tool for the quantitative evaluation of AI studies for clinical decision support, which offers a critical appraisal of how such tools might eventually be implemented in clinical practice.
The authors identified 30 studies encompassing more than 17,000 patients, most of which were retrospective and based on preoperative computed tomography. Radiomic features were typically derived from tumour texture (particularly grey-level co-occurrence matrix metrics) and shape descriptors, often combined with clinical variables. In pooled analyses of 20 radiomic model cohorts, the discrimination for 5-year recurrence-free survival was strong, with an area under the curve (AUC) of 0.87. Importantly, models evaluated in external validation cohorts maintained similar performance (AUC 0.86) with lower heterogeneity, suggesting that at least some approaches may generalise beyond their original development datasets. Overall, these findings indicate that imaging-based models can capture biologically meaningful information and may offer accurate preoperative prognostication for recurrence and survival.
For the practicing urologist, this is an appealing prospect. A non-invasive tool capable of identifying patients at higher risk before surgery could potentially improve counselling, tailor follow-up intensity, or guide selection for adjuvant therapy trials. Radiomics might complement existing scores by adding objective, quantitative tumour phenotyping derived directly from routine imaging, thereby enriching traditional clinical and pathological risk models.
At the same time, the review exposes important shortcomings in the current literature. Despite the number of published studies, overall methodological quality was only moderate. Many investigations did not adhere to TRIPOD-AI reporting guidelines, relied on small and retrospective single-centre cohorts, and used heterogeneous feature selection and modelling strategies. External validation was frequently absent, and explainability tools were underused. Together, these issues raise the risk of overfitting and limit confidence that reported performance would translate reliably into everyday clinical settings.
Moreover, while these models may improve prognostic accuracy on paper, it remains unproven whether their use would actually improve patient outcomes. Demonstrating better discrimination is not the same as showing that changing management based on model predictions leads to fewer recurrences, better survival, or more efficient care. Prospective impact studies are therefore essential before integrating such tools into treatment decision-making pathways.
Taken together, radiomics and AI for localised RCC should currently be regarded as promising proof-of-concept technologies rather than practice-ready solutions. Future studies should focus on large, representative, multicentre cohorts, standardised imaging and analytical pipelines, transparent reporting aligned with TRIPOD-AI, and mandatory external validation, ideally coupled with trials that test clinical utility. Until then, cautious optimism is warranted: the signal is strong, but the evidence base still needs to mature before routine implementation.