Causal-Constrained Reinforcement Learning for Revenue Cycle Optimization
Abstract
Healthcare revenue cycle management (RCM) loses billions annually to claim denials, yet existing machine learning approaches treat billing as a prediction problem rather than a decision problemthey predict which claims will be denied but do not optimize the coding actions that cause denials. We formulate billing optimization as a constrained sequential decision-making problem under regulatory constraints, integrating causal inference, constrained offline reinforcement learning, and uncertainty quantification into a unified framework. The central theoretical contribution is a proof that the interventional policy value is identifiable from observational claims data despite latent patient severity (Theorem 1), with explicit assumptions (consistency, positivity, conditional ignorability) and sensitivity analysis for unmeasured confounding. A regret decomposition (Theorem 2) isolates causal estimation error from optimization and constraint approximation errors, providing a diagnostic for performance losses. A finite-sample conformal coverage guarantee (Lemma 1) handles policy-induced covariate shift. Uncertainty quantification is embedded in policy training via reward shaping, not merely as a post-hoc filter. On semi-synthetic data with five baselines, the framework achieves a 36% denial rate reduction with stable revenue, zero constraint violations, and well-calibrated uncertainty. A real-world validation on 25,734 claims episodes (151 CPT codes, 3 payers) confirms scalability and produces statistically significant causal ATE estimates. On real data lacking clinical features, the causal reward does not outperform non-causal constrained RL—consistent with the theory, since identification requires observing the full adjustment set. The framework’s primary value is the identification and regret decomposition machinery that enables principled decision-making under causal uncertainty, with practical revenue gains contingent on the quality of available causal estimates.
Keywords
Citation Information
@article{yunguoyu2026,
title={Causal-Constrained Reinforcement Learning for Revenue Cycle Optimization},
author={Yunguo Yu},
journal={Research Square},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9465761/v1}
}
SinoXiv