A Deterministic, Rule‑Based Pipeline for Automated Cervical Cytology-Histology Correlation Quality Assurance
Abstract
Background: Cervical cytology–histology correlation (CHC) is a key component of quality assurance (QA) in gynecologic cytopathology, enabling assessment of diagnostic concordance and identification of clinically significant discrepancies. Despite established guidance from the College of American Pathologists (CAP) and the Bethesda System, CHC practices remain variable across laboratories due to inconsistent terminology mapping, heterogeneous discrepancy classification, and reliance on manual workflows, limiting reproducibility and scalability.Objective: To develop and evaluate a deterministic, rule-based pipeline for automated CHC quality assurance using laboratory information system (LIS)-derived data.Methods: A modular pipeline was implemented to perform diagnostic normalization, rule-based classification of cytology–histology pairs, and computation of QA metrics. Diagnostic terminology was standardized using configurable dictionary-based mappings, and discrepancy classification was performed using predefined rules aligned with CAP CHC guidelines. The system generates concordance rates, discordance subtypes, and high-grade lesion (HSIL) positive predictive value, along with structured outputs and visualizations. Performance was evaluated using synthetic datasets of 300, 500, and 1,000 cases to assess normalization coverage and classification stability.Results: The pipeline demonstrated progressive improvement in terminology coverage, with unmapped cases decreasing from 4.0% (12/300) to 1.2% (6/500) and ultimately 0% (0/1,000). Rule-based classification remained stable across all datasets, with complete assignment of concordant, minor discordant, and major discordant categories in the final evaluation. The normalization dictionary achieved full convergence, and all QA metrics were reproducibly generated across runs.Conclusions: This study presents a reproducible, deterministic pipeline for automated CHC quality assurance that standardizes diagnostic terminology, ensures consistent discrepancy classification, and generates interpretable QA metrics. The approach addresses key limitations of manual CHC workflows and provides a scalable framework aligned with contemporary cytopathology practice and risk-based management principles. Future validation using real-world LIS datasets will further establish its clinical utility.
Citation Information
@article{solomontessega2026,
title={A Deterministic, Rule‑Based Pipeline for Automated Cervical Cytology-Histology Correlation Quality Assurance},
author={Solomon Tessega},
journal={Research Square},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9460175/v1}
}
SinoXiv