JVLGS: Joint Vision–Language Gas Leak Segmentation
Abstract
Gas leaks pose severe risks to human health and industrial safety. However, accurate and timely monitoring of gas leaks remains a major challenge. Existing vision-based methods using infrared (IR) imagery are limited by the inherently blurry and non-rigid nature of leak plumes, which reduces detection reliability and precision. To overcome these limitations, this paper proposes a Joint Vision–Language Gas leak Segmentation (JVLGS) framework that integrates the complementary strengths of visual and textual modalities to enhance gas leak segmentation. Recognizing that gas leaks are sporadic and many video frames contain no leakage, JVLGS incorporates an adaptive postprocessing module to effectively suppress false positives caused by noise and non-target objects—a common limitation of existing approaches. Extensive experiments across diverse industrial scenarios demonstrate that JVLGS significantly outperforms state-of-the-art gas leak segmentation methods. Furthermore, it achieves consistently strong performance under both supervised and few-shot learning settings, whereas competing methods typically perform well in only one setting or underperform in both. We publish our code at: https://github.com/GeekEagle/JVLGS.
Keywords
Citation Information
@article{xinlongzhao2026,
title={JVLGS: Joint Vision–Language Gas Leak Segmentation},
author={Xinlong Zhao and Qixiang Pang and Shan Du},
journal={The Visual Computer},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9331073/v1}
}
SinoXiv