Multi-Output Machine-Learning Prediction of Volatile Organic Compounds (VOCs): Learning from Co-Emitted VOCs
Publication Type
Original research
Authors
Fulltext
Download

Volatile Organic Compounds (VOCs) are important contributors to indoor and occupational air pollution, such as environments involving the extensive use of paints and solvents. The routine measurement of VOCs is often limited by resource constraints, creating a need for indirect estimation techniques. This work presents the need for a predictive framework that offers a practical, interpretable alternative to a full-spectrum chemical analysis and supports early exposure detection in resource-limited settings, contributing to environmental health monitoring and occupational risk assessment. This study explores the capability of machine learning to simultaneously predict the concentrations of five paint-related VOCs using other co-emitted VOCs along with demographic variables. Three models—Multi-Output Gaussian Process Regression (MOGP), CatBoost Multi-Output Regressor, and Multi-Output Neural Networks—were calibrated and each achieved a high predictive performance. Further, a feature importance analysis is conducted and showed that certain VOCs and some demographic variables consistently influenced the predictions across all models, pointing to common exposure determinants for individuals, regardless of their specific exposure setting. Additionally, a subgroup analysis identified the exposure disparities across demographic groups, supporting targeted risk mitigation efforts.

Journal
Title
environments
Publisher
mdpi
Publisher Country
China
Indexing
Scopus
Impact Factor
3.7
Publication Type
Both (Printed and Online)
Volume
12
Year
2025
Pages
--