Automated Data Quality Validation for Smart City Data Ecosystems
Publication Type
Conference Paper
Authors

Smart cities generate massive volumes of heterogeneous data from sources such as traffic systems, environmental sensors, public transport, and citizen applications. Ensuring the quality of this urban data is crucial for reliable analytics, service optimization, and policy-making. However, data validation in smart city systems remains largely manual, error-prone, and non-scalable due to frequent schema evolution and variable data standards across departments.

In this paper, we utilize the DQGen framework for automating data quality validation in smart city environments. Leveraging metadata extracted from open urban datasets, the framework maps standard quality dimensions—such as completeness, consistency, validity, and timeliness—to executable validation rules using Great Expectations. The generated scripts can be integrated into city dashboards or batch pipelines, allowing for continuous, transparent, and repeatable validation across evolving datasets.

We validate the framework using datasets from a municipal open data portal, which include traffic flow, air quality, and public transportation usage records.

Conference
Conference Title
I-CiTies 2025 - 11th CINI Annual Conference on ICT for Smart Cities & Communities
Conference Country
Italy
Conference Date
Sept. 17, 2025 - Sept. 19, 2025
Conference Sponsor
Springer
Additional Info
Conference Website