Data quality has been a major concern of organizations for decades, leading to the introduction of standards and quality frameworks. Recent advances in artificial intelligence (AI), e.g., generative AI, have brought data quality (DQ) back into the spotlight. In enterprises, it is particularly important to build data ecosystems that can cope with the emerging challenges posed by AI-based systems. Data quality has been tackled from different perspectives: the database community has made significant advances in data profiling and data cleaning and still focuses on DQ issues like duplicate detection or missing data handling; the information systems community provides solutions for addressing DQ at an organizational level; the machine learning (ML) community focuses mainly on the development of robust models that can deal with issues in the data. We want to build upon the success of QDB 23 and QDB 24 and continue offering an open format for joint discussions between different communities on the future of DQ assessment and improvement.
This workshop aims to exchange novel ideas and best practices about data quality assessment and improvement in the era of AI. The event should unite experienced and senior-level data quality researchers with junior researchers and PhD students. We specifically expect junior researchers to benefit, since they get to meet the community and continue high-quality research on data quality. The suggested topics of interest include, but are not limited to:
We appreciate submissions on all these topics for different domains (e.g., healthcare, mobility, production) and for various types of data (e.g., graphs, time series).