This repository contains the code and data for the short paper titled "Towards Formalizing Concept Drift and Its Variants: A Case Study Using Past COSIT Proceedings," accepted at Conference on Spatial Information Theory (COSIT'24).
The dataset includes paper titles and keywords from COSIT proceedings in 1992, 2001, 2011, and 2022.
In the classic Philosophical Investigations, Ludwig Wittgenstein suggests that the meaning of words is rooted in their use in ordinary language, challenging the idea of fixed rules determining the meaning of words. Likewise, we believe that the meaning of keywords and concepts in academic papers is shaped by their usage within the articles and evolves as research progresses. For example, the terms natural hazards and natural disasters were once used interchangeably, but this is rarely the case today. When searching for archived documents, such as those related to disaster relief, choosing the appropriate keyword is crucial and requires a deeper understanding of the historical context. To improve interoperability and promote reusability from a Research Data Management (RDM) perspective, we examine the dynamic nature of concepts, providing formal definitions of concept drift and its variants. By employing a case study of past COSIT (Conference on Spatial Information Theory) proceedings to support these definitions, we argue that a quantitative formalization can help systematically detect subsequent changes and enhance the overall interpretation of concepts.
For questions or collaboration, please contact [email protected].