Julien, class of 2022, Imen Ouled Dlala and Nicolas Travers, both teaching researchers at ESILV, members of the Digital Group of the De Vinci Research Center, are coauthoring a research paper awarded by the scientists of the international conference ADMA 2022, Advanced Mining and Applications.
Julien Martin-Prin, class of 2022, mentored by Imen Ouled Dlala and Nicolas Travers, won the prize for best research paper at the ADMA 2022 conference.
The 18th edition of this prestigious international conference took place from November 28 to 30 in Brisbane, Australia. Nicolas Travers, a faculty member of the Digital Group, participated in the conference to present this work.
The work of the Digital Group awarded at the ADMA conference
Julien is a student of the Connected Objects & Cybersecurity major. As part of his research program at the De Vinci Research Center, he became interested in the impact of Big Data in tourism and contributed to the work of the Digital Group.
ADMA is the International Conference on Advanced Data Mining and Applications 2022 with a high international impact.
In this context, the research paper “A Distributed SAT-based Framework for Closed Frequent Item Mining” was accepted for publication in the 18th International Conference on Advanced Data Mining and Applications 2022 (ADMA) and received the best paper award in the plenary session.
A scientific contribution to the mining of complex data
Data mining aims at extracting knowledge from databases. In particular, it involves searching for recurring patterns in set data where a set of characteristics describes each data item. The criteria used to define patterns (frequency, closure, maximality, etc.) are generally easily modeled in terms of constraints.
Declarative and flexible approaches have been proposed to model and extract patterns: this allows the user to easily describe the criteria, in terms of constraints, and to use the algorithmic machinery of these formalisms to retrieve the patterns satisfying these criteria.
However, the counterpart to this high genericity is simple: it does not scale! In other words, it does not allow for handling huge volumes of data.
Thus, this work aims to enable data mining methods to handle large volumes of data.
Moreover, we focus in this study on the huge volumes of tourism data that we use in our work on tourism behavior analysis [WISE’2019, WISE’2020, Tourism Worlds 2018 & 2021, SAC’2022].
Symbolic artificial intelligence at the service of tourism
The goal here is to enumerate the places that tourists visit frequently together, allowing tourism players to understand the interconnections among frequent destinations.
Tourism data from experience prescriber sites such as TripAdvisor is often voluminous which limits the effectiveness of declarative approaches.
In order to overcome this problem, we proposed a distributed approach based on a strategy of decomposing the search space of tourists’ visits. This strategy uses a computationally distributed paradigm to efficiently enumerate the set of frequent patterns, reducing the processing time and in particular the communication between tasks (performed by the Solver Nodes).
To the best of our knowledge, this paper presents the first attempt towards a distributed Symbolic Artificial Intelligence (SAT) based approach for enumerating frequent patterns from transactional databases. An extensive empirical evaluation of tourism data shows the effectiveness of our approach.
More about the Connected Objects and Cybersecurity major
Explore the work of the Digital Group