Introducing the team
DM2L is a team created in 2012 whose scientific activity is devoted to Knowledge Discovery from Data using automatic or semi-automatic techniques. This includes: data mining, machine learning, pattern recognition, statistical learning, data analysis, data archeology, etc. Its research interests are mainly data mining and machine learning.
Data mining is an area of research that appears in the early 1990s from the need for methods of knowledge discovery from large amounts of data. Initially related to disciplines such as statistics, machine learning, and databases, data mining is now a mature field with its major annual conferences (ACM SIGKDD, IEEE ICDM SIAM DM, ECML / PKDD, PAKDD) and its well established journals (Data Mining and Knowledge Discovery, ACM Transactions on Knowledge Discovery from Data, IEEE Transactions on Knowledge and Data Engineering). Data mining methods are often known as unsupervised processes that intend to describe, summarize, raise hypotheses from data.
Machine learning field refers to the development, analysis and implementation of methods that perform a task from examples. Its main objective is to build systems with capacities not only for learning but also for generalization, i.e. the ability to extend to the whole what has been observed in a sample. Basic research in recent decades has led to the development of many tools for practitioners from varied fields such as industrial production. As designers of algorithmic learning, we pay particular attention to the natural mechanisms of learning and generalization the solid mathematical foundations of the methods, and the rigorous statistical validation on data not used to fit the model.
Our results are theoretical, methodological, algorithmic, software, and applications. Our guiding principle is to try to help data owners throughout the interactive process of knowledge discovery from data. As these processes require the combination of a wide range of paradigms of description or induction (pattern extraction, classification, statistical learning, including Bayesian networks, set-methods, kernel methods, connectionist methods, etc.),
DM2L team leverages ten permanent researchers whose computer skills are complementary. More specifically, the team is working on the following problems:
- Fundamentals of constrain-based mining
- N-ary relations or Boolean tensors mining
- Spatio-temporal data mining
- Dynamic attributed graph mining
- Ensemble learning
- Learning probabilistic graphical models
- Unsupervised learning with and without constraints
In addition to many major annual conferences that are common to DM and ML, machine learning specific conferences are mainly ICML, UAI, and NIPS. Similarly, in addition to the major journals on KDD, there are few well established journals that are dedicated to learning methods (Journal of Machine Learning Research, Pattern Recognition, Machine Learning, Neurocomputing).
Significant contributions in data mining and machine learning can occur in conferences (and journals) of Artificial Intelligence (eg, IJCAI, AAAI, ECAI, Artificial Intelligence) and data management (eg, VLDB, ACM CIKM, Information Systems).
This research is developed in relation to real data analysis: the quantitative and qualitative empirical study on real data is absolutely essential.
While DM2L research team develops mainly methods and algorithms rather than applications, it works with owners of data from several environments. If the life sciences and molecular biology have been particularly targeted in recent years, we are also interested in the study of data related to "Intelligences Urban Worlds" labex (eco-technologies and "urban monitoring", transport and mobility, the emergence communities and analysis of social interactions, understanding of the human impact on the environment and biodiversity, etc.) while pursuing applications in areas such as health and the design and monitoring of manufactured goods. This diversity shows our willingness to be centered "Methods" and develop generic algorithms applicable to a broad spectrum of applications.