Peter Triantafillou
Towards Learned Data Management
Machine Learning (ML) is revolutionizing data management. Fundamentally, as many internal DB components rest essentially on a prediction function, ML offers the promise of improving functionality and performance. Examples of applications of ML algorithms, models, and principles for improving DB functionality and performance abound. In this talk our recent research on three key areas will be overviewed.
Specifically, we will discuss: First, how to best adapt specific deep learning networks for fast and accurate Approximate Query Processing. Second, how adapting principles from Probabilistic Graphical Models can lead to new ways to perform physical joins, and/or analytical queries over joins, and/or facilitating downstream analytics over joins in a manner that significantly outperform the state of the art in terms of time, space, and scalability. Finally, a general framework will be presented that deals effectively with the problems faced by learned DB components/models in the presence of new data following different (to the learned) distributions. The proposed framework can handle different types of neural networks, trained for a variety of different learning tasks (e.g. AQP, selectivity estimation, synthetic data generation/sampling).
Specifically, we will discuss: First, how to best adapt specific deep learning networks for fast and accurate Approximate Query Processing. Second, how adapting principles from Probabilistic Graphical Models can lead to new ways to perform physical joins, and/or analytical queries over joins, and/or facilitating downstream analytics over joins in a manner that significantly outperform the state of the art in terms of time, space, and scalability. Finally, a general framework will be presented that deals effectively with the problems faced by learned DB components/models in the presence of new data following different (to the learned) distributions. The proposed framework can handle different types of neural networks, trained for a variety of different learning tasks (e.g. AQP, selectivity estimation, synthetic data generation/sampling).
back to overview
Watch Recording