Prof. Matteo Marsili (Abdus Salam ICTP)
The mass is a relevant variable in experiments of free falling bodies, their colour is not. The mass enters the laws that governs how objects fall, their colour does not. How can one identify relevant variables when data is scarce and high dimensional and the laws that govern the phenomena under study are unknown? In order to address this question, I will first argue that relevance can be quantified unambiguously in information theoretic terms, on the basis of a data alone. Samples with maximal relevance, i.e. those which are mostly informative about the generative process, exhibit power law distributions, suggesting a possible origin for the ubiquitous observation of such distributions. In addition, this opens the way to model free approaches to extract relevant information from high dimensional datasets. This will be illustrated in the cases of protein sequences and multi-electrode arrays recording of neural activity.