J. Martín-Fernández, M. Vives-Mestres, R. Kenett
Association rule (AR) mining is one of the major techniques to detect and extract useful information from large databases with unstructured semantic data. Measures of interestingness are appropriate indices for measuring the strength of an AR. Because any AR can be expressed by a contingency table, compositional techniques are an appropriate approach to define these measures. Compositions are vectors whose elements, called parts, provide relative information about a whole. There is a general agreement among researchers that the geometry of the simplex is based on log-ratio coordinates. We introduce log-ratio measures and analyse its major properties. A contrast to confirm the significance of an AR and the interpretation of the effects between the itemsets are given. The relation between these measures and other common measures facilitates the interpretation of negative and positive effects between itemsets. An example illustrates the performance of these measures of interestingness.
Palabras clave: compositional data, multivariate analysis, log-ratio, simplex
Programado
M08.2 Grupo de Análisis Multivariante y Clasificación IV
6 de septiembre de 2016 15:20
0.09 - Aula de proyectos 2