A possibilistic reward method for the multi-armed bandit problem

M. C. Martín Blanco, A. Jiménez Martín, A. Mateos Caballero

The multi-armed bandit problem has been deeply studied in statistics becoming fundamental in different areas of economics or artificial intelligence. Different allocation strategies/policies can be found in the literature for this problem under a frequentist view or from a Bayesian perspective. In this paper, we propose a novel allocation strategy, the possibilistic reward method, and a dynamic extension. The uncertainty about the arm expected rewards are first modelled by means of possibilistic reward distributions. Next, we use a pignistic probability transformation to convert these possibilistic functions into probability distributions. Finally, a simulation experiment is carried out by sampling from each arm with the corresponding probability distribution to find out the one with the highest expected reward, which is then pulled. A numerical study proves that the proposed method outperforms other policies in the literature in all tested scenarios.

Palabras clave: multi-armed bandit problem, possibilistic reward, numerical study

Programado

X09.3 Inferencia Estadística II
7 de septiembre de 2016 17:30
Aula 21.08

Otros trabajos en la misma sesión

Uniformly consistent tests for contamination neighbourhoods

H. Inouzhe Valdes

Corrección de residuos atípicos en modelos de frontera estocástica transversales

A. Shatla, N. Corral Blanco, C. E. Carleos Artime

Últimas noticias

22/06/16
Programa SEIO 2016 y X Jornadas de Estadística Pública

El Programa del XXXVI Congreso Nacional de la SEIO y las X Jornadas de Estadística Pública ya está disponible en la página web.

Puede acceder desde aquí.
16/06/16
Fecha límite para hacer la inscripción con la tarifa reducida.
El Comité Organizador del XXXVI Congreso Nacional de Estadística e Investigación Operativa y de las X Jornadas de Estadística Pública, que se celebrará en Toledo del 5 al 7 de septiembre de 2016, les recuerda que el próximo viernes 1 de julio de 2016 es la fecha límite para inscribirse con la tarifa reducida. A partir de esta fecha se podrán seguir inscribiendo con la tarifa normal.
25/05/16
Alojamiento en Residencias Universitarias

La Universidad de Castilla-La Mancha ofrece a los asistentes al XXXVI Congreso Nacional de Estadística e Investigación Operativa y de las X Jornadas de Estadística Pública la posibilidad de alojamiento en el Colegio Mayor Gregorio Marañon, situado en el centro histórico de Toledo.

Para ver más información pulse aquí.

Organizan

UCLM
SEIO

A possibilistic reward method for the multi-armed bandit problem

Otros trabajos en la misma sesión

Últimas noticias

Organizan

Colaboran

Patrocina

A possibilistic reward method for the multi-armed bandit problem

Otros trabajos en la misma sesión

Últimas noticias

Organizan

Colaboran

Patrocina

Política de cookies