Autor: Porto Lucila
JEL: L1, L4
What if Q-Learning algorithms set not only prices but also the degree of differentiation between them? In this paper, I tackle this question by analyzing the competition between two Q-Learning algorithms in a Hotelling setting. I find that most of the simulations converge to a Nash Equilibrium where the algorithms are playing non-competitive strategies. In most simulations, they optimally learn not to differentiate each other and to set a collusive price. An underlying deviation and punishment scheme sustains this implicit agreement. The results are robust to the enlargement of the action space and the introduction of relocalization costs.