We consider the dynamics of Q– learning in two–player two–action games with a Boltzmann exploration mechanism. For any non–zero exploration rate the dynamics is dissipative, which guarantees that agent strategies converge to rest points that are generally different from the game's Nash Equlibria (NE). We provide a comprehensive characterization of the rest point structure for different games, and examine the sensitivity of this structure with respect to the noise due to exploration. Our results indicate that for a class of games with multiple NE the asymptotic behavior of learning dynamics can undergo drastic changes at critical exploration rates. Furthermore, we demonstrate that for certain games with a single NE, it is possible to have additional rest points (not corresponding to any NE) that persist for a finite range of the exploration rates and disappear when the exploration rates of both players tend to zero.;In addition to Boltzmann Q–learning we study adaptive dynamics in games where players abandon the population at a given rate, and are replaced by naive players characterized by a prior distribution over the admitted strategies. We show how the Nash equilibria are modified by the turnover, and study the changes in dynamical features of the system for prototypical examples such as different classes of two-action games played between two distinct populations.;Finally, we presents a model of network formation in repeated games where the players adapt their strategies and network ties simultaneously using a simple reinforcement learning scheme. It is demonstrated that the co-evolutionary dynamics of such systems can be described via coupled replicator equations. We provide a comprehensive analysis for three-player two-action games, which is the minimum system size with non-trivial structural dynamics. In particular, we characterize the Nash Equilibria (NE) in such game, and examine the local stability of the rest points corresponding to those equilibria. We also study general N-player networks via both simulations and analytical methods, and find that in the absence of exploration, the stable equilibria consist of star motifs as the main building blocks of the network. Furthermore, in all stable equilibria the agents play pure strategies, even when the game allows mixed NE. Finally, we study the impact of exploration on learning outcomes, and observe that there is a critical exploration rate above which the symmetric and uniformly connected network topology becomes stable. |