hyu2000

Well, it’s been a fun journey. DeepMind has made a movie about AlphaGo. Here I mostly try to trace back its academic lineage.

AlphaGo lineage

MCTS, RLGo, DQN (Atari, 2013), DNN for policy, AlphaGo (2016), AlphaGoZero, AlphaZero, MuZero, Capture The Flag, AlphaStar (2019).

Game theory, open-ended learning, game of skills

At this point, AlphaGo is super-human. In terms of machine “learning”, we have little to teach it anymore. However, interesting things can be said about open-ended learning. Deepmind has published a few papers:

They came from a game-theory point of view, and put AGZ-style training in a firmer theoretic footing. For example, some (SAI) had observed self-play may not always lead to monotonic increase in performance. This is explained by the “spinning tops”-shaped gamescape quite well.