This article is based off of a podcast episode. You can listen to or watch the episode here.
DeepMind, a London-based subsidiary of Google’s parent company Alphabet, has done it again and pushed the boundaries of what can be achieved in the field of machine learning, thereby pushing the human race one step closer toward developing artificial general intelligence.
Artificial general intelligence (or AGI, for short) would be a single algorithm that could perform well at any cognitive tasks a single person could do. For example, this hypothetical AGI algorithm could recognize your face, converse with you about current events, translate what you’re saying into another language, make reasonable trades for your retirement portfolio, beat you at chess, and drive you safely from one location to another. In effect, the AGI’s intellectual capabilities would be indistinguishable from yours or mine.
While some optimists like the author and entrepreneur Ray Kurzweil suppose that AGI is less than a decade away, a large survey of machine learning experts suggests that we might not have AGI until around the year 2040, if at all. That said, since DeepMind’s founding ten years ago, the company has been methodically (and surprisingly rapidly!) taking steps toward attaining AGI by focusing on designing deep reinforcement learning algorithms that can excel at broader and broader ranges of tasks.
Let’s invest a few moments to clarify what deep reinforcement learning is before we continue further. Deep reinforcement learning is a collection of data modeling techniques that combines together two machine learning subfields: deep learning and reinforcement learning. Combining these two fields together is particularly potent for designing artificial intelligence algorithms that need to make real-time decisions using real-world data.
This is because, the first field, deep learning, shines at distilling the most relevant signals from large quantities of noisy video or sound data, just like the human brain regularly, and seemingly effortlessly, recognizes the most relevant information for us from all of the raw visual and auditory data that is detected by our eyes and our ears.
The second field, meanwhile, reinforcement learning, is proficient at selecting one good option from amongst countless possible actions in order to achieve a particular goal, much like you and I need to be able to decide from nearly infinite options on everyday tasks such as what to have for dinner, how to get to the grocery store to purchase appropriate ingredients, and so on. Thus, a deep reinforcement learning algorithm has the capacity both to identify signals within noisy input data and take immediate actions based on those signals, making it ideally suited to tackling real-world problems in real-time.
All right, now let’s return to DeepMind and the company’s use of deep reinforcement learning to make progress toward an AGI algorithm that would have all of the intellectual abilities of a person. In 2013 one of DeepMind’s first algorithms to generate a major publicity splash was their Deep Q-Learning network, which is a single deep reinforcement learning algorithm that’s capable of exceeding human performance on dozens of different Atari video games, including beloved classics like Pong, Space Invaders, and Breakout.
A few years later, in 2016, DeepMind made headlines again when their AlphaGo algorithm defeated one of the world’s best players of the board game Go -- an extremely popular board game in Asia as well as a complex game that requires lots of quote-unquote “human intuition” to play at a high level. As a side note, I highly recommend checking out the documentary AlphaGo, which has a perfect 100% rating on RottenTomatoes and is available via many digital film providers such as Netflix, and is even available for free on YouTube.
Now, Go is only one game so defeating one of the all-time great human Go players may not sound like it’s progress on the way toward an AGI algorithm that excels at a broad range of tasks. However, AlphaGo was only a stepping stone toward DeepMind’s AlphaZero algorithm, a single deep reinforcement learning model that has superhuman ability on many board games; not just Go, but also chess and shogi, a Japanese chess-like game.
All right, so AlphaZero was the world’s most advanced and exciting algorithm for approaching artificial general intelligence up until DeepMind published on their MuZero algorithm only a couple of weeks ago. Be sure to tune into my Five-Minute Friday episode next week, where having set the stage today by introducing AGI and MuZero’s preeminent predecessors, next week we can dive specifically into the now-state-of-the-art MuZero algorithm as well as tons of great resources for learning about deep reinforcement learning theory and applications in intricate detail yourself. See you next week on Five-Minute-Friday!
DeepMind’s blog post on MuZero is here and their Nature paper is here.