Today I want to introduce my faithful audience to a very special person. He has an h-factor of about 80, and his two most relevant papers total more than 40,000 citations together; his works made 6 times the front cover of Nature and twice that of Science; his PNAS paper on hippocampal amnesia was listed as one of the 10 breakthrough works of 2007, by Science; graduated in computer science at age 21 from Cambridge with a double-first, elected fellow of the Royal Society, of the Royal academy of Engineering, of the Royal Society of Arts, and nominated Knight of the Order of the British Empire for his scientific merits, there is little doubt that this man, at the still young age of 45, is an outstanding scientist. However, strictly speaking, science is not his main occupation. For most of his life, and still today, he’s been a gamer.
Demis Hassabis was born in North London, July 1976, to a Greek Cypriot father and a Chinese Singaporean mother. Since age 4 he started displaying an exceptional talent for playing chess, and at the age of 13 he received the title of master, being ranked the World’s second-best player under-14. He led various junior chess teams, while obtaining his college diploma at the age of 15 with two years of advance. When he applied at Cambridge University they found him too young, and required him to take a gap year. So, Demis went to Bullfrog Productions, where at age 17 he designed the videogame Theme Park, which sold more than 10 million copies worldwide. He then was accepted at Queen’s Mary College for his graduate studies, where he also represented the Cambridge chess team for three years, winning several competitions. After his degree he was hired at Lionhead Studios, where he worked at one of the first AI-based computer games, Black&White, which made close to 20 million dollars revenues in the first year. In 1998 he left Lionhead to create his first company, Elixir Studios, where he designed other successful games, like the popular Evil Genius. In 2005 Hassabis sold Elixir. In the meantime, he had won five times the Pentamind and twice the Decamentathlon, multi-game best-player competitions awarded during the World Mind-Sports Olympiads.
At that time, Hassabis was one of the best gamers in the world, a true Magister Ludi as in Hermann Hesse’s famous book, Das Glasperlenspiel. However, das wirklicher spiel, the actual meaning of the game was not yet clear in his mind. In Hesse’s novel, the glass beads are but a symbol of the past, they were used in ancient times but they disappeared in the modern game, replaced by abstract spoken formulas connecting seemingly disparate experiences, such as a Bach concerto and the shape of a plant. From his abstract province of Castalia, Demis Hassabis was starting to grasp a link between the virtual space of events created in a video game, and the way human brain organizes our experiences in space and time. Seeking inspiration in the functioning of the human brain for his innovative AI algorithms, he went in 2005 at UCL to study under Eleanor Maguire, and obtained a PhD in cognitive neuroscience; then he spent some time at MIT with Tomaso Poggio; finally, in 2009 he got a postdoc in the group of Peter Dayan. In short, he touched base with some of the greatest theoretical neuroscientists of our times. During these years, he co-authored a landmark paper (PNAS 104(5) (2007) 1726) showing that patients with damaged hippocampus, a condition causing amnesia, are unable to imagine themselves in new experiences. This was the starting point for his new theoretical model of episodic memory, in which the construction of a coherent and complex scene is the key for both recalling the past, and imagining the future. Later, he generalized these ideas into a “mind-simulation engine” needed to construct imagined events and scenarios. Then, in 2010 he left academia again, and founded DeepMind.
With a few old friends as partners (Shane Legg, Mustafa Suleyman and David Silver, his old associate at Elixir), Hassabis aimed his new creature at “solving intelligence and then use this intelligence to solve everything else”. But in practice, they started playing games. Again. Their first application of AI required the code to just “watch” the screen where a sample game was played, with no prior knowledge of the game rules, and with the hope that after some time the AI program would become an expert just like a human would learn to do. To begin with, they picked old arcade games from the ‘70s and ‘80s with very primitive graphics, such as Breakout, Space Invaders, and even the famous Pong (anybody remember the videoclip by Air?). They finally published Agent57, an AI module capable of beating any human on all the 57 games of the Atari2600. Such a fixation on games as the key to convey human intelligence into a computer model has never been just a ruse at DeepMind. In 2015 they produced AlphaGo, whose improved version could beat the following year a 9th-dan grand master of Go, Lee Se-dol, with a score of 4-1. In 2017 they introduced AlphaZero, a Go-player that learns the game strategy only by playing against itself. Go is a very big deal in the Far East, considered to be largely more difficult and complex than chess, and this research apparently had a huge psychological impact on the local culture. A computer beating a grand-master of Go would likely translate in our western culture as a computer creating a credible 39th play by Shakespeare. It has been said that this event gave the definitive push to the Chinese government, to massively invest in AI. While difficult to verify, however the number of Chinese AI patents increased by almost 3000% between 2010 and 2019, now ranking at World’s third, and with a nearly four-fold increase in published papers, almost doubling the US output in the same period.
DeepMind went ahead also with practical applications, each time providing astonishing results. The most celebrated is AlphaFold2, released in 2020, an AI system that uses a nest of deep-learning subnetworks embedded in a single model with iterative refinement, to predict the 3D structure of proteins. As of July 2022, the AI model has predicted more than 200 million proteins, a huge step forward and technical achievement, although the claim of “solving the protein folding problem” is definitely an overshoot. This is one problem of any AI-based solutions: the results obtained may be optimal, but we learn nothing of the details. In this case, we still don’t know how a protein folds, we only get the final, probably correct result (see e.g. Nussinov et al., Alphafold, AI and Allostery, J Phys Chem B 2022). Google acquired DeepMind in 2014 for some $500 million, which was the necessary move since the hefty computer power required could only be provided by the high-tech giant. Very few other practical applications of DeepMind are developed, such as a collaboration with English hospitals in a NHS program for patient records management. The fact is, Deep Mind is 99% a research powerhouse, hiring highly paid experts and scientists into expensive, frontier projects. It loses hundreds of millions of dollars each year to Alphabet, the Google subsidiary, with negligible revenues. Demis Hassabis continues to invest in games, one of their latest projects being AlphaStar, an AI whose only purpose was to reach the level of Grand Master in the popular videogame StarCraft II.
As already said, however, gaming is never just for the sake of the game. DeepMind’s main, if not only current focus, is on reinforcement learning, a field in which they are making breakthrough advances. Apparently, Google does not seem to care losing money in this part of their activities. (Google’s practical AI applications are mostly developed by the other subsidiary GoogleBrain, which created the popular AI software TensorFlow). One needs to set aside the old stereotype that computer programs simply follow fixed rules and can do only what humans have programmed them to do, hence lacking any capacity for creativity. AI can learn from enormous sets of unstructured data, without predesigning a search space, nor its rules. What the code will learn, and how it will behave after learning is very difficult, practically impossible to predict in advance. If you browse the internet, you could see that such terms as artificial intelligence, machine learning, neural networks, deep learning, playing a widespread role, but tend to be used interchangeably in conversation, leading to confusion about the nuances between them. The metaphor often used is that of the Russian Matrioska dolls, according to which each of the four terms above contains the following one. Hence, AI is the master field which contains all the variants; and deep learning, at the other extreme, is the most specialized version of neural networks (NN). However, NN could be called anything but the last cry in computer science.
The world of artificial NN initiated already in the ‘50s, with pioneer works of Hebb, and McCulloch & Pitts, who sought to imitate real neurons in the brain; in 1958 Rosenblatt created the perceptron, the first working example of an artificial classifier, a NN with 3 input-hidden-output layers; however, around 1968 Marvin Minsky and Seymour Papert published their seminal works demonstrating that a 3-layer classifier can only perform linear tasks and cannot reproduce the logical XOR. This led to a stall, and the field of NNs was practically dead for 20 years. In the mid-80s it was resurrected by David Rumelhart, who applied the backpropagation algorithm (a 10-years older theoretical idea by Paul Werbos) to train multi-layer NNs. Since then, with key advances by Elman, Kohonen and many others, the science around NNs started flourishing again. I first met with the NN programming model around 1988, stumbling by chance on a beautiful review paper, Brain without mind, by John Clark and Johann Rafelski (Phys Rep 123 (1985) 215). With the help of my friend Luciano Tondinelli, a brilliant mathematician and engineer who left too early this world, we programmed a 3-layer perceptron with backpropagation training. Despite some clumsiness from Fortran77, our NN was capable of recognizing the 26 letters of the alphabet, also with added noise and mistakes (we tried a very distorted A and a very distorted R, and the IBM-3090 running our simple code was able to tell one from another with decent approximation).
Therefore, I was very much surprised when I realized that all the current hype around deep learning is “just” about neural networks. When, out of curiosity, I started reading about the marvels of AI a few years ago, I thought somebody in the meantime must have made a true breakthrough, invented some crazy, entirely new cognitive model, something that could go beyond anything we had until then. In fact, a deep-learning machine is “nothing more” than a humongous, multi-layer, multi-level, possibly convoluted and recursive, linear-algebra construct. A gigantic network of neural networks. What made all the difference with respect to our naïve age of perceptrons, has been the explosion in computing power over the recent years, last but not least the advent of GPUs and very recently TPUs (again, a Google’s copyright). Such a vast computing power (not many official figures are available, but the combined Google data centers used 15,5 TWh of electricity in 2020, that is about 1/3 of the annual consumption of the whole city of New York) is what allows to run complex AI systems, integrating many nested levels of NN, with dozens or hundreds of hidden layers, continuously refined training modes and schedules. Just have a look at the schematic outline of an image processing model like Inception-3. IBM-Cloud is proposing an open-access platform called Watson, allowing anybody to run remote applications on their servers, under both a free or paying-access model.
I am running out of time and space, to even start hinting at the possible criticisms of this approach. In your spare time, you may have a look for example at this video from Lille’s DevFest 2019 (but of you do, please watch until the end!). One common objection would be that each AI application has to be trained on specific sets of data, and becomes a target-oriented task: would it be possible for the same AI module, e.g., to rank cancer patients, and to find the best coffee prices in its spare time? An initial answer to multitasking is coming again from a recent DeepMind prototype, the GATO generalist agent. According to the report appeared last November, the same network with the same weights can play Atari, caption images, chat with humans, stack blocks with a real robot arm, and much more, deciding based on its context whether to output text, speak, joint torques, button presses, or other tokens.
While terrible and unexpected (?) wars loom on our present, and the “pedagogical province” of Castalia with its dream of synthesis of arts and sciences may seem waning in a ruthless competition about technology, I believe the future of the Glasperlenspiel will be bright with breakthrough ideas, waiting for more and always innovative Magister ludi.