BlizzCon 2018 StarCraft II: What’s Next Panel Transcript
Day: Thank you, Scip. Now Oriol, we’re going to pass the baton to you; but before we get to talk about DeepMind — after this, we’re going to be doing a Q&A and there’s a microphone… I believe somewhere over here. I can’t see it, because these lights are really bright; but I have confidence that there’s a microphone over there. We’d love to invite you up there to ask questions to our lovely panelists. Now Oriol, please talk to us about what’s been going on with DeepMind.
Oriol: Right. So it’s been quite an exciting couple of years. Two years ago, we announced that we would collaborate with Blizzard to open or release the StarCraft II as an environment for building artificial intelligence; and last year, we actually released it; and it’s been an amazing experience to see the community come together. There’s a very nice Discord channel where people sort of do all kinds of things in StarCraft that is maybe not what you would expect you could do– like building bots and iterating, and having a ladder for AIs, and so on. So that’s been great to see. And then, after the release, we sort of started working a bit more on like other aspects of the game; but first, we wanted to get the release right.
Day: Yeah, it must have been a really interesting 2 years… like going from the very first day trying to even approach the StarCraft II problem to now. What are some of the big highlights in your eyes?
Oriol: We started from just understanding what is needed to expose StarCraft II to a computer, not to a human. So that required a bit of work, and what’s been sort of exciting for DeepMind… we sort of have a bit of history on using video games for researching artificial intelligence; and what’s been super nice is that StarCraft kind of speaks a lot of boxes of difficulty of previously sort of successfully declared soft-games were at least played at a very high human level.
So we started with Atari which there’s like fairly simple old games, but it was still an impressive achievement when we did that 3 or 4 years ago; and then, as people might remember… 2 years ago we beat the world champion on Go; but obviously, Go is a game where you, for instance, see the whole board; whereas in StarCraft you do have to go out and seek for information. So that aspect, in particular, is going to be one of the challenging things to deal with for the current AI technology.
Day: Yeah. I’m curious, because in a game like Go there are 361 board positions. So it’s really clear what a move is, you just place the stone; but when it comes to something like StarCraft, how do you even define what an action is that a computer might want to take, or that a player would want to take?
Oriol: So for these, we took a very sort of human-centric approach. We look at how you play the game. I was a gamer back in the day. Actually, I played quite a bit of StarCraft 1, and we just kind of tried to emulate this idea of mouse-clicks and keyboard-clicks, and dragging rectangles across the screen, and so on.
So when we defined the API with Blizzard, we tried to really follow closely what someone playing with a keyboard and a mouse would look like. We also have an interface that is more like programming, but what we care about is playing like a human; because that’s what kind of is like the challenges of StarCraft’s span of attention and economy of where you have your camera, micro versus macro… All these are much harder when you expose it with the human API as we call it.
Day: Well, what’s the progress been like so far?
Oriol: So as I was saying, the first things we did an open source was these mini-games which were kind of pieces of StarCraft really that any players should kind of master. So the simple things like moving units around, moving groups of units, around also dealing with economy like building buildings, to mine more minerals; or building as large an army as you can in a limited time span; and since the release, we actually have gone quite far in these mini-games and we have essentially achieved what I would call very competitive human performance like maybe grandmaster level or so; but there are obviously extremely limited sets of things in StarCraft.
Day: And is this to just get a feel for it to make sure that the framework that you’re functioned with is causing proper learning to happen?
Oriol: Exactly. So we learned a lot. A lot of the difficulties of the actions space, the most clicks was exposed thanks to this; and so it was a very good first stepping stone; and also for the community, it’s a bit less daunting to play these mini-games than to play the full 1v1. Because if you do try to play 1v1, I’ll show you sort of when we released, we also attempted to see what would happen if you play StarCraft like the full game; and–
Day: What does 1v1 look like?
Oriol: A year ago, it looked a bit like this. This was an agent that was trained mostly to imitate humans; and you can see it’s kind of moving the camera a little bit building random buildings in random locations. It’s quite fun to see for a little bit, but it’s not going to even produce units sometimes. So it’s very limited, and it’s what it was like maybe a year ago; but that was kind of the simplest approach was to take human data and a try to imitate how humans play the game. That was what came out.
Day: I can imagine it’s really hard to start learning from scratch, because there are millions of different combinations of click and drag, and release; and then, what button you press in what order.
Oriol: Right, right, and just remember we are playing the game by looking at the screen. We don’t see the pixels exactly. We see kind of an esker version of StarCraft, so very much different from the cool art that we saw. It’s kind of the opposite way. It’s kind of ugly, but the computers understand it; but really, between the observations being the raw imagery and the actions being so high dimensional, millions of actions that you need to decide absolutely every second… the game is quite difficult to play.
Day: And what evolved from this? Did the AIs learn that cannon rushing is the most overpowered strategy in StarCraft? Is that what happened?
Oriol: So I’ll show you a bit of like what kind of went through in the last few months. The first time we actually scaled this up and played against bots. So all the bots that are building the game (as we know, many of us started playing against one bot and then eventually 7, and what not). So surprisingly perhaps, or perhaps not, we trained an agent to play against these bots, and it decided to do a very simple strategy.
Day: Look it’s NaNiwa.
Oriol: So agents obviously are lazy if they can; but this agent, in particular, is not just simply attacking with its probes; it’s actually doing some micro. Essentially, exploiting the built-in AI; and this is not the easiest built-in AI. This is actually cheap Insane, which is the hardest AI there is. So just with its probes, it actually manages to beat the hardest AI in the game. Obviously, not in a very elegant way. It’s going to finish. Yes. So this tells you a lot about the game, and it also tells you a lot about these bots that people do. They’re quite exploitable. They’re in single strategies. You might not account for certain viability in like how you build buildings, and so on. So it’s quite difficult to play; and for humans, it’s so natural; but for these agents it’s quite a struggle.
Day: I’m now getting concerned about queuing up against any of you on the ladder going: “I have a build order… look at this.”
Oriol: Well, it’s right. Try it at your own risk; but anyways, after a little bit of training and refinement, we started seeing a bit more like normal play (let’s say). So the next video shows sort of an agent that is actually playing somewhat… it’s perhaps not the best at macro or micro; but what’s happening in this game is… you notice there’s a pylon that is from the enemy, so this agent is being cannon rushed.
Again the agent just through learning understands that… well, as soon as these cannons come up, I need to pull all my probes and start defending — which is something that if you are a programmer, you need to start of accounting for all sorts of… well, if there’s 1 pylon and 2 cannons should I attack or not. The cool thing is this agent sort of learned this behavior, and it’s defending this cannon rush in particular, but also it does other things like… well, after I defeated the cannons, I go back to mining because I eventually want to win the game; and then, there’s a zealot coming out that starts clearing and so on.
Day: And then there’s also like the subtlety of leaving some workers back to mine not sending the entire crew over right away; and that’s a very easy thing to overreact to — I feel like, as a human player.
Oriol: Right and in the previous video that bot sort of almost send all the probes and pull them back; and back and forth, and that was quite exploitable. So it is very nice that we can learn these sort of behaviors; but perhaps what I’m most excited about is the way the agents play. So this next video shows maybe later-game, and it’s quite interesting because what we’re going to see is the first person view of the agent. So let’s see if it plays.
So here you’re really seeing what the agent sees. So you see the agent is actually playing like a human. In fact, it starts by imitating how humans play; and then, refining its play. So you see it’s kind of warping in units, moving its camera, it has this moving when you move units you kind of span-click a little bit. So you’ll see all these behavior, and remember every single action you see like building a new Nexus, moving the units, moving the camera. Comes essentially through looking at the screen, passing it through a neural network, and a deciding I’m going to click at this point at this time, and it’s very nice to see that. Just by learning from humans first, and maybe a bit of fine-tuning, it starts to level up like revising build orders. You can see it has reasonable macro. It’s not like accumulating many minerals, it’s reinforcing its army, it’s building cannons in its base, it’s chrono-boosting its Nexus and…
Day: –(talks on top of Oriol) vision on the map it seems.
Oriol: It’s not yet at the very extremely good level, but definitely it’s very nice to see play with this interface like when you actually see what the agent is kind of trying to do all the time. It’s really awesome, and there’s been a lot of progress on these; and it’s quite exciting right now.
Day: It’s so cool to see. I’m curious, what are some of your goal’s timelines to get to in terms of being able to play against humans? Could we play against it today for instance?
Oriol: For us (and for as being as a player), it obviously would be super exciting because this is much unlike any other bot. It would feel very natural to play against it. So I couldn’t wait to play it. For us, it’s still a bit early in the research. We want to definitely do more research, and do things like for example: right now the agents might not care too much about scouting. So that’s something we want to see agents do very well. That being said, obviously we would be extremely excited to allow the community to play it. So when we have something to say, you definitely will be the first ones to know; because you will be the guys playing and testing the bot. As you can see here, it’s winning the battle, it’s playing against another bot. So it’s not very good, but the nice thing about playing as yourself is you’re going to win.
Day: Nice, nice. I’ll quote you on that. Well, I think at this point this is where… ahh, yes. There is the mic. I would love to invite you up to answer some of your questions from the community here today. Thank you, Oriol. Thank you, gentlemen, for getting a chance to talk about SC2 with us.
NEXT: Q&A (Work in progress)
|BlizzCon 2018 StarCraft II: What's Next Panel Transcript|
|1. Intro||2. War Chest #4||3. Co-op Commander: Zeratul||4. Versus: Design Patch|