on
Let me introduce Odysseus
I just started the actual implementation of my approach to strategy generation through machine learning into the Steamhammer AI. Let me introduce Odysseus. The name is based on the great tactician from Homers Odyssee which seems a good fit to me.
The creation of genius always seem like miracles, because they are, for the most part, crated far out of the reach of observation.
- Homer, The Odyssey
The topic of my bachelor thesis is the Potential of machine learning in the domain of real-time decision making under uncertainty. The topic got me really interested through works like AlphaGo and OpenAI on the Atari2600 so I started with a deep dive into the domain of machine learning and because of it’s active developer/science community and great competitions settled with StarCraft Broodwar as the test-bed.
To simplify my first baby-steps in this domain a bit I chose the Steamhammer bot (SH) by Jay Scott as a solid starting point with a descent open codebase.
With both docker-starcraft and docker-botcraft I created an cross-platform environment powered by docker enabling the execution and compilation of StarCraft BWAPI bots on linux based systems. I did that mainly out of curiosity, though it’s not really necessary for my actual AI implementation. The next step in this direction is the contribution of a cross-compile ready BWAPI. But that has to wait.
A first concept
While a game of StarCraft includes several problems that are worth the time, strategies and reactive decision making based on uncertain opponent behaviour got me excited the most.
I believe that tasks like building placement, scouting and micro-management are working quite well compared to the current static strategies that are shipped with most bots. Lets take the advances of current StarCraft Broodwar AI and exchange the static opening books and build orders by reactive decision making based on observed time-series data of played games.
For this purpose four descriptions are needed:
- a game state
- a series of game states
- a desired game/self state
- a measure of performance
While the game state consists of capturable information like own units, enemy units, buildings, time and resources, a series of states captures timing data and correlation in between. Through a machine learning algorithm a model is formed which encodes the experience between game state series, taken decisions, resulting outcome and performance.
The strategy generation will take the game state as input while returning a desired unit composition as output. The process of achieving this unit mix and using it in combat will be handled as two separate problems. With this simple interface different approaches can be plugged in to test them against each other.
I was positively surprised when I noticed Jay Scott with his opponent model is working on something similar. This will create great synergies I believe and encourages me that I’m on a promising track. The difference between the approaches however is that Jay is only learning a prediction model which is used in specific coded decision processes while this approach learns a decision model which encodes the whole decision process.
For game state extraction I’m building upon the work of Jay Scott and the TorchCraft-Project, which enables the usage of Lua and Torch to develop StarCraft BW AI.
The actual architecture will definitely look different since my experience with machine learning is still limited. I feel like there are many things where my concept is still to naive. Well, figuring this out will be the fun part!
Any thoughts of your own?
Feel free to raise a discussion with me on Mastodon or drop me an email.