Artificial Intelligence (AI) — Part two

Graphs showing some ML data

Spoiler alert, I won’t use Unity machine learning (also called mlagents) to implement artificial intelligence for my bots. If you want to know more about why, read on.

At first, it was hard to use, see my previous post, but then Unity helped my by giving me access to their alpha mlagents-cloud. That fixed my previous problem which was mostly a hardware problem.

From hard to use it becomes easy to iterate, and that’s exactly what I needed to find out if it was a good approach for my idea of having bot using “real” AI.


When you try to train a model you have to give it three main data points:

1 – Observations: what it (it’s called an agent) can see from its environment
2 – Actions: what the agent can do
3 – Rewards: information on how it performs

So you have to think quite hard about it, but as you know your environment, in the end you can find some good inputs for each of those (so you think).

In the very beginning I tried to train a model that would move and shot the target.

Let’s dive into some details

I had 12 observations, 10 actions and plenty of rewards points here and there. But I found out that no matter what, my model could not understand how to fire, it was moving quite alright but never firing.

I decided to split the model in two, one for moving, and one for aim and fire. I found out online that most people do this way when the problem for the agent is too hard. It’s a first trade off but I thought that it was acceptable.

Now I have two experiments, one to learn to move and the other to aim and fire.


The agent has to go to the target so reward is calculated on how close it is to the target. It can go left/right/jump/double jump. The map can be pretty hard to navigate for sure, even sometime impossible (something that machine learning does not like).

After 7 iterations, where I changed the reward values, added/removed some observations, made the map easier to navigate etc. This is what I’ve got:

The agent mostly succeed but sometime it goes in the wrong direction, it is always jumping like crazy, it does not handle the double jump when needed, it does not look natural at all.

Give it 8 more iterations, trying to add negative reward to the jump so it stops doing it that much etc. I did not get anything better.

Note that even if Unity mlagents-cloud allow me to iterate quickly, it still needs a couple of hours between each model changes.


The agent has to hit the target with the bazooka so reward is calculated on how many damages it makes and also how close it is (when failing). It can aim up/down/load fire/release to shot. This time the map was made easy from the beginning.

But after 5 iterations I found out that this was already too complex for the model. It did not manage to hit the target, only itself. The load and release action to fire is too complex from what I understood.


The problem is that machine learning is hard, and I’m not an expert in it

It took me a full week, working like crazy to conclude that I’m not an expert enough to know what are the limits of this, and how to bypass them. Of course, I could spend more time on this but it seems that no matter what, the outcome will not be as good as I first imagined.

By working on machine learning in this scope, training an agent to be a bot in a game, I also realized that doing so as a developper, you would lose all the control on your bot. I’m quite sure that when the AI is well trained the result for the player is nice, but as a game designer you can not force how your bot would behave (except making an new model each time).

This adds up to my final conclusion: machine learning is not what I need so I’ll have to make a manual AI for Artillery Battle, and this will be hard.

Funny note

When working with machine learning, you can come across some funny (but logic) behaviors. For example in my first “fire” experiment the agent learned that not firing at all was the best way to go. Because if it failed and hit itself it was punished. So I had to give it some positive reward for firing and lower negative on hitting itself (this is an example on what you have to do between each of your experiments to get a better outcome).

Network implementation

Naive start

My idea was to implement the network part from day one because that way I could play with beta testers right away.

First, I started with my own implementation using Firebase and found out that it was pretty hard to get everything working. Then I benchmarked a bunch of solutions and settled down on Photon Unity Network (PUN).

It was great, the code was not that hard and it seemed to work. Until I had the occasion to test an early version with a friend in real condition (meaning over the internet and not on a local machine). The result was too laggy for me. I’m pretty sure I could improve some details but I didn’t want to fight against the code.

I decided to stop developing the network part right away, but thanks to this first step I’m very aware of how to structure the code.

Custom solution

Later on, I made some prototype with a new solution of my own, tailored for that particular game. Indeed, being a turn-based game, I will go for a “turn replay” mechanism: the idea is to record the turn of the player and broadcast it to the other player in near real time. This will also allow keeping a record of any game for later replays.

You can now see how important it is to have deterministic physics, so I don’t need to record every movement in the replay stream.

Let’s dive into some details

The “Stream Play” code (that’s how I call it internally) is split in two main components: the Recorder which in charge of — hum — recording events and the Player which will replay those events. Of course in between there is a websocket connection to transfer recorded event from player A to player B (it goes through a server for extra control).

The recorder does not save everything that is happening, it only saves important information called snapshots. Those are the position of the characters, the state of the map (holes and other changes like this), positions of the bonus boxes and mines. That way at the end of the turn we are sure that both players are in sync.

The recorder also sends the active player inputs, this time it is real time, and those inputs are played right away on the other side. But because the output could slightly diverge, the source of truth at the end of the turn will be the snapshots.

The player, on the other end, buffers a few seconds of data, and because Artillery Royale is turn based and not real time, it does not matter much. And then runs the inputs and apply the snapshots. Both are time based that way the player can follow the right timeline.

In the middle there is a NodeJS server. It does not do much. Mostly send data from player A to player B, using a game id that is shared across both client. This server prototype — I mean this whole network thing — is still an early prototype. But so far I have some good results!

@koalefant asked on the discord server (click to join): “I am curious why did you end up using both snapshots and input simulation for networking? Would not snapshots be sufficient?

The answer is: basically I use custom physics for movements (characters and ammo) but I still use Unity colliders and I’m worried that collision would drift away at some point (I mean not worried, it will at some point). That’s why I’m using both inputs and snapshots.


We can see that I choose a deterministic way of doing by sending inputs and letting the physics plays on both sides, and because the physics in Artillery Royale is — mostly — deterministic, it works. But I’m extra careful and send snapshots just in case!

The data that flows from both players is very light. Even real time inputs does not represent that much of information. This way of doing will also allow saving replays in a very optimized format.

Artificial Intelligence (AI) — Part one

Graph showing Cumulative Reward

A demo for Artillery Royale is planned for the end of September.

At first, this demo would have been a two-player demo only, but quickly I realized that this won’t make much sense for most of the players because right now there is no network support, nor enough players.

So if you want to play the demo you’d have to be two in the same room, playing turn by turn (something that is intended when the game will be done but probably not ideal for a demo where I want quick iteration and feedback loop).

On another hand, I always thought basic AI would be a pain to code and not fun to play with (because it’s based on a set of rules, the player can understand and predict them quickly).

So what was the solution?

Fortunately, we are at a time when you can use “real” AI in your games now. Real like the one in automated cars, or the one which won at the Go game. That kind of real. The one that you can train yourself giving rewards to get a neural network in return. The one that is mostly unpredictable, creative, and fun to interact with.

I mean, that’s the theory.

That being said, it’s still a hard topic. AI development is fun to play with but hard to get right. And to be honest, I’m a total noob in this area. I understand the basics and how it works as a whole but implementing it is something else.

Fortunately, Unity has some good pieces in place to help you start, they included a good toolkit (API) and some tutorials too. I thought it will be easy to apply their example to my specific problem but OMG that was way harder than I thought.

I had a first very naive approach, using what I’ve just learned in a good tutorial and thinking that it will be quite easy to apply to my own problem. Not true. I had to refactor all the code first to make it work with multiple environments (but that’s a detail), I’ve also had to think hard to get a good rewarding system and implement it. Find how to express AI objectives and translate that to code.

When this was done, I had some AI training going on, and to be honest it looked like it could yield some result. But I found another problem: computational power. My old MacBook Pro is, well, old and does not have the needed CPU power to train an AI model in a reasonable time. After a lot of internet searching, I found out that my objective is too complex for my AI model anyway (no matter the power of my computer).

Note on the hardware: at some point, I was looking to pimp my MacBook with external GPU thinking that it would help. I discovered that it won’t. Tensorflow the software used behind the scene uses Nvidia GPU only (and fallback to CPU otherwise). Unfortunately, Nvidia and Apple are at war (and thus not compatible). So it won’t help.

Getting ready for part two.

Today I designed a new way of getting an AI to work. After my search, I found out that people split their objectives in small chunks and train multiple brains. Then they add some code to switch those brains during the gameplay. I’m hoping that way it will work.

Also, Unity contacted me because they have some AI Training Cloud in their roadmap, and they may provide that service in the future. Hopefully, they will let me try it soon.

Character Movement

Today I spent way too much time on my character movement controller. It turns out, this part is more complicated than I expected. But it’s so important to get it right, that I took the time, and spent a bunch of days working on it.

At first, I was a little bit disappointed that something so basic was not included in the game engine. So I browsed the asset store and tried a few assets with no success.

The problem, kin of “as usual” is that I want something very specific with a certain feel to it, and even if most of the assets were configurable I never found the right combination. I decided to give it a try myself.

I’m a little torn, in a way it was not so hard to do, but on the other, I had to deal with so many cases. As you may know from my article on my technological choices I decided to handle most of the physics myself. So that was some work to get the character to collide with stuff and bounce when projected.

But it’s done.

  • I implemented left / right move
  • Jump (and double jump) in player physics controller
  • Handled pretty much all collision cases, including when hitting multiple walls at the same time (or when falling, jumping)
  • Implemented distance limits (because in Artillery Royale, a character can move a certain distance regarding his class. Not sure that mechanic will make it up to the final version).

Prototyping with a designer

Image with test for different sizes

These days I’m lucky enough to spend time with a great artist: Jean-Baptiste Dessaux (Jb for short). He is going to work on the graphic side of Artillery Royale.

I’m very happy that this part of the project is moving forward!
We are going to try to answer some critical gameplay questions; for example the proportion of the character versus the map or the style of the game.


In classic artillery games, players often move military tanks (or in the particular case of the Worms series: worms). These characters are slow, which gives artillery all its meaning.

In Artillery Royale, the characters are not intended to move slowly, they are humanoids. This could be a gameplay problem. Indeed if the player can quickly move his characters close to those of the other player, why would he use artillery rather than nearby weapons?

That’s partly why we have another gameplay mechanism: distance limits per character. For example, the bishop can move a certain distance, the knight a bit more, etc (it’s because the game has some chess board game roots). But even for me, it seems a little bit odd.

We will see how it goes!

Image with test for different sizes
Size tests. Concept art by Jean-Baptiste Dessaux

By working with Jb I came to realize that doing a map generator is probably too early for the game. I’ll go for a pre-built map so the gameplay can be tested in advance. It’s always sad to remove features from the codebase but that’s how programs/projects dev work.

Learn more about the map generator here.

First results

Jb is now iterating on concept arts, some for the map, and some for the main character: the queen.

Image with 9 objects
Concept art for map objects by Jean-Baptiste Dessaux
Image with 4 tests for the Queen character
Concept art for the Queen character by Jean-Baptiste Dessaux

I paused a bit some dev things to focus on helping him deliver. And then I could integrate his work into the game to see everything came to life.

Exciting times!


This article will be updated as I add parts to the game.
Last update was on the 30th of October 2020

In this blog post, I will talk about the technological choices I made and go deeper in some technical details about the game.


The game is based on the Unity engine which allows you to export on all platforms, from PC/Mac to consoles via mobile platforms.
I made that choice because I know Unity quite well and I like coding in C#. But to be honest, everything is not perfect in the Unity world.

Love / hate relationship

I love Unity because it’s very easy to get started with, there are plenty of online tutorials and examples, and the assets store has some good gems. All this gets you to pass the prototype phase pretty quickly.

I hate Unity because they have different ways to do the same thing (usually they buy some popular asset, integrate it in the software and that’s it, they do not remove the part it replaces). They also sometimes remove features without giving anything in replacement (last example with the networking part).

Physics 2D

When I started the game, I thought “it’s going to be easy because Unity has all the needed tools including a good 2D physics engine”. But after a while playing with it, I found a bunch of blocking limitations.

First, it’s very hard to customize, you can play with physics materials and change the mass, friction, and all those variables, even the gravity scale itself. But still, in the end, it’s a bunch of variables that you’ve changed here and there. You get easily confused and it’s not even practical to test.

Second, and that was the blocker, the Physics engine is non-deterministic. It means that given a set of inputs (i.e.: force, mass, and start position) if you play it two times, you will probably not have the same output. Unfortunately, this is not compatible with Artillery Royale Gameplay. If a player at some point has a good aim, now the right force and play the same move again, they should get their bullet at the very same spot.

You will see that the deterministic part is even more important because of network implementation.

So I took some time to re-implement basic physic equations but kept some of Unity existing pieces like colliders and collision resolutions.

About the destructible map

I found a great asset on the store that allows you to take any Sprite and make it destructible. I bought it without even thinking and at first, I was in love.

But after some prototyping, I found out that because the asset was working on pixels and not shapes it would never render the way I wanted.

So I made my own destructible map which is based on Sprite Mask and custom polygon colliders, all of this is easily done with Unity and a bit of shape math.

Map generation

See detailed article here: Map generator


Custom solution

I made some new prototype with a new solution of my own, tailored for that particular game. Indeed, being a turn-based game, I will go for a “turn replay” mechanism: the idea is to record the turn of the player (with a specific optimized stream format) and broadcast it to the other player in near real time. This will also allow keeping a record of any game for later replays.

You can now see how important it is to have deterministic physics, so I don’t need to record every movement in the replay stream.

See the detailed article here: Network

Artificial intelligence

Learn about AI in this detailed blog post