Be a Supporter!
Sandremss128
Sandremss128
  • Member since: Aug. 22, 2009
  • Offline.
Forum Stats
Supporter
Level 11
Programmer
AI programming 2011-10-15 11:02:36 Reply

Hey there,

This week the homework assignment is to make an AI that can play a Reversi game. If you dont know what Reversi is:
http://en.wikipedia.org/wiki/Reversi

The AI I make has to compete again various other AI's and if it has a high enough ratio after 1000 games it is as winner, the more AI's it beats the higher point I will get. It has to win at least 2/3 of the games.

Of course there have been various Reversi AI's made already so the game that will be played is a variant:
1. Your move doesnt have to change a piece of the other player
2. An additional piece is introduced: "The unchangeable piece' This piece cant be changed by the opponent but it will also not trigger a flip. You can place either a normal piece or the unchangeable.

The amount of resources available to your AI is also limited: 10 seconds for 1000 games, on the server computer and this also includes the time the other AI has to think. This means I cant brute-force my way to victory.

I've already read various articles about AI programming and AI for Reversi in particular but the 2 new rules change this allot:
1. The idea of mobility isn't as big any more since you dont have to place a piece at a position that flips a piece of the opponent.
2. The 'unchangeable piece' opens up new strategies that are not discussed anywhere.
2. You cant make your opponent force to do nothing, in the original Reversi if your opponent can make no moves you can take most of the remaining pieces.

I've already tried to make an AI that only uses this 'unchangeable piece' but there is a problem: A draw will cause the opposing AI to win. If you only use the unchangeable piece all the time it will in the best case result in 32 pieces for you and 32 pieces for the opponent.

I haven't really started coding seriously because I have no idea where to start. I think I will have to make something that can look into the future because the greed-AI has already been given as an example (that is an AI that makes the best move for that round, not looking ahead).
Where do I start?

sharpnova
sharpnova
  • Member since: Feb. 19, 2005
  • Offline.
Forum Stats
Member
Level 09
Blank Slate
Response to AI programming 2011-10-15 11:34:08 Reply

http://en.wikipedia.org/wiki/Minimax

this is the general algorithm used for games of perfect information (and even some games of imperfect information) it won't work on its own, just like it doesn't work on its own in chess since the possible moves per position causes the game tree to fill up too fast.

http://en.wikipedia.org/wiki/Alpha-beta_
pruning

here is part of the solution to the branching problem. there are many ways to do move ordering. the "greed AI" you cited is probably a good way to do a simplistic "move ordering" then a simplistic pruning could be searching only the top 3 or 4 moves in that list. a-b pruning reduces the size of the game tree for a given ply (depth) and branching factor by a factor of its square root.

http://en.wikipedia.org/wiki/Bitboard

bitboards are especially awesome for reversi since you can represent it as 32 or 64-bits depending on what OS you're targetting. (hint: you get about 60% gains on 64-bit cpus) and due to the fact that reversi (your reversi variant) only has 4 or 5 states a square can be in

http://en.wikipedia.org/wiki/Genetic_alg orithm
http://en.wikipedia.org/wiki/Neural_netw ork

so far i've only talked about search, but the next level is evaluation.

search lets you generate a game tree and mini-max lets you recurse up the tree to determine the best move (based on scores) the scores themselves are gotten via evaluation.

an evaluation function (similar to fitness function in genetic simulations) evaluates a position without searching it and tries to give a score. (for minimax purposes score one player as positive and another as negative)

btw, since you don't want draws, you need to count a draw as a loss for scoring purposes. or at least use a large contempt factor.

for fine tuning your evaluation function you need to isolate what parameters you want to weigh: mobility, # of legal moves, squares you control, etc. then (and this is just me. as far as i know no one else but me or very few have done it this way) you can use a genetic algorithm from a pool of having your engine play against itself thousands of times to tweak the magnitude and weights and thresholds of your evaluation function's treatment of your selected parameters.

i have gone as far as implementing neural networks to control the pruning and singular extensions aspect, but i wouldn't recommend it. i got miniscule gains and it actually resulted in weird "local optima" play where when it was presented with situations outside the realm of its training sample, it played terribly.

http://en.wikipedia.org/wiki/Chess_openi ng_book#Computers
http://en.wikipedia.org/wiki/Endgame_tab lebase

for the opening, giving it a small (or large) human-generated or self-generated library to help it make its first moves is a great way to give it an advantage in time and general strategic position throughout the early middle game. since your reversi variant is not studied, a book would probably give you a large advantage over other programs not using any opening book.

read about depth-tomate and depth-to-conversion tablebases. in chess it's simple, they're generated on a "n pieces left" basis. for your variant you'd want to generate them for "n empty spaces left" situations. they would have to serve as search enhancers and in the case of your variant could be applied dynamically throughout the game, not just at the end.

there is a lot more to say and i haven't even shown you the surface of what game engines can do and how they can be made to do it, but this should get you started.


= + ^ e * i pi 1 0

Sandremss128
Sandremss128
  • Member since: Aug. 22, 2009
  • Offline.
Forum Stats
Supporter
Level 11
Programmer
Response to AI programming 2011-10-15 12:15:23 Reply

Thank you very much sharpnova! You gave me allot to read and allot to think about.

Only the fine-tuning will be a problem since I cant see how the opposing AI's think; I can only submit mine and later see how much they won. So maybe some configuration will work really well vs AI_A but it will lose dramatically to AI_B.

The library idea is a really good one! I haven't thought of it myself but I think it will cut down some execution time.

I will start making my AI and see where I get. To get a sufficient grade it is not really hard, but the gap between 'sufficient grade' and 'maximum' points is a big one I think.

I just hope that I dont have to start all over a gain later with my AI.

Sandremss128
Sandremss128
  • Member since: Aug. 22, 2009
  • Offline.
Forum Stats
Supporter
Level 11
Programmer
Response to AI programming 2011-10-17 08:26:01 Reply

Quick performance question:
Is passing a array all the time to other function through parameters more expensive than making a field?

Now it is something like:

for all fields :
{
   if(isValidMove(move))
      validMoveArray.add(move);
}

for all moves in validMoveArray:
{
    array tempField = grid.clone();
    tempField  = makeMove(move, tempField);
}
}
function makeMove(Move move, int[][] grid)
{
   for (all directions)
       grid = makeLine(move, grid)
   grid[move.x][move.y] = move.value;
}

So basically when it calls subroutines it passes the 2d array as a parameter. Is this the best way or is it better to have a 2d variable that can be used in the class so I dont have to give it as parameter?

Also Im afraid that looking too deep is not valid, even looking 2 levels deep is expensive.

Wolfos
Wolfos
  • Member since: Jan. 19, 2008
  • Offline.
Forum Stats
Member
Level 25
Game Developer
Response to AI programming 2011-10-17 15:35:49 Reply

Well, benchmark it!

Sandremss128
Sandremss128
  • Member since: Aug. 22, 2009
  • Offline.
Forum Stats
Supporter
Level 11
Programmer
Response to AI programming 2011-10-17 16:42:23 Reply

Actually I shouldn't even worry about this small thing before I get the big thing working. I am encountering major performance issues. This is how my programme works now:
1. Get all possible moves
2. Get all possible answers to every move
3. Get the 'best' answer to each move
4. The least best possible answer to one of our moves is the move we choose.

On the 1000 games my programme has to play in 10 seconds mine is taking 20 + seconds. Which is actually pretty good I found out after keeping track of the number of checks:
24 619 247 Moves it has to check on 1000 games.
That is, it has to make a new 2d array 24 mil times, apply the move to it, and then evaluate that move.
It did win 998 of the 1000 games though. But it seems to me that checking 24 000 moves while you only have to make 30 moves a game is a little too much.

I think that I might be doing it in a wrong order: First get the moves, then evaluate.
What I think would be better is: Evaluate the grid, then get moves.

Since this variant allows you to make moves that dont change anything this adds weight to this checking. 99% of the time doing a move that does nothing but taking a new field is not the best possible move you can make. I think that minmax is a little too brute-force like in this case.

I think its better to first look at the strong points and weak points of the current field for both you and the opponent. Later get some moves that alter these strong and weak points to your advantage. Finally when you have isolated the moves with the most potential it is possible to dig a little deeper into the future. But just doing all possible moves isn't going to be the way I think.

And I've been thinking about a library to help me through the first 5-10 moves but there is yet another problem: All submissions are limited in space! Both memory and storage space is an issue here.

Any thoughts on how to design an AI like this?

AhrimanZora
AhrimanZora
  • Member since: Oct. 21, 2011
  • Offline.
Forum Stats
Member
Level 01
Blank Slate
Response to AI programming 2011-10-23 20:55:40 Reply

Your AI in its current implementation should use A*. A* is a pathfinding algorithm, but it is also a general case problem solver whenever you have a large number of possibilities and want to prune them according to some fitness metric. This differs from a GA solution, because GA does a random walk followed by pruning for fitness, then selecting the best and doing a near random walk on the previous generation. A* can be used on a bunch of arrays quite easily as long as you have a way to prune them at a depth less than complete. If you store an interim scoring mechanism, that's ideal. I did A* on a compression algo I wrote in Python, and spent some time tuning the depth before prune, the branches in width, etc. If you prune too early, you lose some of your best possible results. There is also depth first vs width first. I went one layer deep, then did depth first. Then I compared the later width-depth solutions versus my best case already found. This was highly optimized to my specific problem set, but it ran very quickly for the amount of data searched. You can optimize your AI to run in the time/memory alotted by doing this sort of optimization.

sharpnova
sharpnova
  • Member since: Feb. 19, 2005
  • Offline.
Forum Stats
Member
Level 09
Blank Slate
Response to AI programming 2011-10-24 12:55:06 Reply

I'm going to disagree with A* being useful for a reversi engine.

Also, it seems you aren't doing any pruning whatsoever. That's why even ply-2 is expensive. You want to be searching dozens of ply and pruning makes this possible. Singular extensions makes things even more powerful.

Want to Skype this? I could spend hours writing a book on this subject but it would be easier if we just talked it over.


= + ^ e * i pi 1 0

sharpnova
sharpnova
  • Member since: Feb. 19, 2005
  • Offline.
Forum Stats
Member
Level 09
Blank Slate
Response to AI programming 2011-10-24 12:56:13 Reply

The only sense in which A* is applicable is that it is a specific example of mini-max. But it's overly specific for this type of problem.


= + ^ e * i pi 1 0