Strategies
Strategies are the classes with common algorithms that can be shared among multiple agents. It makes easy to implement variances or combine diferent strategies when creating an new agent.
Example
# inheriting from two strategy classes class AgentSimulationTL(AgentBase, SimulationStrategy, TimerStrategy): name = 'Simulation TL' description = 'Simple simulation strategy (simulates managing the time limit)' kind = 'simulation' clock_management = True # picking a single method from a strategy class childs = TreeSearchStrategy.childs def action(self, board): # applying the strategy method from TimerStrategy rule = lambda time_left: time_left/(25-self.turn) self.start_timer(rule, max=20) ...
RandomStrategy
connectFourLab.game.agents.strategies.random.RandomStrategy()
Random strategy
RandomStrategy methods
random_choice
random_choice(board)
Random choice. Return a valid column from a given board state.
TimerStrategy
connectFourLab.game.agents.strategies.timer.TimerStrategy()
Timer strategy allow you to easily program a timer to control the time of each turn.
Using the start_timer
method you can set a rule
which will determine the start point of the timer
and when the time expires the time_out
flag will
be changed to True
.
Example
from . import AgentBase from .strategies import TimerStrategy class AgentNew(AgentBase, TimerStrategy): def action(self, board): rule = lambda time_left: time_left/(25-self.turn) self.start_timer(rule, max=20) while not self.time_out: # loop process return column
TimerStrategy methods
start_timer
start_timer(rule, max)
Set the rule and starts the timer
Arguments
rule
- lambda, required, rule which will determine the time limit of the turn.max
- float, required; maximum time allowed per turn.
ZobristHashingStrategy
connectFourLab.game.agents.strategies.treeSearch.ZobristHashingStrategy()
Zobrist hashing table strategy provid an easy way to implement a hash table using the a board state to crate the hash. Useful to store the data in a tree search.
Examples
TreeSearchStrategy
- documentationNode
- documentation
ZobristHashingStrategy methods
init_zobrist
init_zobrist()
Initialize Zobrist hashing table.
This method creates a list of two tables, filled randomly with int64 numbers, to represent all possitions in a 7x7 board, one table for each player. It also generate two random int64 numbers which will represent the owner of a board state.
Both lists are used when generating a hash in the
hash
method.
hash
hash(board, color)
Create a zobrist hash for a given board state.
Arguments:
board
- matrix, required, board statecolor
- int (1 or -1), required, owner of the board state.
Return:
Generated zobrist hash.
next_random64
next_random64()
Return a random int64 number
TreeSearchStrategy
connectFourLab.game.agents.strategies.treeSearch.TreeSearchStrategy()
Tree Search Strategy provide the necessary methods to an agent make a tree search in a given board state.
Example
AgentNegamax
- documentation
TreeSearchStrategy methods
negamax
negamax(node, depth, color)
Negamax algorithm
This algorithm, in order to be more efficiente, also uses a Zobrist hashing table to store the values of board states and avoid duplicated searches in equal board states in the same depth.
Arguments
node
- matrix, required, board statedepth
- int, required, number which controls the depth limit that the recursive calls should hitcolor
- int (1 or -1), required, owner of the board state
Return
Best value of a node
childs
childs(node, color=1)
Create an return the children of a given node
.
The children are generated from all available
positions in a given board
Arguments
node
- matrix, required, board statecolor
- int (1 or -1), optional, default 1 owner of the board state
Return
List of children. Each item = (column, new_board_state)
save_search
save_search(board, value, depth, color)
Save the data of a search in a zobrist hashing table.
Arguments:
board
- matrix, required, board statevalue
- int, value of the board statedepth
- int, current depth of the searchcolor
- int (1 or -1), owner of the board state
stored_value
stored_value(board, depth, color)
Get the stored value of a board state.
Search in the hash table if the is already a value
stored for the board
and return the value if there is
one and if it's necessary to update the value.
The search is made on a Zobrist hashing table.
It will require an update in the value stored when the stored value depth is inferior to the current search depth.
Arguments
board
- matrix, required, board statedepth
- int, required, current search depthcolor
- int (1 or -1), owner of the board state
Return
value
- int, value of the given board if it finds oneupdate
- boolean,True
when the value stored needs to be updated.
SimulationStrategy
connectFourLab.game.agents.strategies.monteCarlo.SimulationStrategy()
Simulation Stragegy provide the method necessary to simulate matches.
SimulationStrategy methods
simulate
simulate(board, color=-1)
Simulate a match to the end from a given board state. All turns are played randomly till the board hits a terminal state then the value is returned.
Arguments
board
- matrix, required, board state to be simulatedcolor
- int, required, id of the owner of the board state
Return
Id of the winner of the simulation or zero in case the simulation ends in a draw.
Example
AgentSimulation
- documentation
Node
connectFourLab.game.agents.strategies.monteCarlo.Node(board, memory=None, parent=None, position=None, color=1)
Node class is a base for a complex node for a Monte Carlo Tree Searches.
This class has specific methods to perform the Monte Carlo Tree Search.
This class don't work on its own, because it doesn't have a
default board evaluation algorithm. When inherited it's required
to implement the rollout_score
method which is an evaluation
of a given state.
This class uses a zobrist hashing table to optimize the search.
Arguments
board
- matrix, required, board statememory
- empty dictionary, required(*), default None- this dictionary will store all searches with zobrist hashing.
parent
-Node
object, required (**), default None- Node above in the tree hierarchy.
position
- int, required (**), default None- Index of the column which generated the board
current
board
.
- Index of the column which generated the board
current
color
- int, required (**), default 1
(*) required when creating a new root Node
object.
(**) required when creating a new "children"
Node
object (new_node
method).
Properties
UCB1
- float, UCB1 value (*) of the node.visits
- int, total number of Node visitsvalue
- float, total number of score divided by the number of visits- the total number of score is the sum of all scores made by the Node and his children.
(*) UBC1 is an algorithm which calculates the distribution of search effort for exploration and exploitation in the Monte Carlo Tree Search strategy.
Example
AgentMonteCarlo
- documentationAgentMCTSNN
- documentation
Node methods
rollout
rollout()
Node rollout.
It will execute a rollout if this is the first
visits of the Node, otherwise it will return False
.
Each rollout adds a visit in the counter and the score
of the board
to the Node and all the parents above
in the tree.
Return
True
when the rollout occur, False
when it do not.
rollout_score
rollout_score()
This method must return a score (float), evaluation,
for the Node board state (self.board
), from the
perspective of the player id 1.
This method is called every rollout. The rollout occur in the first visit of the Node, the value is stored in the zobrist table and it is not recalculated during the match.
Return
float, score of the Node board
children
children()
Get all the childen Nodes.
Generate or get from the memory, all the childen
Nodes. Each node is generated from the available
possitions in the board
of the current Node.
new_node
new_node(board, column)
This method is called by children
method to
generate a new Node.
Arguments:
board
- matrix, new board statecolumn
- int, index of the last column
Return
A new instance of Node
object.
DepthMeasure
connectFourLab.game.agents.strategies.monteCarlo.DepthMeasure()
Use this class to help when measuring the depth of a tree search. Useful when debugging and measuring performance optimization.
Example
class AgentNew(AgentBase, TimerStrategy): def action(self, board): DepthMeasure.start() self.start_timer(...) while not self.time_out: # place the line "DepthMeasure.add()" inside the # search method, when it creates a new depth self.run_search() DepthMeasure.reset() DepthMeasure.print() return ...
DepthMeasure methods
start
start()
Reset all the variables to begin a new measurement.
add
add()
Add more 1 depth.
reset
reset()
Reset the depth before start a new search episode. Save the current depth if it's the deepiest till know.
print()
Print the deepiest depth reached.