update
This commit is contained in:
@@ -12,24 +12,24 @@ using ..type, ..mcts
|
||||
""" Search the best action to take for a given state and task
|
||||
|
||||
# Arguments
|
||||
- `a::agent`
|
||||
one of Yiem's agents
|
||||
- `initial state`
|
||||
initial state
|
||||
- `decisionMaker::Function`
|
||||
decide what action to take
|
||||
- `evaluator::Function`
|
||||
assess the value of the state
|
||||
- `reflector::Function`
|
||||
generate lesson from trajectory and reward
|
||||
- `isterminal::Function`
|
||||
determine whether a given state is a terminal state
|
||||
- `n::Integer`
|
||||
how many times action will be sampled from decisionMaker
|
||||
- `w::Float64`
|
||||
exploration weight. Value is usually between 1 to 2.
|
||||
Value 1.0 makes MCTS balance between exploration and exploitation like 50%-50%
|
||||
Value 2.0 makes MCTS aggressively search the tree
|
||||
- `transition::Function`
|
||||
a function that define how the state transitions
|
||||
- `transitionargs::NamedTuple`
|
||||
arguments for transition function
|
||||
|
||||
# Keyword Arguments
|
||||
- `totalsample::Integer`
|
||||
a number of child state MCTS sample at each node during expansion phase
|
||||
- `maxdepth::Integer`
|
||||
a number of levels MCTS goes during simulation phase
|
||||
- `maxiterations::Integer`
|
||||
a number of iteration MCTS goes thru expansion -> simulation -> backpropagation cycle
|
||||
- `explorationweight::Number`
|
||||
exploration weight controls how much MCTS should explore new state instead of exploit
|
||||
a known state. 1.0 balance between exploration and exploitation like 50%-50%. 2.0 makes MCTS
|
||||
aggressively explore new state.
|
||||
|
||||
# Return
|
||||
- `plan::Vector{Dict}`
|
||||
@@ -41,7 +41,7 @@ julia>
|
||||
```
|
||||
|
||||
# TODO
|
||||
[] update docstring
|
||||
[x] update docstring
|
||||
[] return best action
|
||||
|
||||
# Signature
|
||||
|
||||
Reference in New Issue
Block a user