update
This commit is contained in:
@@ -12,24 +12,24 @@ using ..type, ..mcts
|
|||||||
""" Search the best action to take for a given state and task
|
""" Search the best action to take for a given state and task
|
||||||
|
|
||||||
# Arguments
|
# Arguments
|
||||||
- `a::agent`
|
|
||||||
one of Yiem's agents
|
|
||||||
- `initial state`
|
- `initial state`
|
||||||
initial state
|
initial state
|
||||||
- `decisionMaker::Function`
|
- `transition::Function`
|
||||||
decide what action to take
|
a function that define how the state transitions
|
||||||
- `evaluator::Function`
|
- `transitionargs::NamedTuple`
|
||||||
assess the value of the state
|
arguments for transition function
|
||||||
- `reflector::Function`
|
|
||||||
generate lesson from trajectory and reward
|
# Keyword Arguments
|
||||||
- `isterminal::Function`
|
- `totalsample::Integer`
|
||||||
determine whether a given state is a terminal state
|
a number of child state MCTS sample at each node during expansion phase
|
||||||
- `n::Integer`
|
- `maxdepth::Integer`
|
||||||
how many times action will be sampled from decisionMaker
|
a number of levels MCTS goes during simulation phase
|
||||||
- `w::Float64`
|
- `maxiterations::Integer`
|
||||||
exploration weight. Value is usually between 1 to 2.
|
a number of iteration MCTS goes thru expansion -> simulation -> backpropagation cycle
|
||||||
Value 1.0 makes MCTS balance between exploration and exploitation like 50%-50%
|
- `explorationweight::Number`
|
||||||
Value 2.0 makes MCTS aggressively search the tree
|
exploration weight controls how much MCTS should explore new state instead of exploit
|
||||||
|
a known state. 1.0 balance between exploration and exploitation like 50%-50%. 2.0 makes MCTS
|
||||||
|
aggressively explore new state.
|
||||||
|
|
||||||
# Return
|
# Return
|
||||||
- `plan::Vector{Dict}`
|
- `plan::Vector{Dict}`
|
||||||
@@ -41,7 +41,7 @@ julia>
|
|||||||
```
|
```
|
||||||
|
|
||||||
# TODO
|
# TODO
|
||||||
[] update docstring
|
[x] update docstring
|
||||||
[] return best action
|
[] return best action
|
||||||
|
|
||||||
# Signature
|
# Signature
|
||||||
|
|||||||
Reference in New Issue
Block a user