State
import mathy_envs.state
MathyAgentState¶
MathyAgentState(
self,
moves_remaining: int,
problem: str,
problem_type: str,
reward: float = 0.0,
history: Optional[List[mathy_envs.state.MathyEnvStateStep]] = None,
)
MathyEnvState¶
MathyEnvState(
self,
state: Optional[MathyEnvState] = None,
problem: Optional[str] = None,
max_moves: int = 10,
num_rules: int = 0,
problem_type: str = 'mathy.unknown',
)
Mutating operations all return a copy of the environment adapter with its own state.
This allocation strategy requires more memory but removes a class of potential issues around unintentional sharing of data and mutation by two different sources.
from_np¶
MathyEnvState.from_np(input_bytes: numpy.ndarray) -> 'MathyEnvState'
from_string¶
MathyEnvState.from_string(input_string: str) -> 'MathyEnvState'
get_out_state¶
MathyEnvState.get_out_state(
self,
problem: str,
action: Tuple[int, int],
moves_remaining: int,
) -> 'MathyEnvState'
get_problem_hash¶
MathyEnvState.get_problem_hash(self) -> List[int]
Example
mycorp.envs.solve_impossible_problems
->[12375561, -12375561]
to_np¶
MathyEnvState.to_np(self, pad_to: Optional[int] = None) -> numpy.ndarray
to_observation¶
MathyEnvState.to_observation(
self,
move_mask: Optional[List[List[int]]] = None,
hash_type: Optional[List[int]] = None,
parser: Optional[mathy_core.parser.ExpressionParser] = None,
normalize: bool = True,
max_seq_len: Optional[int] = None,
) -> mathy_envs.state.MathyObservation
to_string¶
MathyEnvState.to_string(self) -> str
MathyEnvStateStep¶
MathyEnvStateStep(self, args, kwargs)
action¶
a tuple indicating the chosen action and the node it was applied to
raw¶
the input text at the timestep
MathyObservation¶
MathyObservation(self, args, kwargs)
mask¶
0/1 mask where 0 indicates an invalid action shape=[n,]
nodes¶
tree node types in the current environment state shape=[n,]
time¶
float value between 0.0 and 1.0 indicating the time elapsed shape=[1,]
type¶
two column hash of problem environment type shape=[2,]
values¶
tree node value sequences, with non number indices set to 0.0 shape=[n,]
ObservationFeatureIndices¶
ObservationFeatureIndices(self, args, kwargs)
mask¶
An enumeration.
nodes¶
An enumeration.
time¶
An enumeration.
type¶
An enumeration.
values¶
An enumeration.