mathy_envs¶

Develop agents that can complete Mathy's challenging algebra environments.

Mathy includes a framework for building reinforcement learning environments that transform math expressions using a set of user-defined actions.

Built-in environments aim to simplify algebra problems and expose generous customization points for user-created ones.

Large Action Spaces: Mathy environments have 2d action spaces, where you can apply any known rule to any node in the tree. Without masking this makes mathy environments very difficult to explore.
Masked Action Support To enable better curriculum learning and toy problem creation, mathy agents are given access to a mask of valid actions given the state of the system. When used to select actions, mathy environments become much easier to explore.
Rich Reward Signals: The environments are constructed such that agents receive reward feedback at every action. Custom environments can implement their own reward schemes.
Curriculum Learning: Mathy envs cover related but distinct math problem types, and scale their complexity based on multiple controllable inputs. They also include built-in easy/normal/hard variants of each environment.

Requirements¶

Python 3.6+

Installation¶

$ pip install mathy_envs

Episodes¶

Mathy agents interact with environments through sequences of interactions called episodes, which follow a standard RL episode lifecycle:

Episode Pseudocode.

set state to an initial state from the environment
while state is not terminal
- take an action and update state
done

Other Libraries¶

Mathy supports alternative reinforcement learning libraries.

Gymnasium¶

Mathy has support Gymnasium via a small wrapper.

You can import the mathy_envs.gym module separately to register the environments:

#!pip install gymnasium
import gymnasium as gym
from mathy_envs.gym import MathyGymEnv

all_envs = gym.registry.values()
# Filter to just mathy registered envs
mathy_envs = [e for e in all_envs if e.id.startswith("mathy-")]

assert len(mathy_envs) > 0

# Each env can be created and produce an initial observation without
# special configuration.
for gym_env_spec in mathy_envs:
    wrapper_env: MathyGymEnv = gym.make(gym_env_spec.id)  # type:ignore
    assert wrapper_env is not None
    observation = wrapper_env.reset()
    assert observation is not None

Contributors¶

Mathy Envs wouldn't be possible without the contributions of the following people:

_{Justin DuJardin}

This project follows the all-contributors specification. Contributions of any kind are welcome!