ALE Vector Environment Guide¶
Introduction¶
The Arcade Learning Environment (ALE) Vector Environment provides a high-performance implementation for running multiple Atari environments in parallel. This implementation utilizes native C++ code with multi-threading to achieve significant performance improvements, especially when running many environments simultaneously.
The vector environment is equivalent to FrameStackObservation + AtariPreprocessing from Gymnasium as
gym_envs = gym.vector.SyncVectorEnv(
[
lambda: gym.wrappers.FrameStackObservation(
gym.wrappers.AtariPreprocessing(
gym.make(env_id, frameskip=1),
),
stack_size=stack_num,
padding_type="zero",
)
for _ in range(num_envs)
],
)
ale_envs = gym.make_vec(
env_id,
num_envs,
use_fire_reset=False,
reward_clipping=False,
repeat_action_probability=0.0,
)
Key Features¶
Parallel Execution: Run multiple Atari environments simultaneously with minimal overhead
Standard Preprocessing: Includes standard preprocessing steps from the Atari Deep RL literature:
Frame skipping
Observation resizing
Grayscale conversion
Frame stacking
NoOp initialization at reset
Fire reset (for games requiring the fire button to start)
Episodic life modes
Performance Optimizations:
Native C++ implementation
Same-step and Next-step autoreset (see blog for more detail)
Multi-threading for parallel execution
Thread affinity options for better performance on multi-core systems
Batch processing capabilities
Asynchronous Operation: Split step operation into
sendandrecvfor more flexible control flowGymnasium Compatible: Implements the Gymnasium
VectorEnvinterface
Installation¶
The vector implementation is packaged with ale-py that can be installed through PyPI, pip install ale-py.
Optionally, users can build the project locally, requiring VCPKG, that will install OpenCV to support observation preprocessing.
Basic Usage¶
Creating a Vector Environment¶
from ale_py.vector_env import AtariVectorEnv
# Create a vector environment with 4 parallel instances of Breakout
envs = AtariVectorEnv(
game="breakout", # The ROM id not name, i.e., camel case compared to `gymnasium.make` name versions
num_envs=4,
)
# Reset all environments
observations, info = envs.reset()
# Take random actions in all environments
actions = envs.action_space.sample()
observations, rewards, terminations, truncations, infos = envs.step(actions)
# Close the environment when done
envs.close()
Advanced Configuration¶
The vector environment provides numerous configuration options:
envs = AtariVectorEnv(
# Required parameters
game: str = "breakout", # The ROM id not name, i.e., camel case compared to Gymnasium.make name versions
num_envs: int = 1, # Number of parallel environments
*,
# Preprocessing parameters
frameskip: int = 4, # Number of frames to skip (action repeat)
grayscale: bool = True, # Use grayscale observations
stack_num: int = 4, # Number of frames to stack
img_height: int = 84, # Height to resize frames to
img_width: int = 84, # Width to resize frames to
maxpool: bool = True, # If to maxpool sequential frames
reward_clipping: bool = True, # If to clip environment step rewards between -1 and 1
# Environment behavior
noop_max: int = 30, # Maximum number of no-ops at reset
use_fire_reset: bool = True, # Press FIRE on reset for games that require it
episodic_life: bool = False, # End episodes on life loss
life_loss_info: bool = False, # Return termination signal on life loss but don't reset the environment until all lives are alot. If used, this MUST be indicated as has a significant impact on training performance.
max_num_frames_per_episode: int = 108000, # Max frames per episode (27000 steps * 4 frame skip)
repeat_action_probability: float = 0.0, # Sticky actions probability
full_action_space: bool = False, # Use full action space (not minimal)
continuous: bool = False, # If to use continuous actions
continuous_action_threshold: bool = 0.5, # The threshold at which to use continuous actions
# Performance options
batch_size=0, # Number of environments to process at once (default=0 is the `num_envs`)
autoreset_mode=gym.vector.AutoresetMode.NEXT_STEP, # How reset sub-environments when they terminated (https://farama.org/Vector-Autoreset-Mode)
num_threads=0, # Number of worker threads (0=auto)
thread_affinity_offset=-1,# CPU core offset for thread affinity (-1=no affinity)
)
Observation Format¶
The observation format from the vector environment is:
observations.shape = (num_envs, stack_size, height, width)
Where:
num_envs: Number of parallel environmentsstack_size: Number of stacked frames (typically 4)height,width: Image dimensions (typically 84x84)
Additionally, with grayscale=True then the shape is (num_envs, stack_size, height, width, 3) for RGB frames.
Performance Considerations¶
Number of Environments¶
Increasing the number of environments typically improves throughput until you hit CPU core limits.
For optimal performance, set num_envs close to the number of physical CPU cores.
Send/Recv vs Step¶
Using the send/recv API can allow for better overlapping of computation and environment stepping:
# Send actions to environments
envs.send(actions)
# Do other computation here while environments are stepping
# Receive results when ready
observations, rewards, terminations, truncations, infos = envs.recv()
Batch Size¶
The batch_size parameter controls how many environments are processed simultaneously by the worker threads:
# Process environments in batches of 4
envs = AtariVectorEnv(game="Breakout", num_envs=16, batch_size=4)
A smaller batch size can improve latency while a larger batch size can improve throughput.
When passing a batch size, the information will include the environment id of each observation
which is critical as the first (batch size) observations are returned for reset and recv.
Thread Affinity¶
On systems with multiple CPU cores, setting thread affinity can improve performance:
# Set thread affinity starting from core 0
envs = AtariVectorEnv(game="Breakout", num_envs=8, thread_affinity_offset=0)