This repository contains the code for the paper Learning Discrete World Models for Heuristic Search, accepted to the first Reinforcement Learning Conference (RLC 2024).
![]() |
![]() |
![]() |
![]() |
|---|
DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, please refer to the paper.
- Key contributions: Key Contributions
- Main results: Main Results
- Quick start: Installing the package or from source
- Install: docs/installation.md
- CLI reference: docs/cli.md
- Stage-by-stage usage (all flags and paths): docs/usage.md
- Reproduce the paper results: docs/reproduce.md
- Distributed training for Q-learning: docs/qlearning_distributed.md
- Environments and integration: docs/environments.md
- Python usage (API snippets): docs/python_api.md
- Citing the paper: Citation
- Contact: Contact
DeepCubeAI is comprised of three key components:
-
Discrete World Model
- Learns a world model that represents states in a discrete latent space.
- This approach tackles two challenges: model degradation and state re-identification.
- Prediction errors less than 0.5 are corrected by rounding.
- Re-identifies states by comparing two binary vectors.

-
Generalizable Heuristic Function
- Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a goal-conditioned heuristic function that generalizes over start and goal states.
-
Optimized Search
- Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses Q* search, a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.
- Accurate reconstruction of ground truth images after thousands of timesteps.
- Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
- 99.9% success on Rubik's Cube with reversed start/goal states.
- Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.
DeepCubeAI provides a Python package and CLI. You can install it from PyPI or build it from source. The package supports Python 3.10-3.12.
Note
You can find detailed installation instructions, including using Conda for environment management, in the installation guide.
deepcubeai is available on PyPI and you can use the following commands to install it using uv.
-
Install
uvfrom the official website: Install uv. -
Create and activate a virtual environment:
# create a .venv in the current folder uv venv # macOS & Linux source .venv/bin/activate # Windows (PowerShell) .venv\Scripts\activate
If you have multiple Python versions, ensure you use a supported one (3.10-3.12), e.g.:
uv venv --python 3.12
-
Install the package (using uv’s pip interface):
uv pip install deepcubeai
Pixi is a package management tool that provides fast, reproducible environments with support for Conda and PyPI dependencies. The pixi.toml and pixi.lock files define reproducible environments with exact dependency versions.
-
Install Pixi: Follow the official installation guide
-
Clone repository:
git clone https://github.com/misaghsoltani/DeepCubeAI.git cd DeepCubeAI -
Install the environment: Install the environment of your choice (default is
default):pixi install # or: pixi install -e default # Or the dev environment with additional dev dependencies: pixi install -e dev
You may also install all environments at once:
pixi install --all
-
Enter the environment: First run may perform dependency resolution if the environment is not already installed:
pixi shell # or: pixi shell -e default # or for the dev environment: pixi shell -e dev
Note
There is also an environment named all, which installs all dependencies from every environment into a single environment. This differs from installing all environments separately.
- The command
pixi install -e allinstalls the environment namedall. - The command
pixi install --allinstalls each environment separately (i.e.,default,dev,build,glibc217,all, andcuda).
For running the CLI use the following command to see the available options:
# If already entered the environment with Pixi:
deepcubeai --help # or -h
# or without entering the environment:
pixi run deepcubeai --help # or -hOr use it as a Python package:
import deepcubeai
print(deepcubeai.__version__)MIT License - see LICENSE.
If you use DeepCubeAI in your research, please cite:
@article{agostinelli2025learning,
title={Learning Discrete World Models for Heuristic Search},
author={Agostinelli, Forest and Soltani, Misagh},
journal={Reinforcement Learning Journal},
volume={4},
pages={1781--1792},
year={2025}
}If you have any questions or issues, please contact Misagh Soltani (msoltani@email.sc.edu)



