RLG:
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

✨ Overview

This repository presents Reinforcement Learning Guidance (RLG), an innovative inference-time method designed to enhance and control the alignment of diffusion models. RLG builds upon the widely used Classifier-Free Guidance (CFG) by introducing a novel approach: it harmonizes the outputs of a base diffusion model and an RL-fine-tuned model through a geometric average. This unique combination empowers users with dynamic and precise control over alignment strength without requiring any additional training.

🚀 Quick Start Guide

Let's get RLG up and running! Follow these simple steps to set up your environment and generate stunning images.

1. 🌐 Environment Setup

Begin by cloning the repository and installing the necessary packages.

git clone https://github.com/jinluo12345/Reinforcement-learning-guidance.git
cd Reinforcement-learning-guidance
conda create -n rlg python=3.10.16 -y
conda activate rlg # Activate your new environment
pip install -e .

2. ⬇️ Model Downloads

RLG requires both a reference base model and an RL-fine-tuned model to operate. Please download them in advance.

Base Model

RLG currently primarily supports stable-diffusion-3.5 as its base model.

SD3.5: Accessible via stabilityai/stable-diffusion-3.5-medium.

RL-fine-tuned Model

RLG currently supports models fine-tuned with Flow-GRPO.

PickScore Alignment: Download from 🤗PickScore.
Text Rendering: Download from 🤗Text.

Evaluation Models

For comprehensive evaluation, you might need these models:

PickScore:
- laion/CLIP-ViT-H-14-laion2B-s32B-b79K
- yuvalkirstain/PickScore_v1
Aesthetic Score: openai/clip-vit-large-patch14
ImageReward Score: zai-org/ImageReward

3. 🖼️ Image Generation

Unleash the power of RLG to generate images!

Running Inference with RLG

To generate images using the Reinforcement Learning Guidance (RLG) framework, use the following command:

python scripts/generate.py \
  --config="config/dgx.py:pickscore_sd3" \
  --lora_path="{your downloaded RL-finetuned model's ckpt path}" \
  --tuned_guidance_scale=1.6 \
  --output_dir="logs/" \
  --prompt_file="dataset/pickscore/test.txt"

Note: If you wish to generate images using the original base model or the RL-fine-tuned model without RLG, simply set tuned_guidance_scale accordingly:

Set tuned_guidance_scale to 0.0 for the original base model.
Set tuned_guidance_scale to 1.0 for the original RL-fine-tuned model (sampling without RLG).

Parameter Descriptions

--config: Specifies the configuration file and model setup (e.g., config/dgx.py:pickscore_sd3 refers to the pickscore_sd3 configuration in dgx.py, fine-tuned on PickScore reward).
--lora_path: The file path to the LoRA (Low-Rank Adaptation) weights of your downloaded RL-fine-tuned model.
--tuned_guidance_scale: Adjusts the strength of RL-guided alignment during inference.
- Recommended range: 1.0 to 3.0.
- Higher values enhance alignment but might reduce image diversity.
--output_dir: The directory where all your magnificent generated images will be saved.
--prompt_file: Path to a text file containing the prompts for image generation.

4. 📈 Evaluation Model Preparation

The above steps install tools for generation. Evaluation models, however, may have specific dependency versions that could conflict if installed in a single environment. To prevent this, install only the specific reward models you intend to use.

PickScore

PickScore requires no additional installation steps. It's ready to go!

ImageReward

To use ImageReward, install it along with the CLIP library:

pip install image-reward
pip install git+https://github.com/openai/CLIP.git

Running Evaluations

We provide scripts to evaluate Aesthetic Score, PickScore, and ImageReward. To run evaluations, use the following command:

python scripts/cal_aes.py --input_dir logs/ --batch_size 1

This script will by default evaluate all supported rewards and save the results into a CSV file within your input directories.

Customizing Evaluation: If you wish to evaluate only specific scores (e.g., just PickScore), you can modify the score_dict within cal_aes.py by setting other scores to 0.0:

# Example to evaluate only PickScore
score_dict = {
    "aesthetic": 0.0,
    "imagereward": 0.0,
    "pickscore": 1.0, # Enable PickScore evaluation
}

🤗 Acknowledgement

This repo is based on Flow-GRPO. We thank the authors for their valuable contributions to the AIGC community. Special thanks to Liu Jie for the excellent flow_grpo repo.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
config		config
dataset		dataset
rlg		rlg
scripts		scripts
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RLG:
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

✨ Overview

🚀 Quick Start Guide

1. 🌐 Environment Setup

2. ⬇️ Model Downloads

Base Model

RL-fine-tuned Model

Evaluation Models

3. 🖼️ Image Generation

Running Inference with RLG

Parameter Descriptions

4. 📈 Evaluation Model Preparation

PickScore

ImageReward

Running Evaluations

🤗 Acknowledgement

About

Uh oh!

Releases 1

Packages

Languages

jinluo12345/Reinforcement-learning-guidance

Folders and files

Latest commit

History

Repository files navigation

RLG:Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

✨ Overview

🚀 Quick Start Guide

1. 🌐 Environment Setup

2. ⬇️ Model Downloads

Base Model

RL-fine-tuned Model

Evaluation Models

3. 🖼️ Image Generation

Running Inference with RLG

Parameter Descriptions

4. 📈 Evaluation Model Preparation

PickScore

ImageReward

Running Evaluations

🤗 Acknowledgement

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

RLG:
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance

Packages