This is an unofficial implementation of SplitMeanFlow from the paper "SplitMeanFlow: Interval Splitting Consistency in Few-Step Generative Modeling" by ByteDance. [arxiv]. This implementation is based on my understanding of the paper and uses the DiT architecture as the backbone.
Note: This is not the official implementation. For the official code, please refer to the authors' repository when it becomes available.
SplitMeanFlow introduces a novel approach for few-step generation by learning average velocity fields through an algebraic consistency principle. The key innovation is avoiding expensive Jacobian-Vector Product (JVP) computations while achieving one-step generation quality comparable to traditional multi-step methods.
Create the conda environment:
conda env create -f environment.yml
conda activate splitmeanflow
First, train a standard flow matching teacher:
python train_teacher_mnist.py
Samples in training:
You can download the teacher model at model
Then train the SplitMeanFlow student:
python train_student.py
Samples in training:
The training follows Algorithm 1 from the paper with some modifications based on my interpretation.
- Trained MNIST dataset
- Trained CIFAR10 dataset
Disclaimer: These are preliminary results from my implementation and may not match the paper's performance.
- Performance Gap: My implementation doesn't quite reach the paper's reported metrics
- Training Stability: Occasionally encounter NaN losses - added gradient clipping as workaround
- CFG Handling: My interpretation of CFG-free training might differ from the official approach
- Time Sampling: Using uniform sampling instead of log-normal as mentioned in paper
This is an unofficial implementation created for learning purposes. If you find any bugs or have suggestions for improvements, please open an issue or submit a PR. If you have insights about the paper or the official implementation, I'd love to hear them!
Please cite the original paper:
@article{guo2025splitmeanflow,
title={SplitMeanFlow: Interval Splitting Consistency in Few-Step Generative Modeling},
author={Guo, Yi and Wang, Wei and Yuan, Zhihang and others},
journal={arXiv preprint arXiv:2507.16884},
year={2025}
}
- Original paper by ByteDance Seed team
- DiT implementation adapted from facebookresearch/DiT
- Thanks to the authors for the clear paper and algorithm descriptions
This unofficial implementation is released under MIT License. Please refer to the original paper and official implementation for any commercial use.