Skip to content

Conversation

maxmelichov
Copy link

adding a setting for DiffMamba, DiffMamba improves the regular mamba
for more information you can find in the research paper here: https://arxiv.org/html/2507.06204v1

maxmelichov and others added 10 commits August 21, 2025 12:29
… setup.py to BSD; adjust CUDA architecture flags in setup.py; update version to 2.2.4 in __init__.py; modify import paths in various files to reflect new package structure.
…e README.md with comprehensive setup instructions, experiment details, and citation information for Differential Mamba.
…nt setup instructions, and include additional package installation details.
…, improving cache management for mixers, and refining dimension inference logic. Update allocation_inference_cache to share a single cache object between mixers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant