Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
-
Updated
Sep 19, 2025 - Python
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
Survey: https://arxiv.org/pdf/2507.20198
A High-Efficiency System of Large Language Model Based Search Agents
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
TinyML and Efficient Deep Learning Computing | MIT 6.S965/6.5940
Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.
Official PyTorch implementation of the paper "Dataset Distillation via the Wasserstein Metric" (ICCV 2025).
A deep learning framework that implements Early Exit strategies in Convolutional Neural Networks (CNNs) using Deep Q-Learning (DQN). This project enhances computational efficiency by dynamically determining the optimal exit point in a neural network for image classification tasks on CIFAR-10.
Code for paper "Automated Design for Hardware-aware Graph Neural Networks on Edge Devices"
Task-Aware Dynamic Model Optimization for Multi-Task Learning (IEEE Access 2023)
MOCA-Net: Novel neural architecture with sparse MoE, external memory, and budget-aware computation. Real Stanford SST-2 integration, O(L) complexity, 96.40% accuracy. Built for efficient sequence modeling.
Production-grade GPT transformer implemented from scratch in C++. Runs on modest hardware with complete mathematical derivations and optimized tensor operations.
Add a description, image, and links to the efficient-ai topic page so that developers can more easily learn about it.
To associate your repository with the efficient-ai topic, visit your repo's landing page and select "manage topics."