llm-compression

Interpretation code for analyzing LLMs compression effects for the paper "When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models"

pruning quantization distillation awq llm gptq llm-compression

Updated Oct 8, 2025
Python

A standard PyTorch implementation of Google’s paper Language Modeling Is Compression—with no reliance on Haiku or JAX. Drawing on the original repository (https://github.com/google-deepmind/language_modeling_is_compression), this code is capable of reproducing the key results from the paper.

llm-compression

Updated Oct 17, 2025
Python

FardinHash / tokencal

Star

Token Price Estimation for LLMs

tokens cost-optimization cost-management token-count llm-compression llm-cost llm-token token-cost

Updated Jun 20, 2024
Python

KeithLin724 / NYCU_Edge_AI_SGLang

Star

NYCU Edge AI Final Project Using SGLang

quantization llm vllm llm-compression sglang

Updated Jun 4, 2025
Python

meghanmane84 / LLM-Manifold-Based-Compression-Techniques

Star

Research code for LLM Compression using Functional Algorithms, exploring stratified manifold learning, clustering, and compression techniques. Experiments span synthetic datasets (Swiss Roll, Manifold Singularities) and real-world text embeddings (DBpedia-14). The goal is to preserve semantic structure while reducing model complexity.

clustering cuda embeddings pruning quantization manifold-learning topological-data-analysis hdbscan spectral-clustering stratification functional-algorithms llm-compression classical-clustering

Updated Sep 12, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the llm-compression topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-compression topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-compression

Here are 14 public repositories matching this topic...

horseee / Awesome-Efficient-LLM

Tencent / AngelSlim

pprp / Pruner-Zero

lliai / D2MoE

VITA-Group / llm-kick

Picovoice / llm-compression-benchmark

Picovoice / serverless-picollm

bupt-ai-club / llm-compression-papers

GongCheng1919 / bias-compensation

psunlpgroup / Compression-Effects

txsing / GoogleLLMCompress

FardinHash / tokencal

KeithLin724 / NYCU_Edge_AI_SGLang

meghanmane84 / LLM-Manifold-Based-Compression-Techniques

Improve this page

Add this topic to your repo