Release v0.22.0 · keras-team/keras-hub

Summary:

New Models:

We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:

Gemma 3 270M: Released Gemma 3 270M parameter model and instruction tuned, 18-layer, text-only model designed for
hyper-efficient AI, particularly for task-specific fine-tuning.
Qwen3: A powerful, large-scale multilingual language model, excelling in various natural language processing tasks, from text generation to complex reasoning.
DeiT: Data-efficient Image Transformers (DeiT), specifically designed to train Vision Transformers effectively with less data, making high-performance visual models more accessible.
HGNetV2: An advanced version of the Hybrid-Grouped Network, known for its efficient architecture in computer vision tasks, particularly optimized for performance on diverse hardware.
DINOV2: A state-of-the-art Self-Supervised Vision Transformer, enabling the learning of robust visual representations without relying on explicit labels, ideal for foundation models.
ESM & ESM2: Evolutionary Scale Modeling (ESM & ESM2), powerful protein language models used for understanding protein sequences and structures, with ESM2 offering improved capabilities for bioinformatics research.

Improvements & Enhancements

This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:

Python 3.10 Minimum Support: Updated the minimum supported Python version to 3.10, ensuring compatibility with the latest libraries and features.
Gemma Conversion (Keras to SafeTensors): Added a new conversion script to effortlessly convert Gemma models from Keras format to Hugging Face's Safetensor format.
Gemma3 Conversion Script: Added conversion script for Gemma3 models, streamlining their integration into the Hugging Face ecosystem.
ViT Non-Square Image Support: Enhanced the Vision Transformer (ViT) model to now accept non-square images as input, providing greater flexibility for various computer vision applications.
LLM Left Padding Method: Added support for left padding in our LLM padding methods, offering more control and compatibility for specific model architectures and inference requirements.

What's Changed

Complete list of all the changes included in this release.

register presets by @sachinprasadhs in #2268
Fix batch preprocessing bug in Moonshine generation by @harshaljanjani in #2266
fix get_lora_target_names function by @divyashreepathihalli in #2167
implement of leftpadding by @pass-lin in #2242
make vit compatible with non square images by @sineeli in #2255
Bump up master version to 0.22.0.dev0 by @laxmareddyp in #2277
Fix keras-io integration test by @laxmareddyp in #2280
Add Qwen3 by @kanpuriyanawab in #2249
Add DeiT Model by @Sohaib-Ahmed21 in #2203
[HOTFIX] Add Docstring for QwenCausalLM by @kanpuriyanawab in #2279
Fix: Correct coverage tracking for keras_hub by @sachinprasadhs in #2283
Update the sharded version number for Llama3 variants by @laxmareddyp in #2294
Support None for max_shard_size by @laxmareddyp in #2261
Sharded weights type error by @laxmareddyp in #2296
Fix PaliGemmaCausalLM example. by @hertschuh in #2302
Routine HF sync by @divyashreepathihalli in #2303
Incorrect condition on sliding_window_size by @laxmareddyp in #2289
Bump the python group with 2 updates by @dependabot[bot] in #2282
Modify TransformerEncoder masking documentation by @sonali-kumari1 in #2297
Fix Gemma3InterleaveEmbeddings JAX inference error by ensuring indices are int32 by @pctablet505 in #2305
Update preset versions for Mixtral,Qwen-MoE and Mistral models by @laxmareddyp in #2307
Fix Mistral conversion script by @laxmareddyp in #2306
Bump the python group with 6 updates by @dependabot[bot] in #2317
Qwen3 causal lm by @kanpuriyanawab in #2311
Fix JAX GPU tests by @sachinprasadhs in #2319
support flash-attn at torch backend by @pass-lin in #2257
Add HGNetV2 to KerasHub by @harshaljanjani in #2293
Qwen3 presets register by @laxmareddyp in #2325
Register HGNetV2 presets by @laxmareddyp in #2326
Safetensors conversion by @Bond099 in #2290
Add DINOV2. by @james77777778 in #2328
Refactor CLIP and update SD3. by @james77777778 in #2316
add DINOv2 preset details by @sachinprasadhs in #2336
Fix dtype issues on JAX CPU in SD3 tests. by @james77777778 in #2338
Revert "Fix dtype issues of JAX CPU in SD3. (#2338)" by @divyashreepathihalli in #2344
Resolve preset comparison bug in glue load model method by @emmanuel-ferdman in #2345
Removes unnecessary call to torch.no_grad() by @JyotinderSingh in #2353
Add Esm by @pass-lin in #2244
Fix float16 issue in SD3 when using JAX CPU. by @james77777778 in #2354
update python to 3.10 and Keras minimum version to 3.8 by @sachinprasadhs in #2292
register DeiT presets by @sachinprasadhs in #2348
Fix path for presets to link it to API docs in keras.io by @sachinprasadhs in #2357
Fix for llama3.1 instruct models by @pctablet505 in #2355
Add & register ESM presets by @sachinprasadhs in #2356
Add Gemma 3 conversion script by @abheesht17 in #2358
Remove exact matching of outputs from Gemma 3 conversion notebook by @abheesht17 in #2359

New Contributors

@Sohaib-Ahmed21 made their first contribution in #2203
@sonali-kumari1 made their first contribution in #2297
@Bond099 made their first contribution in #2290
@emmanuel-ferdman made their first contribution in #2345

Full Changelog: v0.21.1...v0.22.0

For detailed documentation and usage examples/guides, please refer to our updated guides on https://keras.io/keras_hub/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.22.0