Skip to content

Releases: keras-team/keras-hub

v0.22.2

12 Sep 15:31
f4b648d
Compare
Choose a tag to compare

New Model: VaultGemma

VaultGemma is a 1-billion-parameter, 26-layer, text-only decoder model trained with sequence-level differential privacy (DP).
Derived from Gemma 2, its architecture notably drops the norms after the Attention and MLP blocks and uses full attention for all layers, rather than alternating with local sliding attention.
The pretrained model is available with a 1024-token sequence length.

What's Changed

Full Changelog: v0.22.1...v0.22.2

v0.22.1

15 Aug 18:59
56ba520
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.22.0...v0.22.1

v0.22.0

14 Aug 18:01
Compare
Choose a tag to compare

Summary:

New Models:

We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:

  • Gemma 3 270M: Released Gemma 3 270M parameter model and instruction tuned, 18-layer, text-only model designed for
    hyper-efficient AI, particularly for task-specific fine-tuning.

  • Qwen3: A powerful, large-scale multilingual language model, excelling in various natural language processing tasks, from text generation to complex reasoning.

  • DeiT: Data-efficient Image Transformers (DeiT), specifically designed to train Vision Transformers effectively with less data, making high-performance visual models more accessible.

  • HGNetV2: An advanced version of the Hybrid-Grouped Network, known for its efficient architecture in computer vision tasks, particularly optimized for performance on diverse hardware.

  • DINOV2: A state-of-the-art Self-Supervised Vision Transformer, enabling the learning of robust visual representations without relying on explicit labels, ideal for foundation models.

  • ESM & ESM2: Evolutionary Scale Modeling (ESM & ESM2), powerful protein language models used for understanding protein sequences and structures, with ESM2 offering improved capabilities for bioinformatics research.

Improvements & Enhancements

This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:

  • Python 3.10 Minimum Support: Updated the minimum supported Python version to 3.10, ensuring compatibility with the latest libraries and features.
  • Gemma Conversion (Keras to SafeTensors): Added a new conversion script to effortlessly convert Gemma models from Keras format to Hugging Face's Safetensor format.
  • Gemma3 Conversion Script: Added conversion script for Gemma3 models, streamlining their integration into the Hugging Face ecosystem.
  • ViT Non-Square Image Support: Enhanced the Vision Transformer (ViT) model to now accept non-square images as input, providing greater flexibility for various computer vision applications.
  • LLM Left Padding Method: Added support for left padding in our LLM padding methods, offering more control and compatibility for specific model architectures and inference requirements.

What's Changed

Complete list of all the changes included in this release.

New Contributors

Full Changelog: v0.21.1...v0.22.0

For detailed documentation and usage examples/guides, please refer to our updated guides on https://keras.io/keras_hub/

v0.22.0.dev0

13 Aug 18:32
4e3435f
Compare
Choose a tag to compare
v0.22.0.dev0 Pre-release
Pre-release

What's Changed

New Contributors

Read more

v0.21.1

03 Jun 23:28
c019e50
Compare
Choose a tag to compare

Summary:

  • Comprehensive docstrings to QwencausalLM, resolve integration test issues for Keras-IO, and coverage tracking for Keras-Hub.

What's Changed

Full Changelog: v0.21.0...v0.21.1

v0.21.0

28 May 19:07
933efe6
Compare
Choose a tag to compare

Summary

  • New Models.

    • Xception: Added Xception architecture for image classification tasks.
    • Qwen: Added Qwen2.5 large language models and presets of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.
    • Qwen MoE: Added transformer-based Mixture of Experts (MoE) decoder-only language model with a base variant having 2.7B activated parameters during runtime.
    • Mixtral: Added Mixtral LLM, a pretrained generative Sparse Mixture of Experts with pre-trained and instruction tuned models having 7 billion activated parameters.
    • Moonshine: Added Moonshine, a speech recognition task model.
    • CSPNet: Added Cross Stage Partial Network (CSPNet) classification task model.
    • Llama3: Added support for Llama 3.1 and 3.2.
  • Added sharded weight support to KerasPresetSaver and KerasPresetLoader, defaulting to a 10GB maximum shard size.

What's Changed

New Contributors

Full Changelog: v0.20.0...v0.21.0

v0.20.0

03 Apr 23:48
d907fed
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.19.3...v0.20.0

v0.20.0.dev1

03 Apr 19:11
50807f2
Compare
Choose a tag to compare
v0.20.0.dev1 Pre-release
Pre-release

What's Changed

Full Changelog: v0.20.0.dev0...v0.20.0.dev1

v0.20.0.dev0

03 Apr 17:58
23ac977
Compare
Choose a tag to compare
v0.20.0.dev0 Pre-release
Pre-release

What's Changed

New Contributors

Full Changelog: v0.19.0.dev0...v0.20.0.dev0

v0.19.3

26 Mar 08:50
9604a38
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.19.2...v0.19.3