Skip to content

Releases: kozistr/pytorch_optimizer

pytorch-optimizer v3.8.2

26 Oct 05:57
4b866c2

Choose a tag to compare

Change Log

Feature

  • Speed-up zeropower_via_newtonschulz up to 20% by utilizing torch.baddmm and torch.addmm ops. (#448)

Update

  • Refactor the type hints. (#448)

Fix

  • Resolved compatibility issue with lower PyTorch versions where torch.optim.optimizer.ParamT could not be imported. (#448)

Docs

  • Convert the docstring style from reST to google-style docstring. (#449)

pytorch-optimizer v3.8.1

18 Oct 10:44
657092f

Choose a tag to compare

Change Log

Feature

Update

  • Accept the GaloreProjector parameters in the init params of the Conda optimizer. (#443, #444)

Bug

  • Fix NaN problem when grad norm is zero in StableSPAM optimizer. (#431)

Docs

  • Update the documentation page. (#428)

Contribution

thanks to @liveck, @AhmedMostafa16

pytorch-optimizer v3.8.0

13 Aug 14:06
483a816

Choose a tag to compare

Change Log

Feature

Update

  • Re-implement Muon and AdaMuon optimizers based on the recent official implementation. (#408, #410)
    • Their definitions have changed from the previous version, so please check out the documentation!
  • Update the missing optimizers from __init__.py. (#415)
  • Add the HuggingFace Trainer example. (#415)
  • Optimize the visualization outputs and change the visualization document to a table layout. (#416)

Dependency

  • Update mkdocs dependencies. (#417)

CI

  • Add some GitHub actions to automate some processes. (#411, #412, #413)

Contributions

thanks to @AidinHamedi

pytorch-optimizer v3.7.0

28 Jul 14:33
ecdf6b6

Choose a tag to compare

Change Log

Feature

CI

  • Enable CI for Python 3.8 ~ 3.13. (#402, #404)

Fix

  • Adjust the value of eps to the fixed value 1e-15 when adding to exp_avg_sq. (#397, #398)
  • built-in type-hint in Kron optimizer. (#404)

Contributions

Thanks to @sobolevn

pytorch-optimizer v3.6.1

05 Jul 11:42
77098e9

Choose a tag to compare

Change Log

Feature

Update

  • Change the default range of the beta parameter from [0, 1] to [0, 1). (#392)

Fix

  • Fix to use momentum buffer instead of the gradient to calculate LMO. (#385)

pytorch-optimizer v3.6.0

17 May 10:36
9753eda

Choose a tag to compare

Change Log

Feature

Update

  • Support 2D< Tensor for RACS and Alice optimizers. (#380)
  • Remove the auxiliary variants from the default parameters of the optimizers and change the name of the state and parameter. (#380)
    • use_gc, adanorm, cautious, stable_adamw, and adam_debias will be affected.
    • You can still use these variants by passing the parameters to **kwargs.
    • Notably, in case of adanorm variant, you need to pass adanorm (and adanorm_r for r option) parameter(s) to use this variant, and the name of the state will be changed from exp_avg_norm to exp_avg_adanorm.
  • Refactor reset() to init_group() method in the BaseOptimizer class. (#380)
  • Refactor SAM optimizer family. (#380)
  • Gather AdamP, SGDP things into pytorch_optimizer.optimizer.adamp.*. (#381)
    • pytorch_optimizer.optimizer.sgdp.SGDP to pytorch_optimizer.optimizer.adamp.SGDP
    • pytorch_optimizer.optimizer.util.projection to pytorch_optimizer.optimizer.adamp.projection
    • pytorch_optimizer.optimizer.util.cosine_similarity_by_view to pytorch_optimizer.optimizer.adamp.cosine_similarity_by_view
  • Remove channel_view() and layer_view() from pytorch_optimizer.optimizer.util. (#381)

Fix

  • Fix shape mismatch issues in the Galore projection for reverse_std, right, and full projection types. (#376)

pytorch-optimizer v3.5.1

26 Apr 17:01
84b926c

Choose a tag to compare

Change Log

Feature

  • Implement ScionLight optimizer. (#369)

Update

  • Update SCION optimizer based on the official implementation. (#369)

Fix

  • Correct the learning rate ratio in Muon optimizer properly. (#371, #372, #373)

pytorch-optimizer v3.5.0

16 Mar 07:03
6397d56

Choose a tag to compare

Change Log

Feature

Update

  • Update Muon optimizer. (#355, #356)
    • support decoupled weight decay.
    • adjust default hyperparameters the same as the original implementation.
    • support adjusted lr from the Moonlight. you can use it by setting use_adjusted_lr=True.
  • Tune the performance of the coupled Newton iteration method by 5% increase. (#360)
  • Update SCION optimizer. (#361)
    • add scale parameter.
    • update get_lmo_direction.

Fix

  • bias_correction2 in ScheduleFreeRAdam optimizer. (#354)
  • potential bug in SPAM optimizer. (#365)
  • initialize the z state within the step() of the ScheduleFreeWrapper. (#363, #366)

pytorch-optimizer v3.4.2

22 Feb 06:08
c09d18b

Choose a tag to compare

Change Log

Feature

Update

  • Update ScheduleFreeSGD, AdamW, RAdam optimizers with the latest. (#351, #353)
  • Remove use_palm variant in ScheduleFree optimizer due to instability. (#353)
  • Ranger25 optimizer. (#353)

Fix

  • Remove weight decouple parameter in ScheduleFree optimizers. (#351, #353)

Docs

  • Fix AliG optimizer visualization. (#350)

Contributions

thanks to @AidinHamedi, @hatonosuke

pytorch-optimizer v3.4.1

14 Feb 11:57
00fbae0

Choose a tag to compare

Change Log

Feature

Update

  • Support alternative precision training for Shampoo optimizer. (#339)
  • Add more features to and tune Ranger25 optimizer. (#340)
    • AGC + Lookahead variants
    • change default beta1, beta2 to 0.95 and 0.98 respectively
  • Skip adding Lookahead wrapper in case of Ranger* optimizers, which already have it in create_optimizer(). (#340)
  • Improved optimizer visualization. (#345)
  • Rename pytorch_optimizer.optimizer.gc to pytorch_optimizer.optimizer.gradient_centralization to avoid possible conflict with Python built-in function gc. (#349)

Bug

  • Fix to update exp_avg_sq after calculating the denominator in ADOPT optimizer. (#346, #347)

Docs

  • Update the visualizations. (#340)

Contributions

thanks to @AidinHamedi