Skip to content

Conversation

quic-mamta
Copy link
Contributor

@quic-mamta quic-mamta commented Aug 19, 2025

Update Transformers to 4.55.0
Update PyTorch to 2.7.0+cpu
Torchvision to 0.22.0+cpu
and Python Requirement to >=3.9

Updated modeling files and Cache Utils for transformers 4.55.0

Updated models :

  1. codegen
  2. falcon
  3. gemma
  4. gemma2
  5. gptj
  6. gpt2
  7. granite
  8. granite_moe
  9. grok1
  10. llama
  11. llama_swiftkv
  12. mistral
  13. mixtral_moe
  14. mpt
  15. phi
  16. phi3
  17. qwen2
  18. starcoder2
  19. gpt_bigcode
  20. internvl
  21. llava
  22. llava_next
  23. whisper
  24. gemma3
  25. llama4
  26. mllama

@quic-mamta quic-mamta changed the title Tf version 4.55 upgrade Transformers version 4.55 upgrade Aug 19, 2025
@quic-mamta quic-mamta marked this pull request as draft August 19, 2025 19:42
@asmigosw asmigosw force-pushed the TF_version_4.55_upgrade branch from d36c124 to a514d36 Compare September 2, 2025 08:29
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch 2 times, most recently from e15d548 to 3643fee Compare September 23, 2025 08:45
@quic-mamta quic-mamta marked this pull request as ready for review September 24, 2025 05:27
@quic-mamta quic-mamta requested a review from vbaddi September 24, 2025 05:27
@quic-mamta quic-mamta marked this pull request as draft September 24, 2025 19:31
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch from 69ec2a4 to 6ad267b Compare September 25, 2025 07:46
@quic-mamta quic-mamta changed the title Transformers version 4.55 upgrade Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 Sep 25, 2025
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch 3 times, most recently from dd8b38e to 940dfcf Compare September 26, 2025 11:44
@quic-mamta quic-mamta marked this pull request as ready for review September 26, 2025 11:44
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch from 940dfcf to 4f44dd4 Compare September 26, 2025 19:03
Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com>
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch from 4f44dd4 to d8cf0a1 Compare September 28, 2025 13:10
# Apply the attention mask
attn_weights = torch.where(attention_mask, mask_value, attn_weights)

attn_weights = attn_weights / self.scale_attn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it has been moved from line 51?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was made equivalent to new TF code; they have moved it down since its placement whether at line 50 or 58, won't affect the performance. Should I move it back to 50?


EXTERNAL_MODELS = {
"hpcai-tech/grok-1",
"hpcai-tech/grok-1": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do we need this?

Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
Signed-off-by: Asmita Goswami <asmigosw@qti.qualcomm.com>
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch 2 times, most recently from f4ade46 to 8217cb5 Compare October 8, 2025 10:48
Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com>
@quic-mamta quic-mamta force-pushed the TF_version_4.55_upgrade branch from 8217cb5 to 7bf2298 Compare October 8, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants