Skip to content

Conversation

GAD-cell
Copy link

@GAD-cell GAD-cell commented Sep 2, 2025

Small patch to support LFM2 with vLLM.
Since LFM2 doesn’t support prefix caching with vLLM, I had to add enable_prefix_caching to both VLLMModelConfig and VLLMModel._create_auto_model to make it work.

@HuggingFaceDocBuilderDev
Copy link
Collaborator

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@GAD-cell
Copy link
Author

GAD-cell commented Sep 8, 2025

@NathanHB seems like a cuda compilation error, do you have any clue why ?

@NathanHB
Copy link
Member

NathanHB commented Sep 9, 2025

mhh not sure why. I launched again but if it does not work can you try and set the default value of enable prefix caching to None ?

@GAD-cell
Copy link
Author

mhh not sure why. I launched again but if it does not work can you try and set the default value of enable prefix caching to None ?

OK I've changed the default value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants