-
Notifications
You must be signed in to change notification settings - Fork 30.3k
Description
Feature request
Currently tokenizer.apply_chat_template
can be used to convert a chat in the form list of dicts into a formatted string for text generation. However, there is currently no convenient method for doing the reverse. A new function (hypothetically called tokenizer.parse_chat_template
) could handle this.
See also:
https://stackoverflow.com/questions/79248499/how-to-reverse-the-tokenizer-apply-chat-template-method-and-handle-streaming-r
https://stackoverflow.com/questions/79248486/how-to-reverse-the-tokenizer-apply-chat-template
Motivation
Logically if there is a function for converting a chat into a string, there should be a function that does the reverse, which would make things easier when working with AutoModelForCausalLM
with chat models. This feature is already implemented in the pipeline
API, so it should be straightforward to create a version that's exposed to end users.
Your contribution
N/A