New Modular Classifier #270

laerdon · 2025-02-01T18:38:25Z

ConvoKit currently does not allow clients to use their own models, especially critical if such models are fine-tuned for the datasets which they use with ConvoKit. Currently, the classifiers powering features like politeness analysis and hypergraph representation are based upon sk-learn models, which are generally outdated and less robust than those provided by the HuggingFace Transformers library. We aim to update ConvoKit to support a more modular design which will provide users with a broader selection of models. Users want to use their own models, and leverage the ease of use that ConvoKit provides with navigating conversational corpuses. As of now, the Classifier class contains all functionality, including methods like fit() and transform(). We aim to delegate that functionality to a ClassifierModel abstract class, which will be the type of the internal classification model classifier_model.

Tested on local machine—fit and transform run successfully. More testing may be needed on a GPU-enabled environment.
An example is provided in convokit/examples/classifier/modular-classifier-example.ipynb.
This change deprecates pred_feats, the attribute of Classifier. Now, users are expected to produce their own torch Dataset containing this information. This also deprecates the evaluate_with_cv and evaluate_with_train_test_split methods.

laerdon and others added 9 commits January 1, 2025 22:04

Updated classifier and classifierModel

ea044c3

tentative done 1/x

bea8693

tentative done 2/x

beccf59

fixed formatting with black

6a88757

fixed formatting (2) with black

ca974a3

Merge branch 'master' into huggingface

18787ea

updated with notebook edits

fd6fa11

cleanup

0961b79

cleanup

05d9b94

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

New Modular Classifier #270

New Modular Classifier #270

Uh oh!

laerdon commented Feb 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

New Modular Classifier #270

Are you sure you want to change the base?

New Modular Classifier #270

Uh oh!

Conversation

laerdon commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

laerdon commented Feb 1, 2025 •

edited

Loading