Skip to content

Conversation

patrickfleith
Copy link

@patrickfleith patrickfleith commented Sep 27, 2025

Fix Question Rewriting Pipeline Issues

Relates to #178

1. Type Error in inference calls

  • Fixed "sequence item 0: expected str instance, list found" error
  • As I understand, STAGE_TAG is already ["question_rewriting"], so wrapping it created invalid nested structure.

After the fix I ran the question_rewriting pipeline and found other issues.

2. Missing required field validation

  • Fixed QuestionRow validation errors for existing datasets
  • The question_mode is not part of the to_dict() arguments (we don't save it as part of the dataset). So it triggers a validation error when building QuestionRow with the question_rewriting pipeline. The fix is to use a default value to pass validation. The question_mode is anyway not saved in the new _questions_rewritten subset

3. Missing subset error

  • Added graceful handling when multi_hop_questions subset doesn't exist

Result
Question Rewriting pipeline should now work again.

Please test and let me know if I missed something or if I should follow a different approach to fix this.

@codecov-commenter
Copy link

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@sumukshashidhar
Copy link
Collaborator

hi @patrickfleith , could you just reformat with ruff so it passes the codeql? thanks!

@sumukshashidhar sumukshashidhar self-requested a review October 2, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants