-
Notifications
You must be signed in to change notification settings - Fork 64
The UniformSynthesizer produces multiple UserWarning messages when run on a demo dataset
#452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #452 +/- ##
==========================================
+ Coverage 71.90% 71.94% +0.03%
==========================================
Files 27 27
Lines 2168 2171 +3
==========================================
+ Hits 1559 1562 +3
Misses 609 609
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| warnings.filterwarnings( | ||
| 'ignore', | ||
| message='.*is incompatible with transformer.*', | ||
| category=UserWarning, | ||
| ) | ||
| hyper_transformer.update_sdtypes(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the actual cause of the warnings? Could we change the logic at all to just prevent them instead of filtering them out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @andrew,
The issue occurs because we first call hyper_transformer.detect_initial_config(real_data) and then update the configuration to match the metadata sdtypes. Sometimes, detect_initial_config assigns transformers that aren’t compatible with the updated sdtypes.
For example, a datetime column stored as an object might be detected as categorical and assigned the UniformEncoder. Later, when we update the sdtype to datetime, the configuration becomes invalid since UniformEncoder doesn’t support datetime sdtypes — which triggers the warning.
Instead of filtering out the warning, I decided to update the transformer assignment whenever the sdtype changes. This preserves the previous behavior while avoiding the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I understand the warning I think it's better to just go back to silencing it 😅 . The reason being that I think it's better to use the methods RDT exposes in the docs than directly changing field_transformers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is annoying, but can you change it back to how you had it before?
4a65969 to
44eb234
Compare
44eb234 to
3c033e8
Compare
Resolve #449
CU-86b72ragn