Skip to content

Conversation

kozzztik
Copy link

Clickhouse widely used to work with large amount of data. Finally best way to work with very huge lists is streaming. However, sqlalchemy itself not supports streaming in inserts (only for select queries). But clickhouse_driver supports generators as source for insert queries.

Here is small workaround, that adds support of generators to clickhouse sqlalchemy in native way. I use it on my project to stream big batches of data (1M rows per query). Sure it can be helpful for other clickhouse users.

Checklist:

  • Add tests that demonstrate the correct behavior of the change. Tests should fail without the change.
  • Add or update relevant docs, in the docs folder and in code.
  • Ensure PR doesn't contain untouched code reformatting: spaces, etc.
  • Run flake8 and fix issues.
  • Run pytest no tests failed. See https://clickhouse-sqlalchemy.readthedocs.io/en/latest/development.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant