Skip to content

Impossible to stream query results from the DB #760

@SIEDA-EliasHeydrich

Description

@SIEDA-EliasHeydrich

Hi there 😺

we here at SIEDA have been using SqlKata in our main product for about 4 years now and are pretty happy with the library, as it nicely abstracts away the specific DBMS (and we have to support a number of them in the same product).

Recently though, we ran into what we believe is a design flaw (or we fail to find the correct method for the usecase):
It seems to be impossible to stream query results from the DB, the application is forced to hold the entire query result in memory.

Let's look at this simple case:

private readonly IDbTransaction DbTransaction;
private readonly QueryFactory Factory;

// goal: select something from a DB and stream the results as an IEnumerable
public IEnumerable<T> ExecSelectAndReturnIEnumerable<T>( Query sqlQueryObj ) => Factory.Get<T>( sqlQueryObj, DbTransaction )

It looked to us like SqlKata supports this usecase. Which makes sense, because the underlying Dapper-Library supports it.
However, the memory footprint indicated otherwise and so we investigated and found this:

public IEnumerable<T> Get<T>(Query query, IDbTransaction transaction = null, int? timeout = null)
{
   SqlResult sqlResult = CompileAndLog(query);
   List<T> result = Connection.Query<T>(sqlResult.Sql, sqlResult.NamedBindings, transaction, buffered: true, timeout ?? QueryTimeout).ToList();
   return handleIncludes(query, result).ToList();
}

Dapper's buffered is set, however the fact that a List<T> is used to store the result defeats the purpose (and also the IEnumerable<T> as a return type, at that point it would be better to just return real List<T> so the programmer knows what is what).

We updated our SqlKata version (which also did come with a new Dapper that changed the API on their side), but the new code suffers similar problems, again storing things in Lists:

public IEnumerable<T> Get<T>(Query query, IDbTransaction transaction = null, int? timeout = null)
{
    var compiled = CompileAndLog(query);

    var result = this.Connection.Query<T>(
        compiled.Sql,
        compiled.NamedBindings,
        transaction: transaction,
        commandTimeout: timeout ?? this.QueryTimeout
    ).ToList();

    result = handleIncludes<T>(query, result).ToList();

    return result;
}

So, our question / problem report:
It is impossible to stream query results from the DB, since SqlKata always stores the queried data in a list and thus fully in memory.

This is a problem when e.g. selecting large sections (or even everything) from a table with the intent of then streaming the resulting data somewhere else.

What we would expect:
An option to receive an "actually" enumerable result, aka something that can be iterated over one at a time in a streaming manner with each query result row being processed as soon as it is received from the DB.
Note that we suspect that such a result object will require to be explicitly disposed for technical reasons and assume this is half the reason why lists feature so prominently in SqlKata.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions