Skip to content

Conversation

tnull
Copy link
Collaborator

@tnull tnull commented Sep 12, 2025

As an intermediary step towards making our IO fully async, we now require any store to implement both KVStore and KVStoreSync, which allows us to switch over to the fully-async background processor and take further migration steps bit-by-bit when we make more and more of the core codebase async.

To this end, we refactor VssStore and SqliteStore to implement KVStore

TODOs:

  • Implement KVStore for TestStore upstream to fix tests
  • Implement write-order tracking for the VSS KVStore implementation

.. draft until then.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Sep 12, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@tnull
Copy link
Collaborator Author

tnull commented Sep 18, 2025

This now builds based on the just-merged lightningdevkit/rust-lightning#4069. We have yet to add write-version tracking for VSS.

@tnull tnull force-pushed the 2025-09-async-vss-store branch 3 times, most recently from c4251d4 to 7158653 Compare September 18, 2025 09:52
@tnull tnull moved this to Goal: Merge in Weekly Goals Sep 18, 2025
@tnull tnull self-assigned this Sep 18, 2025
@tnull tnull force-pushed the 2025-09-async-vss-store branch from 7158653 to 8c2ff8f Compare September 25, 2025 09:11
@tnull tnull force-pushed the 2025-09-async-vss-store branch 2 times, most recently from 464e6b7 to 9035b71 Compare September 29, 2025 13:35
@tnull tnull requested a review from joostjager September 29, 2025 13:38
@tnull
Copy link
Collaborator Author

tnull commented Sep 29, 2025

Should be good for review. For the VssStore versioning I copied as much as possible from the already-reviewed approach over at lightningdevkit/rust-lightning#3931

@tnull tnull marked this pull request as ready for review September 29, 2025 13:39
@tnull tnull force-pushed the 2025-09-async-vss-store branch from 9035b71 to 750ab87 Compare September 29, 2025 14:54
@tnull
Copy link
Collaborator Author

tnull commented Sep 29, 2025

Rebased to address conflicts post-#652.

if primary_namespace.is_empty() {
key.to_owned()
} else {
format!("{}#{}#{}", primary_namespace, secondary_namespace, key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to make a string out of this? It doesn't need to be mapped to a filename like for fs_store, so maybe it can also simply be a tuple?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, as the HashMap needs to hold an owned value. We could have it be a (String, String, String), but that's worse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that worse? The string concatenation looks a bit unnecessary. Or make it a struct that is used as the key?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that worse? The string concatenation looks a bit unnecessary. Or make it a struct that is used as the key?

It at least requires three individual allocations instead of one? I.e. more clutter on the heap, and probably also some slowdown?

FWIW, I mirrored what we do for the obfuscated key. Unfortunately it's not super easy to just reuse that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not ideal indeed, the heap allocs


Ok(self.storable_builder.deconstruct(storable)?.0)

self.execute_locked_read(locking_key, async move || {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need the lock for reading?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped.

@tnull tnull force-pushed the 2025-09-async-vss-store branch from 750ab87 to 70404ed Compare September 30, 2025 07:45
@tnull tnull requested a review from joostjager September 30, 2025 07:46
@tnull
Copy link
Collaborator Author

tnull commented Sep 30, 2025

Addressed pending comments, probably still need to see why the CI job start hanging again.

@joostjager
Copy link
Contributor

LGTM, can squash

@tnull tnull force-pushed the 2025-09-async-vss-store branch from 70404ed to 6c3fdf3 Compare September 30, 2025 08:47
@tnull
Copy link
Collaborator Author

tnull commented Sep 30, 2025

LGTM, can squash

Squashed without further changes.

let secondary_namespace = secondary_namespace.to_string();
let key = key.to_string();
let inner = Arc::clone(&self.inner);
let fut = tokio::task::spawn_blocking(move || {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During final look, I am now again wondering if this function is actually preserving order?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point, it seems it would depend on how tokio exactly schedules the blocking tasks. I considered a few other options, but now simply also followed the tried and true 'write version locking' approach here, as we already use that in VSS and FS stores.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but isn't this completely unnecessary because sqlite has its own global lock? With FS and VSS there is the actual possibility of parallel execution.

Copy link
Collaborator Author

@tnull tnull Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but isn't this completely unnecessary because sqlite has its own global lock? With FS and VSS there is the actual possibility of parallel execution.

Well, I first thought so, too, but I think the issue is that tokio gives no guarantee in which order spawned tasks are executed/scheduled. I.e., AFAICT it could happen that we spawn to writes w1, w2 but the task for w2 gets polled first, acquiring the Mutex first. Same goes for the case where multiple writes wait on the same connection lock: say w1 currently holds the Mutex and two more writes w2, w3 would get get queued, AFAIU there is no guarantee that when w1 drops the lock w2 always acquires the lock next.

TLDR: it seems we unfortunately need to do the version dance here, too. Maybe there's an easier mechanism in the SQLite case (e.g., prepare should technically take care of that, but we can't lock the connection Mutex outside of the spawned task as the guard is not Send) we could lean on to guarantee the ordered writes, but applying the same approach seemed simplest for now?

Copy link
Contributor

@joostjager joostjager Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just refuse to believe that we need that version dance really for this problem. Isn't block_in_place meant for this? If we ensure that the write has happened before the fn returns, it doesn't matter that during that execution there may be other writes that get processed in a certain order?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm? How do you imagine block_in_place to work here? block_in_place takes a future and drives it to completion, i.e., makes it a blocking operation. For the async KVStore we however exactly need a future, not a blocking operation, which is what spawn_blocking does for us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it? I saw

pub fn block_in_place<F, R>(f: F) -> R where F: FnOnce() -> R

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it? I saw

pub fn block_in_place<F, R>(f: F) -> R where F: FnOnce() -> R

Sorry, please replace block_in_place with block_on above. block_in_place is simply a wrapper that spawns a blocking task on the outer runtime context so that the inner block_on call doesn't starve (that is, if it is indeed called on the same runtime, which it isn't always).

Copy link
Contributor

@joostjager joostjager Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline that even though block_in_place may work with caveats, we do prefer to implement sqlite async. And that for that to work, we need the versioning.

tnull added 5 commits October 7, 2025 10:08
.. first step to make review easier.
.. as we're gonna reuse the `async` `_internal` methods shortly.
.. where the former holds the latter in an `Arc` that can be used in
async/`Future` contexts more easily.
We implement the async `KVStore` trait for `VssStore`.
@tnull tnull force-pushed the 2025-09-async-vss-store branch from 6c3fdf3 to 1a55ab8 Compare October 7, 2025 08:45
tnull added 10 commits October 7, 2025 10:49
.. to be easier reusable via `KVStore` also
.. where the former holds the latter in an `Arc` that can be used in
async/`Future` contexts more easily.
.. to be easier reusable via `KVStore` also
.. where the former holds the latter in an `Arc` that can be used in
async/`Future` contexts more easily.
As an intermediary step, we require any store to implement both
`KVStore` and `KVStoreSync`, allowing us to switch over step-by-step.

We already switch to the fully-async background processor variant here.
@tnull tnull force-pushed the 2025-09-async-vss-store branch from 1a55ab8 to 8dada08 Compare October 7, 2025 08:49
@tnull tnull changed the base branch from develop to main October 7, 2025 08:49
@tnull tnull requested a review from joostjager October 7, 2025 08:49
@tnull
Copy link
Collaborator Author

tnull commented Oct 7, 2025

Addressed remaining comments and changed base to main. Let me know if I can squash.

@tnull
Copy link
Collaborator Author

tnull commented Oct 7, 2025

CI breakage is unrelated (#654)

&self, primary_namespace: &str, secondary_namespace: &str, key: &str,
) -> String {
if primary_namespace.is_empty() {
key.to_owned()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And secondary ns?

Is this actually used without a namespace?

io::Error::new(io::ErrorKind::Other, msg)
})?;
Ok(())
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent formatting issue?

@joostjager
Copy link
Contributor

Squash = ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Goal: Merge
Development

Successfully merging this pull request may close these issues.

3 participants