Skip to content

VecMat fusion autotune bug #3683

@laggui

Description

@laggui

Running wgpu tests fails due to vecmat fusion autotune:

cargo test  --lib --bins --examples -p burn-wgpu --color=always --release
  thread 'tests::cube_fusion::autodiff::f32_ty::ad_transpose::tests::should_diff_swap_dims' panicked at /home/agent/.cargo/git/checkouts/cubecl-058c47895211d464/5a4ad7f/crates/cubecl-runtime/src/tune/local.rs:155:26:
  Should run when selected by autotune.: Unknown("RunnerError(LaunchError(Unable to launch matmul because the config is invalid: \"Lhs and Rhs must have same line size, got lhs=1 and rhs=2\"\n))")

When running the failing test should_diff_swap_dims only, from a clean cache, it passes (but if you run it using the current cache it will fail, as expected):

cargo test  --lib --bins --examples -p burn-wgpu --color=always --release should_diff_swap_dims

And if we remove (comment out) the SimpleVecMat and DoubleVecMat tunables when running all the tests, there are no issues.

TunableSet::new(create_key::<R>, input_gen::<R>)
.with(Tunable::new(tune_fallback::<R, BT>)) // First one should always work.
.with(Tunable::new(tune_fused::<R, BT, SimpleUnit>).group(&unit, |_| PRIORITY_MAX))
.with(Tunable::new(tune_fused::<R, BT, SimpleVecMat>).group(&unit, |_| PRIORITY_MAX))
.with(Tunable::new(tune_fused::<R, BT, DoubleVecMat>).group(&unit, |_| PRIORITY_MAX))
.with(

It looks like the vecmat are selected by a previous test during autotune, so when running should_diff_swap_dims it is selected as the fastest but fails to execute at runtime due to line size mismatch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions