feat: Add stacksafe feature #857

osiewicz · 2025-08-07T16:29:57Z

Objective

This PR adds stacksafe feature. We ran taffy super_deep benchmark with different display kinds and annotated all stack overflow-inducing call-sites with stacksafe::stacksafe.

Changes that will affect external library users must update RELEASES.md before they will be merged.

Context

There was a prior discussion about stacksafe use in #853.

Co-authored-by: Anthony Eid <hello@anthonyeid.me> Co-authored-by: Ben Kunkle <ben@zed.dev>

nicoburns

This should probably also be added to https://github.com/DioxusLabs/taffy/blob/main/src/compute/grid/mod.rs#L43

osiewicz · 2025-08-08T09:14:51Z

Oddly enough I could not trigger an overflow with grid layout easily, but yeah, let's stay consistent.

osiewicz · 2025-08-08T09:41:59Z

I ran benchmarks against taffy without stacksafe enabled:
Grid:

Grid (tl;dr: 8-10% perf regression)


grid/wide/31x31/961     time:   [656.85 µs 664.80 µs 668.10 µs]
                        change: [−3.3794% −0.2172% +3.4173%] (p = 0.91 > 0.05)
                        No change in performance detected.
grid/wide/100x100/10000 time:   [9.0808 ms 9.1444 ms 9.1716 ms]
                        change: [−1.4588% +1.1055% +3.6078%] (p = 0.40 > 0.05)
                        No change in performance detected.
Benchmarking grid/wide/316x316/99856: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 9.2s or enable flat sampling.
grid/wide/316x316/99856 time:   [154.95 ms 155.43 ms 156.40 ms]
                        change: [+2.7604% +3.2565% +3.8323%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/2x2/1024      time:   [1.6690 ms 1.6836 ms 1.6905 ms]
                        change: [+7.0525% +8.3548% +9.6254%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/3x3/6561      time:   [9.1765 ms 9.2361 ms 9.2734 ms]
                        change: [+6.2477% +7.4270% +8.6247%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/2x2/16384     time:   [34.196 ms 34.301 ms 34.448 ms]
                        change: [+6.3845% +7.2430% +8.0459%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
grid/superdeep/1x1/100  time:   [175.93 µs 179.93 µs 182.26 µs]
                        change: [+10.696% +12.494% +14.303%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/superdeep/1x1/1000 time:   [1.8662 ms 1.8758 ms 1.8812 ms]
                        change: [+9.9307% +10.608% +11.334%] (p = 0.00 < 0.05)
                        Performance has regressed.

Flexbox (tl;dr: >10% perf regression)


yoga 'huge nested'/Taffy 0.7 /10000
                        time:   [2.4040 ms 2.4058 ms 2.4077 ms]
                        change: [+11.913% +12.043% +12.166%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe
Wide tree/Taffy 0.7 (2-level hierarchy)/10000
                        time:   [3.9141 ms 3.9206 ms 3.9266 ms]
                        change: [+9.9228% +11.393% +12.883%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) low severe
Deep tree (auto size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [2.2935 ms 2.3227 ms 2.3361 ms]
                        change: [+14.057% +16.311% +18.752%] (p = 0.00 < 0.05)
                        Performance has regressed.
Deep tree (auto size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [5.9560 ms 5.9848 ms 5.9957 ms]
                        change: [+13.704% +15.171% +16.676%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) low mild
Benchmarking Deep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000: Collecting 10 samples in estimated 5.0668 s (2530 iterationDeep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [1.5994 ms 1.6166 ms 1.6241 ms]
                        change: [+3.1367% +7.1762% +11.125%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking Deep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000: Collecting 10 samples in estimated 5.2958 s (935 iterationDeep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [4.3597 ms 4.3754 ms 4.3847 ms]
                        change: [+8.5868% +9.6264% +10.751%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
super deep/Taffy 0.7 /100
                        time:   [196.86 µs 198.73 µs 199.47 µs]
                        change: [+15.924% +18.632% +21.418%] (p = 0.00 < 0.05)
                        Performance has regressed.

That's a bit yikes (even though we at Zed are likely to roll with it). I wonder if all of the callsites need the stacksafe annotation and if not, whether reducing them would reduce the perf impact of stacksafe on taffy's performance.

nicoburns

Some principled reasoning for where to include the annotations:

As a starting point, we want to consider the functions here: https://docs.rs/taffy/latest/taffy/compute/index.html. Looking at those:

compute_flexbox_layout
compute_grid_layout
compute_block_layout
compute_leaf_layout

Should definitely have the annotation. One of these functions gets called each time we recurse into a child node.

compute_root_layout doesn't need it. That only gets called once for the very root node of the tree.
compute_cached_layout doesn't need it because it is redundant with the layout-mode specific functions above. And we prefer to put it on the other methods to speed up cache access.
compute_hidden_layout maybe needs it? This also recurses down the tree but with a very minimal function that doesn't use a lot of stack space: https://docs.rs/taffy/latest/src/taffy/compute/mod.rs.html#266-278
round_layout - The inner round_layout_inner should probably have it (you already have this).
print_tree - The inner print_node should probably have it

nicoburns · 2025-08-12T10:20:18Z

src/compute/mod.rs

 /// Attempts to find a cached layout for the specified node and layout inputs.
 ///
 /// Uses the provided closure to compute the layout (and then stores the result in the cache) if no cached layout is found.
+#[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)]


Removing from here might help perf, because this function gets called just prior to Taffy reading from it's cache. It's also probably not necessary here, because:

If there's a cache hit then we will immediately return back up the stack

If there's a cache miss then we will recurse into another function which will be annotated

nicoburns · 2025-08-12T10:21:12Z

src/tree/taffy_tree.rs

    }

    #[inline(always)]
+    #[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)]


I'd also remove here for same reason as above. This is just prior to hitting the cache, and we'd like to only run this check if we get a cache miss to keep cache access fast.

nicoburns · 2025-08-12T10:21:45Z

src/compute/block.rs

 }

 /// Computes the layout of [`LayoutBlockContainer`] according to the block layout algorithm
+#[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)]


I expect this and perform_final_layout_on_in_flow_children aren't necessary, but are probably harmless.

osiewicz and others added 2 commits August 7, 2025 17:55

Add stacksafe dependency/feature and enable it for block layouts

7d82192

Co-authored-by: Anthony Eid <hello@anthonyeid.me> Co-authored-by: Ben Kunkle <ben@zed.dev>

Fixes for flex layout

230e7dc

Co-authored-by: Anthony Eid <hello@anthonyeid.me> Co-authored-by: Ben Kunkle <ben@zed.dev>

nicoburns requested changes Aug 7, 2025

View reviewed changes

Add stacksafe to grid layout

64baf41

nicoburns reviewed Aug 12, 2025

View reviewed changes

Merge branch 'main' into add-stacksafe

b7dd579

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Add stacksafe feature #857

feat: Add stacksafe feature #857

Uh oh!

osiewicz commented Aug 7, 2025

Uh oh!

nicoburns left a comment

Uh oh!

osiewicz commented Aug 8, 2025

Uh oh!

osiewicz commented Aug 8, 2025

Uh oh!

nicoburns left a comment •

edited

Loading

Uh oh!

nicoburns Aug 12, 2025

Uh oh!

nicoburns Aug 12, 2025

Uh oh!

nicoburns Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat: Add stacksafe feature #857

Are you sure you want to change the base?

feat: Add stacksafe feature #857

Uh oh!

Conversation

osiewicz commented Aug 7, 2025

Objective

Context

Uh oh!

nicoburns left a comment

Choose a reason for hiding this comment

Uh oh!

osiewicz commented Aug 8, 2025

Uh oh!

osiewicz commented Aug 8, 2025

Uh oh!

nicoburns left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicoburns Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

nicoburns Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

nicoburns Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nicoburns left a comment •

edited

Loading