-
Couldn't load subscription status.
- Fork 153
feat: Add stacksafe feature #857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Anthony Eid <hello@anthonyeid.me> Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Anthony Eid <hello@anthonyeid.me> Co-authored-by: Ben Kunkle <ben@zed.dev>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably also be added to https://github.com/DioxusLabs/taffy/blob/main/src/compute/grid/mod.rs#L43
|
Oddly enough I could not trigger an overflow with grid layout easily, but yeah, let's stay consistent. |
|
I ran benchmarks against taffy without Grid (tl;dr: 8-10% perf regression)
grid/wide/31x31/961 time: [656.85 µs 664.80 µs 668.10 µs]
change: [−3.3794% −0.2172% +3.4173%] (p = 0.91 > 0.05)
No change in performance detected.
grid/wide/100x100/10000 time: [9.0808 ms 9.1444 ms 9.1716 ms]
change: [−1.4588% +1.1055% +3.6078%] (p = 0.40 > 0.05)
No change in performance detected.
Benchmarking grid/wide/316x316/99856: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 9.2s or enable flat sampling.
grid/wide/316x316/99856 time: [154.95 ms 155.43 ms 156.40 ms]
change: [+2.7604% +3.2565% +3.8323%] (p = 0.00 < 0.05)
Performance has regressed.
grid/deep/2x2/1024 time: [1.6690 ms 1.6836 ms 1.6905 ms]
change: [+7.0525% +8.3548% +9.6254%] (p = 0.00 < 0.05)
Performance has regressed.
grid/deep/3x3/6561 time: [9.1765 ms 9.2361 ms 9.2734 ms]
change: [+6.2477% +7.4270% +8.6247%] (p = 0.00 < 0.05)
Performance has regressed.
grid/deep/2x2/16384 time: [34.196 ms 34.301 ms 34.448 ms]
change: [+6.3845% +7.2430% +8.0459%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
grid/superdeep/1x1/100 time: [175.93 µs 179.93 µs 182.26 µs]
change: [+10.696% +12.494% +14.303%] (p = 0.00 < 0.05)
Performance has regressed.
grid/superdeep/1x1/1000 time: [1.8662 ms 1.8758 ms 1.8812 ms]
change: [+9.9307% +10.608% +11.334%] (p = 0.00 < 0.05)
Performance has regressed.
Flexbox (tl;dr: >10% perf regression)
yoga 'huge nested'/Taffy 0.7 /10000
time: [2.4040 ms 2.4058 ms 2.4077 ms]
change: [+11.913% +12.043% +12.166%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
Wide tree/Taffy 0.7 (2-level hierarchy)/10000
time: [3.9141 ms 3.9206 ms 3.9266 ms]
change: [+9.9228% +11.393% +12.883%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) low severe
Deep tree (auto size)/Taffy 0.7 (12-level hierarchy)/4000
time: [2.2935 ms 2.3227 ms 2.3361 ms]
change: [+14.057% +16.311% +18.752%] (p = 0.00 < 0.05)
Performance has regressed.
Deep tree (auto size)/Taffy 0.7 (14-level hierarchy)/10000
time: [5.9560 ms 5.9848 ms 5.9957 ms]
change: [+13.704% +15.171% +16.676%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) low mild
Benchmarking Deep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000: Collecting 10 samples in estimated 5.0668 s (2530 iterationDeep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000
time: [1.5994 ms 1.6166 ms 1.6241 ms]
change: [+3.1367% +7.1762% +11.125%] (p = 0.00 < 0.05)
Performance has regressed.
Benchmarking Deep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000: Collecting 10 samples in estimated 5.2958 s (935 iterationDeep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000
time: [4.3597 ms 4.3754 ms 4.3847 ms]
change: [+8.5868% +9.6264% +10.751%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) low mild
super deep/Taffy 0.7 /100
time: [196.86 µs 198.73 µs 199.47 µs]
change: [+15.924% +18.632% +21.418%] (p = 0.00 < 0.05)
Performance has regressed.
That's a bit yikes (even though we at Zed are likely to roll with it). I wonder if all of the callsites need the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some principled reasoning for where to include the annotations:
As a starting point, we want to consider the functions here: https://docs.rs/taffy/latest/taffy/compute/index.html. Looking at those:
compute_flexbox_layoutcompute_grid_layoutcompute_block_layoutcompute_leaf_layout
Should definitely have the annotation. One of these functions gets called each time we recurse into a child node.
compute_root_layoutdoesn't need it. That only gets called once for the very root node of the tree.compute_cached_layoutdoesn't need it because it is redundant with the layout-mode specific functions above. And we prefer to put it on the other methods to speed up cache access.compute_hidden_layoutmaybe needs it? This also recurses down the tree but with a very minimal function that doesn't use a lot of stack space: https://docs.rs/taffy/latest/src/taffy/compute/mod.rs.html#266-278round_layout- The innerround_layout_innershould probably have it (you already have this).print_tree- The innerprint_nodeshould probably have it
| /// Attempts to find a cached layout for the specified node and layout inputs. | ||
| /// | ||
| /// Uses the provided closure to compute the layout (and then stores the result in the cache) if no cached layout is found. | ||
| #[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing from here might help perf, because this function gets called just prior to Taffy reading from it's cache. It's also probably not necessary here, because:
- If there's a cache hit then we will immediately return back up the stack
- If there's a cache miss then we will recurse into another function which will be annotated
| } | ||
|
|
||
| #[inline(always)] | ||
| #[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also remove here for same reason as above. This is just prior to hitting the cache, and we'd like to only run this check if we get a cache miss to keep cache access fast.
| } | ||
|
|
||
| /// Computes the layout of [`LayoutBlockContainer`] according to the block layout algorithm | ||
| #[cfg_attr(feature = "stacksafe", stacksafe::stacksafe)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this and perform_final_layout_on_in_flow_children aren't necessary, but are probably harmless.
Objective
This PR adds
stacksafefeature. We ran taffysuper_deepbenchmark with different display kinds and annotated all stack overflow-inducing call-sites with stacksafe::stacksafe.Changes that will affect external library users must update RELEASES.md before they will be merged.
Context
There was a prior discussion about
stacksafeuse in #853.