Skip to content

1.1.0: Standardization, reporting mode, extension of visualizations, and many enhancements

Latest
Compare
Choose a tag to compare
@fjwillemsen fjwillemsen released this 03 Sep 19:15
5c6cf6c

This major release introduces substantial updates to further standardize and extend this software for evaluation of auto-tuning algorithms, in particular by the following:

  • Visualization and reporting of results are modular, scores can be obtained programmatically using report_experiments.
  • New benchmark_hub repository and submodule hosting brute-forced benchmarking resources.
  • Extended and improved visualizations:
    • Heatmaps for comparison per search space and / or over time
    • Head-to-head plots for direct comparisons on practical impact (beta)
    • New color handling based on existing color maps
  • Improved Experiments Schema:
    • Strategy grouping and coloring
    • Extended visualization settings such as visual minima and maxima
  • Improvements to consistency:
    • Improved and expanded tests
    • File handling modes explicitly define encodings for consistent behavior
    • Ensuring correct aggregation data is used
  • Further improvements and changes:
    • Switched to NumPy 2.0
    • Python 3.12 and 3.13 support, 3.9 was dropped
    • Auto-retry on missing data, smart cutoff handling, better error messages
    • Execution Engine Enhancements
    • Time-based cutoffs and runtime conversion
    • Flexible support for optimization direction, cutoffs, objectives, and valid result thresholds
    • Reformatted code with Ruff, improved test coverage
    • Implemented a new baseline strategy execution framework (ExecutedStrategyBaseline)
    • Implemented automatic hiding of strategies used as an executed baseline
    • Updated other plot types to respect the ‘hide’ key for strategies
    • Introduced get_colors function for strategy grouping and improved color assignment

In addition, there are various bug fixes and minor improvements.
For more details, refer to the full changelog: 1.0.0...1.1.0