Skip to content

Releases: KernelTuner/kernel_tuner

Version 0.1.2

29 Mar 07:02

Choose a tag to compare

Better defaults for grid divisor lists, full support for 3D grids, and a simpler way to specify the problem size of 1D grids.

[0.1.2] - 2017-03-29

Changed

  • allow non-tuple problem_size for 1D grids
  • changed default for grid_div_y from None to block_size_y
  • converted the tutorial to a Jupyter Notebook
  • CUDA backend prints device in use, similar to OpenCL backend
  • migrating from nosetests to pytest
  • rewrote many of the examples to save results to json files

Added

  • full support for 3D grids, including option for grid_div_z
  • separable convolution example

Version 0.1.1

10 Feb 07:49

Choose a tag to compare

[0.1.1] - 2017-02-10

Changed

  • changed the output format to list of dictionaries

Added

  • option to set compiler options

version 0.1.0

02 Nov 11:59

Choose a tag to compare

Version 0.1.0

The Kernel Tuner should by now be ready for production use. Over the last few months we have used it in several projects, which has revealed some of the things that were fixed in this version. This release also marks the end of a period in which the internal structure of the Kernel Tuner has changed several times. We expect the current code structure to stay around for a while. With this version we also release the public roadmap for the project, to show which changes and additional features we have planned for the near and not so near future. We also feel that the software is now ready to be added to public software repositories, which we will do shortly.

first beta release

14 Jun 11:13

Choose a tag to compare

first beta release Pre-release
Pre-release

This is the first beta release of the Kernel Tuner.

This release basically marks the first version of the kernel tuner, which is currently in beta testing to see what functionality is missing and what needs to be fixed before the code can be considered production ready.

A brief description of the Kernel Tuner's functionality in this version:

  • Basic kernel tuning functionality for CUDA, OpenCL, and C functions
  • Many examples and rather extensive documentation
  • Search space restriction, using the 'restrictions' option
  • Kernel output verification, using the 'answer' option
  • Example showing how to tune both host code (number of streams) and GPU code
  • Run a single kernel with a specific parameter set and get the output