-
Notifications
You must be signed in to change notification settings - Fork 113
Open
Description
What type of issue is this?
- Bug in the code or other problem
- Inadequate/incorrect documation
- Feature request
LoopVectorization.jl usually does a better job than the julia compiler + llvm at unrolling and vectorization. You might want to use it for some of the benchmarks.
For instance on zen2:
$ ~/julia-1.6.0-rc1/bin/julia -O3
(@v1.6) pkg> activate --temp
Activating new environment at `/tmp/jl_GYGsu9/Project.toml`
(jl_GYGsu9) pkg> add BenchmarkTools, LoopVectorization
julia> using LoopVectorization, BenchmarkTools
julia> r = 3;
julia> n = 1000;
julia> A = zeros(Float64, n, n);
julia> B = zeros(Float64, n, n);
julia> W = zeros(Float64, 2*r+1, 2*r+1);
julia> function do_stencil(A, W, B, r, n)
for j=r:n-r-1
for i=r:n-r-1
for jj=-r:r
for ii=-r:r
@inbounds B[i+1,j+1] += W[r+ii+1,r+jj+1] * A[i+ii+1,j+jj+1]
end
end
end
end
end
do_stencil (generic function with 1 method)
julia> @benchmark do_stencil($A, $W, $B, $r, $n)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 24.744 ms (0.00% GC)
median time: 24.799 ms (0.00% GC)
mean time: 24.803 ms (0.00% GC)
maximum time: 24.948 ms (0.00% GC)
--------------
samples: 202
evals/sample: 1
julia> function do_stencil_avx(A, W, B, r, n)
@avx for j=r:n-r-1, i=r:n-r-1, jj=-r:r, ii=-r:r
B[i+1,j+1] += W[r+ii+1,r+jj+1] * A[i+ii+1,j+jj+1]
end
end
do_stencil_avx (generic function with 1 method)
julia> @benchmark do_stencil_avx($A, $W, $B, $r, $n)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 3.234 ms (0.00% GC)
median time: 3.267 ms (0.00% GC)
mean time: 3.275 ms (0.00% GC)
maximum time: 3.452 ms (0.00% GC)
--------------
samples: 1527
evals/sample: 1
Metadata
Metadata
Assignees
Labels
No labels