Skip to content

2 Optimizing Julia benchmarks

Thierry Dumont edited this page Mar 8, 2019 · 19 revisions

Optimizing Julia programs.

One must first read this, and apply all what is said.

In all the benchmarks, the main problem was with memory allocation. If the program allocates large chunks of memory, performances, as it is easy to understand, will be low.

  • One must first type all what can be typed, at least where we spend some computing time. Let us look for example at the module Rando.jl in FeStiff/. If we replace:
  seed::Int64
  a::Int64
  c::Int64
  m::Int64`

by

    seed
    a
    c
    m

the computing time, for the generation of triangles is now 2.622462655 seconds (on my computer), but is was only 0.091999968 seconds when the variables where typed. Now, launch the untyped version with ./script-m and have a look at Rando.jl.mem:

       - function fv!(R::RandoData,vmax=1.)
576000240     R.seed= (R.a * R.seed + R.c) % R.m
192000096     vmax*Float64(R.seed)/R.m
        - end

Thus, a lot of memory is allocated; return to the original "typed" version, and launch again ./script-m: you can verify that no memory is allocated in the function when the variables are typed!

  • Forget what you learned with Python/Numpy:

Have a look at MicroBenchmarks/Ju: all the benchmarks are coded with different programming styles: vector styles (like what we would do with Python/Scipy/Numpy) and a naïve loop unrolling style. The unrolled loop style is always the best. Do not forget that arrays are stored fortran like (look at MicroBenchmarks/Ju/main_lapl_2.jl), and do not forget the @simd macro.

.

---to be continued---

Clone this wiki locally