Implement FourierLayer for higher dimensions #31

pzimbrod · 2022-02-16T14:10:13Z

So far, FourierLayer only works in 1-D. This should be extended to more dimensions, ideally with minimal code repetition.

Replacing the standard implementation by a generated function can do this.

codecov · 2022-02-16T14:13:48Z

Codecov Report

Merging #31 (8e69b17) into master (9b16e02) will decrease coverage by 2.20%.
The diff coverage is 35.71%.

❗ Current head 8e69b17 differs from pull request most recent head 2e9d4f0. Consider uploading reports for the commit 2e9d4f0 to get more accurate results

@@            Coverage Diff             @@
##           master      #31      +/-   ##
==========================================
- Coverage   41.09%   38.88%   -2.21%     
==========================================
  Files           6        8       +2     
  Lines          73      108      +35     
==========================================
+ Hits           30       42      +12     
- Misses         43       66      +23

Impacted Files	Coverage Δ
src/OperatorLearning.jl	`100.00% <ø> (ø)`
src/generated.jl	`0.00% <0.00%> (ø)`
src/FourierLayer.jl	`43.75% <39.47%> (-1.42%)`	⬇️
src/trainable.jl	`100.00% <100.00%> (ø)`

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

pzimbrod · 2022-02-16T14:22:17Z

Also, I ditched rfft for regular fft for sake of simplicity. But since this roughly doubles computation time and memory requirements, it should be changed later on.

ChrisRackauckas · 2022-02-16T14:23:24Z

Also, I ditched rfft for regular fft for sake of simplicity. But since this roughly doubles computation time and memory requirements, it should be changed later on.

Can you capture this in an issue for a future performance improvement? Usually it's easier to remember these kinds of things when they are in the list.

pzimbrod · 2022-02-16T14:30:45Z

@ChrisRackauckas It seems that codecov complains about the new generated function not being tested. I imagine this can be fixed by taking care of #26 / including tests that call the layer on some data, right?

Can you capture this in an issue for a future performance improvement? Usually it's easier to remember these kinds of things when they are in the list.

Will do.

ChrisRackauckas · 2022-02-16T14:32:15Z

I imagine this can be fixed by taking care of #26 / including tests that call the layer on some data, right?

Maybe. @generated + coverage is a bit unreliable IIRC.

pzimbrod · 2022-02-16T15:19:52Z

Training works fine on CPU and GPU with input randn(8,32,32,16,12) (3D Grid 32x32x16), but somehow memory requirements have shot up to a point where the burgers_FNO example runs out of memory on a 24GB RTX A5000 - which is a rather simple 1D case.

# examples/burgers_FNO.jl l.82
# Run on an RTX A5000 24GB
julia> train!(loss, parameters, train_loader, opt, cb = throttled_cb)
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.

[...] A whole lot more of those

ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.
ERROR: KernelException: exception thrown during kernel execution on device NVIDIA RTX A5000
Stacktrace:
  [1] check_exceptions()
    @ CUDA ~/.julia/packages/CUDA/t62lT/src/compiler/exceptions.jl:34
  [2] nonblocking_synchronize
    @ ~/.julia/packages/CUDA/t62lT/lib/cudadrv/context.jl:329 [inlined]
  [3] device_synchronize()
    @ CUDA ~/.julia/packages/CUDA/t62lT/lib/cudadrv/context.jl:317
  [4] CUDA.CuModule(data::Vector{UInt8}, options::Dict{CUDA.CUjit_option_enum, Any})
    @ CUDA ~/.julia/packages/CUDA/t62lT/lib/cudadrv/module.jl:41
  [5] CuModule
    @ ~/.julia/packages/CUDA/t62lT/lib/cudadrv/module.jl:23 [inlined]
  [6] cufunction_link(job::GPUCompiler.CompilerJob, compiled::NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}})
    @ CUDA ~/.julia/packages/CUDA/t62lT/src/compiler/execution.jl:451
  [7] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/1Ajz2/src/cache.jl:95
  [8] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CUDA.CuDeviceMatrix{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(*), Tuple{Float64, Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(-), Tuple{Int64, Float64}}, Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{ComplexF32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(conj), Tuple{Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{ComplexF32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}}}}}, Int64}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/t62lT/src/compiler/execution.jl:297
  [9] cufunction(f::GPUArrays.var"#broadcast_kernel#17", tt::Type{Tuple{CUDA.CuKernelContext, CUDA.CuDeviceMatrix{Float32, 1}, Base.Broadcast.Broadcasted{Nothing, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(+), Tuple{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(*), Tuple{Float64, Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(*), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(-), Tuple{Int64, Float64}}, Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{ComplexF32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{2}, Nothing, typeof(conj), Tuple{Base.Broadcast.Extruded{CUDA.CuDeviceMatrix{ComplexF32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}}}}}}}, Int64}})
    @ CUDA ~/.julia/packages/CUDA/t62lT/src/compiler/execution.jl:291
 [10] macro expansion
    @ ~/.julia/packages/CUDA/t62lT/src/compiler/execution.jl:102 [inlined]
 [11] #launch_heuristic#274
    @ ~/.julia/packages/CUDA/t62lT/src/gpuarrays.jl:17 [inlined]
 [12] copyto!
    @ ~/.julia/packages/GPUArrays/umZob/src/host/broadcast.jl:65 [inlined]
 [13] copyto!
    @ ./broadcast.jl:913 [inlined]
 [14] materialize!
    @ ./broadcast.jl:871 [inlined]
 [15] materialize!
    @ ./broadcast.jl:868 [inlined]
 [16] apply!(o::ADAM, x::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Δ::CUDA.CuArray{ComplexF32, 2, CUDA.Mem.DeviceBuffer})
    @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/optimisers.jl:184
 [17] update!(opt::ADAM, x::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, x̄::CUDA.CuArray{ComplexF32, 2, CUDA.Mem.DeviceBuffer})
    @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:26
 [18] update!(opt::ADAM, xs::Zygote.Params, gs::Zygote.Grads)
    @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:32
 [19] macro expansion
    @ ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:112 [inlined]
 [20] macro expansion
    @ ~/.julia/packages/Juno/n6wyj/src/progress.jl:134 [inlined]
 [21] train!(loss::Function, ps::Zygote.Params, data::Flux.Data.DataLoader{Tuple{CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}, CUDA.CuArray{Float32, 3, CUDA.Mem.DeviceBuffer}}, Random._GLOBAL_RNG}, opt::ADAM; cb::Flux.var"#throttled#74"{Flux.var"#throttled#70#75"{Bool, Bool, typeof(evalcb), Int64}})
    @ Flux.Optimise ~/.julia/packages/Flux/qAdFM/src/optimise/train.jl:107
 [22] top-level scope
    @ REPL[21]:1
 [23] top-level scope
    @ ~/.julia/packages/CUDA/t62lT/src/initialization.jl:52

So this will need some additional work.

pzimbrod · 2022-02-18T14:51:41Z

It seems that using @ein within @generated messes up type inference since i𝔉, 𝔉 and 𝔏 all show type Any. Perhaps related to under-Peter/OMEinsum.jl/issues/92.

julia> model = FourierLayer(8,8,(32,32,16),(2,3,4),σ)
FourierLayer with
Convolution path: (8, 8, 32, 32, 16)
Linear path: (8, 8)
Fourier modes: (2, 3, 4)
Activation: σ

julia> x = randn(Float32, 8, 32, 32, 16, 2);

julia> model(x);

julia> @code_warntype model(x)
MethodInstance for (::FourierLayer{typeof(σ), Array{ComplexF32, 5}, Matrix{Float32}, Array{ComplexF32, 5}, Array{Float32, 5}})(::Array{Float32, 5})
  from (a::FourierLayer)(x::AbstractArray{T, N}) where {T, N} in OperatorLearning at /home/zimbropa/OperatorLearning.jl/src/FourierLayer.jl:124
Static Parameters
  T = Float32
  N = 5
Arguments
  a::FourierLayer{typeof(σ), Array{ComplexF32, 5}, Matrix{Float32}, Array{ComplexF32, 5}, Array{Float32, 5}}
  x::Array{Float32, 5}
Locals
  i𝔉 ::Any
  𝔉 ::Any
  𝔏 ::Any
  xₚ::Array{Float32, 5}
  σ::typeof(sigmoid_fast)
  bₗ::Array{Float32, 5}
  bᵩ::Array{ComplexF32, 5}
  Wₗ::Matrix{Float32}
  Wᵩ::Array{ComplexF32, 5}
Body::Any
1 ─       (Wᵩ = Base.getproperty(a, :Wf))
│         (Wₗ = Base.getproperty(a, :Wl))
│         (bᵩ = Base.getproperty(a, :bf))
│         (bₗ = Base.getproperty(a, :bl))
│   %5  = Base.getproperty(a, :σ)::Core.Const(NNlib.σ)
│         (σ = OperatorLearning.fast_act(%5, x))
│   %7  = Core.tuple($(Expr(:static_parameter, 2)))::Core.Const((5,))
│   %8  = ($(Expr(:static_parameter, 2)) - 1)::Core.Const(4)
│   %9  = (1:%8)::Core.Const(1:4)
│   %10 = Base.Generator(Base.identity, %9)::Core.Const(Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 1:4))
│   %11 = Base.collect(%10)::Vector{Int64}
│   %12 = Core._apply_iterate(Base.iterate, OperatorLearning.tuple, %7, %11)::Core.PartialStruct(Tuple{Int64, Vararg{Int64}}, Any[Core.Const(5), Vararg{Int64}])
│         (xₚ = OperatorLearning.permutedims(x, %12))
│   %14 = Core.tuple((1, 2), (3, 2, 4, 5, 6))::Core.Const(((1, 2), (3, 2, 4, 5, 6)))
│   %15 = OMEinsum.EinCode(%14, (3, 1, 4, 5, 6))::OMEinsum.DynamicEinCode{Int64}
│   %16 = Core.tuple(Wₗ, xₚ)::Tuple{Matrix{Float32}, Array{Float32, 5}}
│         (𝔏 = OMEinsum.einsum(%15, %16))
│   %18 = Base.broadcasted(OperatorLearning.:+, 𝔏, bₗ) ::Any
│         Base.materialize(%18)
│   %20 = Core.tuple((1, 2, 3, 4, 5), (6, 1, 3, 4, 5))::Core.Const(((1, 2, 3, 4, 5), (6, 1, 3, 4, 5)))
│   %21 = OMEinsum.EinCode(%20, (6, 2, 3, 4, 5))::OMEinsum.DynamicEinCode{Int64}
│   %22 = Wᵩ::Array{ComplexF32, 5}
│   %23 = xₚ::Array{Float32, 5}
│   %24 = (3:$(Expr(:static_parameter, 2)))::Core.Const(3:5)
│   %25 = Base.Generator(Base.identity, %24)::Core.Const(Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 3:5))
│   %26 = Base.collect(%25)::Vector{Int64}
│   %27 = OperatorLearning.fft(%23, %26)::Array{ComplexF32, 5}
│   %28 = Core.tuple(%22, %27)::Tuple{Array{ComplexF32, 5}, Array{ComplexF32, 5}}
│         (𝔉 = OMEinsum.einsum(%21, %28))
│   %30 = Base.broadcasted(OperatorLearning.:+, 𝔉,  bᵩ)::Any
│         Base.materialize(%30)
│   %32 = 𝔉 ::Any
│   %33 = (3:$(Expr(:static_parameter, 2)))::Core.Const(3:5)
│   %34 = Base.Generator(Base.identity, %33)::Core.Const(Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 3:5))
│   %35 = Base.collect(%34)::Vector{Int64}
│         (i𝔉 = OperatorLearning.ifft(%32, %35))
│   %37 = σ::Core.Const(NNlib.sigmoid_fast)
│   %38 = 𝔏 ::Any
│   %39 = OperatorLearning.real(i𝔉) ::Any
│   %40 = (%38 + %39)::Any
│   %41 = Base.broadcasted(%37, %40)::Any
│   %42 = Base.materialize(%41)::Any
│   %43 = (2:$(Expr(:static_parameter, 2)))::Core.Const(2:5)
│   %44 = Base.Generator(Base.identity, %43)::Core.Const(Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 2:5))
│   %45 = Base.collect(%44)::Vector{Int64}
│   %46 = Core.tuple(1)::Core.Const((1,))
│   %47 = Core._apply_iterate(Base.iterate, OperatorLearning.tuple, %45, %46)::Tuple{Vararg{Int64}}
│   %48 = OperatorLearning.permutedims(%42, %47)::Any
└──       return %48

Can't use Tullio.jl as an alternative since it won't work in generated functions at all - mcabbott/Tullio.jl#129

pzimbrod added 6 commits February 11, 2022 15:50

change input syntax, make fourier weights n-D

ebd8420

fix fourier bias for higher dimension

e12e244

💡 update FNO trainable function for multiple modes

2fc6922

implement n-D FourierLayer via generated function

9aad03c

implement n-D FourierLayer via generated function

d1b1db7

add tests for 3D FourierLayer

8e69b17

pzimbrod mentioned this pull request Feb 16, 2022

Use fft and rfft as fit #33

Open

pzimbrod added 2 commits February 17, 2022 12:23

get rid of unnecessary permutations

a97ce2e

use TensorCore's boxdot in the linear path

2e9d4f0

pzimbrod mentioned this pull request Feb 21, 2022

Fourier Layer Tests #35

Open

pzimbrod closed this Jun 24, 2022

pzimbrod deleted the pzimbrod-FourierLayer-ND branch June 24, 2022 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement FourierLayer for higher dimensions #31

Implement FourierLayer for higher dimensions #31

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

codecov bot commented Feb 16, 2022 •

edited

Loading

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

ChrisRackauckas commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

ChrisRackauckas commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 18, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Implement FourierLayer for higher dimensions #31

Implement FourierLayer for higher dimensions #31

Uh oh!

Conversation

pzimbrod commented Feb 16, 2022

Uh oh!

codecov bot commented Feb 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

ChrisRackauckas commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

ChrisRackauckas commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 16, 2022

Uh oh!

pzimbrod commented Feb 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 16, 2022 •

edited

Loading

pzimbrod commented Feb 18, 2022 •

edited

Loading