gpu
function defined in module
Flux
gpu(m)
Copies
m to the current GPU device (using current GPU backend), if one is available. If no GPU is available, it does nothing (but prints a warning the first time).
On arrays, this calls CUDA's
cu, which also changes arrays with Float64 elements to Float32 while copying them to the device (same for AMDGPU). To act on arrays within a struct, the struct type must be marked with
@functor
.
Use
cpu to copy back to ordinary
Arrays. See also
f32 and
f16 to change element type only.
See the CUDA.jl docs to help identify the current device.
julia> m = Dense(rand(2, 3)) # constructed with Float64 weight matrix
Dense(3 => 2) # 8 parameters
julia> typeof(m.weight)
Matrix{Float64} (alias for Array{Float64, 2})
julia> m_gpu = gpu(m) # can equivalently be written m_gpu = m |> gpu
Dense(3 => 2) # 8 parameters
julia> typeof(m_gpu.weight)
CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
gpu(data::DataLoader)
Transforms a given
DataLoader to apply
gpu to each batch of data, when iterated over. (If no GPU is available, this does nothing.)
julia> dl = Flux.DataLoader((x = ones(2,10), y='a':'j'), batchsize=3)
4-element DataLoader(::NamedTuple{(:x, :y), Tuple{Matrix{Float64}, StepRange{Char, Int64}}}, batchsize=3)
with first element:
(; x = 2×3 Matrix{Float64}, y = 3-element StepRange{Char, Int64})
julia> first(dl)
(x = [1.0 1.0 1.0; 1.0 1.0 1.0], y = 'a':1:'c')
julia> c_dl = gpu(dl)
4-element DataLoader(::MLUtils.MappedData{:auto, typeof(gpu), NamedTuple{(:x, :y), Tuple{Matrix{Float64}, StepRange{Char, Int64}}}}, batchsize=3)
with first element:
(; x = 2×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, y = 3-element StepRange{Char, Int64})
julia> first(c_dl).x
2×3 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
1.0 1.0 1.0
1.0 1.0 1.0
For large datasets, this is preferred over moving all the data to the GPU before creating the
DataLoader, like this:
julia> Flux.DataLoader((x = ones(2,10), y=2:11) |> gpu, batchsize=3)
4-element DataLoader(::NamedTuple{(:x, :y), Tuple{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, UnitRange{Int64}}}, batchsize=3)
with first element:
(; x = 2×3 CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, y = 3-element UnitRange{Int64})
This only works if
gpu is applied directly to the
DataLoader. While
gpu acts recursively on Flux models and many basic Julia structs, it will not work on (say) a tuple of
DataLoaders.
There are
5
methods for Flux.gpu:
The following pages link back here:
How to visualize data, Performant data pipelines, Saving and loading models for inference, Siamese image similarity, Variational autoencoders
FastAI.jl , Flux.jl , functor.jl , callbacks/callbacks.jl , callbacks/sanitycheck.jl