.MLUtils
eachobsparallel
function
defined in module
MLUtils
eachobsparallel(data; buffer, executor, channelsize)
Construct a data iterator over observations in container
data
. It uses available threads as workers to load observations in parallel, leading to large speedups when threads are available.
To ensure that the active Julia session has multiple threads available, check that
Threads.nthreads() > 1
. You can start Julia with multiple threads with the
-t n
option. If your data loading is bottlenecked by the CPU, it is recommended to set
n
to the number of physical CPU cores.
data
: a data container that implements
getindex/getobs
and
length/numobs
buffer = false
: whether to use inplace data loading with
getobs!
. Only use this if you need the additional performance and
getobs!
is implemented for
data
. Setting
buffer = true
means that when using the iterator, an observation is only valid for the current loop iteration. You can also pass in a preallocated
buffer = getobs(data, 1)
.
executor = Folds.ThreadedEx()
: task scheduler You may specify a different task scheduler which can be any
Folds.Executor
.
channelsize = Threads.nthreads()
: the number of observations that are prefetched. Increasing
channelsize
can lead to speedups when per-observation processing time is irregular but will cause higher memory usage.
There is
1
method for MLUtils.eachobsparallel
: