`.MLUtilseachobsparallel`

function defined in module MLUtils


			eachobsparallel(data; buffer, executor, channelsize)

Construct a data iterator over observations in container data. It uses available threads as workers to load observations in parallel, leading to large speedups when threads are available.

To ensure that the active Julia session has multiple threads available, check that Threads.nthreads() > 1. You can start Julia with multiple threads with the -t n option. If your data loading is bottlenecked by the CPU, it is recommended to set n to the number of physical CPU cores.

Arguments

data: a data container that implements getindex/getobs and length/numobs
buffer = false: whether to use inplace data loading with getobs!. Only use this if you need the additional performance and getobs! is implemented for data. Setting buffer = true means that when using the iterator, an observation is only valid for the current loop iteration. You can also pass in a preallocated buffer = getobs(data, 1).
executor = Folds.ThreadedEx(): task scheduler You may specify a different task scheduler which can be any Folds.Executor.
channelsize = Threads.nthreads(): the number of observations that are prefetched. Increasing channelsize can lead to speedups when per-observation processing time is irregular but will cause higher memory usage.

Methods

There is 1 method for MLUtils.eachobsparallel:

parallel.jl:29

Backlinks

The following pages link back here:

Performant data pipelines

eachobs.jl , parallel.jl