`Nesterov`

struct defined in module Flux.Optimise


			Nesterov(η = 0.001, ρ = 0.9)

Gradient descent optimizer with learning rate η and Nesterov momentum ρ.

Parameters

Learning rate (η): Amount by which gradients are discounted before updating the weights.
Nesterov momentum (ρ): Controls the acceleration of gradient descent in the prominent direction, in effect damping oscillations.

Examples


			
			
			
			opt
			 
			=
			 
			

			Nesterov
			(
			)
			

			

			
			opt
			 
			=
			 
			

			Nesterov
			(
			0.003
			,
			 
			0.95
			)

Methods

There are 2 methods for Flux.Optimise.Nesterov:

optimise/optimisers.jl:102

optimise/optimisers.jl:97

Backlinks

The following pages link back here:

Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl