Nesterov
	
struct defined in module 
	Flux.Optimise
			Nesterov(η = 0.001, ρ = 0.9)
			Gradient descent optimizer with learning rate 
			η and Nesterov momentum 
			ρ.
			Learning rate (
			η): Amount by which gradients are discounted before updating the weights.
			Nesterov momentum (
			ρ): Controls the acceleration of gradient descent in the prominent direction, in effect damping oscillations.
			
			
			
			opt
			 
			=
			 
			
			Nesterov
			(
			)
			
			
			
			opt
			 
			=
			 
			
			Nesterov
			(
			0.003
			,
			 
			0.95
			)There are
			2
			methods for Flux.Optimise.Nesterov:
		
The following pages link back here:
Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl