AMSGrad
	
struct defined in module 
	Flux.Optimise
			AMSGrad(η = 0.001, β::Tuple = (0.9, 0.999), ϵ = 1.0e-8)
The AMSGrad version of the Adam optimiser. Parameters don't need tuning.
			Learning rate (
			η): Amount by which gradients are discounted before updating the weights.
			Decay of momentums (
			β::Tuple): Exponential decay for the first (β1) and the second (β2) momentum estimate.
			
			
			
			opt
			 
			=
			 
			
			AMSGrad
			(
			)
			
			
			
			opt
			 
			=
			 
			
			AMSGrad
			(
			0.001
			,
			 
			
			(
			0.89
			,
			 
			0.995
			)
			)There are
			3
			methods for Flux.Optimise.AMSGrad:
		
The following pages link back here:
Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl