RNN

function defined in module Flux


			RNN(in => out, σ = tanh)

The most basic recurrent layer; essentially acts as a Dense layer, but with the output fed back into the input each time step.

The arguments in and out describe the size of the feature vectors passed as input and as output. That is, it accepts a vector of length in or a batch of vectors represented as a in x B matrix and outputs a vector of length out or a batch of vectors of size out x B.

This constructor is syntactic sugar for Recur(RNNCell(a...)), and so RNNs are stateful. Note that the state shape can change depending on the inputs, and so it is good to reset! the model between inference calls if the batch size changes. See the examples below.

Examples


			julia> r = RNN(3 => 5)
Recur(
  RNNCell(3 => 5, tanh),                # 50 parameters
)         # Total: 4 trainable arrays, 50 parameters,
          # plus 1 non-trainable, 5 parameters, summarysize 432 bytes.

julia> r(rand(Float32, 3)) |> size
(5,)

julia> Flux.reset!(r);

julia> r(rand(Float32, 3, 10)) |> size # batch size of 10
(5, 10)
Batch size changes

Failing to call reset! when the input batch size changes can lead to unexpected behavior. See the following example:


			
			
			
			
			julia
			>
			 
			r
			 
			=
			 
			

			RNN
			(
			
			3
			 
			=>
			 
			5
			)
			

			

	
			Recur
			(
			
  
			

	
			RNNCell
			(
			
			3
			 
			=>
			 
			5
			,
			 
			tanh
			)
			,
			                
			# 50 parameters
			

			)
			         
			# Total: 4 trainable arrays, 50 parameters,
			
          
			# plus 1 non-trainable, 5 parameters, summarysize 432 bytes.
			

			

			
			julia
			>
			
			 
			
			r
			.
			

	
			state
			 
			|>
			 
			size
			

			
			(
			5
			,
			 
			1
			)
			

			

			
			julia
			>
			
			 
			
			r
			(
			
			rand
			(
			Float32
			,
			 
			3
			)
			)
			 
			|>
			 
			size
			

			
			(
			5
			,
			)
			

			

			
			julia
			>
			
			 
			
			r
			.
			

	
			state
			 
			|>
			 
			size
			

			
			(
			5
			,
			 
			1
			)
			

			

			
			julia
			>
			
			 
			
			r
			(
			
			rand
			(
			Float32
			,
			 
			3
			,
			 
			10
			)
			)
			 
			|>
			 
			size
			 
			# batch size of 10
			

			
			(
			5
			,
			 
			10
			)
			

			

			
			julia
			>
			
			 
			
			r
			.
			

	
			state
			 
			|>
			 
			size
			 
			# state shape has changed
			

			
			(
			5
			,
			 
			10
			)
			

			

			
			julia
			>
			
			 
			
			r
			(
			
			rand
			(
			Float32
			,
			 
			3
			)
			)
			 
			|>
			 
			size
			 
			# erroneously outputs a length 5*10 = 50 vector.
			

			
			(
			50
			,
			)

Note:

RNNCells can be constructed directly by specifying the non-linear function, the Wi and Wh internal matrices, a bias vector b, and a learnable initial state state0. The Wi and Wh matrices do not need to be the same type, but if Wh is dxd, then Wi should be of shape dxN.


			
			
			
			julia
			>
			 
			
			using
			
			 
			LinearAlgebra
			

			

			
			
			julia
			>
			 
			r
			 
			=
			 
			
			

	
			Flux
			.
			

	
			Recur
			(
			
			

	
			Flux
			.
			

	
			RNNCell
			(
			tanh
			,
			 
			
			rand
			(
			5
			,
			 
			4
			)
			,
			 
			
			Tridiagonal
			(
			
			rand
			(
			5
			,
			 
			5
			)
			)
			,
			 
			
			rand
			(
			5
			)
			,
			 
			
			rand
			(
			5
			,
			 
			1
			)
			)
			)
			

			

			
			julia
			>
			
			 
			
			r
			(
			
			rand
			(
			4
			,
			 
			10
			)
			)
			 
			|>
			 
			size
			 
			# batch size of 10
			

			
			(
			5
			,
			 
			10
			)
Methods

There is 1 method for Flux.RNN: